Non English Speaking Users of PyPI - We need Help!
Hello! It was suggested to me that it would be a good idea for me to throw my net wider, so I’m posting this here in the hopes that someone experienced with translations, programming as a non-native English speaker, and ideally teaching non-native English students/children/new people how to program can chime in and help us. Essentially, we’re trying to make decisions about translations, what to support and how to support it on a project called Warehouse (which will become PyPI 2.0 once it’s ready). One of the things that we’ve tried to do is setup the ground work for (after we’ve launched it as pypi.python.org) we can start having people come in and translate the UI to try and be more welcoming to folks who either are not native English speakers, or may not speak English at all. Sadly, I feel completely unprepared to really make any decisions in this area. I am a native English speaker whom only speaks English, so this issue has no impact on me or my ability to use PyPI. In the few people I’ve talked to, I’ve had answers from non-native English speakers, ranging from “Don’t bother to translate PyPI, programmers need to learn English” to “Gettext is good enough, just use that” to “Gettext has shortcomings that make it hard to accurately get high quality translates, use a different engine instead”. I’m hoping by reaching out here, we can get some more opinions perhaps from people in the affected demographic to chime in. Some information: * The bulk of the content of PyPI is English (and we’re looking at translating the UI right now, not content) however not all of the content is English [1][2]. * We’re currently using L20n.js, I went into details about what it provides over getttext on distutils-sig[3]. * L20n.js is client side, and thus we have to worry about browser support. * zh-cn is our second largest language (behind English) with 8-9% of the views on current PyPI coming from Chrome + zh-cn. * The open issue for translations is https://github.com/pypa/warehouse/issues/881 which has more information on it as well. I’d love it if people familiar with this sort of thing could weigh in, either on this thread, or on the issue, or privately to me if you’d feel more comfortable doing that. Also, please pass this on to any person or group you may know that might have useful input! Thanks! [1] https://pypi.python.org/pypi/byrlogin/1.1.3 [2] https://pypi.python.org/pypi/bypy/1.2.14 [3] https://mail.python.org/pipermail/distutils-sig/2016-January/028134.html ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
Hi Donald, One particular concern is for some Slavic languages and counts. If any of your messages has a specific number of things, you need to know what that number is to have a correct sentence (e.g. saying “I am 34 years old” and “I am 35 years old” has a different grammatical structure; notably, you need to know what the number is to know how to spell “years”.). I believe this is currently a limitation of gettext, but I also know that in many languages gettext is actually gettext + FEATURES, and it’s not clear to me if Python’s gettext solves this problem. lvh
On Jan 26, 2016, at 2:22 PM, Donald Stufft <donald@stufft.io> wrote:
Hello!
It was suggested to me that it would be a good idea for me to throw my net wider, so I’m posting this here in the hopes that someone experienced with translations, programming as a non-native English speaker, and ideally teaching non-native English students/children/new people how to program can chime in and help us.
Essentially, we’re trying to make decisions about translations, what to support and how to support it on a project called Warehouse (which will become PyPI 2.0 once it’s ready). One of the things that we’ve tried to do is setup the ground work for (after we’ve launched it as pypi.python.org) we can start having people come in and translate the UI to try and be more welcoming to folks who either are not native English speakers, or may not speak English at all.
Sadly, I feel completely unprepared to really make any decisions in this area. I am a native English speaker whom only speaks English, so this issue has no impact on me or my ability to use PyPI. In the few people I’ve talked to, I’ve had answers from non-native English speakers, ranging from “Don’t bother to translate PyPI, programmers need to learn English” to “Gettext is good enough, just use that” to “Gettext has shortcomings that make it hard to accurately get high quality translates, use a different engine instead”.
I’m hoping by reaching out here, we can get some more opinions perhaps from people in the affected demographic to chime in.
Some information:
* The bulk of the content of PyPI is English (and we’re looking at translating the UI right now, not content) however not all of the content is English [1][2]. * We’re currently using L20n.js, I went into details about what it provides over getttext on distutils-sig[3]. * L20n.js is client side, and thus we have to worry about browser support. * zh-cn is our second largest language (behind English) with 8-9% of the views on current PyPI coming from Chrome + zh-cn. * The open issue for translations is https://github.com/pypa/warehouse/issues/881 which has more information on it as well.
I’d love it if people familiar with this sort of thing could weigh in, either on this thread, or on the issue, or privately to me if you’d feel more comfortable doing that. Also, please pass this on to any person or group you may know that might have useful input!
Thanks!
[1] https://pypi.python.org/pypi/byrlogin/1.1.3 [2] https://pypi.python.org/pypi/bypy/1.2.14 [3] https://mail.python.org/pipermail/distutils-sig/2016-January/028134.html
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________ PSF-Community mailing list PSF-Community@python.org https://mail.python.org/mailman/listinfo/psf-community
On Jan 26, 2016, at 3:27 PM, Laurens Van Houtven <_@lvh.cc> wrote:
Hi Donald,
One particular concern is for some Slavic languages and counts. If any of your messages has a specific number of things, you need to know what that number is to have a correct sentence (e.g. saying “I am 34 years old” and “I am 35 years old” has a different grammatical structure; notably, you need to know what the number is to know how to spell “years”.). I believe this is currently a limitation of gettext, but I also know that in many languages gettext is actually gettext + FEATURES, and it’s not clear to me if Python’s gettext solves this problem.
Hey, thanks! That’s great information, and I bet that’s why L20n.js (what we’re currently using.. but it’s client side so we need to decide if that’s OK and if it is if the browser support is OK) lets you sort of program your translations a bit. An example from the L20n website is here http://l20n.org/learn/putting-it-all-together-complex-plurals-example/ I’m not sure how to tell if Python’s gettext supports that. The documentation says "If a translation is found, apply the plural formula to n, and return the resulting message (some languages have more than two plural forms).” which sounds like it might work like that? I found https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html which goes way over my head but it mentions Slavic languages so that seems promising? ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
Donald Stufft escribió el 26/01/16 a las 15:22:
Essentially, we’re trying to make decisions about translations, what to support and how to support it on a project called Warehouse (which will become PyPI 2.0 once it’s ready). One of the things that we’ve tried to do is setup the
How Warehouse is being implemented? Is it using a framework (Django, Flask, etc)? Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org.ar/ Twitter: @facundobatista
On Jan 26, 2016, at 3:50 PM, Facundo Batista <facundo@taniquetil.com.ar> wrote:
Donald Stufft escribió el 26/01/16 a las 15:22:
Essentially, we’re trying to make decisions about translations, what to support and how to support it on a project called Warehouse (which will become PyPI 2.0 once it’s ready). One of the things that we’ve tried to do is setup the
How Warehouse is being implemented? Is it using a framework (Django, Flask, etc)?
It’s in Pyramid (code at https://github.com/pypa/warehouse). ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
On 01/26/2016 09:46 PM, Donald Stufft wrote:
On Jan 26, 2016, at 3:27 PM, Laurens Van Houtven <_@lvh.cc> wrote:
Hi Donald,
One particular concern is for some Slavic languages and counts. If any of your messages has a specific number of things, you need to know what that number is to have a correct sentence (e.g. saying “I am 34 years old” and “I am 35 years old” has a different grammatical structure; notably, you need to know what the number is to know how to spell “years”.). I believe this is currently a limitation of gettext, but I also know that in many languages gettext is actually gettext + FEATURES, and it’s not clear to me if Python’s gettext solves this problem.
Hey, thanks! That’s great information, and I bet that’s why L20n.js (what we’re currently using.. but it’s client side so we need to decide if that’s OK and if it is if the browser support is OK) lets you sort of program your translations a bit.
Any stats on how many visitors have Javascript turned off?
An example from the L20n website is here http://l20n.org/learn/putting-it-all-together-complex-plurals-example/
I’m not sure how to tell if Python’s gettext supports that. The documentation says "If a translation is found, apply the plural formula to n, and return the resulting message (some languages have more than two plural forms).” which sounds like it might work like that? I found https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html which goes way over my head but it mentions Slavic languages so that seems promising?
Gettext's "ngettext" supports one number per message (translation unit), and which is good enough for most UI cases. So you can translate "I'm 34 years old" perfectly, but "remaining time: 5 hours 3 minutes 1 second" is not possible. (The good news is that it's impossible to represent using ngettext, even in English: the translator can't end up with a piece of text that can't be translated well.)
participants (5)
-
Donald Stufft
-
Facundo Batista
-
Igor Starikov
-
Laurens Van Houtven
-
Petr Viktorin