Thanks to the 97 people who have so far answered yesterday’s scripts poll. The results are rather neatly bunched into four groups: 14 could be seen by at least 83 of the 97; there were a further 8 in the 47-62 range; there is a cluster of 3 which could be seen by 15-21 people; and one outlier which only I could see when logged in using Firefox.
94 Декабрь Russian (100-200 mn)
92 ธันวาคม Thai (50-100 mn)
92 ديسمبر Arabic (200-500 mn)
92 דצמבר Hebrew (5-10 mn)
90 दिसंबर Hindi (200-500 mn)
89 டிசம்பர் Tamil (50-100 mn)
89 Դեկտեմբեր Armenian (5-10 mn)
88 ਦਸੰਬਰ Punjabi (50-100 mn)
88 ડિસેમ્બર Gujarati (50-100 mn)
85 クリスマス Japanese (100-200 mn)
84 聖誕節 Chinese (over 1 bn)
83 크리스마스 Korean (50-100 mn)
83 დეკემბერი Georgian (2-5 mn)
Taking it for granted that everyone could see the Latin alphabet clearly, this list includes the correct scripts for nine of the world’s ten languages with most speakers, and 23 of the top 25. I’ll address the missing languages when I get to them; the odd inclusions here are Greek and Hebrew (understandable for cultural reasons), Armenian and Georgian (which despite their small number of native speakers are geographically convenient to the massive information technology hub of Russia, and also relatively easy to code) and Thai, which quite probably says something abut the relative openness of Thailand compared to some of its neighbours.
61 డిసెంబర్ Telugu (50-100 mn)
61 ಡಿಸೆಂಬರ್ Kannada (20-50 mn)
61 ഡിസംബര് Malayalam (20-50 mn)
60 ޑިސެމްބަރު Divehi (200,000-500,000)
58 ܟܢܘܢ ܐ Aramaic (2-5 mn)
54 ᑎᓯᒻᐳᕆ Inuit (20,000-50,000)
47 ᏓᏂᏍᏓᏲᎯᎲ Cherokee (20,000-50,000)
Actually these eight subdivide pretty clearly into three groups. Bangla, Telugu, Kannada and Malayalam are South Asian scripts which somehow have not achieved the penetration that their number of speakers would have suggested. This is particularly striking for Bangla which unlike the other three is the sole official language of a sovereign state. Divehi (which is the official language of the Maldives) and Aramaic may not be obvious partners, but in fact both scripts are related to Arabic, so if you have coded for one you may as well code for the other. Inuit and Cherokee are the two least-spoken languages on the entire list, and I suspect that their alphabets may not be all that widely used even by native speakers (Latin transcription of both languages is fairly common), but like Georgian and Armenian they have the advantage of being relatively easy to code and on a convenient continent for coders.
18 දෙසැම්බර් Sinhalese (10-20 mn)
15 បុណ្យណូអែល Khmer (5-10 mn)
The Ge’ez script is used for Tigrinya as well as Amharic, so may need to be bumped up a population category; notably it is the only indigenous African script in the list. All three of these score rather lower than one would expect for the official language of a sovereign state (two sovereign states if one counts Eritrea as well as Ethiopia).
Isn’t that shocking? Burmese script is not easy for us alphabet-users, but really is no more difficult than the other South Asian and South-East Asian scripts. I would be interested to know more about the politics and policies which have put Thai so far ahead and Burmese so far behind compared with their neighbours. You may remember that Cory Doctorow’s book Little Brother is to be translated into Burmese, Karen (which also uses Burmese script), Shin and Kachin; I think this survey rather illustrates why that is a good idea.