The Internet Is Killing the World’s Languages. Can Google Help?
What languages do you speak and are they different than the languages you use online? According to a recent study by prominent mathematical linguist András Kornai, less than 5 percent of the world’s languages are going to make the transition to the digital realm and the Internet is contributing to the demise of the other 95 percent.
Kornai, in his research, estimates that 2,500 of the over 7,000 spoken languages in the world are considered endangered and that the vast majority of languages (spoken or otherwise) are not capable of what he terms “digital ascent.” That is, they won’t make the leap to online written usage. Kornai argues that if the written form of the language is the primary method of communication, then making the digital ascent is easier. To survive in the coming years, languages must function digitally. Unfortunately, many of them are already dead.
The primary registry of data about the world’s languages is the Ethnologue database of the Summer Institute of Linguistics (SIL). The SIL estimates that there are 7,102 languages currently in the world (December 2015). Of that number, 916 (or 13 percent) are considered dying languages and since 1950, 367 languages have gone extinct, a rate of loss of five languages per year.
In terms of our spoken language capability, things are equally worse. In the European Union (EU), there are 24 officially recognized languages, and more than 60 regional or minority languages. According to the European Commission, just over half of Europeans (54 percent) are able to hold a conversation in at least one additional language, a quarter (25 percent) are able to speak at least two additional languages and one in 10 (10 percent) are conversant in at least three. In America, the stats are a bit more dismal where only one in four people can hold a conversation in a second language. As a whole, according to Stockholm University linguist Mikael Parkvall, 80 percent of the global population speaks 1.69 languages natively or non-natively per person.
And which languages rule the Internet? In a 2009 UNESCO report that monitored the languages of websites from 1996 to 2008, researchers found that 45 percent of the content on the web was in English (although this may have been caused by search engines only indexing English information). More recently, W3Techs reports that 54 percent of the web’s content is in English, outpacing the next language (Russian) by a factor of almost 10.
According to ITU’s latest data, 43 percent of the world’s population or 3 billion people are now online with some form of regular access to the Internet. And, if the numbers above are to be believed, much of the content they create and communicate with is in English. The World Economic Forum estimates that Google recognizes the most languages across its Search and Translate services, with 348 languages recognized on Google Search. That’s 348 languages out of the more than 7,000 identified by SIL, or just over 5 percent.
Interestingly, this 5 percent number corresponds with Kornai’s language-survival estimate. That is, Kornai says that around 5 percent of the world’s languages will make the digital leap, the rest may experience extinction. Google currently allows searching in just over 5 percent of all languages. Could increasing the number of languages Google indexes for searching also increase that language’s chances of survival? It’s certainly an appealing avenue for further research.