Geo-linguistics is the study of language in relation to geography. Geo-linguistics is used to differentiate the areas where different varieties of the language are used. Geo linguistics has also been used to differentiate between the type of sounds found in the languages of different area. This new technology has allowed for easier linguistic research.
This new area of improvement in technology has been helpful to researchers and policy makers by helping to answer critical questions regarding language at International, national and urban levels (Cartwright, 2006). It has been helpful in media marketing as well. Knowing the different variations of languages spoken in different areas, advertisements could be created, catering to those varieties (Altman, Portilla, 2004).
Contents
2.1 Using Geo-linguistics to study about language evolution
2.1.1 Cartogram and Choropleth
This particular study by Petzold has discovered that Wikipedia has more than 270 languages varieties. For example, Wikipedia provides different versions of Chinese which includes mainland Chinese, Singaporean and Malaysian version of simplified Chinese, Taiwan’s orthodox and Hong Kong’s traditional Chinese. Choropleth and Cartogram are used to ‘analyze if geographic and linguistic affinity has helped or hindered Wikipedia language versions to evolve’ (Petzold, 2011). The Choropleth acts as an indicator to show the linguistic development across the world. While the Cartogram shows the proportional size of a region to the population of the dataset (Petzold,2011). Follow up of these mappings for a few years will present the researchers with visible linguistic development in Wikipedia. Image 1 show an example of an Cartogram shows the total number of Wikipedia articles in Europe where English seems to be dominating.
2.1.2 Network graphs
Network graphs represent the interconnection between languages. The language nearer to the centre of the network graph is considered to be the more universal language. Network graphs show a core-peripheral structure where researchers can analyse the extension of links among languages which restores, strengthens or changes the hierarchical relationships. Network graphs also work as a form of benchmark for the cartogram results analysis (Petzold, 2010). Image 2 is an example of a network graph of chosen language varieties of Wikipedia.
2.2 Studies involving Geo-linguistics
2.2.1 Geo-twitter
The idea of being able to tag your geographical location in social media may have been unthinkable decades ago. However, now it is a commonly used application. Researchers have been using these new available technologies to their advantage. Social media have knitted very close bonds with people. Their life is being updated on social media frequently. This would also mean that their language varieties and dialects are used to convey these updates to their followers. Table 1 below shows that only 36% of the language used in twitter is in English, while rest of the percentage is of other languages.
With the help of Geo-twitter, researchers have been able to associate language varieties with certain regions. According to Statista (2013), a reputable statistics portal, English is in the top place with 36% of usage for the most famous language used in twitter. However, this statistics does not take into account the different varieties of English used in the various regions. With only 140 words available to bring across your message to the twitterverse, acronyms and shortened words have been formed.
A study by Altman and Portilla shows the analysis of evolution of words used in twitter. For example, the word ‘because’ has evolved into ‘cuz’, ‘coz’ and ‘cause’ according to the pronunciation of the people of the region. This analysis shows the different evolutions of the word in various geographical locations. The data of different varieties of the word used in twitter in different locations were collected through application program interface. Around 300 tweet samples with ‘because’ in it were collected for this analysis. The research has shown that the word ‘because’ has been clipped off the first syllable in most of the context such as ‘cuz’ and ‘coz’. These spellings differentiate the regional use as they are written as how they are pronounced with the different accents. ‘coz’ is prevalently used by the British English speakers while ‘cuz’ is mostly used by American English speakers. Even though less frequent, ‘cz’ was also used instead of the word ‘because’. However, the word ‘cs’ was never found to be used. Clipping of the word ‘because’ without removal of the first syllable was less common.
2.2.2 Geo-location and phonetic differences
A recent study using the improved Geo-location technologies, Google earth and ArcGIS V. 10.0 has identified differences in the sounds of languages of different areas. This shows the language evolution to fit the environment of the language. The study had focused on the Ejective consonant sound. Ejective consonants found in twenty percent of the world’s languages have commonly been located in languages which are nearer to highly elevated areas as compared to languages without the ejective consonants. This phenomenon may be due to the lower air pressure in raised areas which lowers the effort needed to compress the air in the pharyngeal cavity (Everett,2013). This can be seen image below which shows that compression of air is needed in the process of producing ejective consonants. This also proposes that the languages in the high altitudes may have evolved to have ejective sounds as it is easier to produce in that geographical location.
(ArcGIS V. 10.0 is a geographic information system which acts as an geographic information database)
2.3 Limitations
2.3.1 Limitations of Geo- Twitter
Fair analysis is not possible with automatic corrections. Many use twitter on their smartphones. Most smart phones provide automatic corrections. Some of these automatic corrections, do not identify the different varieties of English. Most smartphones or even computer automated corrector identify with American or British varieties of English. Hence, other varieties of English or the different sentence structures would get corrected leaving the data collector with a bunch of English language varieties similar to that of American or British.
In the Geo-twitter analysis, the clipped words tend to have meanings other than ‘because’. Hence might have been wrongly roped in. For example, ‘cos’ could also refer to the mathematical word, cosine and ‘cz’ could refer to Czech Republic in other contexts.
Tweets are short, and provide little context for analysis. Just like fingerprints, every individual would have slight tint of difference in their usage of language. Even when a group of people may be from the same region and may have grown up in the same kind of environment there might be differences in the usage of language. Twitter only allows 140 words for each post. Twitter users try to minimize their use of the words as well. This gives the data collector minimum context for analysis.
Not all parts of the country might have access to the internet nevertheless, twitter. The map below shows that twitter usage is concentrated in city areas as compared to rural areas of Europe. Therefore it might be difficult to get data of the different varieties of English spoken in different regions of the continent. The data collection might only be restricted to the regions with WiFi and where people have twitter. These regions usually tend to be the urban areas where more educated people live. The data from such a source would only be a reflection of the language variety spoken by the educated urban population. However, if this data collection process is done in perhaps 10 years’ time when technological development and knowledge of twitter has reached the rural population it would be considered a fair collection and analysis of data.
2.3.2 Limitations of the Geographical location study
There are outlier languages which are spoken in the elevated land but do not use ejectives (Everett C, 2013) . It is also difficult to measure the languages according to current geographical location. People speaking languages which were initially from the lower altitude areas may have moved to higher altitude areas during the different waves of migration and also due to globalization. A language might not only be spoken in one place but may be spoken in a few regions which have different geographical aspects. Thus reducing the credibility of this study.