Resemblance-based Model

Automated Similarity Judgement Program (ASJP)

Automated Similarity Judgement Program (ASJP), is a programme under the resemblance-based method, that compares two languages at a time for lexical similarity. In this research by (Brown, Holman, Wichmann & Velupillai, 2007), ASJP database consists of 100-item list of core vocabulary from a number of distributed languages, depending on the sample used by researchers. This is known as Swadesh 100-item list of core vocabulary, in which a hundred glosses or meanings are used to be comparatively analysed. This is to establish when was the point whereby two related languages started to deviate from one another and separate into different branches. “Core vocabulary” would necessarily mean words for things that appear common in the environment of human beings such as body parts, colours, and natural objects such as sun, water, rain and etc (Brown, Holman, Wichmann & Velupillai, 2007).

There are 5 steps to ASJP. First and foremost, ASJP produces Lexical Similarity Percentage (LSP) for every pair of languages that are being compared. LSP is calculated by using a formula. ASJP would detect the number of items on the list (Swadesh 100-item list of core vocabulary) in which two compared languages have words that are phonologically similar. This number of items would then be divided by the number of meanings on list for which both of the languages have words for the particular vocabulary. Thereafter, the result would be multiplied by 100. Last but not least, LSP would be corrected for confounding factors and this results in Subtracted Similarity Percentage (SSP). SSP that is generated form a database to producing ASJP trees for languages.

A diagram of ASJP would look like the following:

1.1 100-item list

For this particular research, only a subset of 100-item list of (Swadesh, 1955) consisting of 40 most stable items are being used. The method for measuring stabilities is described in (Holman, Wichmann, Brown, Velupillai, Müller & Bakker, 2008).

Blood Bone Breast Come Die Dog Drink Ear
Eye Fire Fish Full Hand Hear Horn I
Knee Leaf Liver Louse Mountain Name New Night
Nose One Path Person See Skin Star Stone
Sun Tongue Tooth Tree Two Water We You (sg)

1.2 ASJP Orthography

Languages around the world have different writing systems which might not be an effective way to compare vocabularies for lexical similarity. Therefore, there is a need to assemble words into a uniformed, standard orthography. With the help of International Phonetic Alphabet (IPA) symbols, the vocabulary items will be simplified. A unique feature of ASJP orthography is that it consists symbols found on QWERTY keyboard that is commonly used for English Language. The underlying reason behind this is that the ASJP symbols correspond with many sounds. It is constructed to represent all sounds that appear common to languages around the world. In some languages, there are sounds that are less common and are not precisely recognised in the orthography. Such rare sounds would then be identified by the ASJP symbol that is closest to the place and manner of articulation of those sounds. It is also worth to note that the ASJP procedure is not time consuming. Most of the processes, such as formulating a standardized orthography often takes less than an hour (Brown, Holman, Wichmann & Velupillai, 2007).

1.3 ASJP Formula

1.3.1 Lexical Similarity Percentage (LSP)

No. of items on 100-item list that are phonologically similar
No. of meanings on list in which both languages have words     X     100

1.3.2 Subtracted Similarity Percentage (SSP)

Subtracted Similarity (%) – Lexical Similarity (%)  = Phonological Similarity (%)

2.4 ASJP Trees

asjp

 

Figure 1. ASJP Tree for Indo-European Languages
(Brown, Holman, Wichmann & Velupillai, 2007).

 Figure 1 illustrates the ASJP tree for certain Indo-European languages. It is imperative to note that the tree generated is not the full representation of the language families. This is due to the fact that only languages that are used in a particular research for comparative analysis will be reflected in the tree. It is evident that languages that are found on the same branch, such as German and English are more lexically similar as compared to those of different branches, such as German and Irish. Upon the construction of these trees, researchers could then compare the results produced by ASJP with the trees that are manually constructed by Historical linguists. In this particular example, there has been a mutual agreement among researchers as to the relationship found between languages found in Indo-European.