June 23 – 27 2014 | Computer-Assisted Language Learning

Corpus Pinyin Distribution Analysis — Phonetic (Phoneme)

Phoneme is the basic unit of a syllable in Hanyu Pinyin. Compared to the work done by last week, analyze the distribution of phoneme in the entire corpus is more challenging. This is because a single syllable might consist of different phonemes. For example, “ca” consists of phoneme “ts^h” and phoneme “a”. By conducting this analysis, it is possible to understand if the distribution of phonetic in the entire corpus is similar to the statistic of frequent phoneme appears in modern daily Chinese conversation. The following table shows the analysis result.

Phonemes	Initial/Final	Percentage (Corpus)	Frequency (Resource)	Percentage (Resource)	Difference
N	ang,ong,iong,ing,eng,iang,uang,yang,yong	8.16%	621	6.38%	0.02
a	ang,ai,uan,ao,an,uai,ia,a,iao,ian,van,iang,uang,ua,yang,yuan,yao,yan,ya,yvan	17.89%	1279	13.13%	0.05
f	f,	1.36%	119	1.22%	0.00
i	ei,ai,iu,uai,iong,vn,in,ia,ing,ie,iao,ian,iang,i,ui,ya, yan, yang, yao, ye, yi, yin, ying, yong, you	24.45%	1422	14.60%	0.10
k	g,	2.45%	141	1.45%	0.01
kh	k,	1.09%	93	0.95%	0.00
l	l,	3.03%	223	2.29%	0.01
m	m,	2.10%	143	1.47%	0.01
n	n,uan,an,vn,in,ian,van,un,yvn,yan,yin	8.41%	800	8.21%	0.00
p	b,	2.53%	159	1.63%	0.01
ph	p,	0.76%	118	1.21%	0.00
r	r,er	1.41%	58	0.60%	0.01
s	s,	0.71%	305	3.13%	0.02
t	d,	4.22%	165	1.69%	0.03
th	t	2.17%	144	1.48%	0.01
ts	j,z	5.17%	351	3.60%	0.02
tsh	c,q	2.47%	223	2.29%	0.00
u	uan,iu,ong,ao,uai,iong,iao,uo,un,u,ui,ou,uang,ua,w,yong,yao	17.02%	1339	13.75%	0.03
x	h,s	3.13%	168	1.73%	0.01
y	van,v,yv, yvan, yve	2.46%	187	1.92%	0.01
§	sh,	3.58%	189	1.94%	0.02
«	ei,ve,iu,en,e,ing,ie,er,eng,o,uo,un,ui,ou,yve,ye,ying	18.29%	1130	11.60%	0.07
ÿ§	zh,	2.68%	218	2.24%	0.00
ÿ§h	ch,	1.54%	144	1.48%	0.00

The above table shows that the distribution of phonemes in our corpus is similar to the statistics of frequent phoneme appears in modern Chinese. Therefore, this corpus is said to be useful in assisting the beginner learners in learning Mandarin Chinese.

Resources:

http://lingua.mtsu.edu/chinese-computing/phonology/phoneme3500.php

http://lingua.mtsu.edu/chinese-computing/phonology2004/py2phoneme.php