Week 3 & 4 (Aug.25 -- Sep.5 2014) | Computer-Assisted Language Learning

Permutation Test on Gender

1) Introduction

Each utterance in the corpus was split into two groups based on the gender of the speaker. The test statistic was obtained by calculating the difference in mean fluency score of the two groups. After which, the utterances in the two groups would be randomly permuted 50000 times in three experiments. The permuted difference was found by calculating the difference in mean fluency score of the two permuted groups in each permutation. The p-value is the number of times that the permuted difference is larger than the test statistic.

2) Experiment Set up

Null Hypothesis: Male speaker does not have better fluency score than female speaker

Alternative Hypothesis: Male speaker has better fluency score than female speaker

Number of permutation: 50000

Alpha value: 5%

3) Result and Analysis

p-value: 0.0152

Figure 12 Standard Error Chart (Gender)

The p-value is smaller than the alpha value; it has to reject the null hypothesis. As shown in Figure 12, standard error bar of the two groups did not overlap and the mean fluency score of male speaker is significantly higher than female speaker. Thus, it can conclude that male speaker has better fluency score than female speaker, which is in contrast to conventional wisdom.

Computer-Assisted Language Learning

SCE13-0504

Week 3 & 4 (Aug.25 — Sep.5 2014)

Permutation Test on Gender