In the lab session, students are introduced to the SPSS statistical software. This is used to carry out bivariate analysis:
- between a quantitative variable (salary) and categorical variable (e.g. type of library) using t-test
- between two quantitative variables (e.g. salary vs. age) using Pearson r (correlation coefficient)
- between two categorical variables (type of library vs. qualification) using cross-tab and Chi-square test of independence.
The lecture session reviewed 3 important statistical concepts that students should have some familiarity with from their under grad stats class:
– normal distribution, standard deviation and standardised scores
– sampling distribution and interval estimation
– hypothesis testing and p-value
Students looked like they were having a headache at the end of class.
Some international students asked me whether they needed to do any calculation by hand! I told them they just need to know which icon to click on the software and interpret the output! Welcome to the 21st century!
I also told students they don’t have to memorise anything for the exam, except that 1SD is associated with 68% of the population, 2SD with 95% and 3SD with 99%. Not sure what they make of this!
The next class will feature an “American Idol” competition — data mining version!