Assignment

Assignment due date: 18 Jan 2018

Part 1. Analysis of experimental data (70%)

A researcher investigates typing speed in French with two computerized systems:

with two different keyboards: (i) a QWERTY and (ii) an AZERTY keyboard.

The researcher makes two hypotheses:

To test these hypotheses, the researcher runs an experiment with 32 French-speaking participants. Participants are divided into two independent groups, each with 16 participants. The first group (G1) of participants use the AZERTY keyboard, while the second group (G2) of participants use the QWERTY keyboard. All 32 participants test both the No Help and the Auto-Completion system (repeated-measures), one after the other.

To evaluate typing performance, the researcher measures the words-per-minute (WPM) rate, which is the average number of words typed per minute. Assume that this measure is normally distributed.

Data Preparation. Suppose that the researcher has completed the experiment and has collected the results. We simulate the data generation process by random sampling from some fixed populations. For this reason, you will use this R script. You will simply run the script and then use the dataset.cvs file as your dataset. Notice that each of you will generate a different dataset, so the results of your following analyses might be different.

The file should contain four columns, where the third and the fourth column give the WMP rate for no help and for auto-correction.

Exercise 1.1. Write an R script to calculate and report descriptive statistics for your experimental data. The descriptive statistics will include means and standard deviations for the full data set but also for individual conditions: for each group (azerty, qwerty) and for each system (no-help, auto-correction). In addition to descriptive statistics, graphically summarize your data by using box plots.

Exercise 1.2. Write an R script that estimates the mean WMP of each system (no-help and auto-correction) and their mean WMP difference. Similarly, write an R script that estimates the mean WMP of each keyboard (azerty and qwerty) and their mean WMP difference. Use 95% confidence intervals to estimate those means. The R script should include code that calculates and also plots the confidence intervals.

Exercise 1.3. Write an R script that conducts the appropriate significance tests to test the two hypotheses presented above. What are your conclusions? Are the results of the significance tests consistent with the confidence intervals calculated for Exercise 1.2?

Exercise 1.4. Write a brief report (2 to 3 pages) that summarizes your data analyses for Exercises 1.1 to 1.3. The report should also present the generated graphs, report on the statistics that you used, the tests that you conducted and their results. Finally, it should summarize your conclusions.

Part 2. Non-normal distributions (30%)

The researcher suspects that WMP distributions are skewed and considers applying a log-transformation to correct the problem.

Exercise 2.1. Repeat Exercises 1.2 and 1.3, but now use log-transformed values to calculate confidence intervals and conduct your significance tests to test the two hypotheses.

Exercise 2.2. Update your previous report (1 - 2 additional pages) to include the results of this new analysis. Do your findings and conclusions change?

What to Submit: (1) your dataset (.csv file), (2) your R scripts (.R files), and (3) your report (.pdf). Make sure that your R code includes enough comments to explain what you do.

Advice: You are encouraged to discuss the problems and their solutions with your colleagues and with your instructor. However, your final solutions and report is personal.