Introduction to Statistical Methods

M1 International, Université Paris-Saclay, Winter 2018-19


The course is the second part of the module Probabilities and Statistics of the international Computer-Science Master (M1 IIT) program. The course mainly targets students and researchers who are interested in experimental research methods and often need to deal with small samples and messy data. Previous knowledge of statistics or probability theory is not required, but some basic understanding of probabilities could help.

The course will introduce fundamental concepts of descriptive and inferential statistics. The goal of the course is NOT to provide a set of statistical recipes or step-by-step instructions. Particular focus will be given on understanding key principles, thinking about underlying assumptions, and recognizing the limitations of statistical methods.

The students will also learn how to use the R statistical software to analyze real datasets and how to apply computational methods to estimate parameters or evaluate statistical procedures.


Lectures take place on Mondays from 2 to 5pm at PUIO (Bat. 640) in room D201. For more information about the content and schedule of the course, see the course syllabus.

Classes are given by Theophanis Tsandilas.


I have posted some home exercises to help you prepare for the final exam.

The assignment is out!


Nov 19. Basic Concepts: data, populations, samples. Why learning statistics? Types of data and descriptive statistics. Starting with R.
[Lecture 1: Slides] [Lecture 1: R code]

- You can read Chapter 1 of Baguley's book, available online.

Nov 26. Discrete and continuous probability distributions: binomial, normal, log-normal, and chi-square. The sampling distribution of a statistic. The Central Limit Theorem.
[Lecture 2: Slides] [Lecture 2: R code]

- R and Probability Distributions
- Variance vs. Mean Absolute Difference

External Links:
- Lecture notes on the normal distribution and the Central Limit Theorem from MIT. The notes also explain how to use histograms to plot distributions.
- Interactively generate sampling distributions of various statistics from various population distributions.

Dec 3. Confidence intervals. Monte Carlo simulations. Experimental design: independent groups, repeated measures.
[Lecture 3: Slides] [Lecture 3: R code] [R skeleton code for exercise]

Dec 10. Confidence intervals of non-normal distributions. Bootstrapping. Introduction to Null Hypothesis Significance Testing.
[Lecture 4: Slides] [Lecture 4: R code]

Jan 7. Significance tests and p values. Type I and Type II errors. Statistical power. Publication bias. P-hacking and criticisms of NHST. Multiple comparisons. Preregistration.
[Lecture 5: Slides] [Lecture 5: R code]

- An article discussing several p-hacking methods and ways to avoid them.
- A recent article by Cockburn et al. on the preregistation for HCI research.

Jan 14. Covariance and correlation. Simple linear regression.
[Lecture 6: Slides] [Lecture 6: R code]

Jan 21. Preparation for the final exam.
[Lecture 7: Home exercises]

R Tutorials

There is a large collection of online tutorials for learning R. The following list may not be representative. It will be updated during the course: