Introduction
Techniques for exploring data to enable valid conclusions to be drawn are described in this Section.
The diagrammatic methods of stem-and-leaf and box-and-whisker are given prominence.
You will also learn how to summarize data using sets of statistics which have meaning in cases where a data set is not symmetrical. You should note that statistics such as the mean and variance are of limited use in such situations. Finally, you will encounter outliers. These are values which lie outside the main body of the data set and can enable you to reach important conclusions about the behaviour of the data.
Prerequisites
- understand the ideas of sets and subsets ( HELM booklet 35.1)
Learning Outcomes
- undertake Exploratory Data Analysis (EDA)
- construct stem-and-leaf diagrams and box-and-whisker plots
- explain the significance of outliers, skewness, gaps and multiple peaks
Contents
1 Exploratory data analysis1.1 Introduction
1.2 The basics of EDA
1.3 The stem-and-leaf diagram
1.4 Drawing a stem-and-leaf diagram
1.5 The box-and-whisker diagram
2 Outliers
2.1 Criteria for rejecting outliers
3 Skewness, gaps and multiple peaks
3.1 Skewness
3.2 Gaps and multiple peaks
3.3 Final comments on data representations