1.5 Basic Statistical Concepts


  • The mean of a numerical distribution is found by summing up the values of all individual data points, then dividing by the number of data points.
  • It is represented by either a capital letter with a bar drawn above it, or the Greek symbol mu (µ):

\bar{X}=\frac{\sum_{i=1}^{N} x_{i}}{N}

Where N is the total number of data points, and represents the i’th datapoint.

Note: the symbol \Sigma is short for “sum of”, so \sum_{i=1}^{N} x_{i} represents the sum of all individual data points (from datapoint 1, to datapoint N)

1.3 Statistical Analysis of Categorical Distributions

Answering Statistical Questions on Categorical Distributions


  • The mode of categorical data refers to the category with the highest frequency.

Note: the mode of a categorical distribution is also known as the modal category, or dominant category


Given the bar chart:

Bar Chart for Categorical Data

Red has the highest frequency and so it is the modal category.

Guidelines to analysing categorical distributions

