Home » Further Maths Unit 3 & 4 » A1 - Data Analysis » 3.3 Using the Formula for a Fitted Line

3.3 Using the Formula for a Fitted Line

Interpolation

  • After fitting a model to a dataset (through linear regression), we can use that model to estimate values we don’t have data points for.
  • When estimating values that lie within the range of available raw data points, we refer to it as interpolating.
  • Interpolation is considered accurate if the fit has high strength and sufficient data points were used.

Example: if a linear fit is creating using data points ranging in value from 1 to 10, estimating the value of the response variable when the explanatory variable has a value of 2 would be considered interpolation.

Extrapolation

  • When estimating values that lie outside of the range of available raw data points, we are extrapolating.
  • Extrapolation is generally not considered accurate as the form of the relationship between the explanatory and response variable may change outside of the values tested for, the variables may not be able to take on data values beyond some point (e.g. it might not make sense for a variable to be negative despite the fitted model being able to do so), and any error in the fitted line will have a greater effect at values significantly larger or smaller than those used in the fit.

Example: if a linear fit is creating using data points ranging in value from 1 to 10, estimating the value of the response variable when the explanatory variable has a value of 11 would be considered extrapolation.

Interpreting the Slope of a Regression Line

  • The slope of a regression line can be found either simply as the value of the coefficient b in its formula, or with two points on the line and the formula:

b=\frac{r i s e}{r u n}=\frac{y_{2}-y_{1}}{x_{2}-x_{1}}

Where x_{1},\ y_{1} and x_{2},\ y_{2} are the coordinates of the two data points and x_{2}>x_{1}.

  • The slope shows how much the response variable increases (or decreases if the slope is negative) per unit change in the explanatory variable.
  • Remember to include units.

Example

The speed of a vehicle (in m\ s^{-1} is recorded at differing values of time (in seconds) and the following regression relationship is found:

v=2+3t

Where v is the speed of the vehicle and t is the time.

From the formula, we can see the slope of this regression relationship is 3 m\ s^{-2}. This suggests the speed of the vehicle changes by 3 m\ s^{-1} per second.

Interpreting the y-intercept of a Regression Line

  • The y-intercept of a regression line can be found either as the value of the constant a in its formula, or as the value of response variable when the explanatory variable is equal to 0.
  • What the y-intercept estimates the value of the response variable when the explanatory variable is equal to 0.

Note: in some cases it may not make sense for the explanatory variable to take on a value of 0, however a linear regression model can always be used to predict a value at that point, even if it makes no physical sense.

  • If the explanatory variable is time, the y-intercept represents the initial value of the response variable.

Example

The speed of a vehicle is recorded at differing values of time and the following regression relationship is found:

v=2+3t

From the formula we can see the y-intercept is 2 m\ s^{-1}. In the context of this situation, this means that the model predicts the initial speed of the vehicle to be 2 m\ s^{-1}.

Using a Fitted Line’s Formula to Predict Values

  • The formula for a fitted line can be used to predict values for both the response and explanatory variables.
  • Predicting values for the response variable is done by simply substituting in a value for the explanatory variable and solving the right-hand side of the equation.
  • Predicting values for the explanatory variable requires the equation to be rearranged using simple linear algebraic operations to get it to the following form:

x=\frac{y-a}{b}

  • Keep in mind when predicting values whether you are interpolating or extrapolating and if the values make sense in the context of the association being analysed.

Example: Predicting Response Variable Values

The speed of a vehicle is recorded at differing values of time and the following regression relationship is found:

v=2+3t

We wish to find what the speed of the vehicle is after 10 seconds. We do so by substituting t=10 into the formula:

v(t=10)=2+3 * 10=32\ m\ s^{-1}

Example: Predicting Explanatory Variable Values

Using the situation and formula described above (v=2+3t), we wish to find what time the vehicle reaches a speed of m\ s^{-1}. We start by rearranging the formula to make time the subject:

t=\frac{v-2}{3}

Now we substitute in the value v=17 into the formula:

t=\frac{17-2}{3}=5 \ s

[/membership]