Key Idea: Topic 2.6 takes the model types from 2.5 and asks: which one fits this data best? You use the GDC to run regression, get an equation, and then use it to make predictions. The critical thinking skill here is knowing when to trust a prediction (interpolation โ inside the data) versus when to be cautious (extrapolation โ beyond the data).
โ The modelling workflow
Example: Data: year (x) vs sales (y) for 5 years. GDC linear regression gives: y = 12.4x + 85.2, r = 0.97. Interpretation: Strong positive linear relationship (r close to 1). For every additional year, sales increase by about 12.4 units. Predict sales in year 6 (interpolation or extrapolation?): x = 6 is just beyond the last data point โ this is a slight extrapolation. Prediction: y = 12.4(6) + 85.2 = 159.6 (treat with some caution).
Selecting the regression type matters. Using linear regression on exponential data gives a poor fit even if r looks acceptable. Always check the graph of the regression against the scatter plot. If an exam question says 'use your regression equation to predict...', substitute your x-value into the equation and calculate y. Round to a sensible level of accuracy for the context.
Paper 2 (GDC allowed): Write the regression equation, the value of r or rยฒ, and then use it to make the prediction. Show the substitution step. Whenever you extrapolate, acknowledge the limitation: 'this is beyond the data range so the prediction may be less reliable'. This is an explicit IB marking criterion.