Home > Topics > Business Statistics – II > Lines of Regression

Lines of Regression 📉

In Linear Regression, we fit straight lines to the data. But did you know there are generally two regression lines, not one?


Why Two Lines? ✌️

You might think one "best fit" line is enough. However, we minimize errors differently depending on what we want to predict.

  1. Line of Y on X: Used to predict Y when X is known.
  2. Line of X on Y: Used to predict X when Y is known.

[!IMPORTANT] We always pick the line that corresponds to the variable we want to predict (Dependent Variable).


1. Regression Line of Y on X ➡️

  • Objective: Minimize errors in Y (vertical distances).
  • Dependent Variable: Y
  • Independent Variable: X
  • Equation:
    Y = a + bX
    
  • Key Statistic: byx (Regression Coefficient of Y on X). It tells us how much Y changes for a unit change in X.

2. Regression Line of X on Y ⬆️

  • Objective: Minimize errors in X (horizontal distances).
  • Dependent Variable: X
  • Independent Variable: Y
  • Equation:
    X = a + bY
    
  • Key Statistic: bxy (Regression Coefficient of X on Y). It tells us how much X changes for a unit change in Y.

Intersection on the Graph ❌

If you plot both regression lines on the same graph:

  1. Intersection: They intersect at the point of means (, Ȳ).
  2. Coincidence: If r = ± 1 (Perfect Correlation), both lines overlap and become one single line.
  3. Perpendicular: If r = 0 (No Correlation), the lines cut each other at 90 degrees (perpendicular). Y on X becomes horizontal, X on Y becomes vertical.

Two Regression Lines (Imagine a graph where two lines cross at the average values of X and Y)


Differences Summary 📝

FeatureLine of Y on XLine of X on Y
PurposePrediction of YPrediction of X
Slope Coefficientb_yxb_xy
Minimizes Error inY direction (Vertical)X direction (Horizontal)
FormatY - Ȳ = b_yx(X - X̄)X - X̄ = b_xy(Y - Ȳ)

Why not just one line? 🧠

Mathematically, the "Line of Best Fit" minimizes the sum of squared deviations (Least Squares Method).

  • If we minimize vertical deviations, we get Y on X.
  • If we minimize horizontal deviations, we get X on Y. Unless the correlation is perfect (r=1), these two minimization processes yield slightly different lines.

Summary

  • Two lines exist to ensure the best prediction accuracy for either variable.
  • They meet at the means (, Ȳ).
  • The closer the angle between them is to 0, the higher the correlation.

Loading quiz…