Reference 3
Dummy Variable Trap
๐ซ Dummy Variable Trap (Perfect Multicollinearity)
Suppose our categorical variable is orange juice brand with 3 categories:
- Dominickโs (D)
- Minute Maid (M)
- Tropicana (T)
Create three dummy variables:
- \(D_D = 1\) if Dominickโs, 0 otherwise
- \(D_M = 1\) if Minute Maid, 0 otherwise
- \(D_T = 1\) if Tropicana, 0 otherwise
Because every purchase/week observation belongs to exactly one brand,
\[ D_D + D_M + D_T = 1 \quad \text{for every observation.} \]
โ What goes wrong if we include an intercept + all dummies?
Consider the regression model (e.g., for log sales):
\[ y_i = \beta_0 + \beta_D D_{D,i} + \beta_M D_{M,i} + \beta_T D_{T,i} + \varepsilon_i \]
But from \(D_D + D_M + D_T = 1\), we can solve for \(D_T\):
\[ D_{T,i} = 1 - D_{D,i} - D_{M,i} \]
Substitute into the regression:
\[ \begin{aligned} y_i &= \beta_0 + \beta_D D_{D,i} + \beta_M D_{M,i} + \beta_T(1 - D_{D,i} - D_{M,i}) + \varepsilon_i \\ &= (\beta_0 + \beta_T) + (\beta_D - \beta_T)D_{D,i} + (\beta_M - \beta_T)D_{M,i} + \varepsilon_i \end{aligned} \]
๐ฅ Key point
The model depends only on:
- \((\beta_0 + \beta_T)\)
- \((\beta_D - \beta_T)\)
- \((\beta_M - \beta_T)\)
So \(\beta_0, \beta_D, \beta_M, \beta_T\) are not uniquely identified (infinitely many coefficient sets give the same fitted values).
That is the dummy variable trap:
intercept + all brand dummies โ perfect multicollinearity.
โ Fix: drop one dummy (choose a reference brand)
Drop \(D_T\) (Tropicana becomes the baseline/reference):
\[ y_i = \beta_0 + \beta_D D_{D,i} + \beta_M D_{M,i} + \varepsilon_i \]
Now:
- \(\beta_0\) = expected \(y\) for Tropicana (reference brand)
- \(\beta_D\) = difference (Dominickโs โ Tropicana)
- \(\beta_M\) = difference (Minute Maid โ Tropicana)
Discussion
Welcome to our Reference 3 Discussion Board! ๐
This space is designed for you to engage with your classmates about the material covered in Reference 3.
Whether you are looking to delve deeper into the content, share insights, or have questions about the content, this is the perfect place for you.
If you have any specific questions for Byeong-Hak (@bcdanl) regarding the Reference 3 materials or need clarification on any points, donโt hesitate to ask here.
All comments will be stored here.
Letโs collaborate and learn from each other!