A data science researcher is choosing between Ridge and Lasso regression for a high-dimensional dataset. What is the key distinguishing behavior of Lasso?
-
A
It penalizes the sum of squared coefficients
-
B
It can shrink coefficients exactly to zero, performing feature selection
-
C
It always produces lower bias than Ridge
-
D
It requires normally distributed predictors