- Entropy (H(X)) measures the uncertainty in a random variable. It is lowest when the variable's probability distribution is concentrated on one or a few values.
- Lower entropy means the variable's values are more predictable on average. Higher entropy means more uncertainty.
- Information gain measures how much knowing the value of one variable (Y) reduces the uncertainty about the value of another variable (X). It is highest when X and Y are highly correlated.
- Information gain is used in decision tree learning to select the variable that best splits the data at each node, reducing uncertainty about the target variable. The variable with the highest information gain is chosen