Here is a quick summary of what you learned in the course:
- A decision tree is a model
composed of a collection of
conditions organized
hierarchically in the shape of a tree. Conditions fall into various
categories:
- An axis-aligned condition involves only a single feature. An oblique condition involves multiple features.
- A binary condition has two possible outcomes. A non-binary condition has more than two possible outcomes.
- Training a decision tree involves searching for the best condition at each node. The splitter routine uses metrics such as information gain or Gini to determine the best condition.
- A decision forest is a mode made of multiple decision trees. The prediction of a decision forest is the aggregation of the predictions of its decision trees.
- A random forest is an ensemble of decision trees in which each decision tree is trained with a specific random noise.
- Bagging is a technique in which each decision tree in the random forest is trained on a different subset of examples.
- Random forests do not require a validation dataset. Instead, most random forests use a technique called out-of-bag-evaluation to evaluate the quality of the model.
- A gradient boosted (decision) tree is a type of decision forest trained through iterative adjustments from input decision trees. A value called shrinkage controls the rate at which a gradient boosted (decision) tree learns and the degree to which it could overfit.
References
- Greedy Function Approximation: A Gradient Boosting Machine, J. Friedman.
- The Elements of Statistical Learning, Trevor Hastie. Chapter 10.
- Generalized Boosted Models: A guide to the gbm package, G. Ridgeway