Variable importances
Variable importance (also known as feature importance) is a score that indicates how "important" a feature is to the model. For example, if for a given model with two input features "f1" and "f2", the variable importances are {f1=5.8, f2=2.5}, then the feature "f1" is more "important" to the model than feature "f2". As with other machine learning models, variable importance is a simple way to understand how a decision tree works.
You can apply model agnostic variable importances such as permutation variable importances, to decision trees.
Decision trees also have specific variable importances, such as:
- The sum of the split score with a given variable.
- The number of nodes with a given variable.
- The average depth of the first occurrence of a feature across all the tree paths.
Variable importances can differ by qualities such as:
- semantics
- scale
- properties
Furthermore, variable importances provide different types of information about:
- the model
- the dataset
- the training process
For example, the number of conditions containing a specific feature indicates how much a decision tree is looking at this specific feature, which might indicate variable importance. After all, the learning algorithm would not have used a feature in multiple conditions if it did not matter. However, the same feature appearing in multiple conditions might also indicate that a model is trying but failing to generalize the pattern of a feature. For example, this can happen when a feature is just an example identifier with no information to generalize.
On the other hand, a high value for a high permutation variable importance indicates that removing a feature hurts the model, which is an indication of variable importance. However, if the model is robust, removing any one feature might not hurt the model.
Because different variable importances inform about different aspects of the models, looking at several variable importances at the same time is informative. For example, if a feature is important according to all the variable importances, this feature is likely important. As another example, if a feature has a high "number of nodes" variable importance and a small "permutation" variable importance, then this feature might be hard to generalize and can hurt the model quality.
model.describe()
and looking at the "variable importance" tab.
See the
Model understanding tutorial for more details.