Decision trees: Check your understanding
This page challenges you to answer a series of multiple choice exercises
about the material discussed in the "Training Decision Trees" unit.
Question 1
What are the effects of replacing the numerical features with their
negative values (for example, changing the value +8 to -8) with
the exact numerical splitter?
The structure of the decision tree will be completely
different.
The structure of the decision tree will actually be
pretty much the same. The conditions will change, though.
The same conditions will be learned; only the
positive/negative children will be switched.
Fantastic.
Different conditions will be learned, but the overall structure
of the decision tree will remain the same.
If the features change, then the conditions will change.
Question 2
What two answers best describe the effect of testing only half
(randomly selected) of the candidate threshold values in X?
The information gain would be lower or equal.
Well done.
The final decision tree would have worse testing accuracy.
The information gain would be higher or equal.
The final decision tree would have no better training accuracy.
Well done.
Question 3
What would happen if the "information gain" versus "threshold" curve
had multiple local maxima?
The algorithm would select the local maxima with the smallest
threshold value.
It is impossible to have multiple local maxima.
Multiple local maxima are possible.
The algorithm would select the global maximum.
Well done.
Question 4
Compute the information gain of the following split:
Node | # of positive examples | # of negative
examples |
parent node | 10 | 6 |
first child | 8 | 2 |
second child | 2 | 4 |
Click the icon to see the answer.
# Positive label distribution
p_parent = 10 / (10+6) # = 0.625
p_child_1 = 8 / (8+2) # = 0.8
p_child_2 = 2 / (2+4) # = 0.3333333
# Entropy
h_parent = -p_parent * log(p_parent) - (1-p_parent) * log(1-p_parent) # = 0.6615632
h_child_1 = ... # = 0.5004024
h_child_2 = ... # = 0.6365142
# Ratio of example in the child 1
s = (8+2)/(10+6)
f_final = s * h_child_1 + (1-s) * h_child_2 # = 0.5514443
information_gain = h_parent - f_final # = 0.1101189
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-02-25 UTC.
[null,null,["Last updated 2025-02-25 UTC."],[[["This webpage presents a series of multiple-choice exercises focused on evaluating your understanding of decision tree training concepts."],["The exercises cover topics such as the impact of feature manipulation on decision tree structure, the effects of altering threshold selection strategies, and the implications of multiple local maxima in information gain curves."],["One question requires calculating information gain using entropy and provided data, demonstrating the practical application of decision tree principles."]]],[]]