Decision Trees: “Gini” vs. “Entropy” criteria

The scikit-learn documentation1 has an argument to control how the decision tree algorithm splits nodes:

criterion : string, optional (default=”gini”)
The function to measure the quality of a split. 
Supported criteria are “gini” for the Gini impurity 
and “entropy” for the information gain.

It seems like something that could be important since this determines the formula used to partition your dataset at each point in the dataset.

Unfortunately the documentation tells you nothing about what you should use, other than trying each to see what happens, so here’s what I found (spoiler: it doesn’t appear to matter):

  • Gini is intended for continuous attributes, and Entropy for attributes that occur in classes (e.g. colors2
  • “Gini” will tend to find the largest class, and “entropy” tends to find groups of classes that make up ~50% of the data((http://paginas.fe.up.pt/~ec/files_1011/week%2008%20-%20Decision%20Trees.pdf))
  • “Gini” to minimize misclassification3
  • “Entropy” for exploratory analysis4
  • Some studies show this doesn’t matter – these differ less than 2% of the time5
  • Entropy may be a little slower to compute6

If you're looking for a Python book, Natural Language Processing with Python is a great way to learn the language while building some really interesting projects.

  1. http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html []
  2. http://paginas.fe.up.pt/~ec/files_1011/week%2008%20-%20Decision%20Trees.pdf []
  3. http://www.quora.com/Machine-Learning/Are-gini-index-entropy-or-classification-error-measures-causing-any-difference-on-Decision-Tree-classification []
  4. http://www.quora.com/Machine-Learning/Are-gini-index-entropy-or-classification-error-measures-causing-any-difference-on-Decision-Tree-classification []
  5. https://rapid-i.com/rapidforum/index.php?topic=3060.0 []
  6. http://stats.stackexchange.com/questions/19639/which-is-a-better-cost-function-for-a-random-forest-tree-gini-index-or-entropy []
4 replies
  1. Eva
    Eva says:

    If you find the best split using the entropy criterion. Does the “feature importances” attribute of a tree return the information gain measures or does it return the gini importance?


Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *