Random Forests for multiclass classification: Random MultiNomial Logit

Anita Prinzie, Dirk Van den Poel

Research output: Contribution to journalArticlepeer-review

Abstract

Several supervised learning algorithms are suited to classify instances into a multiclass value space. MultiNomial Logit (MNL) is recognized as a robust classifier and is commonly applied within the CRM (Customer Relationship Management) domain. Unfortunately, to date, it is unable to handle huge feature spaces typical of CRM applications. Hence, the analyst is forced to immerse himself into feature selection. Surprisingly, in sharp contrast with binary logit, current software packages lack any feature-selection algorithm for MultiNomial Logit. Conversely, Random Forests, another algorithm learning multiclass problems, is just like MNL robust but unlike MNL it easily handles high-dimensional feature spaces. This paper investigates the potential of applying the Random Forests principles to the MNL framework. We propose the Random MultiNomial Logit (RMNL), i.e. a random forest of MNLs, and compare its predictive performance to that of (a) MNL with expert feature selection, (b) Random Forests of classification trees. We illustrate the Random MultiNomial Logit on a cross-sell CRM problem within the home-appliances industry. The results indicate a substantial increase in model accuracy of the RMNL model to that of the MNL model with expert feature selection. © 2007 Elsevier Ltd. All rights reserved.
Original languageEnglish
Pages (from-to)1721-1732
Number of pages11
JournalExpert Systems with Applications
Volume34
Issue number3
DOIs
Publication statusPublished - Apr 2008

Keywords

  • Customer relationship management (CRM)
  • Data mining methods and algorithms
  • Feature evaluation and selection
  • Multiclass classifier design and evaluation

Fingerprint

Dive into the research topics of 'Random Forests for multiclass classification: Random MultiNomial Logit'. Together they form a unique fingerprint.

Cite this