Airbnb, Inc.
Classification for asymmetric error costs
Last updated:
Abstract:
A behavior detection module constructs a random forest classifier (RFC) that takes into account asymmetric misclassification costs between a set of classification labels. The classification label estimate is determined based on classification estimates from the plurality of decision trees. Each parent node of a decision tree is associated with a condition of an attribute that splits a parent node into two child nodes by maximizing an improvement function based on a training database. The improvement function is based on an asymmetric impurity function that biases the decision tree to decrease the error for a label with high misclassification cost over the other, at the cost of increasing the error of the other label with a lower misclassification cost.
Utility
14 Dec 2015
23 Jul 2019