CORT: Classification OR Regression Trees

Tuesday, Nov. 12, 2002

2:00-3:30 p.m.

Hughes Room

2:00-3:30 p.m.

Hughes Room

Tree models are powerful tools for signal processing and pattern
analysis. The first tree-based methodology to gain wide recognition
was CART (Classification and Regression Trees). CART aims to balance
the fitness of a tree model to data with the complexity of a tree
(measured by the number of leaf nodes). In regression problems, this
is a means to balance the familiar bias-variance trade-off. Wavelet
denoising methods and complexity penalized decision trees can be
interpreted as instances or variations of CART. In this talk, three
basic elements of CART are challenged. The primary focus is the
penalization strategy employed to prune back an initial, overfitted
tree. It is shown that the pruning rule for classification should be
different from that used for regression (unlike CART); hence, the
title, Classification or Regression Trees. Second, it is argued that
growing a tree-structured partition that is specifically fitted to the
data is unnecessary. Instead, an approach based on non-adapted
(fixed) dyadic tree structures and partitions, much like the trees
underlying wavelet analysis, is advocated. It is shown that dyadic
trees provide sufficient flexibility, are easy to construct, and
produce near-optimal results when properly pruned. Third, the use of
a negative log-likelihood measure of empirical risk for regression
problems is recommended, instead of the usual sum-of-squared errors
criterion. The likelihood-based criterion leads to regression trees
that extend wavelet denoising methods to many non-Gaussian regression
problems. Applications of tree models in networking and biomedicine
will also be discussed.

UC Berkeley Networking

Ashwin Pananjady and Orhan Ocal

Last Modification Date: Wednesday, February 10, 2016

Ashwin Pananjady and Orhan Ocal

Last Modification Date: Wednesday, February 10, 2016