UWSpace >
University of Waterloo >
Electronic Theses and Dissertations (UW) >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10012/6496

Title: Variable Ranking by Solution-path Algorithms
Authors: Wang, Bo
Keywords: Lasso
LARS
Variable Ranking
Solution path
Approved Date: 20-Jan-2012
Date Submitted: 19-Jan-2012
Abstract: Variable Selection has always been a very important problem in statistics. We often meet situations where a huge data set is given and we want to find out the relationship between the response and the corresponding variables. With a huge number of variables, we often end up with a big model even if we delete those that are insignificant. There are two reasons why we are unsatisfied with a final model with too many variables. The first reason is the prediction accuracy. Though the prediction bias might be small under a big model, the variance is usually very high. The second reason is interpretation. With a large number of variables in the model, it's hard to determine a clear relationship and explain the effects of variables we are interested in. A lot of variable selection methods have been proposed. However, one disadvantage of variable selection is that different sizes of model require different tuning parameters in the analysis, which is hard to choose for non-statisticians. Xin and Zhu advocate variable ranking instead of variable selection. Once variables are ranked properly, we can make the selection by adopting a threshold rule. In this thesis, we try to rank the variables using Least Angle Regression (LARS). Some shrinkage methods like Lasso and LARS can shrink the coefficients to zero. The advantage of this kind of methods is that they can give a solution path which describes the order that variables enter the model. This provides an intuitive way to rank variables based on the path. However, Lasso can sometimes be difficult to apply to variable ranking directly. This is because that in a Lasso solution path, variables might enter the model and then get dropped. This dropping issue makes it hard to rank based on the order of entrance. However, LARS, which is a modified version of Lasso, doesn't have this problem. We'll make use of this property and rank variables using LARS solution path.
Program: Statistics
Department: Statistics and Actuarial Science
Degree: Master of Mathematics
URI: http://hdl.handle.net/10012/6496
Appears in Collections:Electronic Theses and Dissertations (UW)
Faculty of Mathematics Theses and Dissertations

Files in This Item:

File Description SizeFormat
Wang_Bo.pdf396.41 kBAdobe PDFView/Open


This item is protected by original copyright

All items in UWSpace are protected by copyright, with all rights reserved.

 

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

contact us | give us feedback | http://www.lib.uwaterloo.ca | © 2006 University of Waterloo