Show simple item record

dc.contributor.authorWang, Bo
dc.date.accessioned2012-01-20 21:29:11 (GMT)
dc.date.available2012-01-20 21:29:11 (GMT)
dc.date.issued2012-01-20T21:29:11Z
dc.date.submitted2012-01-19
dc.identifier.urihttp://hdl.handle.net/10012/6496
dc.description.abstractVariable Selection has always been a very important problem in statistics. We often meet situations where a huge data set is given and we want to find out the relationship between the response and the corresponding variables. With a huge number of variables, we often end up with a big model even if we delete those that are insignificant. There are two reasons why we are unsatisfied with a final model with too many variables. The first reason is the prediction accuracy. Though the prediction bias might be small under a big model, the variance is usually very high. The second reason is interpretation. With a large number of variables in the model, it's hard to determine a clear relationship and explain the effects of variables we are interested in. A lot of variable selection methods have been proposed. However, one disadvantage of variable selection is that different sizes of model require different tuning parameters in the analysis, which is hard to choose for non-statisticians. Xin and Zhu advocate variable ranking instead of variable selection. Once variables are ranked properly, we can make the selection by adopting a threshold rule. In this thesis, we try to rank the variables using Least Angle Regression (LARS). Some shrinkage methods like Lasso and LARS can shrink the coefficients to zero. The advantage of this kind of methods is that they can give a solution path which describes the order that variables enter the model. This provides an intuitive way to rank variables based on the path. However, Lasso can sometimes be difficult to apply to variable ranking directly. This is because that in a Lasso solution path, variables might enter the model and then get dropped. This dropping issue makes it hard to rank based on the order of entrance. However, LARS, which is a modified version of Lasso, doesn't have this problem. We'll make use of this property and rank variables using LARS solution path.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectLassoen
dc.subjectLARSen
dc.subjectVariable Rankingen
dc.subjectSolution pathen
dc.titleVariable Ranking by Solution-path Algorithmsen
dc.typeMaster Thesisen
dc.pendingfalseen
dc.subject.programStatisticsen
uws-etd.degree.departmentStatistics and Actuarial Scienceen
uws-etd.degreeMaster of Mathematicsen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages