Generalizaciones de minimos cuadrados parciales con aplicación en clasificacion supervisada

Vega-Vilca, Jose C.

Publication:

Generalizaciones de minimos cuadrados parciales con aplicación en clasificacion supervisada

Files

CIIC_VegaVilcaJ_2004.pdf (749.65 KB)

Authors

Vega-Vilca, Jose C.

Advisor

Acuña-Fernández, Edgar

College

College of Engineering

Department

Department of Electrical and Computer Engineering

Degree Level

Ph.D.

Date

2004

Full item page

Abstract

The development of technologies such as microarrays has generated a large amount of data. The main characteristic of this kind of data it is the large number of predictors (genes) and few observations (experiments). Thus, the data matrix X is of order n×p, where n is much smaller than p. Before using any multivariate statistical technique, such as regression and classification, to analyze the information contained in this data, we need to apply either feature selection methods and/or dimensionality reduction using orthogonal variables, in order to eliminate multicollineality among the predictor variables that can lead to severe prediction errors, as well as to a decrease of the computational burden required to build and validate the classifier. Principal component analysis (PCA) is a technique that has being used for some time to reduce the dimensionality. However, the first components that have the most variability of the data structure do not necessarily improve the prediction when it is used for regression and classification (Yeung and Ruzzo, 2001). Partial least squares (PLS), introduced by Wold (1975), was an important contribution to reduce dimensionality in a regression context using orthogonal components. The certainty that first PLS components improve the prediction has made PLS a widely technique used particularly in the area of chemistry, known as Chemometrics. Nguyen and Rocke (2002), working on supervised classification methods for microarray data, reduced the dimensionality by applying first feature selection using statistical techniques such as difference of means and analysis of variance, after which they applied PLS regression considering the vector of classes ( a categorical variable) as a response vector (continuous variable). This procedure is not adequate since the predictions are not necessarily integers and they must be rounded up, losing accuracy. In spite of these shortcomings, regression PLS yields reasonable results. In this thesis work we implement generalizations of regression PLS as a dimensionality reduction technique to be applied in supervised classification. We extend a technique introduced by Bastien et al. (2002), who combined PLS with ordinal logistic regression.

Keywords

Clasificación supervisada,
Cuadrados parciales

Usage Rights

Persistent URL

https://hdl.handle.net/20.500.11801/1811

Cite

Vega-Vilca, J. C. (2004). Generalizaciones de minimos cuadrados parciales con aplicación en clasificacion supervisada [Dissertation]. Retrieved from https://hdl.handle.net/20.500.11801/1811

Collections

Theses & Dissertations

Publication:

Generalizaciones de minimos cuadrados parciales con aplicación en clasificacion supervisada

Files

Authors

Embargoed Until

Advisor

College

Department

Degree Level

Publisher

Date

Abstract

Keywords

Usage Rights

Persistent URL

Collections

Publication: Generalizaciones de minimos cuadrados parciales con aplicación en clasificacion supervisada

Files

Authors

Embargoed Until

Advisor

College

Department

Degree Level

Publisher

Date

Abstract

Keywords

Usage Rights

Persistent URL

Collections

Publication:

Generalizaciones de minimos cuadrados parciales con aplicación en clasificacion supervisada