Publication:
Generalizaciones de minimos cuadrados parciales con aplicación en clasificacion supervisada 

dc.contributor.advisor Acuña-Fernández, Edgar
dc.contributor.author Vega-Vilca, Jose C.
dc.contributor.college College of Engineering en_US
dc.contributor.committee Macchiavelli, Raul
dc.contributor.committee Romanach, Rodolfo
dc.contributor.committee Vega, Fernando
dc.contributor.department Department of Electrical and Computer Engineering en_US
dc.contributor.representative Calderón, Andrés
dc.date.accessioned 2019-02-12T16:03:44Z
dc.date.available 2019-02-12T16:03:44Z
dc.date.issued 2004
dc.description.abstract The development of technologies such as microarrays has generated a large amount of data. The main characteristic of this kind of data it is the large number of predictors (genes) and few observations (experiments). Thus, the data matrix X is of order n×p, where n is much smaller than p. Before using any multivariate statistical technique, such as regression and classification, to analyze the information contained in this data, we need to apply either feature selection methods and/or dimensionality reduction using orthogonal variables, in order to eliminate multicollineality among the predictor variables that can lead to severe prediction errors, as well as to a decrease of the computational burden required to build and validate the classifier. Principal component analysis (PCA) is a technique that has being used for some time to reduce the dimensionality. However, the first components that have the most variability of the data structure do not necessarily improve the prediction when it is used for regression and classification (Yeung and Ruzzo, 2001). Partial least squares (PLS), introduced by Wold (1975), was an important contribution to reduce dimensionality in a regression context using orthogonal components. The certainty that first PLS components improve the prediction has made PLS a widely technique used particularly in the area of chemistry, known as Chemometrics. Nguyen and Rocke (2002), working on supervised classification methods for microarray data, reduced the dimensionality by applying first feature selection using statistical techniques such as difference of means and analysis of variance, after which they applied PLS regression considering the vector of classes ( a categorical variable) as a response vector (continuous variable). This procedure is not adequate since the predictions are not necessarily integers and they must be rounded up, losing accuracy. In spite of these shortcomings, regression PLS yields reasonable results. In this thesis work we implement generalizations of regression PLS as a dimensionality reduction technique to be applied in supervised classification. We extend a technique introduced by Bastien et al. (2002), who combined PLS with ordinal logistic regression. en_US
dc.description.graduationYear 2004 en_US
dc.description.sponsorship Grant N00014-03-1-0359 de la Oficina de Investigación Naval (ONR). en_US
dc.identifier.uri https://hdl.handle.net/20.500.11801/1811
dc.language.iso Espanol en_US
dc.rights.holder (c) 2004 Jose Carlos Vega Vilca en_US
dc.rights.license All rights reserved en_US
dc.subject Clasificación supervisada en_US
dc.subject Cuadrados parciales en_US
dc.title Generalizaciones de minimos cuadrados parciales con aplicación en clasificacion supervisada  en_US
dc.type Dissertation en_US
dspace.entity.type Publication
thesis.degree.discipline Computing and Information Sciences and Engineering en_US
thesis.degree.level Ph.D. en_US
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
CIIC_VegaVilcaJ_2004.pdf
Size:
749.65 KB
Format:
Adobe Portable Document Format
Description: