Moyano-Niño, Héctor

Loading...
Profile Picture

Publication Search Results

Now showing 1 - 1 of 1
  • Publication
    Estimación de densidades multivariadas en flujo de datos usando mezclas adaptativas de componentes gaussianas
    (2012-06) Moyano-Niño, Héctor; Acuña-Fernández, Edgar; College of Arts and Sciences - Sciences; Lorenzo, Edgardo; Macchiavelli, Raúl E.; Department of Mathematics; Córdoba, Mario
    In the current world of science and technology the data arrive continuously over time, this type of data is called data stream and is impractical to store all of the data. The data mining and traditional techniques of analysis aren’t efficient enough to work with problems that have data stream. Then it is necessary to have statistical models for data stream. The adaptive mixtures (AM) is an estimation method that combines Gaussian mixture modeling and estimation via kernel. Also has as one of its main features, constant updating with the arrival sequence data. Therefore, the adaptive mixtures (AM) are very attractive for modeling the data stream. To adapt the idea adaptive mixtures to data streams presents some problems such as creating models of mixtures with too many components, slight changes in the estimated model parameters due to ordering in the arrival of new data and little applicability to space of high dimension. Many of these problems have been treated recently with the adequacy of expectation-maximization algorithm online (oEM) to the process of adaptive mixtures for data stream (oAM). The thesis presents the study of adaptive mixtures for modeling multidimensional data flow using Gaussian components. Also, it presents an experimental study with artificial data to control the growth in the number of components and improve the estimation of model components using what I call graphs adjustment components. All the theoretical framework and the algorithms presented here are directed to estimate multivariate densities, but the experimental part was carried out and implemented in R statistical programming language for data in two and three dimensions.