Skip to main navigation Skip to search Skip to main content

EAGER-DynamicData: A new paradigm for data analytics: L1-norm based Learning and Processing

Project: Research

Project Details

Description

This project aims to carry out fundamental work on the transformative/disruptive idea of data analytics and data-feature extraction by newly defined and calculated principal-component vectors that best represent the main features of a given data set, even in the presence of faulty/missing/outlier data. Based on this idea, an algorithmic framework will be developed to support several targeted applications, such as data processing in social networks, image processing and video libraries, wireless-sensor-network data fusion, economics, genomics and proteomics, and bioinformatics. The potential impact of the project work is immense and may extend well beyond these applications to cover any field of science and engineering where conventional data feature extraction has been used in the past. Technically speaking, the investigation aims at rewriting the enormously rewarding over the past century chapter on L2-norm (eigen-vector and singular-vector decomposition) data analysis. Optimal L1-norm data analytics are being developed that are inherently resistant to data contamination and as good as L2-norm analytics on "clean" data. L1-norm data principal component analysis has seen a limited amount of previous research and is non-existent so far in education/textbooks. Several profound differences between L1-norm based principle component analysis (PCA) and standard L2-norm PCA have, to date, blocked progress in the theoretical understanding (and thus in the design of efficient algorithmic solutions) of L1-based PCA. The project seeks the development of a novel approach toward dimensionality reduction under the L1 norm to deal with processing of outlier-prone/contaminated ?big data? (large amount of high-dimensional data) by interpreting the fundamental L1-norm principal-components optimization problem as an equivalent binary-field maximization problem, and in such opening a spectrum of potentially new analytical and algorithmic techniques. The project goals include: (i) Fundamental algorithmic research on the exact calculation of maximum-L1-norm-projection data features; (ii) fundamental understanding and execution of L1-norm-measured data dimensionality reduction; and (iii) sample-space reduction.
StatusFinished
Effective start/end date09/1/1508/31/18

Funding

  • National Science Foundation: $180,000.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.