View Complete Reference

Otey, ME (2006)

Approaches to Abnormality Detection with Constraints

PhD thesis, The Ohio State University, USA.

ISSN/ISBN: Not available at this time. DOI: Not available at this time.



Abstract: ABSTRACT: A common problem in data analysis is that of discriminating between modes of normal behavior and modes of abnormal behavior. Of particular interest are techniques that can automatically detect abnormal activity in data. This is important since abnormal data may be indicative of measurement error in scientific data, or malicious activity in security audit data. There are two basic approaches to the problem of automatically finding abnormalities. The first is known as signature detection, which involves finding known patterns of abnormality in a database. However, it has the drawback of not being able to detect abnormalities for which there is no prior information. The second approach is known as anomaly detection, which involves building a model of normal data and then searching for patterns that do not fit this model. Unlike the signature detection approach, it is able to detect abnormalities for which there is no prior information, but has the drawback that the anomalies it does detect may not be (significantly) abnormal. The most successful approaches will use both signature detection and anomaly detection techniques to utilize their combined strengths. Much of the previous research in this area has focused on more general approaches to anomaly and signature detection. However, this work is focused on carrying out anomaly and signature detection under various constraints. For example, the data may contain heterogeneous attribute types, or have missing values. The data may also be distributed across several computers or streaming in at a high rate of speed, or there may be limitations on the resources available to analyze the data. In this work, we develop novel solutions to the abnormality detection problem with constraints, and empirically test them on various real and synthetic data sets


Bibtex:
Not available at this time.


Reference Type: Thesis

Subject Area(s): Computer Science, Statistics