Anomaly detection is an automated process that identifies data that does not belong in a set or pattern. Data that doesn't match can be a sign of a problem with a system, and in large data streams, users might not be able to detect the anomaly. The automated system can identify it, collect information, and generate a report. Some systems may also be equipped to take action if an anomaly is a recognizable problem and needs some kind of system response to protect the system or the users.
Anomalies can arise for a number of reasons. One is an error with a system that causes the generation of garbled, incomplete, or corrupt data. A system may also have data outliers because of an intrusion, where the data may be an injection from another source or a virus that is proliferating within the system. Fraud can also generate anomalies in a computer system.
From a systems architecture and security standpoint, anomaly detection is a valuable tool. Automated scanning can identify and block many attacks before the user is even aware, and this can make the overall system much safer. Whether errors are the result of an internal issue or an outside attack, they need to be identified and resolved as quickly as possible. If the system encounters an anomaly and does not know how to respond, it may send a report to a system administrator for further action.
Detection of fraud can also be important. Insurance companies and other organizations can run anomaly detection scans on claims and reports to see if any stand out or appear unusual. This can help them identify obvious instances of fraud. Likewise, banks and other financial companies use anomaly detection for security. If a 90-year-old person with a very steady banking history suddenly starts behaving oddly, for instance, the anomaly detection system might flag it and indicate suspected identity theft.
Anomaly detection is also a useful tool in the sciences. Researchers can use this tool to spot rogue microorganisms, DNA, and other elusive bits of data of interest in a sample. This can help them identify the source of a medical problem, track down and eliminate impurities in a sample, and perform other tasks. In epidemiology, for instance, automated programs scan reports from health care facilities to spot outliers that might be warning signs of an emerging epidemic, and can issue alerts to researchers and pubic health officials if anything unusual is detected.