Database normalization is used in a database management system (DBMS), specifically with relational databases, to decrease redundant information and therefore minimize data anomalies. Anomalies can occur from information that is poorly grouped or redundant and can cause a range of effects, such as incorrect data insertion or the deletion of a whole group of data. By using database normalization, these anomalies are avoided, and memory typically is freed up so that the database can function more efficiently. Normalization functions should be used periodically, about once a week, to keep the database fresh and free of unexpected problems.
No one makes a relational database to have redundant data on purpose, but this typically happens despite the database designer’s best effort. For example, with an employee database, an employee might be listed on several tables. When redundancy occurs on a large scale, anomalies arise. Database administrators typically cannot catch all of the redundant data, so database normalization is the best way to correct this issue.
The first task of database normalization is to erase, or move, repeating information. If the information is unneeded, then it will be deleted from the database. For data that is needed in other tables, normalization seeks to create better relational tables. Normalization functions will break down large tables, will correct and enhance relations between the data and will isolate information to make data modification easier on the database. By subtracting repeating data, memory typically is freed up, which allows the database to run smoother and faster.
Anomalies occur from refusing or forgetting to normalize the database, and they can render the information useless. An update anomaly is when someone updates the data, but instead of changing the targeted data, the database will create a new record that is highly redundant. With an insertion anomaly, a record is added to the database, but nothing can be added under the new record. Deletion anomalies will randomly delete a record. These are just a few common anomalies that occur if the database is not normalized.
There is no firm standard telling an administrator when he or she should use normalization functions to ensure database efficacy. Scheduling database normalization about once a week usually is the best because this keeps all of the new information added throughout the week from experiencing anomalies. Smaller databases probably can be normalized once a month, and very large databases should be done twice a week because higher amounts of data incur a higher chance of anomalies. There is no standard, however, so the database administrator typically will choose a schedule that he or she thinks is best for the database.