The nested set model is also known as the modified preorder tree traversal algorithm and is a way of storing hierarchical data within relational databases. This model has the advantage of providing very fast access and is best implemented in hierarchies that are read more frequently than written to. Each node within the information model is assigned two numbers that are stored as attributes. Querying the nested set model is fairly easy because both values can be used to pull out the necessary data. Making insertions, deletions, moves, and updates, though, are far more cumbersome because they may involve renumbering the nodes.
Typically used to represent nested sets or hierarchical information in the form of trees, the nested set model was introduced by Joe Celko. A tree, in this instance, is a data structure that contains a number of linked nodes. For instance, a parent node may connect to several child nodes, and this structure is repeated through the tree through several levels.
Trees are a great way of storing information in a particular order within a relational database, which is a data set that stores data depending on common characteristics. For example, product information within the food section of a store may start with food, branching into fruits, vegetables, and meat. Fruits may further be subdivided into berries, melons, and apples and vegetables into tubers, greens, and others, and meat into pork, mutton, and veal.
A relational database stores all this information in an easy-to-understand form, and a nested set model allows the tree structure to be managed efficiently. Using the above example, the root node would be food, which is represented by two values. Given the left value for food as 1, the other items in the tree are assigned a number on the left in order. Fruits would get a value of 2 on the left, berries would be 3, and so on. The values are then assigned on the right side, working all the way through the tree, bottom up, through each branch until the last value is assigned to food on the right side.
Each item in the tree ends up with two values, say lft for left and rgt for right, which can be used to identify them and indicate their relationship to other items. For instance, if fruits have a value of 2 and 15, then all nodes that have left values greater than 2 and right values less than 15 are descendants of the fruit tree 2–15. It becomes easy to pull out information on all fruits in one go because these values can be specified in a single query to the database.
This model is excellent for storing information that is accessed often, but insertions, deletions, and reordering information in the nested set model become very tedious. Rewriting indexes and renumbering the information may cause the database to crash, especially if the tree grows to include hundreds of thousands of nodes. The nested set model is best for light content management systems that have minimal insertions and changes. Insertions can be made much faster in the nested interval model because it stores the position of each node in the tree using floating point decimals while also encoding the path information.