A primary key is an entry in a database that is unique to a single record. This key is generally derived in one of two ways: a unique identification code from outside the database or a generated number from within the database. When the database will contain information that is always unique to the entry, such as a social security number or part identification number, then those are typically used as a primary key. When the data won’t have such an identifier, the database will often create numbers based on its internal systems to give each record a unique code.
There are three main restrictions on a primary key: existence, uniqueness and immutability. A key must exist at the time the record is made — it can’t be added in later. Each key has to be completely different from any other key. This means that common identifiers, such as name or birth date, can’t be used because it is possible that two people are born on the same day with the same name. Lastly, a primary key can never be altered once created.
Since a database can potentially have an infinite number of entries, a primary key must be infinite as well. To make sure that a database will never run out of keys, most records use a numerical code for the key. Since numbers can always get bigger and computers can simply add place-holding zeroes to older entries, a system will never run out of keys. Sometimes these numbers are based on non-unique information, but a unique identifier is added to make sure the key is viable.
Databases will use a primary key as a way of organizing data. Since the key is never repeated, that piece of information will allow the database to keep every record separate from every other. Each piece of information in a record is connected back to the key; that way, no matter what happens to the system, the database can rebuild the records from loose information.
Assigning a meaningful primary key is often seen as a better practice than auto-generating a value. This will give the record an identifier that both works as a key and provides data. In small databases, this distinction is rarely necessary, but in large systems, the extra space used by a generated key can result is serious database bloat. This will both slow the system down and make the database require significantly more storage space.