In computer programming, serialization is the process of taking a data structure stored in local memory and turning it into a stream of bytes that can be transmitted over a network or stored on a disk to be reassembled and used by another program. Serialization also can be used to save the state of an object so it can be reloaded later by the same program. A more complex use of this function is to invoke a remote procedure call (RPC), effectively running a procedure on another computer through a network. This mechanism also allows for the distribution of data objects over a large networked system.
Nearly every modern computer language has either native support for serialization or a library available to add this functionality. When an object is serialized, all of the fields of the object are flattened. This process also is known as deflating or marshalling. The data is turned into a one-dimensional row of bytes that can be written to any output stream. The type of output stream does not matter and could be a file or a network socket.
Once the data has been serialized and sent to its final location, the process of deserialization begins. The program that reads the byte stream restores all the information and places it in a new instance of the original object, creating an exact copy. It is important to understand that only the data that the object was holding is marshaled; the object and its methods and other implementation data is not. This means the program that deserializes the data must be able to create an instance of the class that was originally serialized.
Data structure serialization can be used for a variety of purposes. Object information can be stored on physical media so the exact state of every object can be restored to the point it was at when program execution halted. It can be used to send messages to another computer that will cause a remote procedure to run. Serialization can even be used to efficiently compare state changes in real time applications.
Before using object serialization, it is important to understand some of the limitations it imposes. The most important is that, through the process of converting an object into a byte stream, fields that are declared as private will be exposed. During the transmission of the stream, this data can be captured and decoded, presenting a security hole. Most languages allow for the externalization of the data serialization formats so proprietary encoding is possible to help mitigate this risk.
Another factor to bear in mind is that serialization will, in general, work only with objects that are exactly the same as the serialized object. If new fields or methods are added to an object, then the signature of the object will change. This will mean the stored object will cause an exception and the data will become unrecoverable until an instance of the original unmodified object attempts to restore it.