A data set is a collection of related data collected from a single source. The term has several applications, from information compiled from survey results to sets of scientific research results. In the computer and Internet arena, a data set is a group of numbers, or bytes, often displayed in a table with the columns categorizing the data into subsets. There are several kinds of data sets, including sequential, partitioned, and virtual storage access method (VSAM).
Data sets provide insight into a particular theme or concept. They store the information that applications or operating systems need to function correctly. Typical systems include macro libraries, source programs and system parameters or variables. These sets can be cataloged so that they can be referred to by an easily-understood name without reference to the specific storage area.
Programs for databases of information such as insurance or medical records can also use data sets. The program running on the system stores information in the data sets. Some of these data sets contain readable text that can be generated into reports. These kinds of records are often referred to as units, and they are categorized by a single identifier, such as a customer or patient name.
Data sets are organized according to their quantity and the frequency and method by which they will be accessed. The format of the individual data sets also depends on the intended use of the information. The different kinds of data sets are distinct, but have many commonalities among them.
The sequential variety of data sets store information in some sort of consecutive order. This method is used most often for information that is organized numerically or alphabetically. In order to access an item from a sequential data set, it is necessary for the system to pass through the items that precede it in whatever organizational system has been programmed.
Partitioned data sets allow for more direct access to items. This method is used when there are large quantities of information, such as an extensive database of addresses or client information. These data sets are also known as libraries. The information is organized in a manner somewhat similar to the sequential method, despite the difference in the method of accessing the information.
The Virtual Storage Access Method (VSAM) is a key sequenced data set (KSDS). These data sets are stored with specific search information attached to each item so that each can be accessed more quickly. This system is best for data sets that are used in an unpredictable manner and with high frequency.