Natural language processing (NLP) is a way of translating between computer languages and human languages. The goal of this field is to allow computers to understand what a text says without being given precise values and equations for the data that the text contains. In essence, natural language processing automates the translation process between human and computer languages. While much of this field relies on statistics and models to determine likely meanings of a phrase, there are and have been many different approaches to this problem. Findings in this field have applications in the areas of speech recognition, human language translation, information retrieval, and even artificial intelligence.
Evolving out of a background in computer science and linguistics, natural language processing faces many problems because language is not always consistent and not all clues to meaning are contained in language itself. Even a complete account of the entire grammar of a language including all exceptions does not always allow a computer to parse the information contained in a text. Some sentences are syntactically ambiguous, words often have more than one meaning, and some combinations of sounds or symbols change their meaning depending on the boundaries of the words — all of which can be problems for a computer that does not understand context. More importantly, much of language depends on a connection to the physical and social universe — some sentences, such as speech acts, do not convey information as much as act on the world. Even if a computer has a perfect understanding of human language syntax and semantics, the text to be analyzed must be free of human devices, such as sarcasm or passive aggression, for the computer to correctly ascertain what the text means.
Ideologically, natural language processing is a system of human-computer interaction that is governed by the idea that most computer users are more comfortable working with computers in a human language they already know than adapting to a computer's language. It also capitalizes on the fact that much of human knowledge is already encoded in human language, and the texts that contain that knowledge can be translated into logical structures that can be streamlined for a computer. While many projects in this field work to extract computer-readable data from human language texts, natural language processing is also used to generate human-readable texts from computer data. Both these understanding and generating facilities can be used by the same technology, such as in the case of applications that translate from one human language to another by first decoding the text into a computer language, then encoding it in another human language. The innovations obtained in natural language processing endeavors are also strikingly applicable to artificial intelligence projects because of the degree to which human-like intelligence is defined by a mastery of the complexities of human language.