Word sense disambiguation (WSD) lies at the core of software programs designed to interpret language. Ambiguous words or sentences can be understood multiple ways, though only one meaning is intended. Disambiguation seeks to decipher the intended meaning of words and sentences. This area is extremely challenging to programmers tasked with designing accurate interfaces to bridge the gap between spoken and written language, and computer-generated translations.
Software designed to convert speech-to-text can “listen” to a user speaking into a microphone and translate spoken words into typed sentences. The user dictates punctuation, interjecting words like “comma” and “period” where appropriate. This sounds pretty straightforward except that many words sound exactly alike.
For example, know and no or I and eye are phonetically indistinguishable. Word sense disambiguation helps to translate, “I should know by next week,” properly, by using what is basically a set of “if, then” rules that take word placement and adjacent words into consideration as indicators of the intended word. This type of word sense disambiguation is known as the “shallow approach,” and is fairly accurate, but can’t always be counted on.
Another approach is to apply “world knowledge,” or what computer linguistics call the “deep approach.” This approach relies on lexicons like dictionaries and thesaurus to help determine a word’s proper sense. Unfortunately, designing a deep approach database that is comprehensive enough to provide better accuracy than the shallow approach is not an easy task.
Software that reads text aloud (text-to-speech) also requires word sense disambiguation. The word bass, for example, might mean a musical instrument, a note, or a fish. In the latter case it is pronounced differently, leaving it to WSD to deduce which pronunciation to use. If the typed sentence happens to be, “The bass is heavy,” only a scan of surrounding sentences might reveal clues, such as finding the words “fishing,” “boat,” “dock,” or conversely, “band,” “music” or “song.” If the program’s word sense disambiguation is not robust enough, or if additional clues are absent, the program can make errors in translation.
In addition to “if, then” rules of the shallow approach, algorithms are also used to determine correct interpretations. In the above example, an algorithm might find key words throughout the document that clearly point to a musical interpretation, or visa versa. Other approaches are also used in WSD that are basically refinements or extensions of these basic approaches.
Word sense disambiguation is also vital in verbal command interfaces designed to replace the keyboard — not just in relaying simple operating system commands, but in such complex tasks as researching the Web. Other areas where WSD plays a role include development of the Semantic Web and improved artificial intelligence models. Indeed, any area of science that relies on a linguistic bridge between human and machine will use word sense disambiguation.