Natural language processing (NLP) can be defined as the automatic (or semi-automatic) processing of human language.
What is NLP in other terms…
The term ‘NLP’ is sometimes used rather more narrowly than that, often excluding information retrieval and sometimes even excluding machine translation. NLP is sometimes contrasted with ‘computational linguistics’, with NLP being thought of as more applied.
Nowadays, alternative terms are often preferred, like ‘Language Technology’ or ‘Language Engineering’.
Language is often used in contrast with speech (e.g., Speech and Language Technology). But I’m going to simply refer to NLP and use the term broadly. NLP is essentially multidisciplinary: it is closely related to linguistics (although the extent to which NLP overtly draws on linguistic theory varies considerably).
It also has links to research in cognitive science, psychology, philosophy and maths (especially logic). Within CS, it relates to formal language theory, compiler techniques, theorem proving, machine learning and human-computer interaction. Of course it is also related to AI, though nowadays it’s not generally thought of as part of AI.
Some linguistic terminology
The course is organised so that there are six lectures corresponding to different NLP subareas, moving from relatively ‘shallow’ processing to areas which involve meaning and connections with the real world. These subareas loosely correspond to some of the standard subdivisions of linguistics:
- Morphology: the structure of words. For instance, unusually can be thought of as composed of a prefix un-, a stem usual, and an affix -ly. composed is compose plus the inflectional affix -ed: a spelling rule means we end up with composed rather than composeed.
- Syntax: the way words are used to form phrases. e.g., it is part of English syntax that a determiner such as the will come before a noun, and also that determiners are obligatory with certain singular nouns.
- Semantics: Compositional semantics is the construction of meaning (generally expressed as logic) based on syntax. This is contrasted to lexical semantics, i.e., the meaning of individual words.
- Pragmatics: meaning in context.