The natural language is a context sensitive language and therefore difficult to parse (in contrast to context free programming languages).
Natural Language Processing#
Is done in several steps.
- Tokenize: Separate individual words
- Tagging: Detect word type (Noun, Verb, etc.)
- Chunking: Group words into phrases
- Extraction: Analyze meaning
Part of Speech (POS) Tagging#
|IN||Preposition||at, on, if|
For each type of phrase (e.g. noun phrase) the words are tagged with 3
B if a phrase begins,
following words get
I if the word belongs to the phrase, or
all other words.
|NP||Noun Phrase||the little dog|
|VP||Verb Phrase||will fly|