Natural Language#
The natural language is a context sensitive language and therefore difficult to parse (in contrast to context free programming languages).
Natural Language Processing#
Is done in several steps.
- Tokenize: Separate individual words
- Tagging: Detect word type (Noun, Verb, etc.)
- Chunking: Group words into phrases
- Extraction: Analyze meaning
Part of Speech (POS) Tagging#
| Tag | Description | Example | 
|---|---|---|
| DT | Article | the, a | 
| NN | Noun | dog, car | 
| VB | Verb | fly | 
| JJ | Adjective | little | 
| IN | Preposition | at, on, if | 
| MD | Modal | shall, will | 
| EX | Existential | there | 
Chunking#
For each type of phrase (e.g. noun phrase) the words are tagged with 3
IOB Tags: I-inside, O-outside, B-begin. B if a phrase begins,
following words get I if the word belongs to the phrase, or O for
all other words.
| Chunk | Description | Example | 
|---|---|---|
| NP | Noun Phrase | the little dog | 
| VP | Verb Phrase | will fly | 
| P | Preposition Phrase | to |