spaCy/website/docs/usage/101/_pos-deps.md

After tokenization, spaCy can **parse** and **tag** a given `Doc`. This is where
the trained pipeline and its statistical models come in, which enable spaCy to
**make predictions** of which tag or label most likely applies in this context.
A trained component includes binary data that is produced by showing a system
enough examples for it to make predictions that generalize across the language –
for example, a word following "the" in English is most likely a noun.

Linguistic annotations are available as
[`Token` attributes](/api/token#attributes). Like many NLP libraries, spaCy
**encodes all strings to hash values** to reduce memory usage and improve
efficiency. So to get the readable string representation of an attribute, we
need to add an underscore `_` to its name:

```python
### {executable="true"}
import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")

for token in doc:
    print(token.text, token.lemma_, token.pos_, token.tag_, token.dep_,
            token.shape_, token.is_alpha, token.is_stop)
```

> - **Text:** The original word text.
> - **Lemma:** The base form of the word.
> - **POS:** The simple [UPOS](https://universaldependencies.org/u/pos/)
>   part-of-speech tag.
> - **Tag:** The detailed part-of-speech tag.
> - **Dep:** Syntactic dependency, i.e. the relation between tokens.
> - **Shape:** The word shape – capitalization, punctuation, digits.
> - **is alpha:** Is the token an alpha character?
> - **is stop:** Is the token part of a stop list, i.e. the most common words of
>   the language?

| Text    | Lemma   | POS     | Tag   | Dep        | Shape   | alpha   | stop    |
| ------- | ------- | ------- | ----- | ---------- | ------- | ------- | ------- |
| Apple   | apple   | `PROPN` | `NNP` | `nsubj`    | `Xxxxx` | `True`  | `False` |
| is      | be      | `AUX`   | `VBZ` | `aux`      | `xx`    | `True`  | `True`  |
| looking | look    | `VERB`  | `VBG` | `ROOT`     | `xxxx`  | `True`  | `False` |
| at      | at      | `ADP`   | `IN`  | `prep`     | `xx`    | `True`  | `True`  |
| buying  | buy     | `VERB`  | `VBG` | `pcomp`    | `xxxx`  | `True`  | `False` |
| U.K.    | u.k.    | `PROPN` | `NNP` | `compound` | `X.X.`  | `False` | `False` |
| startup | startup | `NOUN`  | `NN`  | `dobj`     | `xxxx`  | `True`  | `False` |
| for     | for     | `ADP`   | `IN`  | `prep`     | `xxx`   | `True`  | `True`  |
| \$      | \$      | `SYM`   | `$`   | `quantmod` | `$`     | `False` | `False` |
| 1       | 1       | `NUM`   | `CD`  | `compound` | `d`     | `False` | `False` |
| billion | billion | `NUM`   | `CD`  | `pobj`     | `xxxx`  | `True`  | `False` |

> #### Tip: Understanding tags and labels
>
> Most of the tags and labels look pretty abstract, and they vary between
> languages. `spacy.explain` will show you a short description – for example,
> `spacy.explain("VBZ")` returns "verb, 3rd person singular present".

Using spaCy's built-in [displaCy visualizer](/usage/visualizers), here's what
our example sentence and its dependencies look like:

import DisplaCyLongHtml from 'images/displacy-long.html'; import { Iframe } from
'components/embed'

<Iframe title="displaCy visualization of dependencies and entities" html={DisplaCyLongHtml} height={450} />