Merge pull request #6796 from svlandeg/docs/benchmarks [ci skip]

This commit is contained in:
Ines Montani 2021-01-27 13:01:23 +11:00 committed by GitHub
commit 5d79d1af50
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 27 additions and 3 deletions

View File

@ -4,13 +4,13 @@ import { Help } from 'components/typography'; import Link from 'components/link'
| Pipeline | Parser | Tagger | NER | | Pipeline | Parser | Tagger | NER |
| ---------------------------------------------------------- | -----: | -----: | ---: | | ---------------------------------------------------------- | -----: | -----: | ---: |
| [`en_core_web_trf`](/models/en#en_core_web_trf) (spaCy v3) | 95.5 | 98.3 | 89.4 | | [`en_core_web_trf`](/models/en#en_core_web_trf) (spaCy v3) | 95.2 | 97.8 | 89.9 |
| [`en_core_web_lg`](/models/en#en_core_web_lg) (spaCy v3) | 92.2 | 97.4 | 85.4 | | [`en_core_web_lg`](/models/en#en_core_web_lg) (spaCy v3) | 91.9 | 97.4 | 85.5 |
| `en_core_web_lg` (spaCy v2) | 91.9 | 97.2 | 85.5 | | `en_core_web_lg` (spaCy v2) | 91.9 | 97.2 | 85.5 |
<figcaption class="caption"> <figcaption class="caption">
**Full pipeline accuracy and speed** on the **Full pipeline accuracy** on the
[OntoNotes 5.0](https://catalog.ldc.upenn.edu/LDC2013T19) corpus (reported on [OntoNotes 5.0](https://catalog.ldc.upenn.edu/LDC2013T19) corpus (reported on
the development set). the development set).

View File

@ -92,6 +92,30 @@ results. Project template:
</figure> </figure>
### Speed comparison {#benchmarks-speed}
We compare the speed of different NLP libraries, measured in words per second
(WPS) - higher is better. The evaluation was performed on 10,000 Reddit comments.
<figure>
| Library | Pipeline | WPS CPU <Help>words per second on CPU, higher is better</Help> | WPS GPU <Help>words per second on GPU, higher is better</Help> |
| ------- | ----------------------------------------------- | -------------------------------------------------------------: | -------------------------------------------------------------: |
| spaCy | [`en_core_web_lg`](/models/en#en_core_web_lg) | 10,014 | 14,954 |
| spaCy | [`en_core_web_trf`](/models/en#en_core_web_trf) | 684 | 3,768 |
| Stanza | `en_ewt` | 878 | 2,180 |
| Flair | `pos`(`-fast`) & `ner`(`-fast`) | 323 | 1,184 |
| UDPipe | `english-ewt-ud-2.5` | 1,101 | NA |
<figcaption class="caption">
**End-to-end processing speed** on raw unannotated text. Project template:
[`benchmarks/speed`](%%GITHUB_PROJECTS/benchmarks/speed).
</figcaption>
</figure>
<!-- TODO: ## Citing spaCy {#citation} <!-- TODO: ## Citing spaCy {#citation}
--> -->