spaCy/website/docs/usage/facts-figures.md

---
title: Facts & Figures
teaser: The hard numbers for spaCy and how it compares to other tools
next: /usage/spacy-101
menu:
  - ['Feature Comparison', 'comparison']
  - ['Benchmarks', 'benchmarks']
  # TODO: - ['Citing spaCy', 'citation']
---

## Comparison {#comparison hidden="true"}

spaCy is a **free, open-source library** for advanced **Natural Language
Processing** (NLP) in Python. It's designed specifically for **production use**
and helps you build applications that process and "understand" large volumes of
text. It can be used to build information extraction or natural language
understanding systems.

### Feature overview {#comparison-features}

import Features from 'widgets/features.js'

<Features />

### When should I use spaCy? {#comparison-usage}

- ✅ **I'm a beginner and just getting started with NLP.** – spaCy makes it easy
  to get started and comes with extensive documentation, including a
  beginner-friendly [101 guide](/usage/spacy-101), a free interactive
  [online course](https://course.spacy.io) and a range of
  [video tutorials](https://www.youtube.com/c/ExplosionAI).
- ✅ **I want to build an end-to-end production application.** – spaCy is
  specifically designed for production use and lets you build and train powerful
  NLP pipelines and package them for easy deployment.
- ✅ **I want my application to be efficient on GPU _and_ CPU.** – While spaCy
  lets you train modern NLP models that are best run on GPU, it also offers
  CPU-optimized pipelines, which are less accurate but much cheaper to run.
- ✅ **I want to try out different neural network architectures for NLP.** –
  spaCy lets you customize and swap out the model architectures powering its
  components, and implement your own using a framework like PyTorch or
  TensorFlow. The declarative configuration system makes it easy to mix and
  match functions and keep track of your hyperparameters to make sure your
  experiments are reproducible.
- ❌ **I want to build a language generation application.** – spaCy's focus is
  natural language _processing_ and extracting information from large volumes of
  text. While you can use it to help you re-write existing text, it doesn't
  include any specific functionality for language generation tasks.
- ❌ **I want to research machine learning algorithms.** spaCy is built on the
  latest research, but it's not a research library. If your goal is to write
  papers and run benchmarks, spaCy is probably not a good choice. However, you
  can use it to make the results of your research easily available for others to
  use, e.g. via a custom spaCy component.

## Benchmarks {#benchmarks}

spaCy v3.0 introduces transformer-based pipelines that bring spaCy's accuracy
right up to **current state-of-the-art**. You can also use a CPU-optimized
pipeline, which is less accurate but much cheaper to run.

<!-- TODO: update benchmarks and intro -->

> #### Evaluation details
>
> - **OntoNotes 5.0:** spaCy's English models are trained on this corpus, as
>   it's several times larger than other English treebanks. However, most
>   systems do not report accuracies on it.
> - **Penn Treebank:** The "classic" parsing evaluation for research. However,
>   it's quite far removed from actual usage: it uses sentences with
>   gold-standard segmentation and tokenization, from a pretty specific type of
>   text (articles from a single newspaper, 1984-1989).

import Benchmarks from 'usage/\_benchmarks-models.md'

<Benchmarks />

<figure>

| Dependency Parsing System                                                      |  UAS |  LAS |
| ------------------------------------------------------------------------------ | ---: | ---: |
| spaCy RoBERTa (2020)                                                           | 95.5 | 94.3 |
| [Mrini et al.](https://khalilmrini.github.io/Label_Attention_Layer.pdf) (2019) | 97.4 | 96.3 |
| [Zhou and Zhao](https://www.aclweb.org/anthology/P19-1230/) (2019)             | 97.2 | 95.7 |

<figcaption class="caption">

**Dependency parsing accuracy** on the Penn Treebank. See
[NLP-progress](http://nlpprogress.com/english/dependency_parsing.html) for more
results. Project template:
[`benchmarks/parsing_penn_treebank`](%%GITHUB_PROJECTS/benchmarks/parsing_penn_treebank).

</figcaption>

</figure>

### Speed comparison {#benchmarks-speed}

We compare the speed of different NLP libraries, measured in words per second
(WPS) - higher is better. The evaluation was performed on 10,000 Reddit
comments.

<figure>

| Library | Pipeline                                        | WPS CPU <Help>words per second on CPU, higher is better</Help> | WPS GPU <Help>words per second on GPU, higher is better</Help> |
| ------- | ----------------------------------------------- | -------------------------------------------------------------: | -------------------------------------------------------------: |
| spaCy   | [`en_core_web_lg`](/models/en#en_core_web_lg)   |                                                         10,014 |                                                         14,954 |
| spaCy   | [`en_core_web_trf`](/models/en#en_core_web_trf) |                                                            684 |                                                          3,768 |
| Stanza  | `en_ewt`                                        |                                                            878 |                                                          2,180 |
| Flair   | `pos`(`-fast`) & `ner`(`-fast`)                 |                                                            323 |                                                          1,184 |
| UDPipe  | `english-ewt-ud-2.5`                            |                                                          1,101 |                                                          _n/a_ |

<figcaption class="caption">

**End-to-end processing speed** on raw unannotated text. Project template:
[`benchmarks/speed`](%%GITHUB_PROJECTS/benchmarks/speed).

</figcaption>

</figure>

<!-- TODO: ## Citing spaCy {#citation}

-->
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
+								---
 								title: Facts & Figures
 								teaser: The hard numbers for spaCy and how it compares to other tools
 								next: /usage/spacy-101
 								menu:
 								  - ['Feature Comparison', 'comparison']
 								  - ['Benchmarks', 'benchmarks']
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:05:10 +00:00
+								  # TODO: - ['Citing spaCy', 'citation']
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
+								---
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:05:10 +00:00
+								## Comparison {#comparison hidden="true"}
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
-												Update docs [ci skip]

											
										
										
											2020-10-15 09:16:06 +00:00
+								spaCy is a **free, open-source library** for advanced **Natural Language
 								Processing** (NLP) in Python. It's designed specifically for **production use**
 								and helps you build applications that process and "understand" large volumes of
 								text. It can be used to build information extraction or natural language
 								understanding systems.
 								### Feature overview {#comparison-features}
 								import Features from 'widgets/features.js'
 								<Features />
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:05:10 +00:00
+								### When should I use spaCy? {#comparison-usage}
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:40:50 +00:00
+								- ✅ **I'm a beginner and just getting started with NLP.** – spaCy makes it easy
 								  to get started and comes with extensive documentation, including a
 								  beginner-friendly [101 guide](/usage/spacy-101), a free interactive
 								  [online course](https://course.spacy.io) and a range of
 								  [video tutorials](https://www.youtube.com/c/ExplosionAI).
 								- ✅ **I want to build an end-to-end production application.** – spaCy is
 								  specifically designed for production use and lets you build and train powerful
 								  NLP pipelines and package them for easy deployment.
 								- ✅ **I want my application to be efficient on GPU _and_ CPU.** – While spaCy
 								  lets you train modern NLP models that are best run on GPU, it also offers
 								  CPU-optimized pipelines, which are less accurate but much cheaper to run.
 								- ✅ **I want to try out different neural network architectures for NLP.** –
 								  spaCy lets you customize and swap out the model architectures powering its
 								  components, and implement your own using a framework like PyTorch or
 								  TensorFlow. The declarative configuration system makes it easy to mix and
 								  match functions and keep track of your hyperparameters to make sure your
 								  experiments are reproducible.
 								- ❌ **I want to build a language generation application.** – spaCy's focus is
 								  natural language _processing_ and extracting information from large volumes of
 								  text. While you can use it to help you re-write existing text, it doesn't
 								  include any specific functionality for language generation tasks.
 								- ❌ **I want to research machine learning algorithms.** spaCy is built on the
 								  latest research, but it's not a research library. If your goal is to write
 								  papers and run benchmarks, spaCy is probably not a good choice. However, you
 								  can use it to make the results of your research easily available for others to
 								  use, e.g. via a custom spaCy component.
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
 								## Benchmarks {#benchmarks}
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:05:10 +00:00
+								spaCy v3.0 introduces transformer-based pipelines that bring spaCy's accuracy
 								right up to **current state-of-the-art**. You can also use a CPU-optimized
 								pipeline, which is less accurate but much cheaper to run.
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
-												Update docs [ci skip]

											
										
										
											2020-09-20 15:44:58 +00:00
+								<!-- TODO: update benchmarks and intro -->
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:05:10 +00:00
+								> #### Evaluation details
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
+								>
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:05:10 +00:00
+								> - **OntoNotes 5.0:** spaCy's English models are trained on this corpus, as
 								>   it's several times larger than other English treebanks. However, most
 								>   systems do not report accuracies on it.
 								> - **Penn Treebank:** The "classic" parsing evaluation for research. However,
 								>   it's quite far removed from actual usage: it uses sentences with
 								>   gold-standard segmentation and tokenization, from a pretty specific type of
 								>   text (articles from a single newspaper, 1984-1989).
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:05:10 +00:00
+								import Benchmarks from 'usage/\_benchmarks-models.md'
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
-												Update docs [ci skip]

											
										
										
											2020-09-12 15:05:10 +00:00
+								<Benchmarks />
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
-												Update docs [ci skip]

											
										
										
											2020-09-23 20:02:31 +00:00
+								<figure>
-												Update docs and resolve todos [ci skip]

											
										
										
											2020-09-24 11:41:25 +00:00
+								| Dependency Parsing System                                                      |  UAS |  LAS |
-												Update docs [ci skip]

											
										
										
											2020-09-23 20:02:31 +00:00
+								| ------------------------------------------------------------------------------ | ---: | ---: |
-												Update docs [ci skip]

											
										
										
											2020-10-15 06:58:30 +00:00
+								| spaCy RoBERTa (2020)                                                           | 95.5 | 94.3 |
-												Update docs [ci skip]

											
										
										
											2020-09-23 20:02:31 +00:00
+								| [Mrini et al.](https://khalilmrini.github.io/Label_Attention_Layer.pdf) (2019) | 97.4 | 96.3 |
 								| [Zhou and Zhao](https://www.aclweb.org/anthology/P19-1230/) (2019)             | 97.2 | 95.7 |
 								<figcaption class="caption">
-												Update docs [ci skip]

											
										
										
											2020-09-24 08:13:41 +00:00
+								**Dependency parsing accuracy** on the Penn Treebank. See
-												Update docs [ci skip]

											
										
										
											2020-09-23 20:02:31 +00:00
+								[NLP-progress](http://nlpprogress.com/english/dependency_parsing.html) for more
-												Update docs [ci skip]

											
										
										
											2020-10-15 06:58:30 +00:00
+								results. Project template:
-												Update docs [ci skip]

											
										
										
											2020-09-24 08:13:41 +00:00
+								[`benchmarks/parsing_penn_treebank`](%%GITHUB_PROJECTS/benchmarks/parsing_penn_treebank).
-												Update docs [ci skip]

											
										
										
											2020-09-23 20:02:31 +00:00
 								</figcaption>
 								</figure>
-												add speed comparison to docs

											
										
										
											2021-01-22 17:46:35 +00:00
+								### Speed comparison {#benchmarks-speed}
-												update comment

											
										
										
											2021-01-22 17:55:18 +00:00
+								We compare the speed of different NLP libraries, measured in words per second
-												Adjust formatting [ci skip]

											
										
										
											2021-01-27 02:31:25 +00:00
+								(WPS) - higher is better. The evaluation was performed on 10,000 Reddit
 								comments.
-												add speed comparison to docs

											
										
										
											2021-01-22 17:46:35 +00:00
 								<figure>
 								| Library | Pipeline                                        | WPS CPU <Help>words per second on CPU, higher is better</Help> | WPS GPU <Help>words per second on GPU, higher is better</Help> |
 								| ------- | ----------------------------------------------- | -------------------------------------------------------------: | -------------------------------------------------------------: |
 								| spaCy   | [`en_core_web_lg`](/models/en#en_core_web_lg)   |                                                         10,014 |                                                         14,954 |
 								| spaCy   | [`en_core_web_trf`](/models/en#en_core_web_trf) |                                                            684 |                                                          3,768 |
 								| Stanza  | `en_ewt`                                        |                                                            878 |                                                          2,180 |
 								| Flair   | `pos`(`-fast`) & `ner`(`-fast`)                 |                                                            323 |                                                          1,184 |
-												Adjust formatting [ci skip]

											
										
										
											2021-01-27 02:31:25 +00:00
+								| UDPipe  | `english-ewt-ud-2.5`                            |                                                          1,101 |                                                          _n/a_ |
-												add speed comparison to docs

											
										
										
											2021-01-22 17:46:35 +00:00
 								<figcaption class="caption">
 								**End-to-end processing speed** on raw unannotated text. Project template:
 								[`benchmarks/speed`](%%GITHUB_PROJECTS/benchmarks/speed).
 								</figcaption>
 								</figure>
-												Update docs [ci skip]

											
										
										
											2020-09-20 15:44:58 +00:00
+								<!-- TODO: ## Citing spaCy {#citation}
-												💫 Update website (#3285)

<!--- Provide a general summary of your changes in the title. -->

## Description

The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on.

This PR also includes various new docs pages and content.
Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837.


### Types of change
enhancement

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.

											
										
										
											2019-02-17 18:31:19 +00:00
-												Update docs [ci skip]

											
										
										
											2020-09-20 15:44:58 +00:00
+								-->