Update features and add languages (see #598)

This commit is contained in:
Ines Montani 2016-11-02 10:45:29 +01:00 committed by GitHub
parent 74a6e63a6b
commit 869570c2e7
1 changed files with 15 additions and 8 deletions

View File

@ -60,15 +60,22 @@ open-source software, released under the MIT license.
Features Features
======== ========
* Labelled dependency parsing (91.8% accuracy on OntoNotes 5) * Non-destructive **tokenization**
* Named entity recognition (82.6% accuracy on OntoNotes 5) * Syntax-driven sentence segmentation
* Part-of-speech tagging (97.1% accuracy on OntoNotes 5) * Pre-trained **word vectors**
* Easy to use word vectors * Part-of-speech tagging
* All strings mapped to integer IDs * **Named entity** recognition
* Labelled dependency parsing
* Convenient string-to-int mapping
* Export to numpy data arrays * Export to numpy data arrays
* Alignment maintained to original string, ensuring easy mark up calculation * GIL-free **multi-threading**
* Range of easy-to-use orthographic features. * Efficient binary serialization
* No pre-processing required. spaCy takes raw text as input, warts and newlines and all. * Easy **deep learning** integration
* Statistical models for **English** and **German**
* State-of-the-art speed
* Robust, rigorously evaluated accuracy
See `facts, figures and benchmarks <https://spacy.io/docs/api/>`_.
Top Peformance Top Peformance
============== ==============