From 869570c2e7dd2b33aa45057068472cec72788b5c Mon Sep 17 00:00:00 2001 From: Ines Montani Date: Wed, 2 Nov 2016 10:45:29 +0100 Subject: [PATCH] Update features and add languages (see #598) --- README.rst | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/README.rst b/README.rst index 278bd5a63..60eb1241a 100644 --- a/README.rst +++ b/README.rst @@ -60,15 +60,22 @@ open-source software, released under the MIT license. Features ======== -* Labelled dependency parsing (91.8% accuracy on OntoNotes 5) -* Named entity recognition (82.6% accuracy on OntoNotes 5) -* Part-of-speech tagging (97.1% accuracy on OntoNotes 5) -* Easy to use word vectors -* All strings mapped to integer IDs +* Non-destructive **tokenization** +* Syntax-driven sentence segmentation +* Pre-trained **word vectors** +* Part-of-speech tagging +* **Named entity** recognition +* Labelled dependency parsing +* Convenient string-to-int mapping * Export to numpy data arrays -* Alignment maintained to original string, ensuring easy mark up calculation -* Range of easy-to-use orthographic features. -* No pre-processing required. spaCy takes raw text as input, warts and newlines and all. +* GIL-free **multi-threading** +* Efficient binary serialization +* Easy **deep learning** integration +* Statistical models for **English** and **German** +* State-of-the-art speed +* Robust, rigorously evaluated accuracy + +See `facts, figures and benchmarks `_. Top Peformance ==============