From b8d58813339d4669c939ffd1a31eccfe4e1d46ef Mon Sep 17 00:00:00 2001 From: Matthew Honnibal Date: Mon, 3 Nov 2014 13:54:18 +1100 Subject: [PATCH] * Update sales copy --- docs/source/index.rst | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/docs/source/index.rst b/docs/source/index.rst index 7aad9c231..e0339f1e2 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -9,13 +9,21 @@ spaCy NLP Tokenizer and Lexicon spaCy is a library for industrial strength NLP in Python and Cython. Its core values are efficiency, accuracy and minimalism. -* Efficiency: spaCy is +* Efficiency: spaCy is TODOx faster than the Stanford tools, and TODOx faster + than NLTK. You won't find faster NLP tools. Using spaCy will save you + thousands in server costs, and will force you to make fewer compromises. -It does not attempt to be comprehensive, -or to provide lavish syntactic sugar. This isn't a library that covers 43 known -algorithms to do X. You get 1 --- the best one --- with a simple, low-level interface. -For commercial users, the code is free but the data isn't. For researchers, both -are free and always will be. +* Accuracy: All spaCy tools are within 0.5% of the current published + state-of-the-art, on both news and web text. NLP moves fast, so always check + the numbers --- and don't settle for tools that aren't backed by + rigorous recent evaluation. An algorithm that was "close enough to state-of-the-art" + 5 years ago is probably crap by today's standards. + +* Minimalism: This isn't a library that covers 43 known algorithms to do X. You + get 1 --- the best one --- with a simple, low-level interface. This keeps the + code-base small and concrete. Our Python APIs use lists and + dictionaries, and our C/Cython APIs use arrays and simple structs. + Comparison ----------