* Edt API docs

This commit is contained in:
Matthew Honnibal 2015-01-24 20:49:44 +11:00
parent 71b95202eb
commit 83a7e91f3c
1 changed files with 42 additions and 109 deletions

View File

@ -2,27 +2,28 @@
Documentation
===============
Quick Ref
---------
.. class:: spacy.en.__init__.English(self, data_dir=join(dirname(__file__, 'data')))
.. py:currentmodule:: spacy
.. class:: en.English(self, data_dir=join(dirname(__file__, 'data')))
:noindex:
.. method:: __call__(self, unicode text, tag=True, parse=False) --> Tokens
+-----------+--------------+--------------+
| Attribute | Type | Useful |
+===========+==============+==============+
| strings | StingStore | __getitem__ |
+-----------+--------------+--------------+
| vocab | Vocab | __getitem__ |
+-----------+--------------+--------------+
| tokenizer | Tokenizer | __call__ |
+-----------+--------------+--------------+
| tagger | EnPosTagger | __call__ |
+-----------+--------------+--------------+
| parser | GreedyParser | __call__ |
+-----------+--------------+--------------+
+-----------+----------------------------------------+-------------+--------------------------+
| Attribute | Type | Attr API | NoteS |
+===========+========================================+=============+==========================+
| strings | :py:class:`strings.StringStore` | __getitem__ | string <-> int mapping |
+-----------+----------------------------------------+-------------+--------------------------+
| vocab | :py:class:`vocab.Vocab` | __getitem__ | Look up Lexeme object |
+-----------+----------------------------------------+-------------+--------------------------+
| tokenizer | :py:class:`tokenizer.Tokenizer` | __call__ | Get Tokens given unicode |
+-----------+----------------------------------------+-------------+--------------------------+
| tagger | :py:class:`en.pos.EnPosTagger` | __call__ | Set POS tags on Tokens |
+-----------+----------------------------------------+-------------+--------------------------+
| parser | :py:class:`syntax.parser.GreedyParser` | __call__ | Set parse on Tokens |
+-----------+----------------------------------------+-------------+--------------------------+
.. py:class:: spacy.tokens.Tokens(self, vocab: Vocab, string_length=0)
@ -58,32 +59,30 @@ Quick Ref
.. py:attribute:: head: Token
.. py:method:: check_flag(self, attr_id: int) --> bool
+-----------+------+-----------+---------+-----------+-------+
| Attribute | Type | Attribute | Type | Attribute | Type |
+===========+======+===========+=========+===========+=======+
| sic | int | sic_ | unicode | idx | int |
+-----------+------+-----------+---------+-----------+-------+
| lemma | int | lemma_ | unicode | cluster | int |
+-----------+------+-----------+---------+-----------+-------+
| norm1 | int | norm1_ | unicode | length | int |
+-----------+------+-----------+---------+-----------+-------+
| norm2 | int | norm2_ | unicode | prob | float |
+-----------+------+-----------+---------+-----------+-------+
| shape | int | shape_ | unicode | sentiment | float |
+-----------+------+-----------+---------+-----------+-------+
| prefix | int | prefix_ | unicode | |
+-----------+------+-----------+---------+-------------------+
| suffix | int | suffix_ | unicode | |
+-----------+------+-----------+---------+-------------------+
| pos | int | pos_ | unicode | |
+-----------+------+-----------+---------+-------------------+
| fine_pos | int | fine_pos_ | unicode | |
+-----------+------+-----------+---------+-------------------+
| dep_tag | int | dep_tag_ | unicode | |
+-----------+------+-----------+---------+-------------------+
+-----------+------+-----------+---------+-----------+------------------------------------+
| Attribute | Type | Attribute | Type | Attribute | Type |
+===========+======+===========+=========+===========+====================================+
| orth | int | orth\_ | unicode | idx | int |
+-----------+------+-----------+---------+-----------+------------------------------------+
| lemma | int | lemma\_ | unicode | cluster | int |
+-----------+------+-----------+---------+-----------+------------------------------------+
| lower | int | lower\_ | unicode | length | int |
+-----------+------+-----------+---------+-----------+------------------------------------+
| norm | int | norm\_ | unicode | prob | float |
+-----------+------+-----------+---------+-----------+------------------------------------+
| shape | int | shape\_ | unicode | repvec | ndarray(shape=(300,), dtype=float) |
+-----------+------+-----------+---------+-----------+------------------------------------+
| prefix | int | prefix\_ | unicode | |
+-----------+------+-----------+---------+------------------------------------------------+
| suffix | int | suffix\_ | unicode | |
+-----------+------+-----------+---------+------------------------------------------------+
| pos | int | pos\_ | unicode | |
+-----------+------+-----------+---------+------------------------------------------------+
| tag | int | tag\_ | unicode | |
+-----------+------+-----------+---------+------------------------------------------------+
| dep | int | dep\_ | unicode | |
+-----------+------+-----------+---------+------------------------------------------------+
.. py:class:: spacy.vocab.Vocab(self, data_dir=None, lex_props_getter=None)
@ -94,7 +93,7 @@ Quick Ref
.. py:method:: __getitem__(self, string: unicode) --> int
.. py:method:: __setitem__(self, py_str: unicode, props: Dict[str, int|float]) --> None
.. py:method:: __setitem__(self, py_str: unicode, props: Dict[str, int[float]) --> None
.. py:method:: dump(self, loc: unicode) --> None
@ -108,7 +107,7 @@ Quick Ref
.. py:method:: __getitem__(self, id: int) --> unicode
.. py:method:: __getitem__(self, string: byts) --> id
.. py:method:: __getitem__(self, string: bytes) --> id
.. py:method:: __getitem__(self, string: unicode) --> id
@ -137,69 +136,3 @@ Quick Ref
.. py:method:: __call__(self, tokens: spacy.tokens.Tokens) --> None
.. py:method:: train(self, spacy.tokens.Tokens) --> None
spaCy is designed to easily extend to multiple languages, although presently
only English components are implemented. The components are organised into
a pipeline in the spacy.en.English class.
Usually, you will only want to create one spacy.en.English object, and pass it
around your application. It manages the string-to-integers mapping, and you
will usually want only a single mapping for all strings.
English Pipeline
----------------
The spacy.en package exports a single class, English, and several constants,
under spacy.en.defs.
.. autoclass:: spacy.en.English
:members:
.. autommodule:: spacy.en.pos
:members:
.. automodule:: spacy.en.attrs
:members:
:undoc-members:
Tokens
------
.. autoclass:: spacy.tokens.Tokens
:members:
.. autoclass:: spacy.tokens.Token
:members:
.. autoclass:: spacy.lexeme.Lexeme
:members:
Lexicon
-------
.. automodule:: spacy.vocab
:members:
.. automodule:: spacy.strings
:members:
Tokenizer
---------
.. automodule:: spacy.tokenizer
:members:
Parser
------
.. automodule:: spacy.syntax.parser
:members:
Utility Functions
-----------------
.. automodule:: spacy.orth
:members: