Commit Graph

1100 Commits

Author SHA1 Message Date
Matthew Honnibal 4336397ecb Update develop from master 2018-08-14 03:04:28 +02:00
Wojciech Łukasiewicz 3953e967a0 User correct variable name in the examples (#2664)
* correct naming

* add contributor agreement
2018-08-13 22:21:24 +02:00
Ines Montani 71723cece1 Add note on visualizing long texts ans sentences (see #2636) [ci skip] 2018-08-08 15:28:21 +02:00
Ines Montani 6147bd3eb4 Fix link target (closes #2645) [ci skip] 2018-08-08 15:03:52 +02:00
Ines Montani 8c47da1f19 Update Language serialization docs (see #2628) [ci skip]
Add note on using from_disk and from_bytes via subclasses and add example
2018-08-07 14:17:57 +02:00
Matthew Honnibal 664cfc29bc Merge branch 'master' of https://github.com/explosion/spaCy 2018-08-07 10:49:39 +02:00
Matthew Honnibal 2278c9734e Fix spelling error #2640 2018-08-07 10:49:21 +02:00
Xiaoquan Kong f0c9652ed1 New Feature: display more detail when Error E067 (#2639)
* Fix off-by-one error

* Add verbose option

* Update verbose option

* Update documents for verbose option
2018-08-07 10:45:29 +02:00
Ines Montani 6a4360e425 Update universe [ci skip] 2018-08-02 17:33:08 +02:00
Sami dbc993f5b3 Updating description and code snippet spacy-lefff (#2623)
* updating description and code snippet spacy-lefff

* contributors agreement
2018-08-02 17:25:27 +02:00
Vikas Kumar Yadav d3e21aad64 Update _benchmarks.jade (#2618) 2018-08-02 00:28:28 +02:00
Brian Phillips 8227de0099 Update language.jade (#2616) 2018-07-31 12:34:42 +02:00
Ioannis Daras 055cc0de44 Bug fix to pseudocode for tokenizer customization (#2604) 2018-07-27 11:04:12 +02:00
Andriy Mulyar e9ef51137d Fixed typo (#2596)
Changed 'The index of the first character after the span.' to The index of the last character after the span' in description of doc.char_span
2018-07-25 22:17:15 +02:00
Ines Montani 75f3234404
💫 Refactor test suite (#2568)
## Description

Related issues: #2379 (should be fixed by separating model tests)

* **total execution time down from > 300 seconds to under 60 seconds** 🎉
* removed all model-specific tests that could only really be run manually anyway – those will now live in a separate test suite in the [`spacy-models`](https://github.com/explosion/spacy-models) repository and are already integrated into our new model training infrastructure
* changed all relative imports to absolute imports to prepare for moving the test suite from `/spacy/tests` to `/tests` (it'll now always test against the installed version)
* merged old regression tests into collections, e.g. `test_issue1001-1500.py` (about 90% of the regression tests are very short anyways)
* tidied up and rewrote existing tests wherever possible

### Todo

- [ ] move tests to `/tests` and adjust CI commands accordingly
- [x] move model test suite from internal repo to `spacy-models`
- [x] ~~investigate why `pipeline/test_textcat.py` is flakey~~
- [x] review old regression tests (leftover files) and see if they can be merged, simplified or deleted
- [ ] update documentation on how to run tests


### Types of change
enhancement, tests

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [ ] My changes don't require a change to the documentation, or if they do, I've added all required information.
2018-07-24 23:38:44 +02:00
kororo b1ec827ee0 Fix typo (#2579)
Update slogan, desc and code snippet to latest version
2018-07-24 22:47:33 +02:00
ines cd687091fb Remove nl examples from widget for now [ci skip]
Restore for next spaCy version when path to example sentences is fixed
2018-07-24 22:41:20 +02:00
ines 2d8ffb8bcd Fix formatting 2018-07-24 22:40:49 +02:00
ines 1b3da8d2ae Update website for v2.0.12 [ci skip] 2018-07-24 21:04:22 +02:00
ines ae5ed2d698 Update docs for v2.0.12 [ci skip] 2018-07-21 15:51:44 +02:00
ines d517dd4297 Document remove_extension methods 2018-07-21 15:51:28 +02:00
ines 153f41a5cc Use better examples for Doc extension methods 2018-07-21 15:51:11 +02:00
ines 3c30d1763c Merge branch 'master' into develop 2018-07-21 15:34:18 +02:00
kororo 2784babef9 Add ExcelCy into Universe list (#2572)
Hi guys,

This is my first spaCy extension. I am excited to able to do this. Please do let me know if there is any suggestions or modifications I need to do. Feel free to use/contribute the repo that I made.

## Description
ExcelCy is a SpaCy toolkit to help improve the data training experiences. It provides easy annotation using Excel file format. It has helper to pre-train entity annotation with phrase and regex matcher pipe.

### Types of change
Update to Universe list in website.

## Checklist
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2018-07-19 19:28:33 +02:00
ines 80e7485630 Merge branch 'master' into develop 2018-07-18 17:28:47 +02:00
Xiang Ji 19a5ef1c58 Fix venv command examples (#2560) [ci skip]
* Fix venv command examples

The documentation refers to `venv`, which is native to Python3.
However, the command examples are as if they were still `virtualenv`,
which is a package independent of `venv`:

- It doesn't need to be installed via `pip`. In fact `pip install venv` would
return an error.
- The correct way to invoke `venv` is `python3 -m venv`, not `venv`, which would
return command not found.

See https://docs.python.org/3/library/venv.html

I suspect the documentation simply replaced all occurrences of `virtualenv` with
`venv`. However they are different modules and are used differently.

* Update comment [ci skip]
2018-07-18 10:31:24 +02:00
ines 50c367ee96 Update meta [ci skip] 2018-07-10 13:51:45 +02:00
ines 3a321e79ac Merge branch 'master' into develop 2018-07-10 13:49:08 +02:00
ines 71bfc92913 Exclude models for non-stable versions [ci skip] 2018-07-10 13:44:55 +02:00
ines b5200962c0 Adjust formatting [ci skip] 2018-07-09 18:35:46 +02:00
Alex Villarreal bd35bf7f09 Guidance to handle binary files in git in Windows (#2526)
Adds guidance on what to do if users encounter the error described in [1634](https://github.com/explosion/spaCy/issues/1634), which probably only happens in Windows environments.
2018-07-09 18:31:37 +02:00
ines f575b01595 Update language and license meta [ci skip] 2018-07-04 15:09:36 +02:00
ines 63666af328 Merge branch 'master' into develop 2018-07-04 14:52:25 +02:00
Matthew Honnibal a85620a731 Note CoreNLP tokenizer correction on website 2018-07-02 11:35:31 +02:00
ines 06c6dc6fbc Update Juniper [ci skip] 2018-06-28 11:48:17 +02:00
Nipun Sadvilkar 741ba80bd5 Train model command n_iteration 20 -> 30 (#2454)
In source code `train.py` default Number of iterations  is 30
2018-06-18 11:57:08 +02:00
ines 53a2bc8c8d Only scroll sidebar item into view if needed [ci skip] 2018-06-12 10:58:50 +02:00
ines 65713a6593 Increment versions [ci skip] 2018-06-12 10:49:50 +02:00
Ines Montani 968f6f0bda
💫 Document Cython API (#2433)
## Description

This PR adds the most relevant documentation of spaCy's Cython API.

(Todo for when we publish this: rewrite `/api/#section-cython` and `/api/#cython` to `/api/cython#conventions`.)

### Types of change
docs

## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [x] I ran the tests, and all new and existing tests passed.
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2018-06-11 17:47:46 +02:00
GolanLevy 72d7e80f94 adding a missing apostrophe (#2436) 2018-06-11 17:47:24 +02:00
ines 778e5f4da3 Merge branch 'master' into develop 2018-06-11 00:38:04 +02:00
himkt 57311d5d47 replace janome with mecab in the documentation and the test (#2415)
* Add links to Reddit data (see #2401)

* replace janome with mecab in the documentation and the test

* add the assignment
2018-06-11 00:33:13 +02:00
ines effb55d591 Adjust formatting [ci skip] 2018-06-11 00:29:13 +02:00
Nathan Breit ba6d2cf393 Add EpiTator to Universe (#2429) 2018-06-11 00:24:13 +02:00
himkt 1a568f2e08 fix wrong documentations (#2423) 2018-06-11 00:21:06 +02:00
Bohdan Moskalevskyi d66292f767 fix UD data file extensions (#2425)
* fix UD data files extension

* add contributor agreement for msklvsk
2018-06-08 14:26:11 +02:00
ines a0017e4909 Merge branch 'master' into develop 2018-05-30 14:10:47 +02:00
ines 0baaf836cf Update formatting [ci skip] 2018-05-30 13:32:49 +02:00
ines 3913e18201 Add self-attentive-parser to universe (see #59) 2018-05-30 13:31:28 +02:00
ines 4a62486340 Merge branch 'master' into develop 2018-05-30 13:01:01 +02:00