Make "text" key in JSONL format optional when "tokens" key is provided (#3721)

* Fix issue with forcing text key when it is not required

* Extending the docs to reflect the new behavior
This commit is contained in:
devforfu 2019-05-11 18:41:29 +05:00 committed by Ines Montani
parent 6cfa1e1f47
commit 21af12eb53
2 changed files with 3 additions and 2 deletions

View File

@ -181,10 +181,10 @@ def make_update(model, docs, optimizer, drop=0.0, objective="L2"):
def make_docs(nlp, batch, min_length, max_length): def make_docs(nlp, batch, min_length, max_length):
docs = [] docs = []
for record in batch: for record in batch:
text = record["text"]
if "tokens" in record: if "tokens" in record:
doc = Doc(nlp.vocab, words=record["tokens"]) doc = Doc(nlp.vocab, words=record["tokens"])
else: else:
text = record["text"]
doc = nlp.make_doc(text) doc = nlp.make_doc(text)
if "heads" in record: if "heads" in record:
heads = record["heads"] heads = record["heads"]

View File

@ -327,7 +327,7 @@ tokenization can be provided.
| Key | Type | Description | | Key | Type | Description |
| -------- | ------- | -------------------------------------------- | | -------- | ------- | -------------------------------------------- |
| `text` | unicode | The raw input text. | | `text` | unicode | The raw input text. Is not required if `tokens` available. |
| `tokens` | list | Optional tokenization, one string per token. | | `tokens` | list | Optional tokenization, one string per token. |
```json ```json
@ -335,6 +335,7 @@ tokenization can be provided.
{"text": "Can I ask where you work now and what you do, and if you enjoy it?"} {"text": "Can I ask where you work now and what you do, and if you enjoy it?"}
{"text": "They may just pull out of the Seattle market completely, at least until they have autonomous vehicles."} {"text": "They may just pull out of the Seattle market completely, at least until they have autonomous vehicles."}
{"text": "My cynical view on this is that it will never be free to the public. Reason: what would be the draw of joining the military? Right now their selling point is free Healthcare and Education. Ironically both are run horribly and most, that I've talked to, come out wishing they never went in."} {"text": "My cynical view on this is that it will never be free to the public. Reason: what would be the draw of joining the military? Right now their selling point is free Healthcare and Education. Ironically both are run horribly and most, that I've talked to, come out wishing they never went in."}
{"tokens": ["If", "tokens", "are", "provided", "then", "we", "can", "skip", "the", "raw", "input", "text"]}
``` ```
## Init Model {#init-model new="2"} ## Init Model {#init-model new="2"}