Merge pull request #1647 from atomobianco/patch-1

Corrected char index instead of token index
This commit is contained in:
Ines Montani 2017-11-27 12:01:04 +00:00 committed by GitHub
commit db2dbb2477
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 2 additions and 1 deletions

View File

@ -354,7 +354,8 @@ p
# append mock entity for match in displaCy style to matched_sents # append mock entity for match in displaCy style to matched_sents
# get the match span by ofsetting the start and end of the span with the # get the match span by ofsetting the start and end of the span with the
# start and end of the sentence in the doc # start and end of the sentence in the doc
match_ents = [{'start': span.start-sent.start, 'end': span.end-sent.start, match_ents = [{'start': span.start_char - sent.start_char,
'end': span.end_char - sent.start_char,
'label': 'MATCH'}] 'label': 'MATCH'}]
matched_sents.append({'text': sent.text, 'ents': match_ents }) matched_sents.append({'text': sent.text, 'ents': match_ents })