From e2d53aa1a6add9e0bd242c9c61ea1c7e2a1d08c0 Mon Sep 17 00:00:00 2001 From: Calum Sieppert Date: Fri, 9 Jul 2021 10:25:56 -0600 Subject: [PATCH 1/2] Typo fixes --- website/docs/usage/rule-based-matching.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/website/docs/usage/rule-based-matching.md b/website/docs/usage/rule-based-matching.md index 22bf4f470..f2df71b98 100644 --- a/website/docs/usage/rule-based-matching.md +++ b/website/docs/usage/rule-based-matching.md @@ -63,7 +63,7 @@ another token that's at least 10 characters long. spaCy features a rule-matching engine, the [`Matcher`](/api/matcher), that operates over tokens, similar to regular expressions. The rules can refer to -token annotations (e.g. the token `text` or `tag_`, and flags (e.g. `IS_PUNCT`). +token annotations (e.g. the token `text` or `tag_`, and flags (e.g. `IS_PUNCT`)). The rule matcher also lets you pass in a custom callback to act on matches – for example, to merge entities and apply custom labels. You can also associate patterns with entity IDs, to allow some basic entity linking or disambiguation. @@ -1552,7 +1552,7 @@ doc = nlp("Dr. Alex Smith chaired first board meeting of Acme Corp Inc.") print([(ent.text, ent.label_) for ent in doc.ents]) ``` -An alternative approach would be to an +An alternative approach would be to use an [extension attribute](/usage/processing-pipelines/#custom-components-attributes) like `._.person_title` and add it to `Span` objects (which includes entity spans in `doc.ents`). The advantage here is that the entity text stays intact and can From 50000d37e495b31fcb607d532120eab9068c75c9 Mon Sep 17 00:00:00 2001 From: Ines Montani Date: Sat, 10 Jul 2021 10:52:01 +1000 Subject: [PATCH 2/2] Avoid double parentheses [ci skip] --- website/docs/usage/rule-based-matching.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/usage/rule-based-matching.md b/website/docs/usage/rule-based-matching.md index f2df71b98..037850154 100644 --- a/website/docs/usage/rule-based-matching.md +++ b/website/docs/usage/rule-based-matching.md @@ -63,7 +63,7 @@ another token that's at least 10 characters long. spaCy features a rule-matching engine, the [`Matcher`](/api/matcher), that operates over tokens, similar to regular expressions. The rules can refer to -token annotations (e.g. the token `text` or `tag_`, and flags (e.g. `IS_PUNCT`)). +token annotations (e.g. the token `text` or `tag_`, and flags like `IS_PUNCT`). The rule matcher also lets you pass in a custom callback to act on matches – for example, to merge entities and apply custom labels. You can also associate patterns with entity IDs, to allow some basic entity linking or disambiguation.