Fix check for RIGHT_ATTRS in dep matcher (#8807)

* Fix check for RIGHT_ATTRs in dep matcher

If a non-anchor node does not have RIGHT_ATTRS, the dep matcher throws
an E100, which says that non-anchor nodes must have LEFT_ID, REL_OP, and
RIGHT_ID. It specifically does not say RIGHT_ATTRS is required.

A blank RIGHT_ATTRS is also valid, and patterns with one will be
excepted. While not normal, sometimes a REL_OP is enough to specify a
non-anchor node - maybe you just want the head of another node
unconditionally, for example.

This change just sets RIGHT_ATTRS to {} if not present. Alternatively
changing E100 to state RIGHT_ATTRS is required could also be reasonable.

* Fix test

This test was written on the assumption that if `RIGHT_ATTRS` isn't
present an error will be raised. Since the proposed changes make it so
an error won't be raised this is no longer necessary.

* Revert test, update error message

Error message now lists missing keys, and RIGHT_ATTRS is required.

* Use list of required keys in error message

Also removes unused key param arg.
This commit is contained in:
Paul O'Leary McCann 2021-08-04 16:20:41 +09:00 committed by GitHub
parent 941a591f3c
commit 77d698dcae
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 13 additions and 9 deletions

View File

@ -356,8 +356,8 @@ class Errors:
E098 = ("Invalid pattern: expected both RIGHT_ID and RIGHT_ATTRS.") E098 = ("Invalid pattern: expected both RIGHT_ID and RIGHT_ATTRS.")
E099 = ("Invalid pattern: the first node of pattern should be an anchor " E099 = ("Invalid pattern: the first node of pattern should be an anchor "
"node. The node should only contain RIGHT_ID and RIGHT_ATTRS.") "node. The node should only contain RIGHT_ID and RIGHT_ATTRS.")
E100 = ("Nodes other than the anchor node should all contain LEFT_ID, " E100 = ("Nodes other than the anchor node should all contain {required}, "
"REL_OP and RIGHT_ID.") "but these are missing: {missing}")
E101 = ("RIGHT_ID should be a new node and LEFT_ID should already have " E101 = ("RIGHT_ID should be a new node and LEFT_ID should already have "
"have been declared in previous edges.") "have been declared in previous edges.")
E102 = ("Can't merge non-disjoint spans. '{token}' is already part of " E102 = ("Can't merge non-disjoint spans. '{token}' is already part of "

View File

@ -122,13 +122,17 @@ cdef class DependencyMatcher:
raise ValueError(Errors.E099.format(key=key)) raise ValueError(Errors.E099.format(key=key))
visited_nodes[relation["RIGHT_ID"]] = True visited_nodes[relation["RIGHT_ID"]] = True
else: else:
if not( required_keys = set(
"RIGHT_ID" in relation ("RIGHT_ID", "RIGHT_ATTRS", "REL_OP", "LEFT_ID")
and "RIGHT_ATTRS" in relation )
and "REL_OP" in relation relation_keys = set(relation.keys())
and "LEFT_ID" in relation missing = required_keys - relation_keys
): if missing:
raise ValueError(Errors.E100.format(key=key)) missing_txt = ", ".join(list(missing))
raise ValueError(Errors.E100.format(
required=required_keys,
missing=missing_txt
))
if ( if (
relation["RIGHT_ID"] in visited_nodes relation["RIGHT_ID"] in visited_nodes
or relation["LEFT_ID"] not in visited_nodes or relation["LEFT_ID"] not in visited_nodes