KnowGL Knowledge Extractor¶
The knowgl-large model is trained by combining Wikidata with an extended version of the training data in the REBEL dataset. Given a sentence, KnowGL generates triple(s) in the following format:
[(subject mention # subject label # subject type) | relation label | (object mention # object label # object type)]
This KnowledgeExtractor
does not use any entity/relation pre-defined.
Bases: KnowledgeExtractor
Instantiate the KnowGL Knowledge Extractor
load_models()
¶
Load KnowGL model
parse_result(result, doc, encodings)
¶
Parse the text result into a list of triples
Parameters:
Name | Type | Description | Default |
---|---|---|---|
result |
str
|
Text generate by the KnowGL model |
required |
doc |
Doc
|
Spacy doc |
required |
encodings |
Encoding
|
Encodings result of the tokenization |
required |
Returns:
Type | Description |
---|---|
List[Tuple[Span, RelationSpan, Span]]
|
List of triples (subject, relation, object) |
predict(docs, batch_size=None)
¶
Extract triples from docs
Parameters:
Name | Type | Description | Default |
---|---|---|---|
docs |
Iterator[Doc]
|
Spacy Docs to process |
required |
batch_size |
Optional[Union[int, None]]
|
Batch size for processing |
None
|
Returns:
Type | Description |
---|---|
List[List[Tuple[Span, RelationSpan, Span]]]
|
Triples (subject, relation, object) extracted for each document |