Knowledge Integration Framework
KIF is a Python framework for knowledge integration from IBM Research.
It is based on Wikidata and licensed under the open-source Apache-2.0 license.
First time here? Check out the tutorial.
Looking for the sources? See the GitHub repository.
Hello world!
Install KIF using pip:
Use KIF to query Wikidata:
>>> from kif_lib import Store
>>> from kif_lib.vocabulary import wd
>>> kb = Store('wikidata')
>>> next(kb.filter(subject=wd.Alan_Turing, property=wd.doctoral_advisor))
Statement(Item(IRI('http://www.wikidata.org/entity/Q7251')), ValueSnak(...))
Or, via KIF CLI (the command-line interface):
$ pip install kif-lib[cli] # KIF CLI is an optional dependency
$ kif filter --subject=wd.Alan_Turing --property=wd.doctoral_advisor
(Statement (Item Alan Turing) (ValueSnak (Property doctoral advisor) (Item Alonzo Church)))
KIF can also be used query other knowledge sources. Here is a similar query over DBpedia (notice the -s dbpedia switch):
(Statement (Item dbr:Alan_Turing) (ValueSnak (Property dbo:doctoralAdvisor) (Item dbr:Alonzo_Church)))
The result is a stream of Wikidata-like statements containing DBpedia entities.
KIF in a nutshell
KIF is a knowledge integration framework based on Wikidata. The idea behind it is to use Wikidata to standardize the syntax and (whenever possible) the vocabulary of the integrated knowledge sources. Users can then query the sources through filter patterns described in terms of the Wikidata data model.
The integration done by KIF is virtual in the sense that syntax and vocabulary translations happen dynamically (at query time) and are guided by user-provided mappings. KIF comes with built-in mappings for Wikidata, DBpedia, FactGrid, PubChem, and UniProt, among others. New mappings can be added programmatically.
Highlights
-
KIF allows one to query knowledge sources as if they were Wikidata.
-
KIF queries are written as simple, high-level filters using entities of the Wikidata data model, such as items, properties, quantities, snaks, statements, etc.
-
KIF can be used to query Wikidata itself or other knowledge sources, provided proper mappings are given.
-
KIF can run queries over local RDF data using RDFLib, Apache Jena, QLever, or RDFox.
-
KIF has full support for Python's asyncio. KIF async API can be used run queries asynchronously, without blocking waiting on their results.
Installation
To install KIF, use:
To include KIF CLI, use:
To include all extras, use:
Documentation
KIF documentation is available at https://ibm.github.io/kif/.
For a primer on KIF, see the tutorial.
Dependencies
Required:
- httpx - HTTP support.
- lark - Parsing.
- more_itertools - Extra itertools.
- networkx - Graph algorithms.
- rdflib - RDF support.
- typing-extensions - Typing backports.
KIF CLI (optional):
- click - Option parsing. (Optional, with
kif-lib[cli]) - rich - Rich terminal support. (Optional, with
kif-lib[cli])
Extra (optional):
- graphviz - Graph drawing. (Optional, with
kif-lib[extra]) - jpype1 - Java support. (Optional, with
kif-lib[extra]) - pandas - CSV/DataFrame support. (Optional, with
kif-lib[extra]) - psutil - Process information. (Optional, with
kif-lib[extra])
Citation
Guilherme Lima, João M. B. Rodrigues, Marcelo Machado, Elton Soares, Sandro R. Fiorini, Raphael Thiago, Leonardo G. Azevedo, Viviane T. da Silva, Renato Cerqueira. 2024. "KIF: A Wikidata-Based Framework for Integrating Heterogeneous Knowledge Sources", arXiv:2403.10304, 2024.