---
jupytext:
  text_representation:
    extension: .md
    format_name: myst
    format_version: 0.13
    jupytext_version: 1.13.1
kernelspec:
  display_name: Python 3 (ipykernel)
  language: python
  name: python3
---

# Quickstart

KIF is a Wikidata-based framework for integrating knowledge sources.

This quickstart guide presents the basic API of KIF.

## Hello world!

We start by importing the `kif_lib` namespace:

```{code-cell}
from kif_lib import *
```

We'll also need the Wikidata vocabulary module `wd`:

```{code-cell}
from kif_lib.vocabulary import wd
```

Let us no we create a SPARQL store pointing to the official Wikidata query
service:

```{code-cell}
kb = Store('sparql', 'https://query.wikidata.org/sparql')
```

A KIF store is an inteface to a knowledge source.  It allows us to view the
source as a set of Wikidata-like statements.

The store `kb` we just created is an interface to Wikidata itself. We can
use it, for example, to fetch from Wikidata three statements about Brazil:

```{code-cell}
it = kb.filter(subject=wd.Brazil, limit=3)
for stmt in it:
    display(stmt)
```

## Filters

The `kb.filter(...)` call searches for statements in `kb` matching the
restrictions `...`.

The result of a filter call is a (lazy) iterator `it` of statements:

```{code-cell}
it = kb.filter(subject=wd.Brazil)
```

We can advance `it` to obtain statements:

```{code-cell}
next(it)
```

If no `limit` argument is given to `kb.filter()`, the returned iterator
contains all matching statements.

## Basic filters

We can filter statements by any combination of *subject*, *property*, and
*value*.

For example:

```{code-cell}
###
# match any statement
###
next(kb.filter())
```

```{code-cell}
###
# match statements with subject "Brazil" and property "official website"
###
next(kb.filter(subject=wd.Brazil, property=wd.official_website))
```

```{code-cell}
###
# match statements with property "official website" and value "https://www.ibm.com/"
###
next(kb.filter(property=wd.official_website, value=IRI('https://www.ibm.com/')))
```

```{code-cell}
###
# match statements with value "78.046950192 dalton"
###
next(kb.filter(value=Quantity('78.046950192', unit=wd.dalton)))
```

We can also match statements having *some* (unknown) value:

```{code-cell}
next(kb.filter(snak=wd.date_of_birth.some_value()))
```

Or *no* value:

```{code-cell}
next(kb.filter(snak=wd.date_of_death.no_value()))
```

## Fingerprints (indirect ids)

So far, we have been using the symbolic aliases defined in the `wd` module to
specify entities in filters:

```{code-cell}
display(wd.Brazil)
display(wd.continent)
```

Alternatively, we can use their numeric Wikidata ids:

```{code-cell}
###
# match statements with subject Q155 (Brazil) and property P30 (continent)
###
next(kb.filter(subject=wd.Q(155), property=wd.P(30)))
```

Sometimes, however, ids are not enough.  We might need to specify an entity
indirectly by giving not its id but a property it satisfies.

In cases like this, we can use a *fingerprint*:

```{code-cell}
###
# match statemets whose subject "is a dog" and value "is a human"
###
next(kb.filter(subject=wd.instance_of(wd.dog), value=wd.instance_of(wd.human)))
```

Properties themselves can also be specified using fingerprints:

```{code-cell}
###
# match statements whose property is "equivalent to Schema.org's 'weight'"
###
next(kb.filter(property=wd.equivalent_property('https://schema.org/weight')))
```

The `-` (unary minus) operator can be used to invert the direction of the
property used in the fingerprint:

```{code-cell}
###
# match statements whose subject is "the continent of Brazil"
###
next(kb.filter(subject=-(wd.continent(wd.Brazil))))
```

## And-ing and or-ing fingeprints

Entity ids and fingerpints can be combined using the operators `&` (and) and
`|` (or).

For example:

```{code-cell}
###
# match three statements such that:
# - subject is "Brazil" or "Argentina"
# - property is "continent" or "highest point"
###
it = kb.filter(
        subject=wd.Brazil | wd.Argentina,
        property=wd.continent | wd.highest_point,
        limit=3)
for stmt in it:
    display(stmt)
```

```{code-cell}
###
# match three statements such that:
# - subject "has continent South America" and "official language is Portuguese"
# - value "is a river" or "is a mountain"
###
it = kb.filter(
        subject=wd.continent(wd.South_America) & wd.official_language(wd.Portuguese),
        value=wd.instance_of(wd.river) | wd.instance_of(wd.mountain),
        limit=3)
for stmt in it:
    display(stmt)
```

```{code-cell}
###
# match three statements such that:
# - subject "is a female" and ("was born in NYC" or "was born in Rio")
# - property is "field of work" or "is equivalent to Schema.org's 'hasOccupation'"
###
it = kb.filter(
        subject=wd.sex_or_gender(wd.female)\
        & (wd.place_of_birth(wd.New_York_City) | wd.place_of_birth(wd.Rio_de_Janeiro)),
        property=wd.field_of_work\
        | wd.equivalent_property(IRI('https://schema.org/hasOccupation')),
        limit=3)
for stmt in it:
    display(stmt)
```

## Count and contains

A variant of the filter call is `kb.count(...)` which, instead of
statements, counts the number of statements matching restrictions `...`:

```{code-cell}
kb.count(subject=wd.Brazil, property=wd.population | wd.official_language)
```

The `kb.contains()` call tests whether a given statement occurs in `kb`.

```{code-cell}
stmt1 = wd.official_language(wd.Brazil, wd.Portuguese)
kb.contains(stmt1)
```

```{code-cell}
stmt2 = wd.official_language(wd.Brazil, wd.Spanish)
kb.contains(stmt2)
```

## Final remarks

This concludes the quickstart guide.

There are many other calls in the Store API of KIF.  For more information
see, the [API Reference](<reference/index>).