{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Quick Start - Point Query\n", "## Your First Query\n", "The most basic Geospatial APIs query is the *point query*, we are going to get you started with the Geospatial APIs SDK by using it to do a point query:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2023-08-15 09:11:40 - paw - INFO - The client authentication method is assumed to be OAuth2.\n", "2023-08-15 09:11:40 - paw - INFO - Legacy Environment is False\n", "2023-08-15 09:11:43 - paw - INFO - Authentication success.\n", "2023-08-15 09:11:43 - paw - INFO - HOST: https://api.ibm.com/geospatial/run/na/core/v3\n", "2023-08-15 09:11:43 - paw - INFO - TASK: submit STARTING.\n", "2023-08-15 09:11:58 - paw - INFO - TASK: submit COMPLETED.\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
layer_idlayer_namedatasettimestamplongitudelatitudevaluedatetime
049464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1672531200000-1.48375950.9216330.04642023-01-01
149464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1672790400000-1.48375950.9216330.030699999999999952023-01-04
249464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1672963200000-1.48375950.9216330.168300000000000122023-01-06
349464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1673222400000-1.48375950.9216330.07462023-01-09
449464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1673395200000-1.48375950.9216330.343300000000000162023-01-11
...........................
6549464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1687046400000-1.48375950.9216330.022000000000000022023-06-18
6649464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1687219200000-1.48375950.9216330.043100000000000142023-06-20
6749464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1687651200000-1.48375950.9216330.58862023-06-25
6849464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1687910400000-1.48375950.9216330.43860000000000012023-06-28
6949464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)1688083200000-1.48375950.921633-0.017399999999999972023-06-30
\n", "

70 rows × 8 columns

\n", "
" ], "text/plain": [ " layer_id layer_name \\\n", "0 49464 Normalized difference vegetation index \n", "1 49464 Normalized difference vegetation index \n", "2 49464 Normalized difference vegetation index \n", "3 49464 Normalized difference vegetation index \n", "4 49464 Normalized difference vegetation index \n", ".. ... ... \n", "65 49464 Normalized difference vegetation index \n", "66 49464 Normalized difference vegetation index \n", "67 49464 Normalized difference vegetation index \n", "68 49464 Normalized difference vegetation index \n", "69 49464 Normalized difference vegetation index \n", "\n", " dataset timestamp longitude latitude \\\n", "0 High res imagery (ESA Sentinel 2) 1672531200000 -1.483759 50.921633 \n", "1 High res imagery (ESA Sentinel 2) 1672790400000 -1.483759 50.921633 \n", "2 High res imagery (ESA Sentinel 2) 1672963200000 -1.483759 50.921633 \n", "3 High res imagery (ESA Sentinel 2) 1673222400000 -1.483759 50.921633 \n", "4 High res imagery (ESA Sentinel 2) 1673395200000 -1.483759 50.921633 \n", ".. ... ... ... ... \n", "65 High res imagery (ESA Sentinel 2) 1687046400000 -1.483759 50.921633 \n", "66 High res imagery (ESA Sentinel 2) 1687219200000 -1.483759 50.921633 \n", "67 High res imagery (ESA Sentinel 2) 1687651200000 -1.483759 50.921633 \n", "68 High res imagery (ESA Sentinel 2) 1687910400000 -1.483759 50.921633 \n", "69 High res imagery (ESA Sentinel 2) 1688083200000 -1.483759 50.921633 \n", "\n", " value datetime \n", "0 0.0464 2023-01-01 \n", "1 0.03069999999999995 2023-01-04 \n", "2 0.16830000000000012 2023-01-06 \n", "3 0.0746 2023-01-09 \n", "4 0.34330000000000016 2023-01-11 \n", ".. ... ... \n", "65 0.02200000000000002 2023-06-18 \n", "66 0.04310000000000014 2023-06-20 \n", "67 0.5886 2023-06-25 \n", "68 0.4386000000000001 2023-06-28 \n", "69 -0.01739999999999997 2023-06-30 \n", "\n", "[70 rows x 8 columns]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import os\n", "import pandas as pd\n", "import ibmpairs.authentication as authentication\n", "import ibmpairs.client as client\n", "import ibmpairs.query as query\n", "\n", "# It is best practice not to include secrets in source code so \n", "# we read an api key, tenant id and org id from operating system \n", "# environment variables.\n", "EI_API_KEY = os.environ.get('EI_API_KEY')\n", "EI_TENANT_ID = os.environ.get('EI_TENANT_ID')\n", "EI_ORG_ID = os.environ.get('EI_ORG_ID')\n", "\n", "# Authenticate and get a client object.\n", "ei_client = client.get_client(api_key = EI_API_KEY,\n", " tenant_id = EI_TENANT_ID,\n", " org_id = EI_ORG_ID)\n", "\n", "# The Geospatial APIs query expressed as a JSON structure\n", "query_json = {\n", " \"layers\" : [\n", " {\"type\" : \"raster\", \"id\" : \"49464\"}\n", " ],\n", " \"spatial\" : {\n", " \"type\" : \"point\",\n", " \"coordinates\" : [\"50.92163290389907\", \"-1.4837586747526244\"]\n", " },\n", " \"temporal\" : {\"intervals\" : [\n", " {\"start\" : \"2023-01-01T00:00:00Z\", \"end\" : \"2023-06-30T00:00:00Z\"}\n", " ]}\n", " }\n", "\n", "# Submit the query\n", "query_result = query.submit(query_json)\n", "\n", "# Convert the results to a dataframe\n", "point_df = query_result.point_data_as_dataframe()\n", "# Convert the timestamp to a human readable format\n", "point_df['datetime'] = pd.to_datetime(point_df['timestamp'] * 1e6, errors = 'coerce')\n", "point_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above query requests NDVI values from Geospatial APIs layer 49464, the *High res imagery (ESA Sentinel 2)* dataset, for a location somewhere in Southampton, UK -- the coordinates 50.92/-1.48 (latitude/longitude). \n", "\n", "Geospatial APIs returns about 70 rows of data, which are now stored in the ``point_df`` dataframe.\n", "\n", "
\n", "Point queries such as the above are unique in that they instantly return a response. This makes them particularly suited to testing as well as exploration and experimentation. If unsure about the data you are interested in- its spatial coverage frequency, or temporal extent- start with a point query. Having said that, note that some advanced features -- most notably user defined functions are not available for point queries.\n", "
\n", "\n", "
\n", "Time intervals such as:\n", "\n", "```python\n", "{\"start\" : \"2023-01-01T00:00:00Z\", \"end\" : \"2023-06-30T00:00:00Z\"}\n", "```\n", "\n", "are defined as follows: The start time is included, the end time is included. In other words, the interval is open at the beginning and open at the end: ``2023-01-01T00:00:00Z <= t <= 2023-06-30T00:00:00Z``.\n", "
\n", "\n", "\n", "## Understanding the Example\n", "We start with various import statements:\n", "```python\n", "import os # used to read environment variables\n", "import ibmpairs.authentication as authentication # deals with Geospatial APIs authentication\n", "import ibmpairs.client as client # represents an authenticated HTTP client\n", "import ibmpairs.query as query # manages the submission of queries and retrieval of results\n", "```\n", "After the imports we create a client object and use an API_KEY, TENANT_ID (or CLIENT_ID) and an ORG_ID to create an authenticated HTTP client. \n", "```python\n", "ei_client = client.get_client(api_key = EI_API_KEY,\n", " tenant_id = EI_TENANT_ID,\n", " org_id = EI_ORG_ID)\n", "```\n", "This is a required step before you start doing queries but you only need to do it once. \n", "\n", "The most interesting part of the above example is the definition of the actual query JSON that we send to Geospatial APIs. \n", "```python\n", "query_json = {\n", " \"layers\" : [ \n", " {\"type\" : \"raster\", \"id\" : \"49464\"} # What - the data layer\n", " ], \n", " \"spatial\" : {\"type\" : \"point\", \"coordinates\" : [\"50.92163290389907\", \"-1.4837586747526244\"]}, # Where - the spatial location\n", " \"temporal\" : {\"intervals\" : [\n", " {\"start\" : \"2023-01-01T00:00:00Z\", \"end\" : \"2023-06-30T00:00:00Z\"} # When - the temporal range\n", " ]}\n", " }\n", "```\n", "In general, the ``query_json`` object answers the following questions: *what?*, *where?* and *when?*. What we are requesting is specified by the value associated to ``layers``. Here, we are requesting a single raster layer with ID 49464. This is the *NDVI* layer in the *High res imagery (ESA Sentinel 2)* dataset. Next we define the spatial coverage of the query with the ``spatial`` key. In the above, we only request data for a single point in the format ``[latitude, longitude]``. Note that longitudes in Geospatial APIs range from -180 to +180 degrees. Using values larger than +180 will lead to error messages. Similarly, latitudes range of course from -90 to +90 degrees. Finally we define a single time range via the ``temporal`` field.\n", "\n", "Subsequently we submit the query to Geospatial APIs. As this is a point query, the result is returned directly from the submit method call:\n", "\n", "```python\n", "query_result = query.submit(query_json)\n", "```\n", "\n", "Note that we don't explicitly need to tell the query object to use the authenticated client we created previously as it finds it automatically.\n", "\n", "Geospatial APIs returns the result of a point query as JSON data. We use a helper method to turn this data into a local data frame:\n", "\n", "```python\n", "point_df = query_result.point_data_as_dataframe()\n", "```\n", "From this point on all the data is in a local data frame and we can operate on it as we would any other data frame." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A Not So Minimal Working Example\n", "\n", "The largest part of this documentation will be concerned with extensions to the ``query_json`` object. Once again let's just jump into a working example:\n", "\n", "
\n", "The layer IDs used here can be found using the catalogue sub-module.\n", "
\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2023-08-15 09:11:58 - paw - INFO - TASK: submit STARTING.\n", "2023-08-15 09:13:53 - paw - INFO - TASK: submit COMPLETED.\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
layer_idlayer_namedatasetlongitudelatitudevalueaggregationalias
049464Normalized difference vegetation indexHigh res imagery (ESA Sentinel 2)-122.419437.77490.08440000000000003Max49464.1672531200000>1688083200000
191Daily precipitationDaily US weather (PRISM)-122.419437.77494.467446735076344Mean91.1669852800000>1688083200000
291Daily precipitationDaily US weather (PRISM)-74.006040.71283.035630824596353Mean91.1669852800000>1688083200000
\n", "
" ], "text/plain": [ " layer_id layer_name \\\n", "0 49464 Normalized difference vegetation index \n", "1 91 Daily precipitation \n", "2 91 Daily precipitation \n", "\n", " dataset longitude latitude \\\n", "0 High res imagery (ESA Sentinel 2) -122.4194 37.7749 \n", "1 Daily US weather (PRISM) -122.4194 37.7749 \n", "2 Daily US weather (PRISM) -74.0060 40.7128 \n", "\n", " value aggregation alias \n", "0 0.08440000000000003 Max 49464.1672531200000>1688083200000 \n", "1 4.467446735076344 Mean 91.1669852800000>1688083200000 \n", "2 3.035630824596353 Mean 91.1669852800000>1688083200000 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query_json = {\n", " \"layers\" : [\n", " {\n", " \"type\" : \"raster\", \"id\" : \"91\",\n", " \"temporal\" : {\"intervals\" : [\n", " {\"start\" : \"2022-12-01T00:00:00Z\", \"end\" : \"2023-06-30T00:00:00Z\"}\n", " ]},\n", " \"aggregation\" : \"Mean\"\n", " },\n", " {\n", " \"type\" : \"raster\", \"id\" : \"49464\",\n", " \"temporal\" : {\"intervals\" : [\n", " {\"start\" : \"2023-01-01T00:00:00Z\", \"end\" : \"2023-06-30T00:00:00Z\"}\n", " ]},\n", " \"aggregation\" : \"Max\"\n", " }\n", " ],\n", " \"spatial\" : {\"type\" : \"point\", \"coordinates\" : [\"40.7128\", \"-74.006\", \"37.7749\", \"-122.4194\"]},\n", " \"temporal\" : {\"intervals\" : [\n", " {\"start\" : \"2023-01-01T00:00:00Z\", \"end\" : \"2023-06-30T00:00:00Z\"}\n", " ]}\n", " }\n", "\n", "query_result = query.submit(query_json)\n", "\n", "point_df = query_result.point_data_as_dataframe()\n", "\n", "point_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is quite a lot going on in the above example. To begin, we are requesting data from two different layers: \n", "\n", "91 - Daily precipitation from the *Daily US weather (PRISM)* dataset\n", "```python\n", " {\n", " \"type\" : \"raster\", \"id\" : \"91\",\n", " \"temporal\" : {\"intervals\" : [\n", " {\"start\" : \"2022-12-01T00:00:00Z\", \"end\" : \"2023-06-30T00:00:00Z\"}\n", " ]},\n", " \"aggregation\" : \"Mean\"\n", " }\n", "```\n", "49464 - NDVI from the *ESA Sentinel 2 l2a* dataset\n", "```python\n", " {\n", " \"type\" : \"raster\", \"id\" : \"49464\",\n", " \"temporal\" : {\"intervals\" : [\n", " {\"start\" : \"2023-01-01T00:00:00Z\", \"end\" : \"2023-06-30T00:00:00Z\"}\n", " ]},\n", " \"aggregation\" : \"Max\"\n", " },\n", "```\n", "For each of these we use a different temporal range and we are aggregating the first two over their respective time ranges. ``Mean`` in the case of 91 and ``Max`` in the case of 49464. A layer can appear multiple times, for example, once with ``Mean`` aggregation, once with ``Sum`` aggregation and once without and the results will reflect the three different requests. The possible aggregation functions for temporal aggregations supported at this stage are ``Mean``, ``Max``, ``Min`` and ``Sum``.\n", "\n", "The ``spatial`` specification describes two points using an array:\n", "```python\n", " \"spatial\" : {\"type\" : \"point\", \"coordinates\" : [\"40.7128\", \"-74.006\", \"37.7749\", \"-122.4194\"]},\n", "```\n", "The format is ``[lat-point-1, long-point1, lat-point2, long-point2]``. You will see in the results that data is returned for each layer, for each timestamp (or once for an aggregation) and for each point. \n", "\n", "
\n", "The ``temporal`` section appearing at the end the above query -- outside the ``layers`` block -- gives a *default* time range that is used if a an element of the ``layers`` block comes without a time range. In the above example it is redundant. However, the current implementation requires its presence even if the information is not used.\n", "
\n", " \n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0" } }, "nbformat": 4, "nbformat_minor": 2 }