Scenario
The logistics team want to track the hourly number of events captured by door sensors in each building.
Events from door sensors can be slow to reach the Kafka topic, so the team need to handle a Kafka topic with events that are out of sequence.
Before you begin
The instructions in this tutorial use the Tutorial environment, which includes a selection of topics each with a live stream of events, created to allow you to explore features in IBM Event Automation. Following the setup instructions to deploy the demo environment gives you a complete instance of IBM Event Automation that you can use to follow this tutorial for yourself.
Versions
This tutorial uses the following versions of Event Automation capabilities. Screenshots may differ from the current interface if you are using a newer version.
- Event Streams 11.5.0
- Event Endpoint Management 11.3.0
- Event Processing 1.2.0
Instructions
Step 1 : Discover the topic to use
For this scenario, you need to find information about the source of door badge events.
-
Go to the Event Endpoint Management catalog.
If you need a reminder of how to access the Event Endpoint Management catalog you can review Accessing the tutorial environment.
If there are no topics in the catalog, you may need to complete the tutorial setup step to populate the catalog.
-
The
Door badge events
topic contains events about door badge events.Note the warning in the catalog:
Note that door events can take up to 3 minutes to reach the Kafka topic, so the badge time value in the message payload should be treated as the canonical timestamp for the event.
This delay can be inconsistent, so messages on the topic are often out of sequence as a result.
-
If the topic owner hadn’t provided this warning, you would have needed to observe messages on the Kafka topic itself to identify this. Confirm this by observing messages on the
DOOR.BADGEIN
topic in the Event Streams topic viewer.If you need a reminder about how to access the Event Streams catalog you can review Accessing the tutorial environment.
Look for examples of messages that, even on the same partition, result in an older badge event (according to the
badgetime
property) being on the topic after an earlier badge event.Verify that the timestamp on the Kafka message is not a reliable indicator of when the event occurred, and is frequently up to a few minutes after the actual event.
Step 2 : Create the source of events
-
Go to the Event Processing home page.
If you need a reminder about how to access the Event Processing home page, you can review Accessing the tutorial environment.
-
Create a flow, and give it a name and description to explain that you will use it to track hourly badge events.
-
Update the Event source node.
-
Add an event source.
-
In the Cluster connection pane, fill in the server address by using the value copied from the Event Endpoint Management catalog.
Click Next.
-
In the Access credentials pane, paste the credentials by using a username and password created in Event Endpoint Management using the Generate access credentials button.
-
Select the
DOOR.BADGEIN
topic to process events from, and then click Next. -
The format
JSON
is auto-selected in the Message format drop-down and the sample message is auto-populated in the JSON sample message field.Click Next.
-
In the Key and headers pane, click Next.
Note: The key and headers are displayed automatically if they are available in the selected topic message.
-
In the Event details pane, enter the node name as
door events
in the Node name field. -
Verify that the type of the
badgetime
property has been automatically detected asTimestamp
. -
Configure the event source to use the
badgetime
property as the source of the event time, and to tolerate lateness of up to 3 minutes. -
Click Configure to finalize the event source.
Step 3 : Extract information to aggregate on
-
Add a Transform node to the flow.
Create a transform node by dragging one onto the canvas. You can find this in the Processors section of the left panel.
-
Create a Transform node to extract the building name from the door ID.
Suggested function expression:
REGEXP_EXTRACT(`door`, '([A-Z]+)\-([0-9]+)\-([0-9]+)', 1)
The door ID is made up of:
<building id> - <floor number> - <door number>
For example:
H-0-36
The regular expression function is capturing the first set of letters before the first hyphen character.
-
Click Configure to finalize the transform.
Step 4 : Count the occurrences
-
Add an Aggregate node to the flow.
Create an aggregate node by dragging one onto the canvas. You can find this in the Processors section of the left panel.
-
Define the time window to group badge events by as 1 hour.
-
Count the number of door badge events (by counting the unique record ID), grouped by the building.
-
Rename the output properties to make the results easier to understand.
Step 5 : Test the flow
The final step is to run your completed flow.
-
Use the Run menu, and select Include historical to run your filter on the history of sensor events available on this topic.
-
When you have finished reviewing the results, you can stop this flow.
Recap
It is not unusual to need to process events on a Kafka topic that are out of sequence and with unreliable timestamps in the message header.
Event Processing makes it easy to perform time-based analysis of such events, by allowing you to specify what value to use as a reliable source of time and describe how long it should wait for out-of-sequence events.