You can integrate external systems with Event Streams by using the Kafka Connect framework and connectors.
What is Kafka Connect?
When connecting Apache Kafka and other systems, the technology of choice is the Kafka Connect framework.
Use Kafka Connect to reliably move large amounts of data between your Kafka cluster and external systems. For example, it can ingest data from sources such as databases and make the data available for stream processing.
Source and sink connectors
Kafka Connect uses connectors for moving data into and out of Kafka. Source connectors import data from external systems into Kafka topics, and sink connectors export data from Kafka topics into external systems. A wide range of connectors exists, some of which are commercially supported. In addition, you can write your own connectors.
A number of source and sink connectors are available to use with Event Streams. See the connector catalog section for more information.
Workers
Kafka Connect connectors run inside a Java process called a worker. Kafka Connect can run in either stand-alone or distributed mode. Stand-alone mode is intended for testing and temporary connections between systems, and all work is performed in a single process. Distributed mode is more appropriate for production use, as it benefits from additional features such as automatic balancing of work, dynamic scaling up or down, and fault tolerance.
When you run Kafka Connect with a stand-alone worker, there are two configuration files:
- The worker configuration file contains the properties that are required to connect to Kafka. This is where you provide the details for connecting to Kafka.
- The connector configuration file contains the properties that are required for the connector. This is where you provide the details for connecting to the external systems (for example, IBM MQ).
When you run Kafka Connect with the distributed worker, you still use a worker configuration file but the connector configuration is supplied by using a REST API. Refer to the Kafka Connect documentation for more details about the distributed worker.
For getting started and problem diagnosis, the simplest setup is to run only one connector in each stand-alone worker. Kafka Connect workers print a lot of information and it’s easier to understand if the messages from multiple connectors are not interleaved.
Kafka Connect topics
When running in distributed mode, Kafka Connect uses three topics to store configuration, current offsets and status. Kafka Connect can create these topics automatically as it is started by the Event Streams operator. By default, the topics are:
- connect-configs: This topic stores the connector and task configurations.
- connect-offsets: This topic stores offsets for Kafka Connect.
- connect-status: This topic stores status updates of connectors and tasks.
Note: If you want to run multiple Kafka Connect environments on the same cluster, you can override the default names of the topics in the configuration.
Authentication and authorization
Kafka Connect uses an Apache Kafka client just like a regular application, and the usual authentication and authorization rules apply.
Kafka Connect will need authorization to:
- Produce and consume to the internal Kafka Connect topics and, if you want the topics to be created automatically, to create these topics.
- Produce to the target topics of any source connectors that you are using.
- Consume from the source topics of any sink connectors that you are using.
Note: For more information about authentication and the credentials and certificates required, see the information about managing access.
Connector catalog
The connector catalog contains a list of connectors that are supported either by IBM or the community.
Community supported connectors are supported through the community that maintains them. IBM supported connectors are fully supported as part of the official Event Streams support entitlement if you have a license for IBM Event Automation or IBM Cloud Pak for Integration.
See the connector catalog for a list of connectors that work with Event Streams.
Setting up connectors
Event Streams provides help with setting up your Kafka Connect environment, adding connectors to that environment, and starting the connectors. See the instructions about setting up and running connectors.
Connectors for IBM MQ
Connectors are available for copying data between IBM MQ and Event Streams. There is a MQ source connector for copying data from IBM MQ into Event Streams or Apache Kafka, and a MQ sink connector for copying data from Event Streams or Apache Kafka into IBM MQ.
For more information about MQ connectors, see the topic about connecting to IBM MQ.