Kafka Connect and connectors

You can integrate external systems with Event Streams by using the Kafka Connect framework and connectors.

Event Streams connectors architecture

What is Kafka Connect?

When connecting Apache Kafka and other systems, the technology of choice is the Kafka Connect framework.

Use Kafka Connect to reliably move large amounts of data between your Kafka cluster and external systems. For example, it can ingest data from sources such as databases and make the data available for stream processing.

Kafka Connect: introduction

Source and sink connectors

Kafka Connect uses connectors for moving data into and out of Kafka. Source connectors import data from external systems into Kafka topics, and sink connectors export data from Kafka topics into external systems. A wide range of connectors exists, some of which are commercially supported. In addition, you can write your own connectors.

A number of source and sink connectors are available to use with Event Streams. See the connector catalog section for more information.

Kafka Connect: source and sink connectors

Workers

Kafka Connect connectors run inside a Java process called a worker. Kafka Connect can run in either standalone or distributed mode. Standalone mode is intended for testing and temporary connections between systems, and all work is performed in a single process. Distributed mode is more appropriate for production use, as it benefits from additional features such as automatic balancing of work, dynamic scaling up or down, and fault tolerance.

Kafka Connect: workers

When you run Kafka Connect with a standalone worker, there are two configuration files:

  • The worker configuration file contains the properties needed to connect to Kafka. This is where you provide the details for connecting to Kafka.
  • The connector configuration file contains the properties needed for the connector. This is where you provide the details for connecting to the external system (for example, IBM MQ).

When you run Kafka Connect with the distributed worker, you still use a worker configuration file but the connector configuration is supplied using a REST API. Refer to the Kafka Connect documentation for more details about the distributed worker.

For getting started and problem diagnosis, the simplest setup is to run only one connector in each standalone worker. Kafka Connect workers print a lot of information and it’s easier to understand if the messages from multiple connectors are not interleaved.

Connector catalog

The connector catalog contains a list of connectors that are supported either by IBM or the community.

Community support means the connectors are supported through the community by the people that created them. IBM supported connectors are fully supported as part of the official Event Streams support entitlement if you have a license for IBM Event Automation or IBM Cloud Pak for Integration.

See the connector catalog for a list of connectors that work with Event Streams.

Kafka Connect: connector catalog

Setting up connectors

Event Streams provides help with setting up your Kafka Connect environment, adding connectors to that environment, and starting the connectors. See the instructions about setting up and running connectors.

Connectors for IBM MQ

Connectors are available for copying data between IBM MQ and Event Streams. There is a MQ source connector for copying data from IBM MQ into Event Streams or Apache Kafka, and a MQ sink connector for copying data from Event Streams or Apache Kafka into IBM MQ.

For more information about MQ connectors, see the topic about connecting to IBM MQ.