You can backup and restore your Event Processing flows and Flink instances as follows.
Event Processing flows
You can export your existing Event Processing flows to save them, making them available to import later, as described in exporting flows.
Flink instances
You can back up your Flink instances by using savepoints. A Flink savepoint is a consistent image of a Flink job’s execution state. This document is intended for Flink jobs running in an application mode Flink cluster, and Flink jobs managed by the FlinkSessionJob custom resource.
Prerequisites
This procedure assumes that you have the following deployed:
- An instance of Flink deployed by the IBM Operator for Apache Flink and configured with persistent storage with a PersistentVolumeClaim (PVC).
- Flink jobs as application deployments.
- A Flink job managed by the
FlinkSessionJobcustom resource.
The FlinkDeployment custom resource that configures your Flink instance must define the hereafter parameters, each pointing to a different directory on the persistent storage.
spec.flinkConfiguration.execution.checkpointing.dirspec.flinkConfiguration.execution.checkpointing.savepoint-dirspec.flinkConfiguration.high-availability.storageDir(if high availability is required)
Note: These directories are automatically created by Flink if they do not exist.
Backing up
The backup process captures the latest state of a running Flink job and its specification, allowing to re-create the job from the saved state when required. To back up your Flink instance, update each of your deployed instances by editing their respective FlinkDeployment custom resources as follows:
-
Ensure that the
statussection indicates that the Job Manager is inREADYstatus and that the Flink job is inRUNNINGstatus by checking theFlinkDeploymentcustom resource andFlinkSessionJobcustom resource.status: jobManagerDeploymentStatus: READY jobStatus: state: RUNNING -
You can use the
FlinkStateSnapshotscustom resource to create a savepoint. In the FlinkSnapshots custom resource, make the following modifications:a. Set the name of your
FlinkDeploymentcustom resource inspec.jobReference.name.b. Set the value of
spec.savepoint.disposeOnDeletetofalseto ensure that the savepoint is not deleted even afterFlinkStateSnapshotcustom resource is deleted.For example:
spec: [...] jobReference: kind: FlinkDeployment name: application-cluster-prod savepoint: alreadyExists: false disposeOnDelete: falsec. Save the changes in the
FlinkStateSnapshotscustom resource.d. The savepoint written to a location in the PVC is indicated in the
status.pathfield of theFlinkStateSnapshotscustom resource. For example:status: failures: 0 path: 'file:/opt/flink/volume/flink-sp/savepoint-caf2b2-39d09a1c170c' state: COMPLETED - Save a copy of the
FlinkStateSnapshotsandFlinkDeploymentcustom resources. - Keep the
FlinkDeploymentcustom resource and the PVC bound to a persistent volume (PV) containing the savepoint to make them available later for restoring your deployment. - Keep the
FlinkSessionJobcustom resource andFlinkDeploymentcustom resource where theFlinkSessionJobwas deployed.
Restoring
To restore a previously backed-up Flink instance, ensure that the PVC bound to a PV containing the snapshot is available, then update your FlinkDeployment custom resource as follows.
-
Edit the
FlinkDeploymentcustom resource that you saved earlier when backing up your instance:a. Set the value of
spec.job.upgradeModetosavepoint.b. Set the value of
spec.job.statetorunningto resume the Flink job.c. Set the value of
spec.job.initialSavepointPathto the savepoint location reported instatus.pathfield of theFlinkStateSnapshotscustom resource that you saved earlier.For example:
job: [...] state: running upgradeMode: savepoint initialSavepointPath: file:/opt/flink/volume/flink-sp/savepoint-caf2b2-39d09a1c170c allowNonRestoredState: true -
Apply the modified
FlinkDeploymentcustom resource.
To restore a previously backed-up Flink SessionJob instance, ensure that the PVC bound to a PV containing the snapshot is available, then complete the following steps.
- Apply the previously backed up
FlinkDeploymentcustom resource where you want to deploy theFlinkSessionJobinstance. -
Edit the
FlinkSessionJobcustom resource that you saved earlier when backing up your instance:a. Set the value of
spec.job.upgradeModetosavepoint.b. Set the value of
spec.job.statetorunningto resume the Flink job.c. Set the value of
spec.job.initialSavepointPathto the savepoint location reported instatus.pathfield of theFlinkStateSnapshotscustom resource that you saved earlier.For example:
job: [...] state: running upgradeMode: savepoint initialSavepointPath: file:/opt/flink/volume/flink-sp/savepoint-caf2b2-39d09a1c170c allowNonRestoredState: true