Backing up and restoring

You can backup and restore your Event Processing flows and Flink instances as follows.

Event Processing flows

You can export your existing Event Processing flows to save them, making them available to import later, as described in exporting flows.

A Flink savepoint is a consistent image of a Flink job’s execution state. Backing up your Flink instances involves backing up savepoints.

Prerequisites

This procedure assumes that you have the following deployed:

The FlinkDeployment custom resource that configures your Flink instance must define the hereafter parameters, each pointing to a different directory on the persistent storage.

  • spec.flinkConfiguration.state.checkpoints.dir
  • spec.flinkConfiguration.state.savepoints.dir
  • spec.flinkConfiguration.high-availability.storageDir (if high availability is required)

Note: These directories are automatically created by Flink if they do not exist.

Backing up

To back up your Flink instance, update each of your deployed instances by editing their respective FlinkDeployment custom resources as follows:

  1. Ensure that the status section indicates that the Job Manager is in READY status and that the Flink job is in RUNNING status by checking the FlinkDeployment custom resource.

    status:
      jobManagerDeploymentStatus: READY
      jobStatus:
        state: RUNNING
    
  2. Set the following values in the FlinkDeployment custom resource: a. Set the value of spec.job.upgradeMode to savepoint.

    b. Set the value of spec.job.state to running.

    c. Set the value of spec.job.savepointTriggerNonce to an integer that has never been used before for that option.

    For example:

    job:
      [...]
      savepointTriggerNonce: <integer value>
      state: running
      upgradeMode: savepoint
    

    d. Save the changes in the FlinkDeployment custom resource.

    A savepoint is triggered and written to a location in the PVC, which is indicated in the status.jobStatus.savepointInfo.lastSavepoint.location field of the FlinkDeployment custom resource.

    For example:

    status:
      [...]
      jobStatus:
        [...]
        savepointInfo:
          [...]
          lastSavepoint:
            formatType: CANONICAL
            location: 'file:/opt/flink/volume/flink-sp/savepoint-e372fa-9069a1c0563e'
            timeStamp: 1733957991559
            triggerNonce: 1
            triggerType: MANUAL
    
  3. Keep the FlinkDeployment custom resource and the PVC to make them available later for restoring your deployment.

Restoring

To restore a Flink instance that you previously backed up, ensure that your PVC where the savepoint was written to is available, and update your FlinkDeployment custom resource as follows.

  1. Edit the FlinkDeployment custom resource that you saved earlier when backing up your instance:

    a. Ensure that the value of spec.job.upgradeMode is savepoint.

    b. Ensure that the value of spec.job.state is running to resume the Flink job.

    c. Remove spec.job.savepointTriggerNonce and its value.

    d. Set the value of spec.job.initialSavepointPath to the savepoint location reported during the backing up operation in step 1.d plus the suffix /_metadata.

    For example:

    job:
      [...]
      state: running
      upgradeMode: savepoint
      initialSavepointPath: file:/opt/flink/volume/flink-sp/savepoint-e372fa-9069a1c0563e/_metadata
      allowNonRestoredState: true
    
  2. Apply the modified FlinkDeployment custom resource.