Lab: Transform EBCIDIC data to CSV with COBOL Copybook¶

IBM DataStage Flow Designer allows you to read data from a mainframe. More specifically, you can specify inputs in your DataStage job to be in EBCIDIC format and to import COBOL copybooks as table definitions.

To keep things simple in this lab we're going to speak generally about mainframes and COBOL. Files from a mainframe are usually saved as binary files and sFTPed to some server where DataStage can access them. The binary files alone are not enough for DataStage to read the contents. A COBOL playbook is required to translate the data from binary to ascii. Both files are available here: https://github.com/IBM/datastage-standalone-workshop/tree/master/data/mainframe.

This lab consists of the following steps:

Create job layout
Add COBOL copybook as a table definition
Customize the job
Compile, run, view output

About the data¶

The example binary data should be downloaded to the server. Switch to the server and run:

cd /opt/IBM/InformationServer/Server/Projects/dstage1
wget https://raw.githubusercontent.com/IBM/datastage-standalone-workshop/master/data/mainframe/example.bin

The copybook can be downloaded to the client machine by going to the GitHub repo and saving the file to the desktop. The copybook we're using looks like this:

01  RECORD.
    05  ID                        PIC S9(4)  COMP.
    05  COMPANY.
        10  SHORT-NAME            PIC X(10).
        10  COMPANY-ID-NUM        PIC 9(5) COMP-3.
        10  COMPANY-ID-STR
    05  METADATA.
        10  CLIENTID              PIC X(15).
        10  REGISTRATION-NUM      PIC X(10).
        10  NUMBER-OF-ACCTS       PIC 9(03) COMP-3.

Before you start: Launching DataStage Flow Designer¶

Before we start the lab, let's switch to the iis-client VM and launch Firefox.

Switch to iis-client

Launch the desktop client by going to the start menu and searching for DataStage Designer.

DataStage Designer

1. Create job layout¶

Start a new Parallel Job project and create a job that looks like the image below. Remember to wire the elements together. It should have:

1 x Complex Flat File
1 x Peek
1 X Sequential File

mainframe job

2. Add COBOL copybook as a table definition¶

In the toolbar click on Import > Table Definitions > COBOL File Definitions.

COBOL definitions

Specify the downloaded copybook file and click Import.

Import copybook

You have just imported your copybook definitions!

3. Customize the job¶

The first step is to double click on the Complex Flat File node, go to the File options tab, and specify the example binary file as the input. Critically, we must specify Fixed block as the record type.

file options

Go to the Record options tab and choose the Binary data format and EBCDIC as the character set.

record options

Go to the Records tab and click the Load button, this will give us the option to specify a copybook.

load copybook

Choose to use on the copybook that was imported in the previous step.

find copybook

Select the >> icon to use all fields from the copybook.

specify fields

The Records tab should now show the various column names from the copybook.

add records

Double clicking on the Peek node allows us to map output from the Complex Flat File. Click on the Output section and choose the Columns tab.

Enter the following new columns:

ID
SHORT_NAME
CLIENTID
COMPANY_ID_NUM
COMPANY_ID_STR
REGISTRATION_NUM
NUMBER_OF_ACCTS
ACCOUNT_NUMBER
ACCOUNT_TYPE_X

add colums

Still in the Output section now click on the Mapping tab and choose to Auto-Match.

auto-match

Double clicking on the Sequential File node brings up a single option. To specify a filename, choose mainframe.csv for example.

file name

4. Compile, run, view output¶

Compile and run the job using the usual icons from the toolbar.

After running the job you can view the output from the Designer tool by clicking on the Sequential File node and clicking the View Data button. Click OK on the next dialog.

output file contents

You'll be able to see one row of data with an ID, SHORT_NAME and a few other fields.

shows the data

The output file is also written to the server. Switch to the server VM by clicking the first icon on the Environment VMs panel and selecting iis-server. Login as the root user with the password inf0Xerver.

Switch to server VM

Change your direcotory using cd to the location where you had stored the file.

cd /opt/IBM/InformationServer/Server/Projects/<project-name>/

Finally, output your results using the cat command.

cat mainframe.csv

in a terminal

CONGRATULATIONS!! You have completed this lab!