Skip to content

PDL Language Tutorial

The following sections give a step-by-step overview of PDL language features. All the examples in this tutorial can be found in examples/tutorial.

Simple text

The simplest PDL program is one that generates a small text (file):

description: Hello world!
text:
    Hello, world!

This program has a description field, which contains a title. The description field is optional. It also has a text field, which can be either a string, a block, or a list of strings and blocks. A block is a recipe for how to obtain data (e.g., model call, code call, etc...). In this case, there are no calls to an LLM or other tools, and text consists of a simple string.

To render the program into an actual text, we have a PDL interpreter that can be invoked as follows:

pdl examples/tutorial/simple_program.pdl

This results in the following output:

Hello, world!

Calling an LLM

description: Hello world calling a model
text:
- "Hello\n"
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  parameters:
    stop_sequences: '!'

In this program (file), the text starts with the word "Hello\n", and we call a model (replicate/ibm-granite/granite-3.0-8b-instruct) with this as input prompt. The model is passed a parameter stop_sequences.

A PDL program computes 2 data structures. The first is a JSON corresponding to the result of the overall program, obtained by aggregating the results of each block. This is what is printed by default when we run the interpreter. The second is a conversational background context, which is a list of role/content pairs, where we implicitly keep track of roles and content for the purpose of communicating with models that support chat APIs. The contents in the latter correspond to the results of each block. The conversational background context is what is used to make calls to LLMs via LiteLLM.

In this example, since the input field is not specified in the model call, the entire text up to that point is passed to the model as input context, using the default role user.

When we execute this program using the interpreter, we obtain:

Hello
Hello

where the second Hello has been generated by Granite.

Here's another example of model call that includes an input field (file):

description: Hello world calling a model
text:
- "Hello\n"
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  input: 
    Translate the word 'Hello' to French

In this case, the input passed to the model is the sentence: Translate the word 'Hello' to French and nothing else from the surrounding document. When we execute this program, we obtain:

Hello
The word 'Hello' translates to 'Bonjour' in French.
where the second line is generated by the model.

Using the input field, we can also give a directly an array of messages (role/content) to the model (file):

description: Hello world calling a model
text:
- "Hello\n"
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  input:
    array:
    - role: system
      content: You are a helpful assistant that is fluent in French.
    - role: user
      content: Translate the word 'Hello' to French

This has the same output as the previous program.

Parameter defaults for watsonx Granite models

When using Granite models, we use the following defaults for model parameters (except granite-20b-code-instruct-r1.1): - decoding_method: greedy, (temperature: 0) - max_new_tokens: 1024 - min_new_tokens: 1 - repetition_penalty: 1.05

Also if the decoding_method is sample, then the following defaults are used: - temperature: 0.7 - top_p: 0.85 - top_k: 50

The user can override these defaults by explicitly including them in the model call.

Variable Definition and Use

Any block can define a variable using a def: <var> field. This means that the output of that block is assigned to the variable <var>, which may be reused at a later point in the document.

Consider the following example (file):

description: Hello world with variable def and use
text:
- "Hello\n"
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  def: GEN
  parameters:
    stop_sequences: '!'
- "\nGEN is equal to: ${ GEN }"

Here we assign the output of the model to variable GEN using the def field. The last line of the program prints out the value of GEN. Notice the notation ${ } for accessing the value of a variable. Any Jinja expression is allowed to be used inside these braces. These expressions are also used to specify conditions for loops and conditionals. See for example this file.

When we execute this program, we obtain:

Hello
Hello
GEN is equal to: Hello

Model Chaining

In PDL, we can declaratively chain models together as in the following example (file):

description: Model chaining
text: 
- "Hello\n"
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  parameters:
    stop_sequences: "!"
- "\nDid you just say Hello?\n"
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  parameters:
    stop_sequences: "!"

In this program, the first call is to a Granite model to complete the sentence Hello, world!. The following block in the document prints out the sentence: Translate this to French. The final line of the program takes the entire document produced so far and passes it as input to the Granite multilingual model. Notice that the input passed to this model is the document up to that point, represented as a conversation. This makes it easy to chain models together and continue building on previous interactions.

When we execute this program, we obtain:

Hello
Hello
Did you just say Hello?
Yes, I did. How can I assist you today?

Function Definition

PDL also supports function definitions to make it easier to reuse code. Suppose we want to define a translation function that takes a string and calls a Granite model for the translation. This would be written in PDL as follows (file):

description: function def and call
text:
- def: translate
  function:
    sentence: str
    language: str
  return:
    lastOf:
    - "\nTranslate the sentence '${ sentence }' to ${ language }.\n"
    - model: replicate/ibm-granite/granite-3.0-8b-instruct
      parameters:
        stop_sequences: "\n"
        temperature: 0
- call: ${ translate }
  args:
    sentence: I love Paris!
    language: French
- "\n"
- call: ${ translate }
  args:
    sentence: I love Madrid!
    language: Spanish

In this program, the first block defines a function translate that takes as parameters sentence and language, both of which are of type string. The body of the function is defined by its return field. In this case, we formulate a translation prompt using the parameters and send it to a Granite multilingual model.

The last two blocks are calls to this function, as indicated by call: ${ translate }. This block specifies the arguments to be passed. When we execute this program, we obtain:

The translation of 'I love Paris!' to French is 'J'aime Paris!'.
The translation of 'I love Madrid!' to Spanish is 'Me encanta Madrid!'.

A function only contributes to the output document when it is called. So the definition itself results in "". When we call a function, we implicitly pass the current background context, and this is used as input to model calls inside the function body. In the above example, since the input field is omitted, the entire document produced at that point is passed as input to the Granite model.

To reset the context when calling a function, we can pass the special argument: pdl_context: [].

Notice that the arguments of function calls are expressions and cannot be arbitrary PDL blocks.

Grouping Variable Definitions in Defs

In PDL, the above program can be written more neatly by grouping certain variable definitions into a defs section, as follows (file):

description: function def and call
defs:
  translate:
    function:
      sentence: str
      language: str
    return:
      lastOf:
      - "\nTranslate the sentence '${ sentence }' to ${ language }.\n"
      - model: replicate/ibm-granite/granite-3.0-8b-instruct
        parameters:
          stop_sequences: "\n"
text:
- call: ${ translate }
  args:
    sentence: I love Paris!
    language: French
- "\n"
- call: ${ translate }
  args:
    sentence: I love Madrid!
    language: Spanish

This program has the same output has the one from the previous section.

Muting Block Output with contribute

By default, when a PDL block is executed it produces a result that is contributed to the overall result, and it also contributes to the background context. It is possible to mute both contributions by setting contribute to [] for any block. This feature allows the computation of intermediate values that are not necessarily output in the document. The value of the variable specified in def is still set to the result of the block.

Consider the similar example as above, but with contribute set to [] (file):

description: function def and call
defs:
  translate:
    function:
      sentence: str
      language: str
    return:
      text:
      - text: "\nTranslate the sentence '${ sentence }' to ${ language }.\n"
        contribute: [context]
      - model: replicate/ibm-granite/granite-3.0-8b-instruct
        parameters:
          stop_sequences: "\n"
text:
- call: ${ translate }
  contribute: []
  def: FRENCH
  args:
    sentence: I love Paris!
    language: French
- "The french sentence was: ${ FRENCH }"

The call to the translator with French as language does not produce an output. However, we save the result in variable FRENCH and use it in the last sentence of the document. When we execute this program, we obtain:

The french sentence was: The translation of 'I love Paris!' to French is 'J'aime Paris!'.

In general, contribute can be used to set how the result of the block contribute to the final result and the background context. Here are its possible values: - []: no contribution to either the final result or the background context

  • [result]: contribute to the final result but not the background context

  • [context]: contribute to the background context but not the final result

  • [result, context]: contribute to both, which is also the default setting.

Specifying Data

In PDL, the user specifies step by step the shape of data they wish to generate. A text block takes a list of blocks, stringifies the result of each block, and concatenates them.

An array takes a list of blocks and creates an array of the results of each block:

array:
  - apple
  - orange
  - banana

This results in the following output:

["apple", "orange", "banana"]

Each list item can contain any PDL block (strings are shown here), and the overall result is presented as an array.

An object constructs an object:

object:
  name: Bob
  job: manager

This results in the following output:

{"name": "Bob", "job": "manager"}

Each value in the object can be any PDL block, and the result is presented as an object.

A lastOf is a sequence, where each block in the sequence is executed and the overall result is that of the last block.

lastOf:
  - 1
  - 2
  - 3

This results in the following output:

3

Each list item can contain any PDL block (strings are shown here), and the result of the whole list is that of the last block.

Notice that block types that require lists (repeat, for, if-then-else) have the lastOf semantics by default. For more detailed discussion on this see this section.

The PDL interpreter will raise a warning for a list item inside a lastOf block that is not capturing the result in a variable definition meaning that the result is being implicitly ignored. If this is intended because the block is contributing to the context or doing a side effect for example, the warning can be turned off by including contribute: [context] or contribute: []. On the other hand, if this was a mistake, then capture the result of the block using a variable definition by adding def. You could also turn the list into a text or an array by surrounding it with a text or array block so that no result is lost.

Input from File or Stdin

PDL can accept textual input from a file or stdin. In the following example (file), the contents of this file are read by PDL and incorporated in the document. The result is also assigned to a variable HELLO.

description: PDL code with input block
text:
- read: ./data.txt
  def: HELLO

In the next example, prompts are obtained from stdin (file). This is indicated by assigning the value null to the read field.

description: PDL code with input block
text:
- "The following will prompt the user on stdin.\n"
- read:
  message: "Please provide an input: "
  def: STDIN

If the message field is omitted then one is provided for you.

The following example shows a multiline stdin input (file). When executing this code and to exit from the multiline input simply press control D (on macOS).

description: PDL code with input block
text:
- "A multiline stdin input.\n"
- read:
  multiline: true

Finally, the following example shows reading content in JSON format.

Consider the JSON content in this file:

{
    "name": "Bob",
    "address": {
        "number": 87,
        "street": "Smith Road",
        "town": "Armonk", 
        "state": "NY",
        "zip": 10504
    }
}

The following PDL program reads this content and assigns it to variable PERSON in JSON format (file). The reference PERSON.address.street then refers to that field inside the JSON object.

description: Input block example with json input
defs:
  PERSON:
    read: ./input.json
    parser: json
text:
- "${ PERSON.name } lives at the following address:\n"
- "${ PERSON.address.number } ${ PERSON.address.street } in the town of ${ PERSON.address.town }, ${ PERSON.address.state }"

When we execute this program, we obtain:

Bob lives at the following address:
87 Smith Road in the town of Armonk, NY

Calling code

The following script shows how to execute python code (file). The python code is executed locally (or in a containerized way if using pdl --sandbox). In principle, PDL is agnostic of any specific programming language, but we currently only support Python, Jinja, and shell commands. Variables defined in PDL are copied into the global scope of the Python code, so those variables can be used directly in the code. However, mutating variables in Python has no effect on the variables in the PDL program. The result of the code must be assigned to the variable result internally to be propagated to the result of the block. A variable def on the code block will then be set to this result.

In order to define variables that are carried over to the next Python code block, a special variable PDL_SESSION can be used, and variables assigned to it as fields. See for example: (file).

description: Hello world showing call to python code
text:
- "Hello, "
- lang: python
  code: 
    |
    import random
    import string
    result = random.choice(string.ascii_lowercase)

This results in the following output (for example):

Hello, r!

Calling REST APIs

PDL programs can contain calls to REST APIs with Python code. Consider a simple weather app (file):

description: Using a weather API and LLM to make a small weather app
text:
- def: QUERY
  text: "What is the weather in Madrid?\n"
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  input: |
      Extract the location from the question.
      Question: What is the weather in London?
      Answer: London
      Question: What's the weather in Paris?
      Answer: Paris
      Question: Tell me the weather in Lagos?
      Answer: Lagos
      Question: ${ QUERY }
  parameters:
    stop_sequences: "Question,What,!,\n"
  def: LOCATION
  contribute: []
- lang: python
  code: |
    import requests
    #response = requests.get('https://api.weatherapi.com/v1/current.json?key==XYZ=${ LOCATION }') 
    #Mock response:
    result = '{"location": {"name": "Madrid", "region": "Madrid", "country": "Spain", "lat": 40.4, "lon": -3.6833, "tz_id": "Europe/Madrid", "localtime_epoch": 1732543839, "localtime": "2024-11-25 15:10"}, "current": {"last_updated_epoch": 1732543200, "last_updated": "2024-11-25 15:00", "temp_c": 14.4, "temp_f": 57.9, "is_day": 1, "condition": {"text": "Partly cloudy", "icon": "//cdn.weatherapi.com/weather/64x64/day/116.png", "code": 1003}, "wind_mph": 13.2, "wind_kph": 21.2, "wind_degree": 265, "wind_dir": "W", "pressure_mb": 1017.0, "pressure_in": 30.03, "precip_mm": 0.01, "precip_in": 0.0, "humidity": 77, "cloud": 75, "feelslike_c": 12.8, "feelslike_f": 55.1, "windchill_c": 13.0, "windchill_f": 55.4, "heatindex_c": 14.5, "heatindex_f": 58.2, "dewpoint_c": 7.3, "dewpoint_f": 45.2, "vis_km": 10.0, "vis_miles": 6.0, "uv": 1.4, "gust_mph": 15.2, "gust_kph": 24.4}}'
  def: WEATHER
  parser: json
  contribute: []
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  input: |
      Explain the weather from the following JSON:
      ${ WEATHER }

In this program, we first define a query about the weather in some location (assigned to variable QUERY). The next block is a call to a Granite model with few-shot examples to extract the location, which we assign to variable LOCATION. The next block makes an API call with Python (mocked in this example). Here the LOCATION is appended to the url. The result is a JSON object, which may be hard to interpret for a human user. So we make a final call to an LLM to interpret the JSON in terms of weather. Notice that many blocks have contribute set to [] to hide intermediate results.

Data Block

PDL offers the ability to create JSON data as illustrated by the following example (described in detail in the Overview section). The data block can gather previously defined variables into a JSON structure. This feature is useful for data generation. Programs such as this one can be bootstrapped with a bash or Python script to generate data en masse (file).

description: Code explanation example
defs:
  CODE:
    read: ./data.yaml
    parser: yaml
  TRUTH:
    read: ./ground_truth.txt
lastOf:
- model: replicate/ibm-granite/granite-3.0-8b-instruct
  def: EXPLANATION
  input:
     |
      Here is some info about the location of the function in the repo.
      repo: 
      ${ CODE.repo_info.repo }
      path: ${ CODE.repo_info.path }
      Function_name: ${ CODE.repo_info.function_name }


      Explain the following code:
      ```
      ${ CODE.source_code }```
- def: EVAL
  lang: python
  code:
    |
    import textdistance
    expl = """
    ${ EXPLANATION }
    """
    truth = """
    ${ TRUTH }
    """
    result = textdistance.levenshtein.normalized_similarity(expl, truth)
- data: 
    input: ${ CODE }
    output: ${ EXPLANATION }
    metric: ${ EVAL }

Notice that in the data block the values are interpreted as Jinja expressions. If values need to be PDL programs to be interpreted, then you need to use the object block instead (see this section).

In the example above, the expressions inside the data block are interpreted. In some cases, it may be useful not to interpret the values in a data block. The raw field can be used to turn off the interpreter inside a data block. For example, consider the (file):

description: raw data block
data:
  name: ${ name }
  phone: ${ phone }
raw: True

The result of this program is the JSON object:

{
  "name": "${ name }",
  "phone": "${ phone }"
}

where the values of name and phone have been left uninterpreted.

Include Block

PDL allows programs to be defined over multiple files. The include block allows one file to incorporate another, as shown in the following example:

description: Granite Multi-Round Chat
text:
- include: ../granite/granite_defs.pdl
- read: ../granite/multi-prompts.json
  parser: json
  def: prompts
  spec: {prompts: [str]}
  contribute: []
- for:
    prompt: ${ prompts.prompts }
  repeat:
    text:
    - |

      ${ prompt }
    - model: replicate/ibm-granite/granite-3.0-8b-instruct
      parameters:
        decoding_method: sample
        max_new_tokens: 512
role: user

which includes the following file:

The include block means that the PDL code at that file is executed and its output is included at the point where the include block appears in the program. In this example, the file contains definitions, which are used in the program. This feature allows reuse of common templates and patterns and to build libraries. Notice that relative paths are relative to the containing file.

Conditionals and Loops

PDL supports conditionals and loops as illustrated in the following example (file), which implements a chatbot.

description: chatbot
text:
- read:
  message: "What is your query?\n"
  contribute: [context]
- repeat:
    text:
    - model: replicate/ibm-granite/granite-3.0-8b-instruct
    - read:
      def: eval
      message: "\nIs this a good answer[yes/no]?\n"
      contribute: []
    - if: ${ eval == 'no' }
      then:
        text:
        - read:
          message: "Why not?\n"
  until: ${ eval == 'yes'}

The first block prompts the user for a query, and this is contributed to the background context. The next block is a repeat-until, which repeats the contained text block until the condition in the until becomes true. The field repeat can contain a string, or a block, or a list. If it contains a list, then the list is interpreted to be a lastOf block. This means that all the blocks in the list are executed and the result of the body is that of the last block.

The example also shows the use of an if-then-else block. The if field contains a condition, the then field can also contain either a string, or a block, or a list (and similarly for else). If it contains a list, the list is interpreted to be a lastOf block. So again the blocks in the list are executed and the result is that of the last block.

The chatbot keeps looping by making a call to a model, asking the user if the generated text is a good answer, and asking why not? if the answer (stored in variable eval) is no. The loop ends when eval becomes yes. This is specified with a Jinja expression on line 18.

Notice that the repeat and then blocks are followed by text. This is because of the semantics of lists in PDL. If we want to aggregate the result by stringifying every element in the list and collating them together, then we need the keyword text to precede a list. If this is omitted then the list is treated as a programmatic sequence where all the blocks are executed in sequence but result of the overall list is the result of the {\em last} block in the sequence. This behavior can be marked explicitly with a lastOf block.

Another form of iteration in PDL is repeat followed by num_iterations, which repeats the body num_iterations times.

The way that the result of each iteration is collated with other iterations can be customized in PDL using the join feature (see the following section).

For Loops

PDL also offers for loops over lists. The following example stringifies and outputs each number.

description: for loop
for:
  i: [1, 2, 3, 4]
repeat: 
  ${ i }

This program outputs:

1234

To output a number of each line, we can specify which string to use to join the results.

description: for loop
for:
  i: [1, 2, 3, 4]
repeat:
  ${ i }
join:
  with: "\n"

1
2
3
4

To creates an array as a result of iteration, we would write:

description: for loop
for:
  i: [1, 2, 3, 4]
repeat:
  - ${ i }
join:
  as: array

which outputs the following list:

[1, 2, 3, 4]

To retain only the result of the last iteration of the loop, we would write:

description: for loop
for:
  i: [1, 2, 3, 4]
repeat:
  - ${ i }
join:
  as: lastOf

which outputs:

4

When join is not specified, the collation defaults to

join:
  as: text
  with: ""

meaning that result of each iteration is stringified and concatenated with that of other iterations. When using with, as: text can be elided.

Note that join can be added to any looping construct (repeat) not just for loops.

The for loop constructs also allows iterating over 2 or more lists of the same length simultaneously:

description: for loop
defs:
  numbers:
    data: [1, 2, 3, 4]
  names:
    data: ["Bob", "Carol", "David", "Ernest"]
for:
  number: ${ numbers }
  name: ${ names }
repeat:
  "${ name }'s number is ${ number }\n"

This results in the following output:

Bob's number is 1
Carol's number is 2
David's number is 3
Ernest's number is 4

Roles and Chat Templates

Consider again the chatbot example (file). By default blocks have role user, except for model call blocks, which have role assistant. If we write roles explicitly for the chatbot, we obtain:

description: chatbot
text:
- read:
  message: "What is your query?\n"
  contribute: [context]
- repeat:
    text:
    - model: replicate/ibm-granite/granite-3.0-8b-instruct
      role: assistant
    - read:
      def: eval
      message: "\nIs this a good answer[yes/no]?\n"
      contribute: []
    - if: ${ eval == 'no' }
      then:
        text:
        - read:
          message: "Why not?\n"
  until: ${ eval == 'yes'}
role: user

In PDL, any block can be adorned with a role field indicating the role for that block. These are high-level annotations that help to make programs more portable across different models. If the role of a block is not specified (except for model blocks that have assistant role), then the role is inherited from the surrounding block. So in the above example, we only need to specify role: user at the top level (this is the default, so it doesn't need to be specified explicitly).

PDL takes care of applying appropriate chat templates.

The prompt that is actually submitted to the first model call (with query What is APR?) is the following:

<|start_of_role|>user<|end_of_role|>What is APR?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>

To change the template that is applied, you can specify it as a parameter of the model call:

model: replicate/ibm-granite/granite-3.0-8b-instruct
parameters:
  roles:
    system:
       pre_message: <insert text here>
       post_message: <insert text here>
    user:
       pre_message: <insert text here>
       post_message: <insert text here>
    assistant:
       pre_message: <insert text here>
       post_message: <insert text here>

Type Checking

Consider the following PDL program (file). It first reads the data found here to form few-shot examples. These demonstrations show how to create some JSON data.

description: Creating JSON Data
defs:
  data:
    read: ./gen-data.yaml
    parser: yaml
    spec: { questions: [str], answers: [obj] }
text:
  - model: replicate/ibm-granite/granite-3.0-8b-instruct
    def: model_output
    spec: {name: str, age: int}
    input:
      text:
      - for: 
          question: ${ data.questions }
          answer: ${ data.answers }
        repeat: |
            ${ question }
            ${ answer }
      - > 
        Question: Create a JSON object with fields 'name' and 'age' 
        and set them appropriately. Write the age in letters.
    parser: yaml
    parameters:
      stop_sequences: "\n"
      temperature: 0

Upon reading the data we use a parser to parse it into a YAML. The spec field indicates the expected type for the data, which is an object with 2 fields: questions and answers that are a list of string and a list of objects, respectively. When the interpreter is executed, it checks this type dynamically and throws errors if necessary.

Similarly, the output of the model call is parsed as YAML, and the spec indicates that we expect an object with 2 fields: name of type string, and age of type integer.

When we run this program, we obtain the output:

gen-data.pdl:8 - Type errors during spec checking:
gen-data.pdl:8 - 30 should be of type <class 'int'>
{'name': 'John', 'age': '30'}

Notice that since we asked the age to be produced in letters, we got a string back and this causes a type error indicated above.

In general, spec definitions can be a subset of JSON schema, or use a shorthand notation as illustrated by the examples below:

  • bool: boolean
  • str: string
  • int: integer
  • float: float
  • {str: {pattern: '^[A-Za-z][A-Za-z0-9_]*$'}}: a string satisfying the indicated pattern
  • {float: {minimum: 0, exclusiveMaximum: 1}}: a float satisfying the indicated constraints
  • {list: int}: a list of integers
  • [int]: a list of integers
  • {list: {int: {minimum: 0}}}: a list of integers satisfying the indicated constraints
  • [{int: {minimum: 0}}]: same as above
  • {list: {minItems: 1, int: {}}}, a list satisfying the indicated constraints
  • {obj: {latitude: float, longitude: float}}: an object with fields latitude and longitude
  • {latitude: float, longitude: float}: same as above
  • {obj: {question: str, answer: str, context: {optional: str}}}: an object with an optional field
  • {question: str, answer: str, context: {optional: str}}: same as above
  • {list: {obj: {question: str, answer: str}}}: a list of objects
  • [{question: str, answer: str}]: same as above
  • {enum: [red, green, blue]}: an enumeration

Python SDK

See examples of PDL being called programmatically in Python here.

For a more sophisticated example, see here.

Debugging PDL Programs

We highly recommend to edit PDL programs using an editor that support YAML with JSON Schema validation. For example, you can use VSCode with the YAML extension and configure it to use the PDL schema. The PDL repository has been configured so that every *.pdl file is associated with the PDL grammar JSONSchema (see settings). This enables the editor to display error messages when the yaml deviates from the PDL syntax and grammar. It also provides code completion. The PDL interpreter also provides similar error messages. To make sure that the schema is associated with your PDL files, be sure that PDL Schemas appear at the bottom right of your VSCode window, or on top of the editor window.

The interpreter prints out a log by default in the file log.txt. This log contains the details of inputs and outputs to every block in the program. It is useful to examine this file when the program is behaving differently than expected.

To change the log filename, you can pass it to the interpreter as follows:

pdl --log <my-example.log> <my-example>

Live Document Visualizer

PDL has a Live Document visualizer to help in program understanding given an execution trace. To produce an execution trace consumable by the Live Document, you can run the interpreter with the --trace argument and set the value to either json or yaml:

pdl --trace <my-example_trace.json> <my-example>

This produces an additional file named my-example_trace.json that can be uploaded to the Live Document visualizer tool. Clicking on different parts of the Live Document will show the PDL code that produced that part in the right pane.

This is similar to a spreadsheet for tabular data, where data is in the forefront and the user can inspect the formula that generates the data in each cell. In the Live Document, cells are not uniform but can take arbitrary extents. Clicking on them similarly reveals the part of the code that produced them.

Using Ollama models

  1. Install Ollama e.g., brew install --cask ollama
  2. Run a model e.g., ollama run granite-code:34b-instruct-q5_K_M. See the Ollama library for more models
  3. An OpenAI style server is running locally at http://localhost:11434/, see the Ollama blog for more details.

Example:

text:
- Hello,
- model: ollama_chat/granite-code:34b-instruct-q5_K_M
  parameters:
    stop:
    - '!'
    decoding_method: greedy

If you want to use an external Ollama instance, the env variable OLLAMA_API_BASE should be defined, by default is http://localhost:11434.

Alternatively, one could also use Ollama's OpenAI-style endpoint using the openai/ prefix instead of ollama_chat/. In this case, set the OPENAI_API_BASE, OPENAI_API_KEY, and OPENAI_ORGANIZATION (if necessary) environment variables. If you were using the official OpenAI API, you would only have to set the api key and possibly the organization. For local use e.g., using Ollama, this could look like so:

export OPENAI_API_BASE=http://localhost:11434/v1
export OPENAI_API_KEY=ollama # required, but unused
export OPENAI_ORGANIZATION=ollama # not required

pdl <...>

Strings In Yaml

Multiline strings are commonly used when writing PDL programs. There are two types of formats that YAML supports for strings: block scalar and flow scalar formats. Scalars are what YAML calls basic values like numbers or strings, as opposed to complex types like arrays or objects. Block scalars have more control over how they are interpreted, whereas flow scalars have more limited escaping support. (Explanation here is thanks to Wolfgang Faust)

Block Scalars

Block Style Indicator: The block style indicates how newlines inside the block should behave. If you would like them to be kept as newlines, use the literal style, indicated by a pipe |. Note that without a chomping indicator, described next, only the last newline is kept.

PDL:

text:
  - |
    Several lines of text,
    with some "quotes" of various 'types',
    and also a blank line:

    and some text with
        extra indentation
    on the next line,
    plus another line at the end.


  - "End."

Output:

Several lines of text,
with some "quotes" of various 'types',
and also a blank line:

and some text with
    extra indentation
on the next line,
plus another line at the end.
End.

If instead you want them to be replaced by spaces, use the folded style, indicated by a right angle bracket >. To get a newline using the folded style, leave a blank line by putting two newlines in. Lines with extra indentation are also not folded.

PDL:

text:
  - >
    Several lines of text,
    with some "quotes" of various 'types',
    and also a blank line:

    and some text with
        extra indentation
    on the next line,
    plus another line at the end.


  - "End."

Output:

Several lines of text, with some "quotes" of various 'types', and also a blank line:
and some text with
    extra indentation
on the next line, plus another line at the end.
End.

Block Chomping Indicator: The chomping indicator controls what should happen with newlines at the end of the string. The default, clip, puts a single newline at the end of the string. To remove all newlines, strip them by putting a minus sign - after the style indicator. Both clip and strip ignore how many newlines are actually at the end of the block; to keep them all put a plus sign + after the style indicator.

PDL:

text:
  - |-
    Several lines of text,
    with some "quotes" of various 'types',
    and also a blank line:

    and some text with
        extra indentation
    on the next line,
    plus another line at the end.


  - "End."

Output:

Several lines of text,
with some "quotes" of various 'types',
and also a blank line:

and some text with
    extra indentation
on the next line,
plus another line at the end.End.

PDL:

text:
  - |+
    Several lines of text,
    with some "quotes" of various 'types',
    and also a blank line:

    and some text with
        extra indentation
    on the next line,
    plus another line at the end.


  - "End."

Output:

Several lines of text,
with some "quotes" of various 'types',
and also a blank line:

and some text with
    extra indentation
on the next line,
plus another line at the end.


End.

If you don't have enough newline characters using the above methods, you can always add more like so:

text:
  - |-
    Several lines of text,
    with some "quotes" of various 'types',
    and also a blank line:

    and some text with
        extra indentation
    on the next line,
    plus another line at the end.


  - "\n\n\n\n"
  - "End."

Output:

Several lines of text,
with some "quotes" of various 'types',
and also a blank line:

and some text with
    extra indentation
on the next line,
plus another line at the end.



End.

Indentation Indicator: Ordinarily, the number of spaces you're using to indent a block will be automatically guessed from its first line. You may need a block indentation indicator if the first line of the block starts with extra spaces. In this case, simply put the number of spaces used for indentation (between 1 and 9) at the end of the header.

PDL:

text:
  - |1
    Several lines of text,
    with some "quotes" of various 'types',
    and also a blank line:

    and some text with
        extra indentation
    on the next line.

Output:

 Several lines of text,
 with some "quotes" of various 'types',
 and also a blank line:

 and some text with
     extra indentation
 on the next line.

Flow Scalars

Single-quoted:

PDL:

text: 'Several lines of text,
  containing ''single quotes''. Escapes (like \n) don''t do anything.

  Newlines can be added by leaving a blank line.
    Leading whitespace on lines is ignored.'

Output:

Several lines of text, containing 'single quotes'. Escapes (like \n) don't do anything.
Newlines can be added by leaving a blank line. Leading whitespace on lines is ignored.

Double-quoted:

PDL:

text: "Several lines of text,
  containing \"double quotes\". Escapes (like \\n) work.\nIn addition,
  newlines can be esc\
  aped to prevent them from being converted to a space.

  Newlines can also be added by leaving a blank line.
    Leading whitespace on lines is ignored."

Output:

Several lines of text, containing "double quotes". Escapes (like \n) work.
In addition, newlines can be escaped to prevent them from being converted to a space.
Newlines can also be added by leaving a blank line. Leading whitespace on lines is ignored.

Plain:

PDL:

text: Several lines of text,
  with some "quotes" of various 'types'.
  Escapes (like \n) don't do anything.

  Newlines can be added by leaving a blank line.
    Additional leading whitespace is ignored.

Output:

Several lines of text, with some "quotes" of various 'types'. Escapes (like \n) don't do anything.
Newlines can be added by leaving a blank line. Additional leading whitespace is ignored.