Creating an assistant and configuring conversational search#

Watsonx Orchestrate allows you to create and configure an assistant with conversational search capabilities. Configure your assistant to use conversational search by using a hosted OpenSearch instance. The pre-configured instance of the "Z RAG" on top of OpenSearch boasts over 220 knowledge sources and supports Retrieval Augmented Generation (RAG). The large language model (LLM) providing conversational AI augments this knowledge based on IBM Z documentation, generating IBM Z context-aware responses to queries with content-grounded knowledge.

A high-level, logical architecture of the environment is illustrated in the following diagram.

Access the ITZ IBM Cloud account for the watsonx Assistant for Z Pilot environment#

In the IBM Technology Zone portal, expand My TechZone and select My Reservations, or click the following link.

ITZ My reservations
Click the watsonx Assistant for Z Pilot - watsonx Orchestrate tile.
Record the ITZ IBM Cloud account name associated with the reservation.

Did you read the tip on the welcome page about creating a reference card? Check it out here.
Click the IBM Cloud Login link.

Steps to authenticate to IBM Cloud are not illustrated here.

You may need to authenticate to IBM Cloud after clicking the link. These steps are not shown here as they may vary by individual.
Verify that the current IBM Cloud account is the same as the account name recorded in step 3. If the account is not the same, switch to the proper account.

Note: The formatting of the name can appear differently than what is shown in the ITZ reservation.

If the proper account is not listed, click the account drop down and select the proper account.

Note: If your browser window is narrow, the account drop down can be depicted with the Switch Account icon ().

Create your Assistant#

Click the Resources icon ().
Expand the AI / Machine Learning section and click the watsonx Orchestrate instance listed (the instance name is different than shown in the following image).
Click Launch watsonx Orchestrate.
Request a skills-based tenant of watsonx Orchestrate

As of April 30, 2025, all new tenants of the watsonx Orchestrate environment you provision will default to the new agentic experience. In order to complete the exercises in this guide to setup a wxa4z demo or pilot environment, you will be required to take a manual step to revert your tenant back to the skills-based experience.

This can be done by following the instructions outlined here.

For Business Partners, please reach out to your IBM contact to assist you with this step.

Click the AI assistant builder tile to start creating a new assistant.
Enter a name and optional description for your assistant and click Next.
Complete the Personalize your assistant form and click Next.

Explore the personalization options. In creating an assistant for a client pilot, consider specifying attributes that align with the client's business.

a. Select Web.

b. Select the industry of your choice.

c. Select the role of your choice.

d. Select the need of your choice.
Complete the Customize your chat UI form and click Next.

Explore the customization options. When creating an assistant for a client pilot, consider specifying attributes that align with the client (for example, colors and logos).
Preview your assistant and then click Create.

The assistant is now created.

Configure conversational search#

In the next steps you will be to configure conversational search for your assistant that uses a hosted instance of OpenSearch.

Click Generative AI menu item () in the left navigation.
Select granite-3-8b-instruct for the base large language model (LLM) settings.
Click Set up your Search Integration.

By default, conversational search is not enabled when an assistant is created. Conversational search takes priority over general-purpose answering if both are enabled. Learn more about conversational search in watsonx here.
Click Custom service.
Complete the Custom service (a-e) form and then click Next (f).

a. Select By providing credentials.

b. Enter the following value in the URL field (use the copy icon to avoid typographical errors). This is the URL for the shared OpenSearch instance. In later sections, you create and customize a dedicated instance.
```
https://wxa4z-opensearch-wrapper-wxa4z-v2-2-9.wxa4z-zassistantdeploys-47e063e6a3ad1f71bf2e58f91c3b4c2e-0000.us-east.containers.appdomain.cloud/v1/query
```
c. Select Basic authentication in the Choose an authentication type drop-down list.

d. Enter admin in the Username field.

e. Enter secureP@ssw0rd! in the Password field.
```
secureP@ssw0rd!
```
Enable conversational search and then click Save.
Update the conversational search custom service settings based on your requirements.

Note: The Settings page is divided into two sections in the following images to enhance the visibility of the screen captures.

Learn more about these custom service settings here.

The following settings are proven to work well. You can experiment with these settings to see how they affect queries for your client's pilot.

a. Enable Conversational search.

b. Select Single turn. Multi-turn conversation (by selecting Entire conversation) is supported by the offering, but has not been fully included in the lab guide. See the callout in the Testing conversational search section below.

c. Specify the text that appears to instruct the user to expand the list of citations in the assistant (except web chat client).

d. Select Lowest for the retrieval confidence threshold setting. This setting checks the confidence of the retrieved citations before a response is generated.

e. Select Verbose for the generated response length. This setting affects the average response length. Depending on user input, variations from the selected length can occur.

f. Select Lowest for the response confidence threshold. This setting checks the confidence of the generated citations after the response is generated.

g. Keep the default setting of All for the listing of citations.

h. Keep the Default filter field empty.

i. The Metadata field provides a way to adjust your assistant’s behavior during conversational search for your OpenSearch instance. This option is explored in detail in the Installing and using zassist to ingest client documents. Leave the field empty for now.

j. The Search display text options specify the default text displayed when no results are found or when connectivity issues to the backend search service occur. You can keep the defaults or customize the service.
Click Save (a) and then click Close (b).

Complete the configuration#

After you save and close the Conversational search configuration page, a few more configurations are needed to get the best experience from your conversational chat. Details on these settings are available here.

Hover over the Generative AI icon () in the left navigation and click Actions.
Click Set by assistant under the All items menu.
Click No matches.
Click Step 1 under Conversation steps.
Select without conditions (a) in the Is taken drop-down menu and then click Clear conditions (b).

Note: the Is taken value does not change from with conditions after selecting without conditions.
Delete the default text in the Assistant says entry field.
Expand the And then drop-down menu and select Search for the answer.
Click Edit settings.
Click After generation.
Select End the action after this step and then click Apply.
Click Save ().
Select Step 2 (No matches count) under Conversation steps and click delete ().
Click Delete in the confirmation dialog to delete Conversation step 2.
Click Close (the x icon) the Editor window.
Click Fallback in the Actions table.
Delete all of the Conversation steps except for the last one (Step 6).

Note: You need to select each step individually. Click delete () and confirm the deletion for the first 5 steps.
Verify that the first 5 Conversation steps are deleted and then click the x to close the Editor window.
Click the Global settings ().
Click No matches under the Conversation routing tab.
Move the slider to More often (or select More often in the drop-down).

The setting helps ensure that actions are triggered less often unless the user’s query specifically matches the action’s input.
Click Autocorrection.
Click the autocorrection toggle to turn the feature Off.
Click Save (a) and then Close (b).
Hover over the Home () and click Environments.
Click Web chat.
On the Style tab, click the Streaming toggle to enable streaming.

The streaming setting allows responses to be streamed to the assistant and displayed as they are generated versus waiting until the full response is received and then displayed.
Click the Home screen tab.
Customize the Home screen by setting a custom Greeting message and deleting the default Conversation starters. Optionally, adjust the Background style.
Click Suggestions.
Click the Suggestions toggle to turn this feature Off.
Click (a) Save and exit and then click (b) Close.

Modifying the Chat UI#

Due to an update of watsonx Orchestrate SaaS on IBM Cloud, you will need to change one setting that's set by default when you launch the UI for the first time. This update is not yet available in the watsonx Assistant for Z product. In order to successfully complete the skills exercises later on, you will need to switch the Chat settings to the Legacy Chat. Follow the below steps to do so before moving on.

Click on your profile icon in the top-right corner of the screen and click on Settings.
Click on the Chat version tab and then click on Switch to legacy chat in blue to make the change.
In the new pop-up window, confirm the setting change by clicking on Change to legacy chat.
Once changed, verify that your Chat version is now set to Legacy Chat as shown below.

Configure the base large language model#

There are enhancements that you can make to configure how the large language model (LLM) responds to your queries, including adding prompt instructions and configuring the LLM’s answer behavior. The options are summarized here.

Hover over the Home () and click Generative AI.
Click Add instructions.
Enter a prompt instruction.

Your assistant's LLM gives refined responses by following the prompt's instructions, which clarify how to achieve the end-goal of an action.

Enter prompt instructions in the field. The maximum number of characters you can enter in the prompt instruction field is 1,000.

The following is an example prompt instruction that provides shorter, concise responses. Experiment with different prompt instructions, for example the longer, more verbose prompt instructions below.
```
You are a subject matter expert on mainframe systems. Please respond to all prompts with truth and accuracy. Keep all answers short and concise, unless requested to provide details.
```
Note: When the instructions are typed in, they are automatically saved and the LLM is immediately trained on them.

Customizing prompt instructions

Prompt instructions are highly customizable and should be tested prior to delivering a demo or pilot. The provided prompt instructions above are just one example. If you'd prefer more detailed responses with example commands and bulleted lists, consider using the prompt instructions more similar to the following:

You are a subject matter expert on IBM Z mainframe systems. Please respond to all prompts with truth and accuracy. Provide detailed and bulleted answers with headings, along with examples and commands when requested. DO NOT guess the answer.

Try it out yourself!
Toggle General-purpose answering to Off and then click Save ().

The ability exists to configure the answering behavior of your assistant to provide responses that are based on the preinstalled content or general content.

On the Generative AI page (under Prompt Instructions), you see the Answer behavior section. After you configure Conversational search, you see that it is enabled (toggled on) with the search integration added.

If you enable both general-purpose answering and conversational search, conversational search answering takes precedence over General-purpose answering.

Recommendation: For purposes of retrieving Z-specific answers and responses, it is recommended that you turn off general-purpose answering and leave only conversational search turned on.

Testing conversational search#

Now, you can begin issuing queries to test the assistant's responses. For more detailed responses, try appending "Please provide a detailed response." to the end of your question.

Important: Modify settings iteratively based on your assessment of response quality. Review and change them at any time. For example, add extra prompt instructions, change response verbosity, and modify OpenSearch indexes.

Hover over the Home () and click Preview.
Experiment with different prompts and validate that the answers are reasonable and related to IBM Z.

Other prompts and responses follow.

Note: The responses that you receive can vary from the ones shown.

Prompt:
```
What is z/OS continuous delivery?
```
Example output:

Prompt:
```
What is the APF list in z/OS?
```
Example output:

Prompt:
```
Why is Db2 different than other database systems?
```
Example output:

Prompt:
```
What happens during an IPL on IBM Z?
```
Example output:
Experiment with multi-turn (entire conversation) contextual awareness.

In the December 2024 release of IBM watsonx Assistant for Z support for multi-turn contextual awareness was added. This capability enables the assistant to use an entire session history for retrieving search results and generating answers. This handles context-dependent questions well but may over-rely on past topics, even if the user has moved on.

Experiment with this setting by changing your custom service contextual awareness setting from Single turn to Entire conversation.

Once enabled, try sequential prompts like:
```
What are some features of z/OS?
```
```
Give me an itemized list?
```
```
Tell me more about item 3.
```

You have a working assistant that uses IBM Watson Assistant for Z. Explore different prompt instructions and settings. If you encounter issues, refer to the Troubleshooting section that follows for resolution.

Continue to the Creating a stand-alone OpenSearch instance for document ingestion to learn how to configure a dedicated OpenSearch instance for ingesting client-specific documentation into the RAG model.

Troubleshooting#

The following are issues that you may encounter. If the provided resolutions do not work, contact support by using the methods that are mentioned in the Support section.

Assistant responds to all prompts with, "I might have information related to your query to share, but am unable to connect to my knowledge base at the moment"

This Assistant is unable to connect to the custom service URL specified. This could be a network issue, the service may be down, the service may be restarting, or the service is no longer running at that URL.

Before reaching out to Support, try the following:

Wait a few minutes and try again. It may be the service was in the process of restarting.
If you printed this demonstration guide or saved a copy, verify you are using the most current version of the lab guide and the correct service URL (https://wxa4z-opensearch-wrapper-wxa4z-v2-2-9.wxa4z-zassistantdeploys-47e063e6a3ad1f71bf2e58f91c3b4c2e-0000.us-east.containers.appdomain.cloud/v1/query). The URL may have changed since you saved or printed the lab guide.