RAG

Note

Added in 1.1.x release

Chunkers

LangChainChunker

class ibm_watsonx_ai.foundation_models.extensions.rag.chunker.langchain_chunker.LangChainChunker(method='recursive', chunk_size=4000, chunk_overlap=200, encoding_name='gpt2', model_name=None, **kwargs)[source]

Bases: BaseChunker[Document]

Wrapper for LangChain TextSplitter.

Parameters:
  • method (Literal["recursive", "character", "token"], optional) – describes type of the TextSplitter as the main instance performing chunking, defaults to “recursive”

  • chunk_size (int, optional) – maximum size of a single chunk that is returned, defaults to 4000

  • chunk_overlap (int, optional) – overlap in characters between chunks, defaults to 200

  • encoding_name (str, optional) – encoding used in the TokenTextSplitter, defaults to “gpt2”

  • model_name (str, optional) – model used in the TokenTextSplitter

from ibm_watsonx_ai.foundation_models.extensions.rag.chunker import LangChainChunker

text_splitter = LangChainChunker(
    method="recursive",
    chunk_size=1000,
    chunk_overlap=200
)

chunks_ids = []

for i, document in enumerate(data_loader):
    chunks = text_splitter.split_documents([document])
    chunks_ids.append(vector_store.add_documents(chunks, batch_size=300))
classmethod from_dict(d)[source]

Create instance from the dictionary

split_documents(documents)[source]

Split series of documents into smaller chunks based on the provided chunker settings. Each chunk has metadata that includes the document_id, sequence_number and start_index.

Parameters:

documents (Sequence[langchain_core.documents.Document]) – sequence of elements that contain context in the format of text

Returns:

list of documents splitter into smaller ones, having less content

Return type:

list[langchain_core.documents.Document]

supported_methods = ('recursive', 'character', 'token')
to_dict()[source]

Return dict that can be used to recreate instance of the LangChainChunker.

BaseChunker

class ibm_watsonx_ai.foundation_models.extensions.rag.chunker.base_chunker.BaseChunker[source]

Bases: ABC, Generic[ChunkType]

Class responsible for handling operations of splitting documents within the RAG application.

abstract classmethod from_dict(d)[source]

Create instance from the dictionary

abstract split_documents(documents)[source]

Split series of documents into smaller parts based on the provided chunker settings.

Parameters:

documents – sequence of elements that contain context in the format of text

Type:

Sequence[ChunkType]

Returns:

list of documents splitter into smaller ones, having less content

Return type:

list[ChunkType]

abstract to_dict()[source]

Return dict that can be used to recreate instance of the BaseChunker.

Retrievers

class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.Retriever(vector_store, method=RetrievalMethod.SIMPLE, window_size=2, number_of_chunks=5)[source]

Bases: BaseRetriever

Retriever class that handles retrieval operation for RAG implementation. Returns the number_of_chunks document segments using provided method based on relevant query in a retrieve method.

Parameters:
  • vector_store (BaseVectorStore) – vector store to use for the retrieval

  • method (RetrievalMethod, optional) – default retrieval method to use when calling retrieve, defaults to RetrievalMethod.SIMPLE

  • number_of_chunks (int, optional) – number of expected document chunks to be returned, defaults to 5

You can create a repeatable retrieval and return the three nearest documents by using a simple proximity search. To do this, create a VectorStore and then define a Retriever.

from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore
from ibm_watsonx_ai.foundation_models.extensions.rag import Retriever, RetrievalMethod
from ibm_watsonx_ai.foundation_models.embeddings import SentenceTransformerEmbeddings

api_client = APIClient(credentials)

vector_store = VectorStore(
        api_client,
        connection_id='***',
        params={
            'index_name': 'my_test_index',
        },
        embeddings=SentenceTransformerEmbeddings('sentence-transformers/all-MiniLM-L6-v2')
    )

retriever = Retriever(vector_store=vector_store, method=RetrievalMethod.SIMPLE, number_of_chunks=3)

retriever.retrieve("What is IBM known for?")
classmethod from_vector_store(vector_store, init_parameters=None)[source]

Deserialize this Retriever into concrete one using arguments.

Parameters:
  • vector_store (BaseVectorStore) – vector store used to create a Retriever

  • init_parameters (dict[str, Any]) – parameters to initialize retriever with

Returns:

concrete Retriever or None if data is incorrect

Return type:

BaseRetriever | None

retrieve(query, **kwargs)[source]

Retrieve elements from the VectorStore by using the provided query.

Parameters:

query (str) – text query to be used for searching

Returns:

list of retrieved LangChain documents

Return type:

list[langchain_core.documents.Document]

to_dict()[source]

Serializes this Retriever init_parameters so that this Retriever can be reconstructed by from_vector_store class method.

Returns:

serialized init_parameters

Return type:

dict

class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.RetrievalMethod(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: str, Enum

SIMPLE = 'simple'
WINDOW = 'window'
class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.BaseRetriever(vector_store)[source]

Bases: ABC

Abstract class for all retriever handlers for the chosen vector store. Returns some document chunks in RAG pipeline using concrete retrieve implementation.

Parameters:

vector_store (BaseVectorStore) – vector store used in document retrieval

abstract classmethod from_vector_store(vector_store, init_parameters=None)[source]

Deserialize this Retriever into concrete one using arguments.

Parameters:
  • vector_store (BaseVectorStore) – vector store used to create a Retriever

  • init_parameters (dict[str, Any]) – parameters to initialize retriever with

Returns:

concrete Retriever or None if data is incorrect

Return type:

BaseRetriever | None

abstract retrieve(query, **kwargs)[source]

Retrieve elements from the VectorStore using the provided query.

Parameters:

query (str) – text query to be used for searching

Returns:

list of retrieved LangChain documents

Return type:

list[langchain_core.documents.Document]

to_dict()[source]

Serializes this Retriever init_parameters so that this Retriever can be reconstructed by from_vector_store class method.

Returns:

serialized init_parameters

Return type:

dict

Vector Stores

VectorStore

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.vector_store.VectorStore(api_client=None, *, connection_id=None, embeddings=None, index_name=None, datasource_type=None, distance_metric=None, langchain_vector_store=None, **kwargs)[source]

Bases: BaseVectorStore

Universal vector store client for a RAG pattern.

Instantiates the vector store connection in the Watson Machine Learning environment and handles necessary operations. The parameters given by the keyword arguments are used to instantiate the vector store client in their particular constructor. Those parameters might be parsed differently.

For details, refer to VectorStoreConnector get_... methods.

Can utilize the custom embedding function. This function can be provided in the constructor or by the set_embeddings method. For available embeddings, refer to the ibm_watsonx_ai.foundation_models.embeddings module.

Parameters:
  • api_client (APIClient, optional) – WatsonX API client required if connecting by connection_id, defaults to None

  • connection_id (str, optional) – connection asset ID, defaults to None

  • embeddings (BaseEmbeddings, optional) – default embeddings to be used, defaults to None

  • index_name (str, optional) – name of the vector database index, defaults to None

  • datasource_type (VectorStoreDataSourceType, str, optional) – data source type to use when connection_id is not provided, keyword arguments will be used to establish connection, defaults to None

  • distance_metric (Literal["euclidean", "cosine"], optional) – metric used for determining vector distance, defaults to None

  • langchain_vector_store (VectorStore, optional) – use LangChain vector store, defaults to None

Example

To connect, provide the connection asset ID. You can use custom embeddings to add and search documents.

from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore
from ibm_watsonx_ai.foundation_models.embeddings import SentenceTransformerEmbeddings

api_client = APIClient(credentials)

 embedding = Embeddings(
         model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
         api_client=api_client
         )

vector_store = VectorStore(
        api_client,
        connection_id='***',
        index_name='my_test_index',
        embeddings=embedding
    )

vector_store.add_documents([
    {'content': 'document one content', 'metadata':{'url':'ibm.com'}}
    {'content': 'document two content', 'metadata':{'url':'ibm.com'}}
])

vector_store.search('one', k=1)

Note

Optionally, like in LangChain, it is possible to use a cloud ID and API key parameters to connect to Elastic Cloud. The keyword arguments can be used as direct params to LangChain’s ElasticsearchStore constructor.

from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore

api_client = APIClient(credentials)

vector_store = VectorStore(
        api_client,
        index_name='my_test_index',
        model_id=".elser_model_2_linux-x86_64",
        cloud_id='***',
        api_key='***'
    )

vector_store.add_documents([
    {'content': 'document one content', 'metadata':{'url':'ibm.com'}}
    {'content': 'document two content', 'metadata':{'url':'ibm.com'}}
])

vector_store.search('one', k=1)
add_documents(content, **kwargs)[source]

Adds a list of documents to the RAG’s vector store as upsert operation. IDs are determined by the text content of the document (hash) and redundant duplicates will not be added.

List must contain either strings, dicts with a required field content of str type or LangChain Document.

Parameters:

content (list[str] | list[dict] | list) – unstructured list of data to be added

Returns:

list of ids

Return type:

list[str]

async add_documents_async(content, **kwargs)[source]

Add document to the RAG’s vector store asynchronously. List must contain either strings, dicts with a required field content of str type or LangChain Document.

Parameters:

content (list[str] | list[dict] | list) – unstructured list of data to be added

Returns:

list of ids

Return type:

list[str]

as_langchain_retriever(**kwargs)[source]

Creates a LangChain retriever from this vector store.

Returns:

LangChain retriever which can be used in LangChain pipelines

Return type:

langchain_core.vectorstores.VectorStoreRetriever

clear()[source]

Clears the current collection that is being used by the VectorStore. Removes all documents with all their metadata and embeddings.

count()[source]

Return the number of docs in the current collection.

Returns:

count of all documents in the collection

Return type:

int

delete(ids, **kwargs)[source]

Delete documents with provided ids.

Parameters:

ids (list[str]) – IDs of documents to delete

classmethod from_dict(client=None, data=None)[source]

Creates VectorStore using only a primitive data type dict.

Parameters:

data (dict) – dict in schema like to_dict() method

Returns:

reconstructed VectorStore

Return type:

VectorStore

get_client()[source]

Returns underlying native VectorStore client.

Returns:

wrapped VectorStore client

Return type:

Any

search(query, k, include_scores=False, verbose=False, **kwargs)[source]

Searches for most similar documents to the query.

The method is designed as a wrapper for respective LangChain VectorStores’ similarity search methods. Therefore, additional search parameters passed in kwargs should be consistent with those methods, and can be found in the LangChain documentation as they may differ depending on the connection type: Milvus, Chroma, Elasticsearch etc.

Parameters:
  • query (str) – text query

  • k (int) – number of documents to retrieve

  • include_scores (bool) – whether similarity scores of found documents should be returned, defaults to False

  • verbose (bool) – whether to display a table with found documents, defaults to False

Returns:

list of found documents

Return type:

list

set_embeddings(embedding_fn)[source]

If possible, sets a default embedding function. Use types inheirted from BaseEmbeddings if you want to make it capable for RAGPattern deployment. Argument embedding_fn can be a LangChain embeddings but issues with serialization will occur.

Deprecated: Method set_embeddings for class VectorStore is deprecated, since it may cause issues for ‘langchain >= 0.2.0’.

Parameters:

embedding_fn (BaseEmbeddings) – embedding function

to_dict()[source]

Serialize VectorStore into a dict that allows reconstruction using from_dict class method.

Returns:

dict for the from_dict initialization

Return type:

dict

Similarly to search method, get documents (chunks) that would fit the query. Each chunk is extended to its adjacent chunks (if exist) from the same origin document. The adjacent chunks are merged into one chunk while keeping their order, and intersecting text between them is merged (if exists). This requires chunks to have “document_id” and “sequence_number” in their metadata.

Parameters:
  • query (str) – question asked by a user

  • k (int) – max number of similar documents

  • include_scores (bool, optional) – return scores for documents, defaults to False

  • verbose (bool, optional) – print formatted response to the output, defaults to False

  • window_size (int) – number of adjacent chunks to retrieve before and after the center, according to the sequence_number.

Returns:

list of found documents (extended into windows).

Return type:

list

BaseVectorStore

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.base_vector_store.BaseVectorStore[source]

Bases: ABC

Base abstract class for all vector store-like classes. Interface that support simple database operations.

abstract add_documents(content, **kwargs)[source]

Adds a list of documents to the RAG’s vector store as upsert operation. IDs are determined by the text content of the document (hash) and redundant duplicates will not be added.

List must contain either strings, dicts with a required field content of str type or LangChain Document.

Parameters:

content (list[str] | list[dict] | list) – unstructured list of data to be added

Returns:

list of ids

Return type:

list[str]

abstract async add_documents_async(content, **kwargs)[source]

Add document to the RAG’s vector store asynchronously. List must contain either strings, dicts with a required field content of str type or LangChain Document.

Parameters:

content (list[str] | list[dict] | list) – unstructured list of data to be added

Returns:

list of ids

Return type:

list[str]

abstract as_langchain_retriever(**kwargs)[source]

Creates a LangChain retriever from this vector store.

Returns:

LangChain retriever which can be used in LangChain pipelines

Return type:

langchain_core.vectorstores.VectorStoreRetriever

abstract clear()[source]

Clears the current collection that is being used by the VectorStore. Removes all documents with all their metadata and embeddings.

abstract count()[source]

Return the number of docs in the current collection.

Returns:

count of all documents in the collection

Return type:

int

abstract delete(ids, **kwargs)[source]

Delete documents with provided ids.

Parameters:

ids (list[str]) – IDs of documents to delete

abstract get_client()[source]

Returns underlying native VectorStore client.

Returns:

wrapped VectorStore client

Return type:

Any

abstract search(query, k, include_scores=False, verbose=False, **kwargs)[source]

Get documents that would fit the query.

Parameters:
  • query (str) – question asked by a user

  • k (int) – max number of similar documents

  • include_scores (bool, optional) – return scores for documents, defaults to False

  • verbose (bool, optional) – print formatted response to the output, defaults to False

Returns:

list of found documents

Return type:

list

abstract set_embeddings(embedding_fn)[source]

If possible, sets a default embedding function. Use types inheirted from BaseEmbeddings if you want to make it capable for RAGPattern deployment. Argument embedding_fn can be a LangChain embeddings but issues with serialization will occur.

Deprecated: Method set_embeddings for class VectorStore is deprecated, since it may cause issues for ‘langchain >= 0.2.0’.

Parameters:

embedding_fn (BaseEmbeddings) – embedding function

Similarly to search method, get documents (chunks) that would fit the query. Each chunk is extended to its adjacent chunks (if exist) from the same origin document. The adjacent chunks are merged into one chunk while keeping their order, and intersecting text between them is merged (if exists). This requires chunks to have “document_id” and “sequence_number” in their metadata.

Parameters:
  • query (str) – question asked by a user

  • k (int) – max number of similar documents

  • include_scores (bool, optional) – return scores for documents, defaults to False

  • verbose (bool, optional) – print formatted response to the output, defaults to False

  • window_size (int) – number of adjacent chunks to retrieve before and after the center, according to the sequence_number.

Returns:

list of found documents (extended into windows).

Return type:

list

VectorStoreConnector

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.vector_store_connector.VectorStoreConnector(properties=None)[source]

Bases: object

Creates proper vector store client using provided properties.

Properties are arguments to the LangChain VectorStores of desired type. It also parses properties extracted from Connection assets into one that would fit for initialization.

Custom or Connection asset properties that are parsed include: - index_name - distance_metric - username - password - ssl_certificate - embeddings

Parameters:

properties (dict) – dictionary with all required key values to establish connection.

get_chroma()[source]

Creates Chroma in-memory vector store.

Raises:

ImportError – langchain required

Returns:

vector store adapter for LangChain’s Chroma

Return type:

LangChainVectorStoreAdapter

get_elasticsearch()[source]

Creates Elasticsearch vector store.

Raises:

ImportError – langchain required

Returns:

vector store adapter for LangChain’s Elasticsearch

Return type:

LangChainVectorStoreAdapter

get_from_type(type)[source]

Gets a vector store based on provided type (matching from DataSource names from SDK API).

Parameters:

type (VectorStoreDataSourceType) – DataSource type string from SDK API

Raises:

TypeError – unsupported type

Returns:

proper BaseVectorStore type constructed from properties

Return type:

BaseVectorStore

get_langchain_adapter(langchain_vector_store)[source]

Creates adapter for concrete vector store from LangChain.

Parameters:

langchain_vector_store (Any) – object that is subclass of LangChain VectorStore

Raises:

ImportError – LangChain required

Returns:

proper adapter for the vector store

Return type:

LangChainVectorStoreAdapter

get_milvus()[source]

Creates Milvus vector store.

Raises:

ImportError – langchain required

Returns:

vector store adapter for LangChain’s Milvus

Return type:

LangChainVectorStoreAdapter

static get_type_from_langchain_vector_store(langchain_vector_store)[source]

Returns DataSourceType for concrete LangChain VectorStore class.

Parameters:

langchain_vector_store (Any) – vector store object from LangChain

Returns:

DataSourceType name

Return type:

VectorStoreDataSourceType

VectorStoreDataSourceType

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.vector_store_connector.VectorStoreDataSourceType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: str, Enum

CHROMA = 'chroma'
ELASTICSEARCH = 'elasticsearch'
MILVUS = 'milvus'
UNDEFINED = 'undefined'

LangChainVectorStoreAdapter

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.langchain_vector_store_adapter.LangChainVectorStoreAdapter(vector_store)[source]

Bases: BaseVectorStore

Adapter for LangChain VectorStore base class. :param vector_store: concrete LangChain vector store object :type vector_store: langchain_core.vectorstore.VectorStore

add_documents(content, **kwargs)[source]

Adds a list of documents to the RAG’s vector store as upsert operation. IDs are determined by the text content of the document (hash) and redundant duplicates will not be added.

List must contain either strings, dicts with a required field content of str type or LangChain Document.

Parameters:

content (list[str] | list[dict] | list) – unstructured list of data to be added

Returns:

list of ids

Return type:

list[str]

async add_documents_async(content, **kwargs)[source]

Add document to the RAG’s vector store asynchronously. List must contain either strings, dicts with a required field content of str type or LangChain Document.

Parameters:

content (list[str] | list[dict] | list) – unstructured list of data to be added

Returns:

list of ids

Return type:

list[str]

as_langchain_retriever(**kwargs)[source]

Creates a LangChain retriever from this vector store.

Returns:

LangChain retriever which can be used in LangChain pipelines

Return type:

langchain_core.vectorstores.VectorStoreRetriever

clear()[source]

Clears the current collection that is being used by the VectorStore. Removes all documents with all their metadata and embeddings.

count()[source]

Return the number of docs in the current collection.

Returns:

count of all documents in the collection

Return type:

int

delete(ids, **kwargs)[source]

Delete documents with provided ids.

Parameters:

ids (list[str]) – IDs of documents to delete

get_client()[source]

Returns underlying native VectorStore client.

Returns:

wrapped VectorStore client

Return type:

Any

search(query, k, include_scores=False, verbose=False, **kwargs)[source]

Get documents that would fit the query.

Parameters:
  • query (str) – question asked by a user

  • k (int) – max number of similar documents

  • include_scores (bool, optional) – return scores for documents, defaults to False

  • verbose (bool, optional) – print formatted response to the output, defaults to False

Returns:

list of found documents

Return type:

list

set_embeddings(embedding_fn)[source]

If possible, sets a default embedding function. Use types inheirted from BaseEmbeddings if you want to make it capable for RAGPattern deployment. Argument embedding_fn can be a LangChain embeddings but issues with serialization will occur.

Deprecated: Method set_embeddings for class VectorStore is deprecated, since it may cause issues for ‘langchain >= 0.2.0’.

Parameters:

embedding_fn (BaseEmbeddings) – embedding function

Similarly to search method, get documents (chunks) that would fit the query. Each chunk is extended to its adjacent chunks (if exist) from the same origin document. The adjacent chunks are merged into one chunk while keeping their order, and intersecting text between them is merged (if exists). This requires chunks to have “document_id” and “sequence_number” in their metadata.

Parameters:
  • query (str) – question asked by a user

  • k (int) – max number of similar documents

  • include_scores (bool, optional) – return scores for documents, defaults to False

  • verbose (bool, optional) – print formatted response to the output, defaults to False

  • window_size (int) – number of adjacent chunks to retrieve before and after the center, according to the sequence_number.

Returns:

list of found documents (extended into windows).

Return type:

list