RAG¶

Note

Added in 1.1.x release

Chunkers¶

LangChainChunker¶

class ibm_watsonx_ai.foundation_models.extensions.rag.chunker.langchain_chunker.LangChainChunker(method='recursive', chunk_size=4000, chunk_overlap=200, encoding_name='gpt2', model_name=None, **kwargs)[source]¶

Bases: BaseChunker[Document]

Wrapper for LangChain TextSplitter.

Parameters:

method (Literal["recursive", "character", "token"], optional) – describes the type of TextSplitter as the main instance performing the chunking, defaults to “recursive”
chunk_size (int, optional) – maximum size of a single chunk that is returned, defaults to 4000
chunk_overlap (int, optional) – overlap in characters between chunks, defaults to 200
encoding_name (str, optional) – encoding used in the TokenTextSplitter, defaults to “gpt2”
model_name (str, optional) – model used in the TokenTextSplitter

from ibm_watsonx_ai.foundation_models.extensions.rag.chunker import LangChainChunker

text_splitter = LangChainChunker(
    method="recursive",
    chunk_size=1000,
    chunk_overlap=200
)

chunks_ids = []

for i, document in enumerate(data_loader):
    chunks = text_splitter.split_documents([document])
    chunks_ids.append(vector_store.add_documents(chunks, batch_size=300))

classmethod from_dict(d)[source]¶: Create an instance from the dictionary.

split_documents(documents)[source]¶

Split series of documents into smaller chunks based on the provided chunker settings. Each chunk has metadata that includes the document_id, sequence_number, and start_index.

Parameters:: documents (Sequence[langchain_core.documents.Document]) – sequence of elements that contain context in a text format
Returns:: list of documents split into smaller ones, having less content
Return type:: list[langchain_core.documents.Document]

supported_methods = ('recursive', 'character', 'token')¶

to_dict()[source]¶: Return dictionary that can be used to recreate an instance of the LangChainChunker.

HybridSemanticChunker¶

class ibm_watsonx_ai.foundation_models.extensions.rag.chunker.hybrid_semantic_chunker.HybridSemanticChunker(embeddings, chunk_size=1024, allowed_chunk_size_deviation=1, **kwargs)[source]¶

Bases: BaseChunker[Document]

Chunker which uses similarity between inner segments of text to find optimal breakpoints.

Note

Added in 1.3.25

Parameters:

embeddings (Embeddings) – embeddings to be used to generate dense vectors
chunk_size (int) – approximate chunk size
allowed_chunk_size_deviation (float) – specifies the fraction by which each chunk’s size may vary from the target chunk_size
kwargs (dict) – additional chunker parameters: - tfidf_buffer_size: Number of breakpoint chunks to the left or right of each potential breakpoint used to generate TF-IDF vectors for similarity analysis, defaults to 5. - embedding_buffer_size: Number of breakpoint chunks to the left or right of each potential breakpoint used to generate embeddings for similarity analysis, defaults to 5. - tfidf_weight: Weight of tfidf vectors representations in similarity analysis, defaults to 0.5. - embedding_weight: Weight of embeddings in similarity analysis, defaults to 0.5.

Example:

chunker = HybridSemanticChunker(embedding=embeddings)
chunker.split_documents()

or with vectors precomputing:

chunker = HybridSemanticChunker(embedding=embeddings)
chunker.precompute_vectors()
chunker.get_chunks()

classmethod from_dict(d)[source]¶

Create an instance from the dictionary.

Parameters:: d (HybridSemanticChunker) – dictionary that can be used to create an instance of the HybridSemanticChunker.

get_chunks(chunk_size=None, allowed_chunk_size_deviation=None, tfidf_weight=None, embedding_weight=None, **kwargs)[source]¶

Performs similarity analysis on vector representations of the texts between potential breakpoints and identifies the optimal ones.

Parameters:

chunk_size (int, optional) – approximate chunk size
allowed_chunk_size_deviation (float, optional) – specifies the percentage by which each chunk’s size may vary from the target chunk_size
tfidf_weight (float, optional) – weight of tfidf vectors representations in similarity analysis
embedding_weight (float, optional) – weight of embeddings in similarity analysis

Returns:

list of chunks

Return type:

list[Document]

precompute_vectors(documents, **kwargs)[source]¶

Performs an initial split using sentence_split_regex to identify potential breakpoints, then computes and embedding vector representations of the text segments between breakpoints for semantic similarity analysis. This function is useful for experimenting with different chunking parameters efficiently, allowing to avoid recomputing vectors for the same input documents.

Parameters:: documents (Sequence[Document]) – sequence of documents to perform chunking on

split_documents(documents, **kwargs)[source]¶

Executes the full chunking process, including vector computation, similarity analysis, and selection of optimal breakpoints.

Parameters:

documents (Sequence[Document]) – sequence of documents to perform chunking on
kwargs (Any) – chunking parameters

Returns:

list of chunks

Return type:

list[Document]

to_dict()[source]¶

Return dictionary that can be used to recreate an instance of the HybridSemanticChunker.

Returns:: dictionary which can be used to recreate an instance of the HybridSemanticChunker.
Return type:: dict

BaseChunker¶

class ibm_watsonx_ai.foundation_models.extensions.rag.chunker.base_chunker.BaseChunker[source]¶

Bases: ABC, Generic[ChunkType]

Responsible for handling splitting document operations in the RAG application.

abstractmethod classmethod from_dict(d)[source]¶: Create an instance from the dictionary.

abstractmethod split_documents(documents)[source]¶

Split series of documents into smaller parts based on the provided chunker settings.

Parameters:: documents – sequence of elements that contain context in a text format
Type:: Sequence[ChunkType]
Returns:: list of documents split into smaller ones, having less content
Return type:: list[ChunkType]

abstractmethod to_dict()[source]¶: Return dictionary that can be used to recreate an instance of the BaseChunker.

Retrievers¶

class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.Retriever(vector_store, method=RetrievalMethod.SIMPLE, window_size=2, number_of_chunks=5)[source]¶

Bases: BaseRetriever

Retriever class that handles the retrieval operation for a RAG implementation. Returns the number_of_chunks document segments using the provided method based on a relevant query in the retrieve method.

Parameters:

vector_store (BaseVectorStore) – VectorStore to use for the retrieval
method (RetrievalMethod, optional) – default retrieval method to use when calling retrieve, defaults to RetrievalMethod.SIMPLE
number_of_chunks (int, optional) – number of expected document chunks to be returned, defaults to 5

You can create a repeatable retrieval and return the three nearest documents by using a simple proximity search. To do this, create a VectorStore and then define a Retriever.

from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore
from ibm_watsonx_ai.foundation_models.extensions.rag import Retriever, RetrievalMethod
from ibm_watsonx_ai.foundation_models.embeddings import SentenceTransformerEmbeddings

api_client = APIClient(credentials)

vector_store = VectorStore(
        api_client,
        connection_id='***',
        params={
            'index_name': 'my_test_index',
        },
        embeddings=SentenceTransformerEmbeddings('sentence-transformers/all-MiniLM-L6-v2')
    )

retriever = Retriever(vector_store=vector_store, method=RetrievalMethod.SIMPLE, number_of_chunks=3)

retriever.retrieve("What is IBM known for?")

classmethod from_vector_store(vector_store, init_parameters=None)[source]¶

Deserializes the init_parameters retriever into a concrete one using arguments.

Parameters:

vector_store (BaseVectorStore) – vector store used to create the retriever
init_parameters (dict[str, Any]) – parameters to initialize the retriever with

Returns:

concrete Retriever or None if data is incorrect

Return type:

BaseRetriever | None

retrieve(query, **kwargs)[source]¶

Retrieve elements from the VectorStore by using the provided query.

Parameters:: query (str) – text query to be used for searching
Returns:: list of retrieved LangChain documents
Return type:: list[langchain_core.documents.Document]

to_dict()[source]¶

Serializes the init_parameters retriever so it can be reconstructed by the from_vector_store class method.

Returns:: serialized init_parameters
Return type:: dict

to_langchain_tool(*, name='retriever', description='Retriever tool', document_prompt='{document}', document_separator='\n\n', **retriever_kwargs)[source]¶

Create a LangChain tool to do retrieval of documents.

param name:

the name for the tool that will be passed to the language model, should be unique and somewhat descriptive, defaults to “retriever”

type name:

str, optional

param description:

the description for the tool that will be passed to the language model, defaults to “Retriever tool”

type description:

str, optional

param document_prompt:

the prompt to use for the document, defaults to “{document}”

type document_prompt:

str, optional

param document_separator:

the separator to use between documents, defaults to “

“

type document_separator:: str, optional
param retriever_kwargs:: keyword arguments that will be passed to Retriever.retrieve method, defaults to {}
type retriever_kwargs:: Any, optional
return:: instance of Langchain Tool class to pass to an agent.
rtype:: Tool

class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.RetrievalMethod(value)[source]¶

Bases: str, Enum

SIMPLE = 'simple'¶

WINDOW = 'window'¶

class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.BaseRetriever(vector_store)[source]¶

Bases: ABC

Abstract class for all retriever handlers for the chosen vector store. Returns some document chunks in a RAG pipeline using a concrete retrieve implementation.

Parameters:: vector_store (BaseVectorStore) – vector store used in document retrieval

abstractmethod classmethod from_vector_store(vector_store, init_parameters=None)[source]¶

Deserializes the init_parameters retriever into a concrete one using arguments.

Parameters:

vector_store (BaseVectorStore) – vector store used to create the retriever
init_parameters (dict[str, Any]) – parameters to initialize the retriever with

Returns:

concrete Retriever or None if data is incorrect

Return type:

BaseRetriever | None

abstractmethod retrieve(query, **kwargs)[source]¶

Retrieve elements from the vector store using the provided query.

Parameters:: query (str) – text query to be used for searching
Returns:: list of retrieved LangChain documents
Return type:: list[langchain_core.documents.Document]

to_dict()[source]¶

Serializes the init_parameters retriever so it can be reconstructed by the from_vector_store class method.

Returns:: serialized init_parameters
Return type:: dict

Vector Stores¶

VectorStore¶

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.vector_store.VectorStore(api_client=None, *, connection_id=None, embeddings=None, index_name=None, datasource_type=None, distance_metric=None, langchain_vector_store=None, **kwargs)[source]¶

Bases: BaseVectorStore

Universal vector store client for a RAG pattern.

Instantiates the vector store connection in the Watson Machine Learning environment and handles the necessary operations. The parameters given by the keyword arguments are used to instantiate the vector store client in their particular constructor. Those parameters might be parsed differently.

For details, refer to the VectorStoreConnector get_... methods.

You can utilize the custom embedding function. This function can be provided in the constructor or by the set_embeddings method. For available embeddings, refer to the ibm_watsonx_ai.foundation_models.embeddings module.

Parameters:

api_client (APIClient, optional) – api client is required if connecting by connection_id, defaults to None
connection_id (str, optional) – connection asset ID, defaults to None
embeddings (BaseEmbeddings, optional) – default embeddings to be used, defaults to None
index_name (str, optional) – name of the vector database index, defaults to None
datasource_type (VectorStoreDataSourceType, str, optional) – data source type to use when connection_id is not provided, keyword arguments will be used to establish connection, defaults to None
distance_metric (Literal["euclidean", "cosine"], optional) – metric used for determining vector distance, defaults to None
langchain_vector_store (VectorStore, optional) – use LangChain vector store, defaults to None

Example:

To connect, provide the connection asset ID. You can use custom embeddings to add and search documents.

from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore
from ibm_watsonx_ai.foundation_models.embeddings import SentenceTransformerEmbeddings

api_client = APIClient(credentials)

 embedding = Embeddings(
         model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
         api_client=api_client
         )

vector_store = VectorStore(
        api_client,
        connection_id='***',
        index_name='my_test_index',
        embeddings=embedding
    )

vector_store.add_documents([
    {'content': 'document one content', 'metadata':{'url':'ibm.com'}},
    {'content': 'document two content', 'metadata':{'url':'ibm.com'}}
])

vector_store.search('one', k=1)

Note

Optionally, like in LangChain, it is possible to use direct credentials to connect to Elastic Cloud. The keyword arguments can be used as direct params to LangChain’s ElasticsearchStore constructor.

from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore

api_client = APIClient(credentials)

vector_store = VectorStore(
        api_client,
        index_name='my_test_index',
        model_id=".elser_model_2_linux-x86_64",
        cloud_id='***',
        api_key=IAM_API_KEY
    )

vector_store.add_documents([
    {'content': 'document one content', 'metadata':{'url':'ibm.com'}},
    {'content': 'document two content', 'metadata':{'url':'ibm.com'}}
])

vector_store.search('one', k=1)

add_documents(content, **kwargs)[source]¶

Adds a list of documents to the RAG’s vector store as an upsert operation. IDs are determined by the text content of the document (hash). Duplicates will not be added.

The list must contain strings, dictionaries with a required content field of a string type, or a LangChain Document.

Parameters:: content (list[str] | list[dict] | list) – unstructured list of data to be added
Returns:: list of IDs
Return type:: list[str]

async add_documents_async(content, **kwargs)[source]¶

Add document to the RAG’s vector store asynchronously. The list must contain strings, dictionaries with a required content field of a string type, or a LangChain Document.

Parameters:: content (list[str] | list[dict] | list) – unstructured list of data to be added
Returns:: list of IDs
Return type:: list[str]

as_langchain_retriever(**kwargs)[source]¶

Creates a LangChain retriever from this vector store.

Returns:: LangChain retriever that can be used in LangChain pipelines
Return type:: langchain_core.vectorstores.VectorStoreRetriever

clear()[source]¶: Clears the current collection that is being used by the vector store. Removes all documents with all their metadata and embeddings.

count()[source]¶

Returns the number of documents in the current collection.

Returns:: number of documents in the collection
Return type:: int

delete(ids, **kwargs)[source]¶

Delete documents with provided IDs.

Parameters:: ids (list[str]) – IDs of documents to be deleted

classmethod from_dict(api_client=None, data=None, **kwargs)[source]¶

Creates VectorStore using only a primitive data type dict.

Parameters:

api_client (APIClient, optional) – initialised APIClient used in vector store constructor, defaults to None
data (dict) – dict in schema like the to_dict() method

Returns:

reconstructed VectorStore

Return type:

VectorStore

get_client()[source]¶

Returns an underlying native vector store client.

Returns:: wrapped vector store client
Return type:: Any

search(query, k, include_scores=False, verbose=False, **kwargs)[source]¶

Searches for documents most similar to the query.

The method is designed as a wrapper for respective LangChain VectorStores’ similarity search methods. Therefore, additional search parameters passed in kwargs should be consistent with those methods, and can be found in the LangChain documentation as they may differ depending on the connection type: Milvus, Chroma, Elasticsearch, etc.

Parameters:

query (str) – text query
k (int) – number of documents to retrieve
include_scores (bool) – whether similarity scores of found documents should be returned, defaults to False
verbose (bool) – whether to display a table with the found documents, defaults to False

Returns:

list of found documents

Return type:

list

set_embeddings(embedding_fn)[source]¶

to_dict()[source]¶

Serialize VectorStore into a dict that allows reconstruction using the from_dict class method.

Returns:: dict for the from_dict initialization
Return type:: dict
Raises:: VectorStoreSerializationError – when instance is not serializable

window_search(query, k, include_scores=False, verbose=False, window_size=2, **kwargs)[source]¶

Similarly to the search method, gets documents (chunks) that would fit the query. Each chunk is extended to its adjacent chunks (if they exist) from the same origin document. The adjacent chunks are merged into one chunk while keeping their order, and any intersecting text between them is merged (if it exists). This requires chunks to have “document_id” and “sequence_number” in their metadata.

Parameters:

query (str) – question asked by a user
k (int) – maximum number of similar documents
include_scores (bool, optional) – return scores for documents, defaults to False
verbose (bool, optional) – print formatted response to the output, defaults to False
window_size (int) – number of adjacent chunks to retrieve before and after the center, according to the sequence_number.

Returns:

list of found documents (extended into windows).

Return type:

list

MilvusVectorStore¶

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.adapters.milvus_adapter.MilvusVectorStore(api_client=None, *, connection_id=None, vector_store=None, embedding_function=None, collection_name=None, **kwargs)[source]¶

Bases: LangChainVectorStoreAdapter[Milvus]

MilvusVectorStore vector store client for a RAG pattern.

Instantiates the vector store connection in the watsonx.ai environment and handles the necessary operations. The parameters given by the keyword arguments are used to instantiate the vector store client in their particular constructor. Those parameters might be parsed differently.

Parameters:

api_client (APIClient, optional) – api client is required if connecting by connection_id, defaults to None
connection_id (str, optional) – connection asset ID, defaults to None
vector_store (langchain_milvus.Milvus, optional) – initialized langchain_milvus vector store, defaults to None
embedding_function (BaseEmbeddings | LCEmbeddings | LCMilvusBaseSparseEmbedding | list[BaseEmbeddings | LCEmbeddings | LCMilvusBaseSparseEmbedding], optional) – list of dense or sparse embedding function, defaults to None
collection_name (str, optional) – name of the Milvus vector database collection, defaults to None
kwargs (Any, optional) – keyword arguments that will be directly passed to langchain_milvus.Milvus constructor

Note

For hybrid search (multi-vector search), if no ranker_type is specified, a weighted reranker with default weights equal to 1 is used. For more details, see the langchain_milvus documentation https://python.langchain.com/docs/integrations/vectorstores/milvus/#hybrid-search.

Example:

To connect, provide the connection asset ID. You can use custom embeddings to add and search documents.

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import MilvusVectorStore
from ibm_watsonx_ai.foundation_models.embeddings import Embeddings

credentials = Credentials(
        api_key = IAM_API_KEY,
        url = "https://us-south.ml.cloud.ibm.com"
        )

api_client = APIClient(credentials)

embedding = Embeddings(
         model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
         api_client=api_client
         )

vector_store = MilvusVectorStore(
        api_client,
        connection_id='***',
        collection_name='my_test_collection',
        embedding_function=embedding
    )

vector_store.add_documents([
    {'content': 'document one content', 'metadata':{'url':'ibm.com'}},
    {'content': 'document two content', 'metadata':{'url':'ibm.com'}}
])

vector_store.search('one', k=1)

Note

To use hybrid search you need to pass several embedding function.

Example with weighted ranker.

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import (
    MilvusVectorStore,
    MilvusSpladeEmbeddingFunction
)

credentials = Credentials(api_key=IAM_API_KEY, url="https://us-south.ml.cloud.ibm.com")

api_client = APIClient(credentials)

dense_embedding = Embeddings(
         model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
         api_client=api_client
         )

splade_func = MilvusSpladeEmbeddingFunction(model_name="naver/splade-cocondenser-selfdistil", device="cpu")

vector_store = MilvusVectorStore(
    api_client,
    connection_id=es_connection_id,
    collection_name="my_test_collection",
    embedding_function=[dense_embedding, splade_func]
)

vector_store.add_documents(
    [
        {"content": "document one content", "metadata": {"url": "ibm.com"}},
        {"content": "document two content", "metadata": {"url": "ibm.com"}},
    ]
)

# `weighted` ranker
vector_store.search("one", k=1, ranker_type="weighted", ranker_params={"weights": [0.0, 1.0])

# `rrf` ranker
vector_store.search("one", k=1, ranker_type="rrf", ranker_params={"k": 50)

Note

Please note that since Milvus v2.5 a full-text search can be used https://milvus.io/blog/introduce-milvus-2-5-full-text-search-powerful-metadata-filtering-and-more.md

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import (
    MilvusVectorStore,
    MilvusBM25BuiltinFunction(
)

credentials = Credentials(api_key=IAM_API_KEY, url="https://us-south.ml.cloud.ibm.com")

api_client = APIClient(credentials)

dense_embedding = Embeddings(
         model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
         api_client=api_client
         )
bm25_builtin_func = MilvusBM25BuiltinFunction()

vector_store = MilvusVectorStore(
    api_client,
    connection_id=es_connection_id,
    collection_name="my_test_collection",
    embedding_function=dense_embedding,
    builtin_function=bm25_builtin_func,
)

vector_store.add_documents(
    [
        {"content": "document one content", "metadata": {"url": "ibm.com"}},
        {"content": "document two content", "metadata": {"url": "ibm.com"}},
    ]
)

# `weighted` ranker
vector_store.search("one", k=1, ranker_type="weighted", ranker_params={"weights": [0.0, 1.0])

# `rrf` ranker
vector_store.search("one", k=1, ranker_type="rrf", ranker_params={"k": 50)

add_documents(content, **kwargs)[source]¶

Embed documents and add to the vectorstore.

Parameters:: content (list[str] | list[dict] | list[langchain_core.documents.Document]) – Documents to add to the vectorstore.
Returns:: List of IDs of the added texts.
Return type:: list[str]

async add_documents_async(content, **kwargs)[source]¶

Embed documents and add to the vectorstore in asynchronous manner.

Parameters:: content (list[str] | list[dict] | list[langchain_core.documents.Document]) – Documents to add to the vectorstore.
Returns:: List of IDs of the added texts.
Return type:: list[str]

clear()[source]¶: Clear collection by removing all records.

count()[source]¶: Count number of records in collection.

classmethod from_dict(api_client=None, data=None)[source]¶

Creates MilvusVectorStore using only a primitive data type dict.

Parameters:

api_client (APIClient, optional) – initialised APIClient used in vector store constructor, defaults to None
data (dict) – dict in schema like the to_dict() method

Returns:

reconstructed MilvusVectorStore

Return type:

MilvusVectorStore

get_client()[source]¶: Get langchain_milvus.Milvus instance.

to_dict()[source]¶

Serialize MilvusVectorStore into a dict that allows reconstruction using the from_dict class method.

Returns:: dict for the from_dict initialization
Return type:: dict
Raises:: VectorStoreSerializationError – when instance is not serializable

ElasticsearchVectorStore¶

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.adapters.es_adapter.ElasticsearchVectorStore(api_client=None, *, connection_id=None, vector_store=None, index_name=None, embedding=None, **kwargs)[source]¶

Bases: LangChainVectorStoreAdapter[ElasticsearchStore]

Elasticsearch vector store client for a RAG pattern.

Parameters:

api_client (APIClient, optional) – api client is required if connecting by connection_id, defaults to None
connection_id (str, optional) – connection asset ID, defaults to None
vector_store (langchain_elasticsearch.ElasticsearchStore, optional) – initialized langchain_elasticsearch vector store, defaults to None
embeddings (BaseEmbeddings, optional) – default dense embeddings to be used, defaults to None
index_name (str, optional) – name of the vector database index, defaults to None
kwargs (Any, optional) – keyword arguments that will be directly passed to langchain_elasticsearch.ElasticsearchStore constructor

Note

For hybrid search (multi-vector search), if no ranker type is specified in strategy, a weighted reranker with default weights equal to 1 is used. For more details, see the langchain-elasticsearch documentation and Elasticsearch documentation.

Warning

The default retrieval strategy is the same as in langchain_elasticsearch.ElasticsearchStore, i.e. when no strategy is specified the elasticsearch.helpers.vectorstore.DenseVectorStrategy will be used (see langchain-elasticsearch documentation).

Please note, that this strategy differ from the default one in ibm_watsonx.ai.foundation_models.extensions.rag.vector_stores.VectorStore, where elasticsearch.helpers.vectorstore.DenseVectorScriptScoreStrategy is used. To ensure the same functionality when migrating from VectorStore to ElasticsearchVectorStore, you may want to pass DenseVectorScriptScoreStrategy(distance=distance_metric) explicitly to ElasticsearchVectorStore constructor.

Example:

To connect, provide the connection asset ID. You can use custom embeddings to add and search documents.

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import ElasticsearchVectorStore
from ibm_watsonx_ai.foundation_models.embeddings import Embeddings

credentials = Credentials(
        api_key = IAM_API_KEY,
        url = "https://us-south.ml.cloud.ibm.com"
        )

api_client = APIClient(credentials)

embedding = Embeddings(
         model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
         api_client=api_client
         )

vector_store = ElasticsearchVectorStore(
        api_client,
        connection_id='***',
        index_name='my_test_index',
        embeddings=embedding
    )

vector_store.add_documents([
    {'content': 'document one content', 'metadata':{'url':'ibm.com'}},
    {'content': 'document two content', 'metadata':{'url':'ibm.com'}}
])

vector_store.search('one', k=1)

Note

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import (
    ElasticsearchVectorStore,
    HybridStrategyElasticsearch,
    RetrievalOptions,
)

credentials = Credentials(api_key=IAM_API_KEY, url="https://us-south.ml.cloud.ibm.com")

api_client = APIClient(credentials)

vector_store = ElasticsearchVectorStore(
    api_client,
    index_name="my_test_index",
    strategy=HybridStrategyElasticsearch(
        retrieval_strategies={RetrievalOptions.SPARSE: {"model_id": ".elser"}}
    ),
    cloud_id="***",
    api_key=IAM_API_KEY,
)

vector_store.add_documents(
    [
        {"content": "document one content", "metadata": {"url": "ibm.com"}},
        {"content": "document two content", "metadata": {"url": "ibm.com"}},
    ]
)

vector_store.search("one", k=1)

Note

To use hybrid search please specify multiple retrieval strategies in HybridStrategyElasticsearch.

Example with weighted ranker.

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import (
    ElasticsearchVectorStore,
    HybridStrategyElasticsearch,
    RetrievalOptions,
)

credentials = Credentials(api_key=IAM_API_KEY, url="https://us-south.ml.cloud.ibm.com")

api_client = APIClient(credentials)

vector_store = ElasticsearchVectorStore(
    api_client,
    connection_id=es_connection_id,
    index_name="my_test_index",
    strategy=HybridStrategyElasticsearch(
        retrieval_strategies={
            RetrievalOptions.SPARSE: {"model_id": ".elser", "boost": 0.5},
            RetrievalOptions.BM25: {"boost": 1},
        }
    ),
)

vector_store.add_documents(
    [
        {"content": "document one content", "metadata": {"url": "ibm.com"}},
        {"content": "document two content", "metadata": {"url": "ibm.com"}},
    ]
)

vector_store.search("one", k=1)

Example with rrf ranker:

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import (
    ElasticsearchVectorStore,
    HybridStrategyElasticsearch,
    RetrievalOptions,
)

credentials = Credentials(api_key=IAM_API_KEY, url="https://us-south.ml.cloud.ibm.com")

api_client = APIClient(credentials)

vector_store = ElasticsearchVectorStore(
    api_client,
    connection_id=es_connection_id,
    index_name="my_test_index",
    strategy=HybridStrategyElasticsearch(
        retrieval_strategies={
            RetrievalOptions.SPARSE: {"model_id": ".elser"},
            RetrievalOptions.BM25: {},
        },
        use_rrf=True
        rrf_params={"k": 50}
    ),
)

vector_store.add_documents(
    [
        {"content": "document one content", "metadata": {"url": "ibm.com"}},
        {"content": "document two content", "metadata": {"url": "ibm.com"}},
    ]
)

vector_store.search("one", k=1)

add_documents(content, **kwargs)[source]¶

Embed documents and add to the vectorstore.

Parameters:: content (list[str] | list[dict] | list[langchain_core.documents.Document]) – Documents to add to the vectorstore.
Returns:: List of IDs of the added texts.
Return type:: list[str]

clear()[source]¶: Clear index by removing all records.

count()[source]¶: Count number of records in index.

classmethod from_dict(api_client=None, data=None)[source]¶

Creates ElasticsearchVectorStore using only a primitive data type dict.

Parameters:

api_client (APIClient, optional) – initialised APIClient used in vector store constructor, defaults to None
data (dict) – dict in schema like the to_dict() method

Returns:

reconstructed VectorStore

Return type:

VectorStore

get_client()[source]¶: Get langchain_elasticsearch.ElasticsearchStore instance.

to_dict()[source]¶

Serialize ElasticsearchVectorStore into a dict that allows reconstruction using the from_dict class method.

Returns:: dict for the from_dict initialization
Return type:: dict
Raises:: VectorStoreSerializationError – when instance is not serializable

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.adapters.es_utils.HybridStrategyElasticsearch(retrieval_strategies, use_rrf=False, rrf_params=None, text_field='text_field')[source]¶

Bases: RetrievalStrategy

Hybrid strategy to be used in ElasticsearchVectorStore to take advantage of hybrid search.

Parameters:

retrieval_strategies (dict[str, dict[str, Any]]) – mapping containing retrieval type and its properties
use_rrf (bool, optional) – whether to use Reciprocal Rank Fusion (rrf) ranker when combining multiple results search in hybrid approach. For more details, please visit https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html, defaults to False
rrf_params (dict, optional) – rrf method’s parameters, default to None
text_field (str, optional) – text field name, default to text_field

Example:

When no ranker method is explicitly specified, the weighted ranker is used with all weights equal to 1. To change the weight for particular strategy add boost field to retrieval type settings.

from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import (
    HybridStrategyElasticsearch,
    RetrievalOptions,
)


strategy=HybridStrategyElasticsearch(
    retrieval_strategies={
        RetrievalOptions.SPARSE: {"model_id": ".elser", "boost": 0.5},
        RetrievalOptions.BM25: {"boost": 1},
    }
)

Example with rrf ranker:

from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import (
    HybridStrategyElasticsearch,
    RetrievalOptions,
)



strategy=HybridStrategyElasticsearch(
    retrieval_strategies={
        RetrievalOptions.SPARSE: {"model_id": ".elser"},
        RetrievalOptions.BM25: {},
    },
    use_rrf=True
    rrf_params={"k": 50}
)

before_index_creation(*, client, text_field, vector_field)[source]¶

Executes before the index is created. Used for setting up any required Elasticsearch resources like a pipeline. Defaults to a no-op.

Parameters:

client – The Elasticsearch client.
text_field – The field containing the text data in the index.
vector_field – The field containing the vector representations in the index.

es_mappings_settings(*, text_field, vector_field, num_dimensions)[source]¶

Create the required index and do necessary preliminary work, like creating inference pipelines or checking if a required model was deployed.

Parameters:

client – Elasticsearch client connection.
text_field – The field containing the text data in the index.
vector_field – The field containing the vector representations in the index.
num_dimensions – If vectors are indexed, how many dimensions do they have.

Returns:

Dictionary with field and field type pairs that describe the schema.

es_query(*, query, query_vector, text_field, vector_field, k, num_candidates, filter=[])[source]¶

Returns the Elasticsearch query body for the given parameters. The store will execute the query.

Parameters:

query – The text query. Can be None if query_vector is given.
k – The total number of results to retrieve.
num_candidates – The number of results to fetch initially in knn search.
filter – List of filter clauses to apply to the query.
query_vector – The query vector. Can be None if a query string is given.

Returns:

The Elasticsearch query body.

classmethod from_dict(data)[source]¶

Creates HybridStrategyElasticsearch using only a primitive data type dict.

Parameters:: data (dict) – dict in schema like the to_dict() method
Returns:: reconstructed HybridStrategyElasticsearch
Return type:: HybridStrategyElasticsearch

needs_inference()[source]¶: Some retrieval strategies index embedding vectors and allow search by embedding vector, for example the DenseVectorStrategy strategy. Mapping a user input query string to an embedding vector is called inference. Inference can be applied in Elasticsearch (using a model_id) or outside of Elasticsearch (using an EmbeddingService defined on the VectorStore). In the latter case, this method has to return True.

to_dict()[source]¶

Serialize HybridStrategyElasticsearch into a dict that allows reconstruction using the from_dict class method.

Returns:: dict for the from_dict initialization
Return type:: dict

DB2VectorStore¶

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.adapters.db2_adapter.DB2VectorStore(api_client=None, *, connection_id=None, vector_store=None, embedding_function=None, table_name=None, **kwargs)[source]¶

Bases: LangChainVectorStoreAdapter

DB2VectorStore vector store client for a RAG pattern.

Parameters:

api_client (APIClient, optional) – api client is required if connecting by connection_id, defaults to None
connection_id (str, optional) – connection asset ID, defaults to None
vector_store (langchain_db2.DB2VS, optional) – initialized langchain_db2 vector store, defaults to None
embedding_function (BaseEmbeddings | LCEmbeddings, optional) – dense embedding function, defaults to None
table_name (str, optional) – name of the DB2 table name, defaults to None
kwargs (Any, optional) – keyword arguments that will be directly passed to langchain_db2.DB2VS constructor

Example:

To connect, provide the connection asset ID. You can use custom embeddings to add and search documents.

from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores import DB2VectorStore
from ibm_watsonx_ai.foundation_models.embeddings import Embeddings

credentials = Credentials(
    api_key = IAM_API_KEY,
    url = "https://us-south.ml.cloud.ibm.com"
)

api_client = APIClient(credentials, project_id="<PROJECT_ID>")

embedding = Embeddings(
    model_id=EmbeddingTypes.IBM_SLATE_30M_ENG,
    api_client=api_client
)

vector_store = DB2VectorStore(
    api_client,
    connection_id='***',
    collection_name='my_test_collection',
    embedding_function=embedding
)

vector_store.add_documents([
    {'content': 'document one content', 'metadata':{'url':'ibm.com'}},
    {'content': 'document two content', 'metadata':{'url':'ibm.com'}}
])
# ['4CDDAF00329B3DF9', 'B8AE97421A8857E7']

vector_store.search('one', k=1)
# [Document(metadata={'url': 'ibm.com'}, page_content='document one content')]

add_documents(content, **kwargs)[source]¶

Embed documents and add to the vectorstore.

Parameters:: content (list[str] | list[dict] | list[langchain_core.documents.Document]) – Documents to add to the vectorstore.
Returns:: List of IDs of the added texts.
Return type:: list[str]

clear()[source]¶: Clear table by removing all records.

count()[source]¶: Count number of records in table.

classmethod from_dict(api_client=None, data=None)[source]¶

Creates DB2VectorStore using only a primitive data type dict.

Parameters:

api_client (APIClient, optional) – initialised APIClient used in vector store constructor, defaults to None
data (dict) – dict in schema like the to_dict() method

Returns:

reconstructed DB2VectorStore

Return type:

DB2VectorStore

get_client()[source]¶: Get langchain_db2.DB2VS instance.

to_dict()[source]¶

Serialize DB2VectorStore into a dict that allows reconstruction using the from_dict class method.

Returns:: dict for the from_dict initialization
Return type:: dict
Raises:: VectorStoreSerializationError – when instance is not serializable

BaseVectorStore¶

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.base_vector_store.BaseVectorStore[source]¶

Bases: ABC

Base abstract class for all vector store-like classes. Interface that supports simple database operations.

abstractmethod add_documents(content, **kwargs)[source]¶

Adds a list of documents to the RAG’s vector store as an upsert operation. IDs are determined by the text content of the document (hash). Duplicates will not be added.

The list must contain strings, dictionaries with a required content field of a string type, or a LangChain Document.

Parameters:: content (list[str] | list[dict] | list) – unstructured list of data to be added
Returns:: list of IDs
Return type:: list[str]

abstractmethod async add_documents_async(content, **kwargs)[source]¶

Add document to the RAG’s vector store asynchronously. The list must contain strings, dictionaries with a required content field of a string type, or a LangChain Document.

Parameters:: content (list[str] | list[dict] | list) – unstructured list of data to be added
Returns:: list of IDs
Return type:: list[str]

abstractmethod as_langchain_retriever(**kwargs)[source]¶

Creates a LangChain retriever from this vector store.

Returns:: LangChain retriever that can be used in LangChain pipelines
Return type:: langchain_core.vectorstores.VectorStoreRetriever

abstractmethod clear()[source]¶: Clears the current collection that is being used by the vector store. Removes all documents with all their metadata and embeddings.

abstractmethod count()[source]¶

Returns the number of documents in the current collection.

Returns:: number of documents in the collection
Return type:: int

abstractmethod delete(ids, **kwargs)[source]¶

Delete documents with provided IDs.

Parameters:: ids (list[str]) – IDs of documents to be deleted

abstractmethod get_client()[source]¶

Returns an underlying native vector store client.

Returns:: wrapped vector store client
Return type:: Any

abstractmethod search(query, k, include_scores=False, verbose=False, **kwargs)[source]¶

Get documents that would fit the query.

Parameters:

query (str) – question asked by a user
k (int) – maximum number of similar documents
include_scores (bool, optional) – return scores for documents, defaults to False
verbose (bool, optional) – print formatted response to the output, defaults to False

Returns:

list of found documents

Return type:

list

abstractmethod window_search(query, k, include_scores=False, verbose=False, window_size=2, **kwargs)[source]¶

Parameters:

query (str) – question asked by a user
k (int) – maximum number of similar documents
include_scores (bool, optional) – return scores for documents, defaults to False
verbose (bool, optional) – print formatted response to the output, defaults to False
window_size (int) – number of adjacent chunks to retrieve before and after the center, according to the sequence_number.

Returns:

list of found documents (extended into windows).

Return type:

list

VectorStoreConnector¶

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.vector_store_connector.VectorStoreConnector(properties=None)[source]¶

Bases: object

Creates a proper vector store client using the provided properties.

Properties are arguments to the LangChain vector stores of a desired type. Also parses properties extracted from connection assets into one that would fit for initialization.

Custom or connection asset properties that are parsed are: * index_name * distance_metric * username * password * ssl_certificate * embeddings

Parameters:: properties (dict) – dictionary with all the required key values to establish the connection

get_chroma()[source]¶

Creates an in-memory vector store for Chroma.

Raises:: ImportError – langchain required
Returns:: vector store adapter for LangChain’s Chroma
Return type:: LangChainVectorStoreAdapter

get_db2()[source]¶

Creates a DV2 vector store.

Raises:: ImportError – langchain-db2 required
Returns:: vector store adapter for LangChain’s DB2
Return type:: LangChainVectorStoreAdapter

get_elasticsearch()[source]¶

Creates an Elasticsearch vector store.

Raises:: ImportError – langchain required
Returns:: vector store adapter for LangChain’s Elasticsearch
Return type:: LangChainVectorStoreAdapter

get_from_type(type)[source]¶

Gets a vector store based on the provided type (matching from DataSource names from SDK API).

Parameters:: type (VectorStoreDataSourceType) – DataSource type string from SDK API
Raises:: TypeError – unsupported type
Returns:: proper BaseVectorStore type constructed from properties
Return type:: BaseVectorStore

get_langchain_adapter(langchain_vector_store)[source]¶

Creates an adapter for a concrete vector store from LangChain.

Parameters:: langchain_vector_store (Any) – object that is a subclass of the LangChain vector store
Raises:: ImportError – LangChain required
Returns:: proper adapter for the vector store
Return type:: LangChainVectorStoreAdapter

get_milvus()[source]¶

Creates a Milvus vector store.

Raises:: ImportError – langchain required
Returns:: vector store adapter for LangChain’s Milvus
Return type:: LangChainVectorStoreAdapter

static get_type_from_langchain_vector_store(langchain_vector_store)[source]¶

Returns DataSourceType for concrete LangChain VectorStore class.

Parameters:: langchain_vector_store (Any) – vector store object from LangChain
Returns:: DataSourceType name
Return type:: VectorStoreDataSourceType

VectorStoreDataSourceType¶

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.vector_store_connector.VectorStoreDataSourceType(value)[source]¶

Bases: str, Enum

CHROMA = 'chroma'¶

DB2 = 'db2'¶

ELASTICSEARCH = 'elasticsearch'¶

MILVUS = 'milvus'¶

MILVUS_WXD = 'milvuswxd'¶

UNDEFINED = 'undefined'¶

LangChainVectorStoreAdapter¶

class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.langchain_vector_store_adapter.LangChainVectorStoreAdapter(vector_store)[source]¶

Bases: Generic[T], BaseVectorStore

Adapter for LangChain VectorStore base class.

Parameters:: vector_store (langchain_core.vectorstore.VectorStore) – concrete LangChain vector store object

add_documents(content, **kwargs)[source]¶

Embed documents and add to the vectorstore.

Parameters:: content (list[str] | list[dict] | list[langchain_core.documents.Document]) – Documents to add to the vectorstore.
Returns:: List of IDs of the added texts.
Return type:: list[str]

async add_documents_async(content, **kwargs)[source]¶

Embed documents and add to the vectorstore in asynchronous manner.

Parameters:: content (list[str] | list[dict] | list[langchain_core.documents.Document]) – Documents to add to the vectorstore.
Returns:: List of IDs of the added texts.
Return type:: list[str]

as_langchain_retriever(**kwargs)[source]¶: Return Langchain VectorStoreRetriever initialized from this VectorStore.

clear()[source]¶: Clears the current collection that is being used by the vector store. Removes all documents with all their metadata and embeddings.

count()[source]¶

Returns the number of documents in the current collection.

Returns:: number of documents in the collection
Return type:: int

delete(ids, **kwargs)[source]¶: Delete by vector ID or other criteria. Sor more details see LangChain documentation https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html#langchain_core.vectorstores.base.VectorStore

get_client()[source]¶

Returns an underlying native vector store client.

Returns:: wrapped vector store client
Return type:: Any

search(query, k, include_scores=False, verbose=False, **kwargs)[source]¶

Searches for documents most similar to the query.

Parameters:

query (str) – text query
k (int) – number of documents to retrieve
include_scores (bool) – whether similarity scores of found documents should be returned, defaults to False
verbose (bool) – whether to display a table with the found documents, defaults to False

Returns:

list of found documents

Return type:

list

window_search(query, k, include_scores=False, verbose=False, window_size=2, **kwargs)[source]¶

Searches for documents most similar to the query and extend a document (a chunk) to its adjacent chunks (if they exist) from the same origin document.

Parameters:

query (str) – text query
k (int) – number of documents to retrieve
include_scores (bool) – whether similarity scores of found documents should be returned, defaults to False
verbose (bool) – whether to display a table with the found documents, defaults to False
window_size (int, optional) – number of chunks

Returns:

list of found documents

Return type:

list