RAG¶
Note
Added in 1.1.x release
Chunkers¶
LangChainChunker¶
- class ibm_watsonx_ai.foundation_models.extensions.rag.chunker.langchain_chunker.LangChainChunker(method='recursive', chunk_size=4000, chunk_overlap=200, encoding_name='gpt2', model_name=None, **kwargs)[source]¶
Bases:
BaseChunker
[Document
]Wrapper for LangChain TextSplitter.
- Parameters:
method (Literal["recursive", "character", "token"], optional) – describes the type of TextSplitter as the main instance performing the chunking, defaults to “recursive”
chunk_size (int, optional) – maximum size of a single chunk that is returned, defaults to 4000
chunk_overlap (int, optional) – overlap in characters between chunks, defaults to 200
encoding_name (str, optional) – encoding used in the TokenTextSplitter, defaults to “gpt2”
model_name (str, optional) – model used in the TokenTextSplitter
from ibm_watsonx_ai.foundation_models.extensions.rag.chunker import LangChainChunker text_splitter = LangChainChunker( method="recursive", chunk_size=1000, chunk_overlap=200 ) chunks_ids = [] for i, document in enumerate(data_loader): chunks = text_splitter.split_documents([document]) chunks_ids.append(vector_store.add_documents(chunks, batch_size=300))
- split_documents(documents)[source]¶
Split series of documents into smaller chunks based on the provided chunker settings. Each chunk has metadata that includes the document_id, sequence_number, and start_index.
- Parameters:
documents (Sequence[langchain_core.documents.Document]) – sequence of elements that contain context in a text format
- Returns:
list of documents split into smaller ones, having less content
- Return type:
list[langchain_core.documents.Document]
- supported_methods = ('recursive', 'character', 'token')¶
BaseChunker¶
- class ibm_watsonx_ai.foundation_models.extensions.rag.chunker.base_chunker.BaseChunker[source]¶
Bases:
ABC
,Generic
[ChunkType
]Responsible for handling splitting document operations in the RAG application.
- abstract split_documents(documents)[source]¶
Split series of documents into smaller parts based on the provided chunker settings.
- Parameters:
documents – sequence of elements that contain context in a text format
- Type:
Sequence[ChunkType]
- Returns:
list of documents split into smaller ones, having less content
- Return type:
list[ChunkType]
Retrievers¶
- class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.Retriever(vector_store, method=RetrievalMethod.SIMPLE, window_size=2, number_of_chunks=5)[source]¶
Bases:
BaseRetriever
Retriever class that handles the retrieval operation for a RAG implementation. Returns the number_of_chunks document segments using the provided method based on a relevant query in the
retrieve
method.- Parameters:
vector_store (BaseVectorStore) – VectorStore to use for the retrieval
method (RetrievalMethod, optional) – default retrieval method to use when calling retrieve, defaults to RetrievalMethod.SIMPLE
number_of_chunks (int, optional) – number of expected document chunks to be returned, defaults to 5
You can create a repeatable retrieval and return the three nearest documents by using a simple proximity search. To do this, create a VectorStore and then define a Retriever.
from ibm_watsonx_ai import APIClient from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore from ibm_watsonx_ai.foundation_models.extensions.rag import Retriever, RetrievalMethod from ibm_watsonx_ai.foundation_models.embeddings import SentenceTransformerEmbeddings api_client = APIClient(credentials) vector_store = VectorStore( api_client, connection_id='***', params={ 'index_name': 'my_test_index', }, embeddings=SentenceTransformerEmbeddings('sentence-transformers/all-MiniLM-L6-v2') ) retriever = Retriever(vector_store=vector_store, method=RetrievalMethod.SIMPLE, number_of_chunks=3) retriever.retrieve("What is IBM known for?")
- classmethod from_vector_store(vector_store, init_parameters=None)[source]¶
Deserializes the
init_parameters
retriever into a concrete one using arguments.- Parameters:
vector_store (BaseVectorStore) – vector store used to create the retriever
init_parameters (dict[str, Any]) – parameters to initialize the retriever with
- Returns:
concrete Retriever or None if data is incorrect
- Return type:
BaseRetriever | None
- class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.RetrievalMethod(value)[source]¶
Bases:
str
,Enum
An enumeration.
- SIMPLE = 'simple'¶
- WINDOW = 'window'¶
- class ibm_watsonx_ai.foundation_models.extensions.rag.retriever.retriever.BaseRetriever(vector_store)[source]¶
Bases:
ABC
Abstract class for all retriever handlers for the chosen vector store. Returns some document chunks in a RAG pipeline using a concrete
retrieve
implementation.- Parameters:
vector_store (BaseVectorStore) – vector store used in document retrieval
- abstract classmethod from_vector_store(vector_store, init_parameters=None)[source]¶
Deserializes the
init_parameters
retriever into a concrete one using arguments.- Parameters:
vector_store (BaseVectorStore) – vector store used to create the retriever
init_parameters (dict[str, Any]) – parameters to initialize the retriever with
- Returns:
concrete Retriever or None if data is incorrect
- Return type:
BaseRetriever | None
Vector Stores¶
VectorStore¶
- class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.vector_store.VectorStore(api_client=None, *, connection_id=None, embeddings=None, index_name=None, datasource_type=None, distance_metric=None, langchain_vector_store=None, **kwargs)[source]¶
Bases:
BaseVectorStore
Universal vector store client for a RAG pattern.
Instantiates the vector store connection in the Watson Machine Learning environment and handles the necessary operations. The parameters given by the keyword arguments are used to instantiate the vector store client in their particular constructor. Those parameters might be parsed differently.
For details, refer to the VectorStoreConnector
get_...
methods.You can utilize the custom embedding function. This function can be provided in the constructor or by the
set_embeddings
method. For available embeddings, refer to theibm_watsonx_ai.foundation_models.embeddings
module.- Parameters:
api_client (APIClient, optional) – WatsonX API client required if connecting by connection_id, defaults to None
connection_id (str, optional) – connection asset ID, defaults to None
embeddings (BaseEmbeddings, optional) – default embeddings to be used, defaults to None
index_name (str, optional) – name of the vector database index, defaults to None
datasource_type (VectorStoreDataSourceType, str, optional) – data source type to use when
connection_id
is not provided, keyword arguments will be used to establish connection, defaults to Nonedistance_metric (Literal["euclidean", "cosine"], optional) – metric used for determining vector distance, defaults to None
langchain_vector_store (VectorStore, optional) – use LangChain vector store, defaults to None
Example:
To connect, provide the connection asset ID. You can use custom embeddings to add and search documents.
from ibm_watsonx_ai import APIClient from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore from ibm_watsonx_ai.foundation_models.embeddings import SentenceTransformerEmbeddings api_client = APIClient(credentials) embedding = Embeddings( model_id=EmbeddingTypes.IBM_SLATE_30M_ENG, api_client=api_client ) vector_store = VectorStore( api_client, connection_id='***', index_name='my_test_index', embeddings=embedding ) vector_store.add_documents([ {'content': 'document one content', 'metadata':{'url':'ibm.com'}} {'content': 'document two content', 'metadata':{'url':'ibm.com'}} ]) vector_store.search('one', k=1)
Note
Optionally, like in LangChain, it is possible to use a cloud ID and API key parameters to connect to Elastic Cloud. The keyword arguments can be used as direct params to LangChain’s
ElasticsearchStore
constructor.from ibm_watsonx_ai import APIClient from ibm_watsonx_ai.foundation_models.extensions.rag import VectorStore api_client = APIClient(credentials) vector_store = VectorStore( api_client, index_name='my_test_index', model_id=".elser_model_2_linux-x86_64", cloud_id='***', api_key=IAM_API_KEY ) vector_store.add_documents([ {'content': 'document one content', 'metadata':{'url':'ibm.com'}} {'content': 'document two content', 'metadata':{'url':'ibm.com'}} ]) vector_store.search('one', k=1)
- add_documents(content, **kwargs)[source]¶
Adds a list of documents to the RAG’s vector store as an upsert operation. IDs are determined by the text content of the document (hash). Duplicates will not be added.
The list must contain strings, dictionaries with a required
content
field of a string type, or a LangChainDocument
.- Parameters:
content (list[str] | list[dict] | list) – unstructured list of data to be added
- Returns:
list of IDs
- Return type:
list[str]
- async add_documents_async(content, **kwargs)[source]¶
Add document to the RAG’s vector store asynchronously. The list must contain strings, dictionaries with a required
content
field of a string type, or a LangChainDocument
.- Parameters:
content (list[str] | list[dict] | list) – unstructured list of data to be added
- Returns:
list of IDs
- Return type:
list[str]
- as_langchain_retriever(**kwargs)[source]¶
Creates a LangChain retriever from this vector store.
- Returns:
LangChain retriever that can be used in LangChain pipelines
- Return type:
langchain_core.vectorstores.VectorStoreRetriever
- clear()[source]¶
Clears the current collection that is being used by the vector store. Removes all documents with all their metadata and embeddings.
- count()[source]¶
Returns the number of documents in the current collection.
- Returns:
number of documents in the collection
- Return type:
int
- delete(ids, **kwargs)[source]¶
Delete documents with provided IDs.
- Parameters:
ids (list[str]) – IDs of documents to be deleted
- classmethod from_dict(client=None, data=None)[source]¶
Creates
VectorStore
using only a primitive data type dict.- Parameters:
data (dict) – dict in schema like the
to_dict()
method- Returns:
reconstructed VectorStore
- Return type:
- get_client()[source]¶
Returns an underlying native vector store client.
- Returns:
wrapped vector store client
- Return type:
Any
- search(query, k, include_scores=False, verbose=False, **kwargs)[source]¶
Searches for documents most similar to the query.
The method is designed as a wrapper for respective LangChain VectorStores’ similarity search methods. Therefore, additional search parameters passed in
kwargs
should be consistent with those methods, and can be found in the LangChain documentation as they may differ depending on the connection type: Milvus, Chroma, Elasticsearch, etc.- Parameters:
query (str) – text query
k (int) – number of documents to retrieve
include_scores (bool) – whether similarity scores of found documents should be returned, defaults to False
verbose (bool) – whether to display a table with the found documents, defaults to False
- Returns:
list of found documents
- Return type:
list
- set_embeddings(embedding_fn)[source]¶
If possible, sets a default embedding function. To make the function capable for a
RAGPattern
deployment, use types inherited fromBaseEmbeddings
. Theembedding_fn
argument can be a LangChain embedding but issues with serialization will occur.Deprecated: The set_embeddings method for the VectorStore class is deprecated, because it might cause issues for ‘langchain >= 0.2.0’.
- Parameters:
embedding_fn (BaseEmbeddings) – embedding function
- to_dict()[source]¶
Serialize
VectorStore
into a dict that allows reconstruction using thefrom_dict
class method.- Returns:
dict for the from_dict initialization
- Return type:
dict
- window_search(query, k, include_scores=False, verbose=False, window_size=2, **kwargs)[source]¶
Similarly to the search method, gets documents (chunks) that would fit the query. Each chunk is extended to its adjacent chunks (if they exist) from the same origin document. The adjacent chunks are merged into one chunk while keeping their order, and any intersecting text between them is merged (if it exists). This requires chunks to have “document_id” and “sequence_number” in their metadata.
- Parameters:
query (str) – question asked by a user
k (int) – maximum number of similar documents
include_scores (bool, optional) – return scores for documents, defaults to False
verbose (bool, optional) – print formatted response to the output, defaults to False
window_size (int) – number of adjacent chunks to retrieve before and after the center, according to the sequence_number.
- Returns:
list of found documents (extended into windows).
- Return type:
list
BaseVectorStore¶
- class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.base_vector_store.BaseVectorStore[source]¶
Bases:
ABC
Base abstract class for all vector store-like classes. Interface that supports simple database operations.
- abstract add_documents(content, **kwargs)[source]¶
Adds a list of documents to the RAG’s vector store as an upsert operation. IDs are determined by the text content of the document (hash). Duplicates will not be added.
The list must contain strings, dictionaries with a required
content
field of a string type, or a LangChainDocument
.- Parameters:
content (list[str] | list[dict] | list) – unstructured list of data to be added
- Returns:
list of IDs
- Return type:
list[str]
- abstract async add_documents_async(content, **kwargs)[source]¶
Add document to the RAG’s vector store asynchronously. The list must contain strings, dictionaries with a required
content
field of a string type, or a LangChainDocument
.- Parameters:
content (list[str] | list[dict] | list) – unstructured list of data to be added
- Returns:
list of IDs
- Return type:
list[str]
- abstract as_langchain_retriever(**kwargs)[source]¶
Creates a LangChain retriever from this vector store.
- Returns:
LangChain retriever that can be used in LangChain pipelines
- Return type:
langchain_core.vectorstores.VectorStoreRetriever
- abstract clear()[source]¶
Clears the current collection that is being used by the vector store. Removes all documents with all their metadata and embeddings.
- abstract count()[source]¶
Returns the number of documents in the current collection.
- Returns:
number of documents in the collection
- Return type:
int
- abstract delete(ids, **kwargs)[source]¶
Delete documents with provided IDs.
- Parameters:
ids (list[str]) – IDs of documents to be deleted
- abstract get_client()[source]¶
Returns an underlying native vector store client.
- Returns:
wrapped vector store client
- Return type:
Any
- abstract search(query, k, include_scores=False, verbose=False, **kwargs)[source]¶
Get documents that would fit the query.
- Parameters:
query (str) – question asked by a user
k (int) – maximum number of similar documents
include_scores (bool, optional) – return scores for documents, defaults to False
verbose (bool, optional) – print formatted response to the output, defaults to False
- Returns:
list of found documents
- Return type:
list
- abstract set_embeddings(embedding_fn)[source]¶
If possible, sets a default embedding function. To make the function capable for a
RAGPattern
deployment, use types inherited fromBaseEmbeddings
. Theembedding_fn
argument can be a LangChain embedding but issues with serialization will occur.Deprecated: The set_embeddings method for the VectorStore class is deprecated, because it might cause issues for ‘langchain >= 0.2.0’.
- Parameters:
embedding_fn (BaseEmbeddings) – embedding function
- abstract window_search(query, k, include_scores=False, verbose=False, window_size=2, **kwargs)[source]¶
Similarly to the search method, gets documents (chunks) that would fit the query. Each chunk is extended to its adjacent chunks (if they exist) from the same origin document. The adjacent chunks are merged into one chunk while keeping their order, and any intersecting text between them is merged (if it exists). This requires chunks to have “document_id” and “sequence_number” in their metadata.
- Parameters:
query (str) – question asked by a user
k (int) – maximum number of similar documents
include_scores (bool, optional) – return scores for documents, defaults to False
verbose (bool, optional) – print formatted response to the output, defaults to False
window_size (int) – number of adjacent chunks to retrieve before and after the center, according to the sequence_number.
- Returns:
list of found documents (extended into windows).
- Return type:
list
VectorStoreConnector¶
- class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.vector_store_connector.VectorStoreConnector(properties=None)[source]¶
Bases:
object
Creates a proper vector store client using the provided properties.
Properties are arguments to the LangChain vector stores of a desired type. Also parses properties extracted from connection assets into one that would fit for initialization.
Custom or connection asset properties that are parsed are: * index_name * distance_metric * username * password * ssl_certificate * embeddings
- Parameters:
properties (dict) – dictionary with all the required key values to establish the connection
- get_chroma()[source]¶
Creates an in-memory vector store for Chroma.
- Raises:
ImportError – langchain required
- Returns:
vector store adapter for LangChain’s Chroma
- Return type:
- get_elasticsearch()[source]¶
Creates an Elasticsearch vector store.
- Raises:
ImportError – langchain required
- Returns:
vector store adapter for LangChain’s Elasticsearch
- Return type:
- get_from_type(type)[source]¶
Gets a vector store based on the provided type (matching from DataSource names from SDK API).
- Parameters:
type (VectorStoreDataSourceType) – DataSource type string from SDK API
- Raises:
TypeError – unsupported type
- Returns:
proper BaseVectorStore type constructed from properties
- Return type:
- get_langchain_adapter(langchain_vector_store)[source]¶
Creates an adapter for a concrete vector store from LangChain.
- Parameters:
langchain_vector_store (Any) – object that is a subclass of the LangChain vector store
- Raises:
ImportError – LangChain required
- Returns:
proper adapter for the vector store
- Return type:
- get_milvus()[source]¶
Creates a Milvus vector store.
- Raises:
ImportError – langchain required
- Returns:
vector store adapter for LangChain’s Milvus
- Return type:
VectorStoreDataSourceType¶
LangChainVectorStoreAdapter¶
- class ibm_watsonx_ai.foundation_models.extensions.rag.vector_stores.langchain_vector_store_adapter.LangChainVectorStoreAdapter(vector_store)[source]¶
Bases:
BaseVectorStore
Adapter for LangChain
VectorStore
base class.- Parameters:
vector_store (langchain_core.vectorstore.VectorStore) – concrete LangChain vector store object
- add_documents(content, **kwargs)[source]¶
Adds a list of documents to the RAG’s vector store as an upsert operation. IDs are determined by the text content of the document (hash). Duplicates will not be added.
The list must contain strings, dictionaries with a required
content
field of a string type, or a LangChainDocument
.- Parameters:
content (list[str] | list[dict] | list) – unstructured list of data to be added
- Returns:
list of IDs
- Return type:
list[str]
- async add_documents_async(content, **kwargs)[source]¶
Add document to the RAG’s vector store asynchronously. The list must contain strings, dictionaries with a required
content
field of a string type, or a LangChainDocument
.- Parameters:
content (list[str] | list[dict] | list) – unstructured list of data to be added
- Returns:
list of IDs
- Return type:
list[str]
- as_langchain_retriever(**kwargs)[source]¶
Creates a LangChain retriever from this vector store.
- Returns:
LangChain retriever that can be used in LangChain pipelines
- Return type:
langchain_core.vectorstores.VectorStoreRetriever
- clear()[source]¶
Clears the current collection that is being used by the vector store. Removes all documents with all their metadata and embeddings.
- count()[source]¶
Returns the number of documents in the current collection.
- Returns:
number of documents in the collection
- Return type:
int
- delete(ids, **kwargs)[source]¶
Delete documents with provided IDs.
- Parameters:
ids (list[str]) – IDs of documents to be deleted
- get_client()[source]¶
Returns an underlying native vector store client.
- Returns:
wrapped vector store client
- Return type:
Any
- search(query, k, include_scores=False, verbose=False, **kwargs)[source]¶
Get documents that would fit the query.
- Parameters:
query (str) – question asked by a user
k (int) – maximum number of similar documents
include_scores (bool, optional) – return scores for documents, defaults to False
verbose (bool, optional) – print formatted response to the output, defaults to False
- Returns:
list of found documents
- Return type:
list
- set_embeddings(embedding_fn)[source]¶
If possible, sets a default embedding function. To make the function capable for a
RAGPattern
deployment, use types inherited fromBaseEmbeddings
. Theembedding_fn
argument can be a LangChain embedding but issues with serialization will occur.Deprecated: The set_embeddings method for the VectorStore class is deprecated, because it might cause issues for ‘langchain >= 0.2.0’.
- Parameters:
embedding_fn (BaseEmbeddings) – embedding function
- window_search(query, k, include_scores=False, verbose=False, window_size=2, **kwargs)[source]¶
Similarly to the search method, gets documents (chunks) that would fit the query. Each chunk is extended to its adjacent chunks (if they exist) from the same origin document. The adjacent chunks are merged into one chunk while keeping their order, and any intersecting text between them is merged (if it exists). This requires chunks to have “document_id” and “sequence_number” in their metadata.
- Parameters:
query (str) – question asked by a user
k (int) – maximum number of similar documents
include_scores (bool, optional) – return scores for documents, defaults to False
verbose (bool, optional) – print formatted response to the output, defaults to False
window_size (int) – number of adjacent chunks to retrieve before and after the center, according to the sequence_number.
- Returns:
list of found documents (extended into windows).
- Return type:
list