Working with AutoAI RAG class and rag_optimizer =============================================== The :ref:`AutoAI experiment class` is responsible for creating experiments and scheduling training. All experiment results are stored automatically in the user-specified Cloud Object Storage (COS). Then the AutoAI feature can fetch the results and provide them directly to the user for further usage. Configure rag_optimizer with one data source -------------------------------------------- For an AutoAI object initialization, you need watsonx.ai credentials (with your API key and URL) and either the ``project_id`` or ``space_id``. .. hint:: You can copy the project_id from the Project's Manage tab (Project -> Manage -> General -> Details). .. code-block:: python from ibm_watsonx_ai.experiment import AutoAI experiment = AutoAI(wx_credentials, space_id='76g53e0-0b32-4a0e-9152-3d50324855ddb' ) rag_optimizer = experiment.rag_optimizer( name='AutoAI RAG test', description='Sample description', foundation_models=[ "meta-llama/llama-3-70b-instruct", "ibm/granite-13b-chat-v2" ], embedding_models=[ "ibm/slate-125m-english-rtrvr", "intfloat/multilingual-e5-large" ], max_number_of_rag_patterns=5, optimization_metrics=[AutoAI.RAGMetrics.ANSWER_CORRECTNESS], ) Get configuration parameters ---------------------------- To see the current configuration parameters, call the ``get_params()`` method. .. code-block:: python config_parameters = rag_optimizer.get_params() print(config_parameters) { 'name': 'RAG AutoAi tests without vector_store_references', 'description': 'Sample description', 'chunking_methods': None, 'embedding_models': [ 'ibm/slate-125m-english-rtrvr', 'intfloat/multilingual-e5-large' ], 'retrieval_methods': None, 'foundation_models': [ 'meta-llama/llama-3-70b-instruct', 'ibm/granite-13b-chat-v2' ], 'max_number_of_rag_patterns': 5, 'optimization_metrics': ['answer_correctness'] } Run rag_optimizer ----------------- To schedule an AutoAI RAG experiment, call the ``run()`` method. This will trigger a training and an optimization process on watsonx.ai. The ``run()`` method can be synchronous (``background_mode=False``) or asynchronous (``background_mode=True``). If you don't want to wait for the fit to end, invoke the async version. It immediately returns only run details. .. code-block:: python run_details = rag_optimizer.run( input_data_references=[input_data_connection], test_data_references=[test_data_connection], results_reference=results_connection, background_mode=True ) # OR run_details = rag_optimizer.run( input_data_references=[input_data_connection], test_data_references=[test_data_connection], results_reference=results_connection, background_mode=False ) Get the run status and run details ---------------------------------- If you use the ``run()`` method asynchronously, you can monitor the run details and status using the following two methods: .. code-block:: python status = rag_optimizer.get_run_status() print(status) 'running' # OR 'completed' run_details = rag_optimizer.get_run_details() print(run_details) RAG Optimizer summary --------------------- It is possible to get a ranking of all the computed pattern, sorted based on a scoring metric supplied when configuring the optimizer (``optimization_metrics`` parameter). The output type is a ``pandas.DataFrame`` with pattern names, computation timestamps, machine learning metrics, and the number of enhancements implemented in each of the pattern. .. code-block:: python rag_optimizer.summary() rag_optimizer.summary(scoring='answer_correctness') rag_optimizer.summary(scoring=['answer_correctness', 'context_correctness']) # Result: # mean_answer_correctness ... ci_high_faithfulness # Pattern_Name # Pattern3 0.79165 ... 0.5102 # Pattern1 0.72915 ... 0.4839 # Pattern2 0.64585 ... 0.8333 # Pattern4 0.64585 ... 0.5312 Get pattern details ------------------- To see the pattern details, use the ``get_pattern_details()`` method. If you leave ``pattern_name`` empty, the method returns the details of the best computed pattern. .. code-block:: python pattern_params = rag_optimizer.get_pattern_details(pattern_name='Pattern3') print(pattern_params) { 'composition_steps': [ 'chunking', 'embeddings', 'vector_store', 'retrieval', 'generation' ], 'location': { 'evaluation_results': '4r55b555-63a6-4cc9-3d00-3d2y762b4vg/Pattern3/evaluation_results.json', 'indexing_notebook': '4r55b555-63a6-4cc9-3d00-3d2y762b4vg/Pattern3/indexing_notebook.ipynb', 'inference_notebook': '4r55b555-63a6-4cc9-3d00-3d2y762b4vg/Pattern3/inference_notebook.ipynb' }, 'name': 'Pattern3', 'settings': { 'chunking': { 'chunk_size': 512, 'method': 'recursive' }, 'embeddings': { 'model_id': 'ibm/slate-125m-english-rtrvr', 'truncate_input_tokens': 512, 'truncate_strategy': 'left' }, 'generation': { 'context_template_text': '[document]: {document}\n', 'model_id': 'meta-llama/llama-3-70b-instruct', 'parameters': { 'max_new_tokens': 500 }, 'prompt_template_text': '<|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\nIf a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don’t know the answer to a question, please don’t share false information.\n<|eot_id|><|start_header_id|>user<|end_header_id|>\n{reference_documents}\n[conversation]: {question}. Answer with no more than 150 words. If you cannot base your answer on the given document, please state that you do not have an answer.<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>'}, 'retrieval': { 'method': 'simple', 'number_of_chunks': 5, 'window_size': 0 }, 'vector_store': { 'distance_metric': 'cosine', 'index_name': 'autoai_rag_20240807124701' } } } Get pattern ----------- Use the ``get_pattern()`` method to load a specific pattern. If you leave ``pattern_name`` empty, the method returns the details of the best computed pattern. .. code-block:: python pattern = rag_optimizer.get_pattern(pattern_name='Pattern3') print(type(pattern)) 'ibm_watsonx_ai.foundation_models.extensions.rag.pattern.pattern.RAGPattern' Get inference and indexing notebooks ------------------------------------ To download specified inference notebook from Service use the ``get_inference_notebook()``. If you leave ``pattern_name`` empty, the method download notebook of the best computed pattern. .. code-block:: python rag_optimizer.get_inference_notebook(pattern_name='Pattern3') To download specified indexing notebook from Service use the ``get_indexing_notebook()``. If you leave ``pattern_name`` empty, the method download notebook of the best computed pattern. .. code-block:: python rag_optimizer.get_indexing_notebook(pattern_name='Pattern3') Get logs -------- To download logs of an AutoAI RAG job use ``get_logs()``. .. code-block:: python rag_optimizer.get_logs() Get evaluation results ---------------------- To download evaluation results of an AutoAI RAG job use ``get_evaluation_results()``. If you leave ``pattern_name`` empty, the method download notebook of the best computed pattern. .. code-block:: python rag_optimizer.get_evaluation_results(pattern_name="Pattern1")