Lab 6. Using AnythingLLM for a Local RAG
Configuration and Sanity Check¶
Open up AnyThingLLM, and you should see something like the following:
If you see this that means AnythingLLM is installed correctly, and we can continue configuration, if not, please find a workshop TA or raise your hand we'll be there to help you ASAP.
Next as a sanity check, run the following command to confirm you have the granite3.1-dense
model downloaded in ollama
. This may take a bit, but we should have a way to copy it directly on your laptop.
ollama pull granite3.1-dense:8b
If you didn't know, the supported languages with granite3.1-dense
now include:
- English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese (Simplified)
And the Capabilities also include:
- Summarization
- Text classification
- Text extraction
- Question-answering
- Retrieval Augmented Generation (RAG)
- Code related tasks
- Function-calling tasks
- Multilingual dialog use cases
- Long-context tasks including long document/meeting summarization, long document QA, etc.
Next click on the wrench
icon, and open up the settings. For now we are going to configure the global settings for ollama
but you may want to change it in the future.
Click on the "LLM" section, and select Ollama as the LLM Provider. Also select the granite3-dense:8b
model. (You should be able to
see all the models you have access to through ollama
there.)
Click the "Back to workspaces" button where the wrench was. And Click "New Workspace."
Name it something like "learning llm" or the name of the event we are right now, something so you know it's somewhere you are learning how to use this LLM.
Now we can test our connections through AnythingLLM! I like the "Who is Batman?" question, as a sanity check on connections and that it knows something.
Now you may notice that the answer is slighty different then the screen shot above. That's expected and nothing to worry about. If you have more questions about it raise your hand and one of the helpers would love to talk you about it.
Congratulations! You have AnythingLLM running now, configured to work with granite3.1-dense
and ollama
!
Creating your own local RAG¶
Now that you have everything set up, lets build our own RAG. You need a document, of some sort to questions to answer against it. Lets start with something fun. As of right now, our Granite model doesn't know about the US Federal Budget in 2024, so lets ask it a question about it to verify.
Create a new workspace, and call it whatever you want:
Now you have a new workspace, ask it a question like:
What was the US federal budget for 2024?
You should come back with something like the following, it may be different, but the gist is there.
Not great right? Well now we need to give it a way to look up this data, luckly, we have a backed up copy of the budget pdf here. Go ahead and save it to your local machine, and be ready to grab it.
Now spin up a New Workspace, (yes, please a new workspace, it seems that sometimes AnythingLLM has issues with adding things, so a clean environment is always easier to teach in) and call it something else.
Click on the "upload a document" to get the pdf added.
Next we need to add it to the workspace.
Next click the upload or drag and drop and put the pdf in there, and then the arrow to move it to the workspace. Click Save and Embed.
You have now added the pdf to the workspace.
Now when the chat comes back up ask the same question, and you should see some new answers!
It won't be exactly what we are looking for, but it's enough to now see that the Granite model can leverage the local RAG and in turn can look things up for you. You'll need some prompt engineering to get exactly what you want but this is just the start of leveraging the AI!