Skip to content
Snippets Groups Projects

NOTES

SETUP

The makefile does something not quite right. requirements.in will have unstructured[md]jupyter which causes the later uv pip compile command to throw an error "Distribution not found". Strip that entry back to just unstructured and the rest of the commands will work.

  • Prefer to use ollama, it is FOSS

Setting up ollama

  • Use wget https://github.com/ollama/ollama/releases/download/v0.3.4/ollama-linux-amd64 to get ollama
  • Use chmod u+x ollama (put in bin folder)
  • Use ./ollama serve (consider running in background). This sets up the ollama API frontend serving communication over local HTTP
  • In a new terminal use ./ollama pull llama3.1:8b to get the llama3.1 model locally
  • ./ollama pull nomic-embed-text to get nomic text embedding locally
  • pip install unstructured[md]

Langchain

its emprically defined SS like I showed in the rag notebook 3.0, but a good rule of thumb is, if you want granular acess to iformation use strategically small chunk sizes, then experiment across metrics that matter to you, like relevancy, retrieving certian types of information etc...

Consider using llama-index

embeddings

  • vectorization of existing stuff as a "pre-analyzer" for the llm model proper
  • helps identify which files/data are most likely relevant to a given query
  • langchain helps build this and provides reasonable results
    • paths to files
    • content of files
    • question is there a simple interface to get sections/anchor links from MD?
    • question how to have langchain load a pre-built database?
      • ultimately we'd like to have a chatbot that uses the docs site content as part of the vector store
      • it should build as part of ci/cd by cloning the docs repo, building the db, and hosting it on a server
      • the chatbot would then use that db for similarity search

doc parsing

temperature

Closer to 0 is more "precise" Closer to 1 is more "creative"

Tool Calling

  • This is a way of supplying the AI model with an API call it can interact with. A message is supplied to the AI and it can use the supplied tool call (function, db, etc) as additional information it can interact with.
  • This is an alternative to supplying it with pre-parsed documents
  • The LLM response may indicate or suggest a tool call be used, and provide arguments to use with that tool call.
  • Check 4.0 notebook
  • 8b model is less robust than 70b

Chat Agent

  • Investigate 4.3
  • 5.0 brings things all together