Software & AI Entrepreneur • Engineering Leader • Software Architect • Follow me for actionable guidance on AI strategy
Simple RAG-based chatbot example to query your corporate or private knowledge base. Start with storing the knowledge from your documents in a format that can be queried. Use an embedding model to embed the knowledge: ↳ Split the Text: Break the entire knowledge base into chunks. Each chunk will represent a piece of context that can be queried. The data can come from various sources like Confluence documentation and PDF reports. ↳ Embed the Chunks: Use an embedding model to convert each chunk into a vector embedding. ↳ Store the Embeddings: Save all the vector embeddings in a vector database. ↳ Save Text with Embeddings: Keep the text for each embedding together with a pointer to the embedding. Next, construct the answer to a query: ↳ Embed the Query: Use the same embedding model to convert the query into a vector embedding. ↳ Query the Database: Use this vector embedding to search the vector database. Decide how many vectors to retrieve, which will be the amount of context used to answer the query. ↳ ANN Search: The vector database performs an Approximate Nearest Neighbor (ANN) search and returns the most similar context vectors. ↳ Map Vectors to Text: Map the returned vector embeddings to their corresponding text chunks. ↳ Generate Answer: Pass the question and the retrieved context chunks to the LLM (Language Model). Instruct the LLM to use only the provided context to answer the question. Ensure the LLM doesn't make up answers if the context lacks the needed information. You may want to deploy it to some hosting to make it available from the web: - Build a web interface with a text input box for the chat. - Run the provided question through the pipeline. - Display the generated answer. This is how many chatbots based on internal knowledge sources are built today.
Evgeny Krapivin somewhere you have to insert a component "preserve privacy and access rights" - probably multiple ones. Not everybody can see all documents (standard access control management; but hard to achieve with RAGs), and, your code base will be a treasure trove of passwords and authentication tokens. Of course, I know that you should never put passwords in your code, in confluence or in jira. But hey, they will be there plentiful.
Good to know!
Do you know which vector databases are good for getting started and have support with Langchain or LlamaIndex?
Brilliant illustration of Rag
This graphic is gorgeous my man! So well put together and it flows perfectly!
Love to see how you've explained it all and especially the visual Evgeny Krapivin...
Thanks for sharing this detailed guide, Evgeny. A RAG-based chatbot is a practical way to leverage corporate knowledge.
Breaking down knowledge into bite-sized, searchable chunks is a smart move for quick, accurate info retrieval...Evgeny Krapivin
This is so clear!
Founder & AI Specialist, Voice on AI Ethics & Inclusive leadership
1moEvgeny Krapivin RAG is easiest to get started with vectors but hardest to tune with text!! And just like the data lakes; it's important to curate the data that is indexed. great visualisation in your post!