Retrieval-Augmented Generation (RAG) is a technique that combines retrieving information from a knowledge base (KB) and generating responses using a large language model (LLM).

LLMs have certain limitations, such as hallucinations, a cut-off date for training data, and reliance on publicly available information. RAG addresses these challenges by grounding the LLM's responses in a controlled knowledge domain, like enterprise-specific content. It also allows you to provide additional context to the LLM, helping overcome the issue of outdated training data.

The RAG architecture depends on an efficient knowledge base and search capabilities. A larger knowledge base can improve the chatbot’s accuracy but may also increase latency, as it takes more time to retrieve relevant information. Optimizing the size and structure of the knowledge base is essential for balancing performance and response time. Zammo’s built-in knowledge base feature includes easy-to-use settings for optimizing how you want Azure AI Search to search and index your content.

Azure AI Search supports various query types, including two key methods: vector search and semantic search. These approaches help enhance the relevance and accuracy of the search results by considering the meaning and context of the query.

Semantic Search

Semantic search focuses on meaning and context instead of relying on keywords. The relationship between words and their context within the query is analyzed, enhancing the relevance of the results generated.

Vector Search

Vector search is more math-driven than intent-driven. It calculates a numerical vector representation of the content in the knowledge base and uses mathematical methods to determine which vectors are most similar to the query vector. This approach enables matching based on semantic or contextual similarity, even across different languages and formats (such as text or images).

You can enable vector search when setting up a new knowledge base:

<aside>

physEnableVectosSearch.png

</aside>

<aside> 💡

Using hybrid search (Vector + Semantic) is generally recommended for optimal search results.

</aside>