Query Mate: Granular Control with Flexible LLM Models

Query Mate provides intuitive controls for fine-tuning your RAG experience, combined with an unparalleled choice of Large Language Models (LLMs). This allows you to balance performance, creativity, and data privacy according to your specific needs.

RAG/LLM Parameters

These parameters allow you to control the behavior of the LLM and the RAG process:

Temperature

Controls the randomness of the LLM's output. Higher values mean more creative/diverse responses, lower values mean more deterministic/focused responses.

Top P (Nucleus Sampling)

Filters out low-probability tokens, ensuring more coherent and less random responses, especially useful for factual queries.

Top K

Limits the LLM's choices to the top 'k' most likely next words, balancing creativity with relevance.

Maximum Number of Chunks

Determines how many relevant data chunks from your knowledge base are fed to the LLM for generating its response. More chunks can mean more comprehensive answers, but also higher processing time.

Flexible LLM Model Selection

Query Mate offers access to hundreds of LLM models, giving you the power to choose based on your data privacy concerns and performance needs. By default, private LLMs are preferred, ensuring your data never leaves your secure environment. However, you are in complete control:

Private LLMs: Run locally on your servers, ensuring maximum data confidentiality. Ideal for highly sensitive information.
Public LLMs: For less sensitive queries or when leveraging the advanced capabilities of models like Claude, ChatGPT, or Gemini, you have the option to use public LLMs.

The customer decides what is private and what is not, and how they trust big corporations. Here's an example of a private model available:

Llama3.2 (1GB) - 3.21B Parameters

Version:	llama3.2:3b
Provider:	Ollama
Host:	Ollama 87 nvidia (Private) Ollama fluxedge (Private)
Capabilities:	Streaming
Context size:	128K tokens
Number of parameters:	3.21 billions parameters
Quantization:	Q4_K_M

This is just one example of the many private models you can deploy and manage with Query Mate.

This flexibility ensures that Query Mate can adapt to any enterprise security policy while providing powerful AI-driven insights.

Back to Optimizations Next: Prompt Types