Answers with sources. Knowledge under your control.
Upload your documents (PDF, DOCX, Excel, TXT, Markdown), configure your search engine and get precise answers with exact page citations. Secure, auditable and honest — when it doesn't know, it says so.
No credit card · Free plan forever · Data isolated per tenant
How it works
Three steps to turn your documents into intelligent answers
Upload your documents
PDF, DOCX, Excel, TXT or Markdown. Drag and drop. Automatic background processing with progress tracking.
Configure your engine
Adjust chunk size, top_k, BM25, reranker, temperature and more. Full control over the search pipeline, no code required.
Ask and get answers
Answers with exact page citations. Calibrated confidence. Instant FAQs with multimedia. Export to PDF.
See it in action
Create your first Knowledge Base in under 2 minutes
Everything you need to manage knowledge with AI
Features no other platform offers together
Intelligent hybrid search
Combines semantic search (vectors) + keyword search (BM25) + cross-encoder reranking. Precision the competition can't match.
Exact page citations
Every answer cites the exact page from the original document. Your team can verify the source in 10 seconds.
Auto calculations & charts
Ask something that requires calculations, data comparison, or charts, and the system generates the answer with code automatically. You just ask — Citai solves.
Instant FAQs at zero LLM cost
Predefined answers via embedding matching. Zero latency, zero cost, deterministic. Import from CSV/Excel or create manually. With images, videos and links.
Calibrated honesty
Confidence scoring with cross-encoder. When confidence is low, the system says "I don't know" instead of making things up. Configurable per KB.
Smart semantic cache
Repeated or similar questions are answered from cache, saving up to 60% in LLM costs. Embedding-based cache (not exact text), configurable per KB with TTL and automatic invalidation.
Contextual Retrieval
An LLM enriches each chunk with full document context before indexing, reducing retrieval failures by 50-67%.
Advanced RAG Playground
Three testing modes (individual, batch, A/B), full diagnostics, LLM preview, smart suggestions, and an interactive guide for 12 parameters.
Configurable RAG per KB
chunk_size, top_k, BM25 weight, reranker, temperature — all configurable per knowledge base, no code required.
Admin panel
Complete management of tenants, plans, users, global settings, and maintenance mode. Full control from a single place.
Security & privacy
Data isolated per tenant with encryption. Each organization operates in a completely separate environment. On-premise deployment available for Enterprise.
Embeddable widget
Add an intelligent chatbot to your website with a single line of code. Shadow DOM, SSE streaming, quick replies, business hours, and industry templates.
Real-time Webhooks
Connect Citai to your CRM, Slack, or Zapier. Receive automatic event notifications (messages, low confidence, escalations) with retries and HMAC-SHA256 signing.
Live Human Handoff
A human agent can take over the conversation in real-time. Waiting queue, bidirectional chat, CSAT ratings, and automatic timeout on inactivity.
Multi-language & themes
Interface in 6 languages (Spanish, English, Portuguese, German, French, Italian) with dynamic switching. Dark, light, or auto mode synced with your account.
Feedback Analytics
Dashboard with satisfaction by KB, dissatisfaction reasons, and paginated feedback table with filters. Identify where to improve your knowledge bases.
Knowledge gap detection
Automatic capture of unanswered queries with embedding-based deduplication. RAG quality dashboard, time series, and gap table with bulk resolution.
KB Health Score
Automatic diagnostics for each knowledge base with a 0-100 score, breakdown by 5 components, and actionable recommendations to improve quality.
Hallucination detection
An NLI model verifies that every claim in the answer is supported by the retrieved chunks. 0-100 score, in-chat badge, automatic webhook alerts.
Per-KB Regression Testing
Golden Set with smart diagnostics: each analyzed item self-diagnoses and suggests actions (open Playground, edit, adjust config). 5 metrics (faithfulness, answer relevancy, context precision/recall, citation accuracy). Automatic regression detection vs baseline. Email when something gets worse.
Smart Routing
Citai automatically selects the most relevant knowledge bases for each question using embedding similarity with per-KB centroids.
Auto-FAQ from knowledge gaps
Citai automatically generates FAQs from unanswered questions. The LLM searches your KB and proposes answers you approve with a single click. Close knowledge gaps effortlessly.
Interactive Knowledge Graph
Visualize your RAG pipeline as a node graph. The query at the center, documents around it — each connection shows the relevance score. Diagnose clusters, detect topic gaps, and visually understand why the system chose each result.
Conversational Intake Filters
Personalize the widget experience with onboarding questions. Collect user preferences (passengers, difficulty, budget) injected as context in all responses. Supports conditional logic and real-time editable filters.
Real-time external APIs
Connect the agent with external APIs for real-time data. Ideal for tourism: weather, availability, alerts. The agent automatically decides when to invoke a tool and combines results with your KB knowledge. Predefined templates to get started in minutes.
Visual RAG: document images
Automatically extracts images from PDFs and DOCX (screenshots, diagrams, tables) and displays them in chat responses alongside relevant text. Image Manager to manage image-chunk associations. Carousel for multiple images.
AI Email Assistant
Generate professional email replies using your knowledge base. Smart pre-analysis, 4 tones, embedded images on copy. Paste the email, select the KB, and copy the response directly to Gmail or Outlook.
Email Threading
Chain incoming emails and responses into conversation threads. Citai uses the full chain as context to generate increasingly precise and consistent responses. Find a thread in history and continue the conversation with one click.
Custom Response Style
Upload 3-8 real examples of how your team replies to emails (auto-anonymized with regex + LLM) and Citai mimics that style: opening formulas, length, formality. Content comes from your KB; the style comes from your team. Global toggle and per-response override.
Email Templates
Pre-built templates like 'Order confirmation', 'Apology for delay' or 'Request for info'. Pick one from the dropdown and instructions are automatically injected into the LLM, giving structure to the response without writing instructions from scratch. Full CRUD so each tenant can create their own templates.
Every answer has a source. Verifiable in 10 seconds.
Don't blindly trust AI. Citai cites the exact page from the original document so your team can verify any answer.
- Exact page number for every cited source
- Multiple sources per answer with relevance score
- Export with included sources for auditing
Multimedia FAQs: instant answers without consuming tokens.
Frequently asked questions are answered via embedding matching, bypassing the LLM. Deterministic response, zero latency, zero cost.
- Bulk import FAQs from CSV or Excel in seconds
- Images, videos and links in every FAQ answer
- Smart matching: detects paraphrases and variations
Our business hours are Monday to Friday, 9am to 6pm.
Your data always protected and under your control.
Each tenant operates in an isolated environment. Your data is never mixed with other clients. For Enterprise, on-premise deployment with compiled containers.
- Complete data isolation between tenants
- Encryption in transit and at rest
- On-premise deployment available for Enterprise plan
Optimize your system with real data.
The Playground shows each stage of the RAG pipeline to diagnose issues. Feedback Analytics tells you what your users think. Together, they give you the control to continuously improve.
- Playground: test queries and see results from each pipeline stage
- Feedback: satisfaction by KB, thumbs down reasons, filterable table
- Global settings, plan management, and maintenance mode
Fine-tune your RAG with surgical precision.
Three testing modes — individual, batch, and A/B comparison — with full diagnostics: language detection, cache, content rules, health score, smart routing, and per-stage latencies.
- LLM Preview: see exactly what the system would answer before going to chat
- Smart suggestions: each tip tells you which parameter to change and applies it in one click
- Batch testing: up to 10 queries with aggregate confidence statistics
- A/B comparison: run the same query with two different configurations side by side
- Diagnostics: language, cache hit, content rules, KB health, and smart routing
- Interactive guide for 12 parameters with explanations of when to increase or decrease each
Detect knowledge gaps before your users do.
Complete dashboard with usage metrics, RAG quality, knowledge gaps, automatic topic classification, estimated costs per provider, and CSV/JSON export. Everything to continuously optimize your KB.
- 7 key metrics: queries, conversations, confidence, gaps, resolution rate, FAQ savings, and cache hit rate
- Smart auto-tagging: each conversation is automatically classified by topic (billing, setup, bug, etc.)
- Top 10 frequent questions, LLM provider costs, and estimated FAQ savings
- CSV/JSON export, gaps with KB filter, bulk resolution, and interactive charts
An assistant with your brand's personality
Each knowledge base can have its own identity. Define how the agent speaks, what tone it uses, and what topics it covers. Core system rules (sources, format) remain untouched.
- Name, tone and behavior configurable per KB
- The agent only answers about the topics you define
- Same RAG engine, different personality for each use case
- Pre-built industry templates: configure your widget in one click
"I'm Sofi from ZapasMax. I'm friendly, casual, and only answer about sneakers and orders. If asked about politics, I politely redirect."
"I'm the legal assistant of Martinez & Associates. I use formal language, cite articles precisely, and clarify that my answers don't constitute legal advice."
"I'm the academic assistant of the Engineering Faculty. I explain clearly, use practical examples, and guide students to official resources."
Tell it how you want it to talk and AI writes the prompt for you.
No prompt engineering needed. Chat with our assistant, describe the tone, limits and style you want, and it generates the perfect system prompt for your agent.
- Describe in plain language how you want your assistant to behave
- AI generates the complete prompt ready to apply with one click
- Refine and adjust by chatting, no technical config needed
- Automatically includes tone rules, topic limits and response style
You're part of the support team. You know the platform inside out. Never say "according to the documentation" or phrases that reveal you consult documents...
A chatbot on your website in 60 seconds
Paste one line of code and your visitors can chat with your documents. No iframe, no dependencies, with Shadow DOM for total style isolation.
- Real-time SSE streaming, pre-chat forms and persistent sessions
- Human escalation: manual or automatic when confidence is low
- Quick replies + conversation starters as clickable buttons
- Configurable business hours with out-of-hours message
- Multilingual welcome: detects browser language (ES/EN/PT/DE/FR/IT)
- Industry templates: E-commerce, SaaS, Education, Health
Your knowledge bases diagnose themselves.
Each KB has a Health Score from 0 to 100 that evaluates 5 key factors. You know exactly what to improve without guessing.
- 5 components: Documents, FAQ, RAG Quality, Gaps and Freshness
- Color badge on each card (green, yellow, red)
- Automatic actionable recommendations for low-scoring components
- Smart Routing: queries are routed to the most relevant KB automatically
Chunks that understand their place in the document
Before indexing, an LLM analyzes each fragment and generates context about where it comes from and what concepts it covers. The result: more precise embeddings, richer BM25, and significantly better answers.
- 50-67% reduction in retrieval failures (Anthropic benchmarks)
- Especially effective for tables, lists, and technical documents
- Enabled per KB with a toggle — available on Starter and Pro plans
"El plan incluye 500 consultas mensuales y soporte prioritario."
The chunk loses reference to which plan it refers to
This fragment belongs to the Starter plan pricing section. It describes the monthly query limits included.
"El plan incluye 500 consultas mensuales y soporte prioritario."
External APIs: the agent queries live data.
Connect the agent with external services so it combines knowledge from your documents with real-time data. Ideal for tourism, logistics, or any domain where information changes constantly.
- The agent automatically decides when it needs external data
- Predefined templates: weather, currencies, webhooks — works in minutes
- Smart per-tool caching avoids repeated API calls
- Works in web chat and embeddable widget with no extra setup
Connect your APIs instantly
When the user enters a code or identifier, the system automatically queries your external API and combines the data with your document knowledge. No waiting, no LLM dependency.
- Two modes: LLM decides or deterministic auto-injection
- Visual mapping of widget filters to API parameters
- Compatible with any REST API (CRM, ERP, ticketing)
- If the API fails, the agent responds with the KB uninterrupted
Your documents respond with images.
Automatically extracts screenshots, diagrams and tables from PDFs and DOCX. When a user asks a question, the response includes the exact image from the original document alongside the explanatory text.
- Automatic image extraction when uploading PDF and DOCX documents
- Smart filtering: ignores logos, icons and solid backgrounds
- Image Manager to manually manage image-text associations
- Interactive carousel when multiple relevant images are found
Screenshot automatically extracted from PDF (Figure 21)
Reply to emails with your knowledge base.
Paste a client email, select your KB, and generate a professional reply grounded in your documents. Smart pre-analysis detects topics and verifies coverage before generating.
- Pre-analysis: extracts topics from the email and verifies KB coverage
- 4 tones (professional, friendly, formal, concise) + auto language detection
- Source images embedded on copy — paste in Gmail/Outlook with inline photos
- Full history with search, filters and reuse of previous responses
- External tools: query APIs (CRM, weather, availability) for real-time data
Maria Rodriguez
El Chalten package inquiry
"Hi, I'm interested in the trekking package to El Chalten for 4 people in March. I'd like to know what's included, the price per person, and if there are hotel options."
The El Chalten trekking package includes 5 days / 4 nights with a bilingual guide, transfers from El Calafate, trekking equipment, and full board. The price per person for March 2026 is $1,850 USD based on double occupancy. We offer accommodation options: Senderos hostel (included in the base package) or upgrade to Chalten Suites for an additional $420 USD per person.
See your knowledge as a neural map.
The Knowledge Graph transforms RAG pipeline results into an interactive graph. Each query is visualized as a central node with relevant documents orbiting around it — distance proportional to score, colors by document, scores on every connection.
- Query Graph: your question at the center, chunks around it with relevance scores on each connection
- KB Map: diagnose topic clusters, document redundancy, and coverage gaps
- Click any node to see scores from each pipeline stage (Vector, BM25, Rerank, MMR)
- Interactive drag, zoom, pan, and toggle to show chunks filtered by the pipeline
- Isolated nodes in the KB Map reveal documents with no topic relationship — potential gaps
Automated tests that catch regressions before your customers do.
Build a per-KB Golden Set of queries with expected answers. Every change (LLM model, FAQs, parameters) runs the set automatically and compares against baseline. If any metric drops more than 5%, you get an email.
- 5 objective metrics: faithfulness, answer relevancy, context precision/recall, citation accuracy
- Auto-generate candidates from existing FAQs, real queries, or LLM-synthetic
- Side-by-side comparison with per-metric and per-item deltas
- Automatic triggers: on FAQ approval, suggestion apply, doc reprocessing, weekly
- Automatic email to admin when regression vs baseline is detected
Decision: revert or adjust — less relevant chunks are now leaking into the top-k.
Verify the LLM is not making things up.
After every answer, an NLI model evaluates whether each claim is supported by the retrieved chunks. The 0-100 score separates faithful answers from hallucinations even when confidence is high.
- Catches numbers, dates and details the LLM adds without source backing
- Configurable per KB: threshold, in-chat badge, automatic webhook alert
- Zero latency for the end user (fire-and-forget post-streaming)
- Stat card 'Average faithfulness' + 'Hallucinations detected' table in Analytics
- Runs locally on CPU with NLI DeBERTa — no external API, no variable cost
"Refunds take seven business days from the moment payment is confirmed."
5/5 claims supported
"Refunds take five business days and require no additional approval."
2/5 claims supported
Possible scenarios
This is how Citai works across different industries.
Frequently asked questions
Everything you need to know before getting started