Case Study: Generative AI Chatbots
Building retrieval-augmented assistants for HR and Finance knowledge access.
Project Snapshot
- Role: Applied AI Lead
- Domain: Internal knowledge systems
- Stack: Python, vector retrieval, LLM orchestration, enterprise data controls
- Timeline: 2023 – Oct 2025 (enterprise phase), with independent workflow refinement continuing post-2025
User engagement
80% Retention
User engagement and retention rate in first 8 months of HR deployment.
Query volume
100s/Day
Query volume by month 8, demonstrating sustained adoption and daily use.
Document corpus
50+ Documents
Corporate policies, SOPs, and procedure documents with citation-backed retrieval.
Architecture pattern
Reusable RAG
Provenance-first design with direct source links enabled user trust and adoption.
Technical Architecture
graph TD
subgraph Input
A[Policy Documents] --> B[Chunking Layer]
C[SOPs] --> B
D[Process Guides] --> B
end
subgraph Embedding
B --> E[Embedding Model]
E --> F[Vector Database]
end
subgraph Retrieval
G[User Query] --> H[Query Embedding]
H --> I[Similarity Search]
F --> I
I --> J[Top-K Chunks]
end
subgraph Generation
J --> K[Context Assembly]
K --> L[LLM Generation]
L --> M[Response with Citations]
end
subgraph Guardrails
M --> N[Hallucination Filter]
N --> O[Grounded Response]
end
O --> P[User Interface]
Architecture: Documents are chunked and embedded into a vector database. User queries are embedded and matched via similarity search. Top-K chunks are assembled with context and passed to an LLM for generation. Guardrails filter hallucinations before returning grounded responses with citations.
Decision Tradeoffs
| Option Considered | Pros | Cons | Decision |
|---|---|---|---|
| Vector DB + Semantic Search | Flexible retrieval, handles ambiguous queries, citations built-in | Requires embedding model selection, chunk size tuning | Selected — best fit for policy/Q&A where exact keyword match fails |
| Keyword Search (Elasticsearch) | Fast, well-understood, no embedding cost | Poor on semantic similarity, can't handle paraphrased queries | Rejected — users rarely know exact policy names |
| Fine-tuned LLM | Deep domain knowledge, no retrieval latency | Expensive to train, knowledge frozen at training time, no citations | Rejected — policies change frequently, needed update flexibility |
Problem
HR and Finance teams fielded recurring questions about policies, procedures, and SOPs. Staff spent significant time searching for information in PDFs, SharePoint, and email threads. Response times were slow and inconsistent.
Before/After
| Workflow | Before | After |
|---|---|---|
| Policy lookup | Search PDFs, SharePoint, email threads; ask colleagues | Natural language query → cited answer with source links |
| Answer consistency | Depends on who you ask; informal knowledge silos | Grounded in indexed documents; same source for everyone |
| Trust verification | “Where did you find that?” → uncertain provenance | Citation links directly to source document and section |
Approach
I led design and deployment of a retrieval-augmented generation (RAG) system that indexes internal documents, matches user queries via semantic search, and generates grounded responses with source citations. The system includes guardrails to minimize hallucinations and ensure responses are traceable to approved documents.
Outcome
Deployed for HR and Finance functions with a reusable pattern for subsequent assistants. The architecture supports multi-plant expansion with consistent knowledge base and governance controls.
Leadership Contribution
- Architecture: Designed the RAG pipeline with provenance-first retrieval—every answer links directly to its source document, enabling the trust that drove 80% user retention.
- Team: Led development sprints, established code review for prompt templates, and coordinated UAT across US and Canada.
- Governance: Defined hallucination guardrails, source citation requirements, and document management workflows for policy compliance.
- Outcomes: HR users reported the assistant was an "amazing time saver" and trusted the accuracy because of direct source links—leading to sustained daily use in the hundreds of queries.