Case Study: Generative AI Chatbots

Building retrieval-augmented assistants for HR and Finance knowledge access.

Generative AI chatbot architecture and enterprise use cases

Project Snapshot

  • Role: Applied AI Lead
  • Domain: Internal knowledge systems
  • Stack: Python, vector retrieval, LLM orchestration, enterprise data controls
  • Timeline: 2023 – Oct 2025 (enterprise phase), with independent workflow refinement continuing post-2025

Initial deployment scope

2 Functions Covered

Initial assistant deployments targeted HR and Finance policy/SOP workflows, creating measurable self-service coverage where support load was recurring.

Reusable pattern

1 Deployment Model

Shipped with a reusable 1-pattern deployment model for future assistants, reducing time-to-deploy for subsequent use cases.

Enterprise rollout

Multi-Plant Adoption

Design supported eventual expansion to multiple plant locations with consistent knowledge base and governance controls.

Technical Architecture

graph TD
    subgraph Input
        A[Policy Documents] --> B[Chunking Layer]
        C[SOPs] --> B
        D[Process Guides] --> B
    end
    
    subgraph Embedding
        B --> E[Embedding Model]
        E --> F[Vector Database]
    end
    
    subgraph Retrieval
        G[User Query] --> H[Query Embedding]
        H --> I[Similarity Search]
        F --> I
        I --> J[Top-K Chunks]
    end
    
    subgraph Generation
        J --> K[Context Assembly]
        K --> L[LLM Generation]
        L --> M[Response with Citations]
    end
    
    subgraph Guardrails
        M --> N[Hallucination Filter]
        N --> O[Grounded Response]
    end
    
    O --> P[User Interface]
            

Architecture: Documents are chunked and embedded into a vector database. User queries are embedded and matched via similarity search. Top-K chunks are assembled with context and passed to an LLM for generation. Guardrails filter hallucinations before returning grounded responses with citations.

Decision Tradeoffs

Option ConsideredProsConsDecision
Vector DB + Semantic Search Flexible retrieval, handles ambiguous queries, citations built-in Requires embedding model selection, chunk size tuning Selected — best fit for policy/Q&A where exact keyword match fails
Keyword Search (Elasticsearch) Fast, well-understood, no embedding cost Poor on semantic similarity, can't handle paraphrased queries Rejected — users rarely know exact policy names
Fine-tuned LLM Deep domain knowledge, no retrieval latency Expensive to train, knowledge frozen at training time, no citations Rejected — policies change frequently, needed update flexibility

Problem

HR and Finance teams fielded recurring questions about policies, procedures, and SOPs. Staff spent significant time searching for information in PDFs, SharePoint, and email threads. Response times were slow and inconsistent.

Approach

I led design and deployment of a retrieval-augmented generation (RAG) system that indexes internal documents, matches user queries via semantic search, and generates grounded responses with source citations. The system includes guardrails to minimize hallucinations and ensure responses are traceable to approved documents.

Outcome

Initial deployment covered HR and Finance functions, with a reusable pattern for future assistants. The system demonstrated measurable self-service coverage and created the foundation for multi-plant expansion.

Leadership Contribution

  • Architecture: Designed the RAG pipeline, selecting vector DB + semantic search over keyword search and fine-tuning after evaluating tradeoffs.
  • Team: Led development sprints, established code review for prompt templates and retrieval logic.
  • Governance: Defined hallucination guardrails and citation requirements for enterprise deployment.
  • Outcomes: Measured query resolution rates, user satisfaction, and time-to-answer improvements.