HR-Assistant / docs /intro.md
owenkaplinsky's picture
update from github stable code (#3)
3370983 verified

Gradio Agents & MCP Hackathon Winter Edition 2025

๐Ÿ Overview

This repository hosts our team's submission for Track 2: MCP in Action in the MCP's 1st Birthday Hackathon.

Our goal is to build an autonomous agentic system that demonstrates:

  • Planning, reasoning, and execution
  • Integration of custom tools, MCP tools, or external APIs
  • Effective context engineering
  • Clear, practical user value

We'll use LangGraph as our orchestration backbone for building multi-turn, tool-using, and context-aware agents.

Check hackathon README for detilaed requirements.

๐Ÿง  Tools & Frameworks

  • ๐Ÿงฉ LangGraph: for multi-agent orchestration and planning
  • ๐Ÿง  LLM Engines: OpenAI / Anthropic โ€” reasoning and planning models
  • ๐Ÿ’ฌ Gradio: for the UI and context-engineering demos
  • โš™๏ธ MCP Tools: standardized interfaces for Gmail, Google Calendar, Voice technologies and other APIs
  • โ˜๏ธ Google Cloud Platform: optional backend for hosting MCP servers and integrated services
  • ๐Ÿ“ž Twilio: enables automated voice calls and candidate interactions
  • ๐Ÿ”Š ElevenLabs: (optional) natural text-to-speech for realistic voice screenings
  • ๐ŸŽ™๏ธ Whisper-based Transcription API (or OpenAI Whisper API ) โ€” for speech-to-text functionality in voice interviews
  • ๐Ÿงญ Langfuse or LangSmith: debugging, observability, and trace visualization
  • ๐Ÿ“„ Docling: for parsing and analyzing uploaded CV documents
  • ๐Ÿงฑ Pydantic: for structured outputs and data validation
  • ๐Ÿ”€ Parlant: enables agents to handle multi-intent, free-form conversations by dynamically activating relevant guidelines instead of rigidly routing to a single sub-agent โ€” solving the context fragmentation problem inherent in traditional LangGraph supervisor patterns.

๐Ÿ“š References for Context Engineering

These resources guide our approach to memory management, planning transparency, and tool orchestration in autonomous agents.

๐Ÿงพ HR Candidate Screening Multi-Agent System

An autonomous HR assistant that streamlines early recruitment through five steps:

  1. CV Upload (Application) โ€” candidate applications uploaded and parsed
  2. CV Screening โ€” rank and shortlist candidates using LLM reasoning
  3. Voice Screening โ€” invite and coordinate interviews using a voice agent.
  4. Person-to-Person Screening โ€” schedule HR interviews via Google Calendar integration
  5. Decision โ€” generate a concise summary and notify HR

NOTE

  • Final decision of whether candidate will be hired is made by human.
  • Just automate the boring, tedious stuff while keeping human final decision in the loop.

Architecture:

  1. Main Planner Agent: orchestrates the workflow
  2. Subagents:
  • CV Screening Agent
  • Voice Screening Agent
  • Meeting Scheduler Agent
  1. Tools (via MCP) connect to Gmail, Calendar, and Voice APIs.
  2. Database stores both candidate info and persistent agent memory.
  3. Gradio UI visualizes workflow, reasoning, and results.
flowchart TD
    subgraph MainAgent["๐Ÿง  Main Planner Agent"]
        A1["Plans โ€ข Reasons โ€ข Executes"]
    end

    subgraph Subagents["๐Ÿค– Subagents"]
        S1["๐Ÿ“„ CV Screening"]
        S2["๐ŸŽ™๏ธ Voice Screening"]
        S3["๐Ÿ“… Scheduling"]
        S4["๐Ÿงพ Decision Summary"]
    end

    subgraph Tools["โš™๏ธ MCP & External Tools"]
        T1["๐Ÿ“ง Gmail"]
        T2["๐Ÿ—“๏ธ Google Calendar"]
        T3["๐Ÿ—ฃ๏ธ Voice API"]
    end

    subgraph Data["๐Ÿ—„๏ธ Database"]
        D1["Candidate Data"]
        D2["Context Memory (Cognitive Offloading)"]
    end

    subgraph UI["๐Ÿ’ฌ Gradio Dashboard"]
        U1["HR View & Interaction"]
    end

    %% Connections
    MainAgent --> Subagents
    Subagents --> Tools
    Subagents --> Data
    MainAgent --> Data
    MainAgent --> UI

GCP Setup for Judges: A single demo Gmail/Calendar account ([email protected]) is pre-authorized via OAuth, with stored credentials in .env. Judges can run or view the live demo without any credential setup, experiencing real Gmail + Calendar automation safely.

We use hierarchical planning:

  • Main Agent: decides next step in the workflow (plan, adapt, replan)
  • Subagents: specialized executors (screening, scheduling, summarization)
  • Memory State: tracks plan progress and tool results
  • Dashboard Visualization: shows active plan steps and reasoning traces for transparency

๐Ÿง  Why This Is an Agent (Not Just a Workflow)

Criterion Workflow Our System
Autonomy Executes fixed sequence of steps Main agent decides next actions without manual triggers
Planning Predefined order (A โ†’ B โ†’ C) Main agent generates and adapts a plan (e.g., skip, retry, re-order)
Reasoning No decision logic Uses LLM reasoning to evaluate outputs and choose next subagent
Context Awareness Stateless Maintains shared memory of candidates, progress, and outcomes
Adaptation Fails or stops on error Re-plans (e.g., if calendar slots full or candidate unresponsive)

โœ… Therefore: it qualifies as an agentic system because it plans, reasons, and executes autonomously rather than following a static workflow.

Project Structure

agentic-hr/
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ src/
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ ๐Ÿ“ core/
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ base_agent.py           # Abstract BaseAgent (LangGraph-compatible)
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ supervisor.py           # Supervisor agent (LangGraph graph assembly)
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ state.py                # Shared AgentState + context window
โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ planner.py              # High-level planning logic
โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ executor.py             # Graph executor / runner
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ ๐Ÿ“ agents/
โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”œโ”€โ”€ ๐Ÿ“ cv_screening/
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ agent.py              # CVScreeningAgent implementation
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ ๐Ÿ“ tools/
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ doc_parser.py
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ normalize_skills.py
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ rank_candidates.py
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ match_to_jd.py
โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ ๐Ÿ“ schemas/
โ”‚ โ”‚ โ”‚ โ”‚     โ”œโ”€โ”€ cv_schema.py      # Parsed CV Pydantic schema
โ”‚ โ”‚ โ”‚ โ”‚     โ””โ”€โ”€ jd_schema.py      # Job description schema
โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”œโ”€โ”€ ๐Ÿ“ voice_screening/
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ agent.py              # VoiceScreeningAgent
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ ๐Ÿ“ tools/
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ twilio_client.py
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ whisper_transcribe.py
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ tts_service.py
โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ ๐Ÿ“ schemas/
โ”‚ โ”‚ โ”‚ โ”‚     โ”œโ”€โ”€ call_result.py
โ”‚ โ”‚ โ”‚ โ”‚     โ””โ”€โ”€ transcript.py
โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ”œโ”€โ”€ ๐Ÿ“ scheduler/
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ agent.py              # SchedulerAgent
โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ ๐Ÿ“ tools/
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ calendar_tool.py
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ gmail_tool.py
โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ slot_optimizer.py
โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ ๐Ÿ“ schemas/
โ”‚ โ”‚ โ”‚ โ”‚     โ””โ”€โ”€ meeting_schema.py
โ”‚ โ”‚ โ”‚
โ”‚ โ”‚ โ””โ”€โ”€ ๐Ÿ“ decision/
โ”‚ โ”‚     โ”œโ”€โ”€ agent.py              # DecisionAgent (final summarizer/Reporter)
โ”‚ โ”‚     โ””โ”€โ”€ ๐Ÿ“ schemas/
โ”‚ โ”‚         โ””โ”€โ”€ decision_report.py
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ ๐Ÿ“ mcp_server/
โ”‚ โ”‚   โ”œโ”€โ”€ main.py
โ”‚ โ”‚   โ”œโ”€โ”€ ๐Ÿ“ endpoints/
โ”‚ โ”‚   โ”œโ”€โ”€ auth.py
โ”‚ โ”‚   โ””โ”€โ”€ schemas.py
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ ๐Ÿ“ gradio/
โ”‚ โ”‚   โ”œโ”€โ”€ app.py                  # Main Gradio app (Hugging Face Space entry)
โ”‚ โ”‚   โ”œโ”€โ”€ dashboard.py            # Live agent graph & logs view
โ”‚ โ”‚   โ”œโ”€โ”€ candidate_portal.py     # Candidate upload / screening status
โ”‚ โ”‚   โ”œโ”€โ”€ hr_portal.py            # HR review + interview approval
โ”‚ โ”‚   โ”œโ”€โ”€ components.py           # Shared Gradio components
โ”‚ โ”‚   โ””โ”€โ”€ ๐Ÿ“ assets/              # Logos, CSS, etc.
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ ๐Ÿ“ cv_ui/
โ”‚ โ”‚   โ”œโ”€โ”€ app.py 
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ ๐Ÿ“ voice_screening_ui/
โ”‚ โ”‚   โ”œโ”€โ”€ app.py 
โ”‚ โ”‚
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ ๐Ÿ“ prompts/
โ”‚ โ”‚   โ”œโ”€โ”€ prompt_manager.py       # Centralized prompt versioning
โ”‚ โ”‚   โ”œโ”€โ”€ cv_prompts.py
โ”‚ โ”‚   โ”œโ”€โ”€ voice_prompts.py
โ”‚ โ”‚   โ””โ”€โ”€ scheduler_prompts.py
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ ๐Ÿ“ database/
โ”‚ โ”‚   โ”œโ”€โ”€ models.py               # SQLAlchemy models
โ”‚ โ”‚   โ”œโ”€โ”€ db_client.py            # Connection & CRUD
โ”‚ โ”‚   โ””โ”€โ”€ context_sync.py         # Cognitive offloading (context โ‡„ DB)
โ”‚ โ”‚
โ”‚ โ”œโ”€โ”€ main.py                     # CLI runner / local orchestrator entry
โ”‚ โ””โ”€โ”€ config.py                   # Environment configuration
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ tests/
โ”‚ โ”‚ โ”œโ”€โ”€ test_cv_agent.py
โ”‚ โ”‚ โ”œโ”€โ”€ test_voice_agent.py
โ”‚ โ”‚ โ”œโ”€โ”€ test_scheduler_agent.py
โ”‚ โ”‚ โ”œโ”€โ”€ test_mcp_server.py
โ”‚ โ”‚ โ””โ”€โ”€ test_integration.py
โ”‚
โ”œโ”€โ”€ .env.example
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ Dockerfile
โ”œโ”€โ”€ app.py                         # Shortcut to src/ui/app.py
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ LICENSE

Multi Agent System Architecture

Below you will find an overview of the subagent components that mnake upo the entire system. More detailed information and brainstorming is decicated to the docs/agents/.. directory.

1) Orchestrator

Overview

The orchestrator agent is reponsible for supervising and triggering the tasks of the subagents.

For more planning and info, go to docs/agents/agent_orchestrator.md

2) CV Screener

Overview

The cv screening agent deals with scanning the applicant's CV's, and deciding who are fruitful versus unpromising candidates as a first filtering step.

For more planning and info, go to docs/agents/cv_screening.md

3) ๐ŸŽ™๏ธ Voice Screening Agent

Overview

The Voice Screening Agent conducts automated phone interviews and integrates with the LangGraph HR Orchestrator.
It uses Twilio for phone calls, Whisper/ASR for speech-to-text, ElevenLabs for natural voice output, and LangGraph for dialogue logic.

For more planning and info, go to docs/agents/voice_screening.md

4) Google MCP Agents

Overview

The google mcp agents will be resposnible to: a) writing emails b) scheduling and menaging google calendar events

It adviseable to break this up into two subagents, to get rid of context poisoning.

For more planning and info, go to docs/agents/google_mcp_agent.md

4) LLM as a Judge

Overview

LLM-as-a-judge will be leveraged to judge call screening results.

For more planning and info, go to docs/agents/judging_agent.md

๐Ÿ—„๏ธ Data Layer

The system uses a unified SQLAlchemy-based database for both candidate data management and context engineering.

๐Ÿ“ฆ Purpose

Data Type Description
๐Ÿงพ Candidates Stores CVs, parsed data, and screening results
๐ŸŽ™๏ธ Voice Results Saves transcripts, evaluations, and tone analysis
๐Ÿ—“๏ธ Scheduling Tracks HR availability and confirmed interviews
๐Ÿง  Agent Context Memory Enables cognitive offloading โ€” storing reasoning traces and summaries so the active context stays uncluttered and information can be recalled when needed
๐Ÿ“š Logs / Tool History Archives tool interactions and results for transparency and reuse

We use SQLAlchemy as the ORM layer to manage both structured candidate data and persistent agent memory, allowing the system to offload, summarize, and retrieve context efficiently across sessions.

๐Ÿ—ƒ๏ธ Prompt Archive

To ensure consistent behavior and easy experimentation across subagents, the system includes a centralized prompt management layer.

๐Ÿ“ฆ Purpose

Component Description
๐Ÿง  Prompt Templates Stores standardized prompts for each subagent (CV screening, voice screening, scheduling)
๐Ÿ”„ Prompt Versioning Allows tracking and updating of prompt iterations without changing agent code
๐Ÿงฉ Dynamic Injection Enables context-dependent prompt construction using retrieved memory or database summaries
๐Ÿ“š Archive Keeps older prompt variants for reproducibility and ablation testing

๐Ÿ“บ Gradio Interface

We use Gradio to demonstrate our agent's reasoning, planning, and tool use interactively โ€” fully aligned with the Agents & MCP Hackathon focus on context engineering and user value.

๐Ÿงฉ Key Features

Section Purpose
๐Ÿง Candidate Portal Upload CVs, submit applications, and view screening results
๐Ÿง‘โ€๐Ÿ’ผ HR Portal Review shortlisted candidates, trigger voice screenings, and schedule interviews
๐Ÿง  Agent Dashboard Visualizes the current plan, tool calls, and reasoning traces in real time
โš™๏ธ Tool Integration Shows live MCP actions (Gmail send, Calendar scheduling) with status updates
๐Ÿ“Š Context View Displays agent memory, current workflow stage, and adaptive plan updates

Context Engineering Visualization?

This is what judges really care about โ€” it must show that the system is agentic (reasoning, memory, planning). ๐Ÿง  Agent Plan Viewer gr.JSON() or custom visual showing the current plan state, e.g.:

{
  "plan": [
    "1. Screen CVs โœ…",
    "2. Invite for voice screening ๐Ÿ”„",
    "3. Schedule HR interview โฌœ",
    "4. Await HR decision โฌœ"
  ]
}

๐Ÿ—บ๏ธ Live Plan Progress

  • Use a progress bar or color-coded status list of steps.
  • Judges must see autonomous transitions (from one step to another).

๐Ÿ’ฌ Reasoning Log / Memory

  • Stream or text box showing LLM thought traces or context summary:
    • โ€œDetected strong match for Data Scientist role.โ€
    • โ€œCandidate completed voice interview; confidence: 8.4/10.โ€
    • โ€œNext step: scheduling HR interview.โ€

โš™๏ธ Tool Call Trace

  • Small table showing:
Time Tool Action Result
12:05 Gmail send_invite() Sent
12:06 Calendar create_event() Confirmed

๐Ÿ”— MCP Integration (Best Practice Setup)

To align fully with the Agents & MCP Hackathon standards, our system will use or extend a standardized MCP server for integrations such as Gmail and Google Calendar โ€” and potentially Scion Voice in later stages.

Inspired by Huggingface MCP Course: shows how to build an MCP app.

๐Ÿงฉ Why MCP?

Benefit Description
โœ… Standardized Exposes Gmail & Calendar as reusable MCP tools with a consistent schema
๐Ÿ” Secure OAuth handled once server-side โ€” no tokens or secrets stored in the agent
๐Ÿงฑ Modular Clean separation between the agent's reasoning logic and the integration layer
๐Ÿ”„ Reusable Same MCP server can serve multiple projects or agents
๐Ÿš€ Hackathon-Ready Directly fulfills the โ€œuse MCP tools or external APIsโ€ requirement

โš™๏ธ Why Use MCP Instead of Just Defining Tools

Approach Limitation / Risk MCP Advantage
Custom-defined tools (e.g., direct Gmail API calls in code) Each project must re-implement auth, rate limits, and API logic MCP provides a shared, pre-authorized interface any agent can use
Embedded credentials in .env Security risk, harder for judges to test Credentials handled server-side โ€” no secrets in the repo
Tight coupling between agent and tool Hard to swap or extend integrations MCP creates a plug-and-play API boundary between reasoning and execution
Limited reuse Tools only exist in one codebase MCP servers can expose many tools to multiple agents dynamically

MCP turns these one-off integrations into standardized, composable building blocks that work across agents, organizations, or platforms โ€” the same philosophy used by Anthropic, LangChain, and Hugging Face in 2025 agent ecosystems.

We will build or extend the open-source mcp-gsuite server and host it securely on Google Cloud Run.
This server manages authentication, token refresh, and rate limiting โ€” while exposing standardized MCP actions like:

{
  "action": "gmail.send",
  "parameters": { "to": "[email protected]", "subject": "Interview Invite", "body": "..." }
}

and

{
  "action": "calendar.create_event",
  "parameters": { "summary": "HR Interview", "start": "...", "end": "..." }
}

This architecture lets our HR agent (and future projects) perform real email and scheduling actions via secure MCP endpoints โ€” giving judges a safe, live demo of true agentic behavior with no local credential setup required.

๐Ÿง  Agent Supervisor โ€” Why Parlant + LangGraph

LangGraph provides a powerful orchestration backbone for planning, reasoning, and executing multi-agent workflows.
However, its common supervisor pattern has a key limitation: the supervisor routes each user query to only one sub-agent at a time.

โš ๏ธ Example Problem

โ€œI uploaded my CV yesterday. Can I also reschedule my interview โ€” and how long is the voice call?โ€

A standard LangGraph supervisor would forward this entire message to, say, the CV Screening Agent,
missing the scheduling and voice screening parts โ€” causing incomplete or fragmented responses.

๐Ÿ’ก Parlant as the Fix

Parlant solves this by replacing single-route logic with dynamic guideline activation.
Instead of rigid routing, it loads multiple relevant guidelines into context simultaneously, allowing coherent handling of mixed intents.

agent.create_guideline(
  condition="User asks about rescheduling",
  action="Call SchedulerAgent via LangGraph tool"
)

agent.create_guideline(
  condition="User asks about voice screening duration",
  action="Query VoiceScreeningAgent"
)

If a user blends both topics, both guidelines trigger, producing a unified, context-aware response.

โš™๏ธ Why Combine Them

Layer Framework Role
๐Ÿง  Workflow Orchestration LangGraph Executes structured agent workflows (CV โ†’ Voice โ†’ Schedule โ†’ Decision).
๐Ÿ’ฌ Conversational Layer Parlant Dynamically manages mixed intents using guideline-based reasoning.
๐Ÿ”ง Integration Layer MCP Tools Provides standardized access to Gmail, Calendar, and Voice APIs.

Together, Parlant + LangGraph merge structured planning with conversational adaptability โ€” enabling our HR agent to reason, plan, and respond naturally to complex, multi-topic interactions.

โœจ Agentic Enhancements [BONUS]

To make the system more autonomous, interpretable, and resilient, we integrated a few lightweight yet powerful improvements:

  • ๐Ÿง  Self-Reflection โ€“ before executing a step, the agent briefly states why it's taking that action, improving reasoning transparency.
  • ๐Ÿ”„ Adaptive Re-Planning โ€“ if a subagent or tool call fails (e.g., no calendar slot, missing response, or API timeout), the main planner automatically updates its plan โ€” skipping, retrying, or re-ordering steps instead of stopping.
  • ๐Ÿงฎ LLM Self-Evaluation โ€“ after each stage (CV, voice, scheduling), a lightweight judge model rates the result and adds feedback for the next step.
  • ๐Ÿ—‚๏ธ Context Summary โ€“ the dashboard displays a live summary of all candidates, their current stage, and key outcomes.
  • ๐Ÿค Human-in-the-Loop Checkpoint โ€“ HR receives a short confirmation prompt before final scheduling to ensure responsible autonomy.

These enhancements demonstrate true agentic behavior โ€” autonomous planning, adaptive execution, and transparent reasoning โ€” in a simple, explainable way.

๐Ÿ‘ฅ Team

License

This project includes and builds upon gmail-mcp,
which is licensed under the GNU General Public License v3.0.

This repository extends gmail-mcp for experimental integration and automation with Claude Desktop.
All modifications are distributed under the same GPLv3 license.

Note: The original gmail-mcp code has not been modified at this stage.