Spaces:

MCP-1st-Birthday
/

HR-Assistant

Running

🧩 LangGraph: for multi-agent orchestration and planning
- Why & how they built LangGraph for production agents
🧠 LLM Engines: OpenAI / Anthropic — reasoning and planning models
- gpt-oss inference providers
  - Open Router:
    - LangChain Wrapper: https://github.com/langchain-ai/langchain/discussions/27964
  - TogetherAI
💬 Gradio: for the UI and context-engineering demos
⚙️ MCP Tools: standardized interfaces for Gmail, Google Calendar, Voice technologies and other APIs
☁️ Google Cloud Platform: optional backend for hosting MCP servers and integrated services
📞 Twilio: enables automated voice calls and candidate interactions
🔊 ElevenLabs: (optional) natural text-to-speech for realistic voice screenings
🎙️ Whisper-based Transcription API (or OpenAI Whisper API ) — for speech-to-text functionality in voice interviews
🧭 Langfuse or LangSmith: debugging, observability, and trace visualization
📄 Docling: for parsing and analyzing uploaded CV documents
🧱 Pydantic: for structured outputs and data validation
🔀 Parlant: enables agents to handle multi-intent, free-form conversations by dynamically activating relevant guidelines instead of rigidly routing to a single sub-agent — solving the context fragmentation problem inherent in traditional LangGraph supervisor patterns.

📚 `References for Context Engineering`

These resources guide our approach to memory management, planning transparency, and tool orchestration in autonomous agents.

🧾 `HR Candidate Screening Multi-Agent System`

An autonomous HR assistant that streamlines early recruitment through five steps:

CV Upload (Application) — candidate applications uploaded and parsed
CV Screening — rank and shortlist candidates using LLM reasoning
Voice Screening — invite and coordinate interviews using a voice agent.
Person-to-Person Screening — schedule HR interviews via Google Calendar integration
Decision — generate a concise summary and notify HR

NOTE

Final decision of whether candidate will be hired is made by human.

Just automate the boring, tedious stuff while keeping human final decision in the loop.

Architecture:

Main Planner Agent: orchestrates the workflow
Subagents:

CV Screening Agent
Voice Screening Agent
Meeting Scheduler Agent

Tools (via MCP) connect to Gmail, Calendar, and Voice APIs.
Database stores both candidate info and persistent agent memory.
Gradio UI visualizes workflow, reasoning, and results.

flowchart TD
    subgraph MainAgent["🧠 Main Planner Agent"]
        A1["Plans • Reasons • Executes"]
    end

    subgraph Subagents["🤖 Subagents"]
        S1["📄 CV Screening"]
        S2["🎙️ Voice Screening"]
        S3["📅 Scheduling"]
        S4["🧾 Decision Summary"]
    end

    subgraph Tools["⚙️ MCP & External Tools"]
        T1["📧 Gmail"]
        T2["🗓️ Google Calendar"]
        T3["🗣️ Voice API"]
    end

    subgraph Data["🗄️ Database"]
        D1["Candidate Data"]
        D2["Context Memory (Cognitive Offloading)"]
    end

    subgraph UI["💬 Gradio Dashboard"]
        U1["HR View & Interaction"]
    end

    %% Connections
    MainAgent --> Subagents
    Subagents --> Tools
    Subagents --> Data
    MainAgent --> Data
    MainAgent --> UI

GCP Setup for Judges: A single demo Gmail/Calendar account ([email protected]) is pre-authorized via OAuth, with stored credentials in .env. Judges can run or view the live demo without any credential setup, experiencing real Gmail + Calendar automation safely.

We use hierarchical planning:

Main Agent: decides next step in the workflow (plan, adapt, replan)
Subagents: specialized executors (screening, scheduling, summarization)
Memory State: tracks plan progress and tool results
Dashboard Visualization: shows active plan steps and reasoning traces for transparency

🧠 Why This Is an Agent (Not Just a Workflow)

Criterion	Workflow	Our System
Autonomy	Executes fixed sequence of steps	Main agent decides next actions without manual triggers
Planning	Predefined order (A → B → C)	Main agent generates and adapts a plan (e.g., skip, retry, re-order)
Reasoning	No decision logic	Uses LLM reasoning to evaluate outputs and choose next subagent
Context Awareness	Stateless	Maintains shared memory of candidates, progress, and outcomes
Adaptation	Fails or stops on error	Re-plans (e.g., if calendar slots full or candidate unresponsive)

✅ Therefore: it qualifies as an agentic system because it plans, reasons, and executes autonomously rather than following a static workflow.

`Project Structure`

agentic-hr/
│
├── 📁 src/
│ │
│ ├── 📁 core/
│ │ │ ├── base_agent.py           # Abstract BaseAgent (LangGraph-compatible)
│ │ │ ├── supervisor.py           # Supervisor agent (LangGraph graph assembly)
│ │ │ ├── state.py                # Shared AgentState + context window
│ │ │ ├── planner.py              # High-level planning logic
│ │ │ └── executor.py             # Graph executor / runner
│ │
│ ├── 📁 agents/
│ │ │
│ │ ├── 📁 cv_screening/
│ │ │ │ ├── agent.py              # CVScreeningAgent implementation
│ │ │ │ ├── 📁 tools/
│ │ │ │ │ ├── doc_parser.py
│ │ │ │ │ ├── normalize_skills.py
│ │ │ │ │ ├── rank_candidates.py
│ │ │ │ │ └── match_to_jd.py
│ │ │ │ └── 📁 schemas/
│ │ │ │     ├── cv_schema.py      # Parsed CV Pydantic schema
│ │ │ │     └── jd_schema.py      # Job description schema
│ │ │
│ │ ├── 📁 voice_screening/
│ │ │ │ ├── agent.py              # VoiceScreeningAgent
│ │ │ │ ├── 📁 tools/
│ │ │ │ │ ├── twilio_client.py
│ │ │ │ │ ├── whisper_transcribe.py
│ │ │ │ │ └── tts_service.py
│ │ │ │ └── 📁 schemas/
│ │ │ │     ├── call_result.py
│ │ │ │     └── transcript.py
│ │ │
│ │ ├── 📁 scheduler/
│ │ │ │ ├── agent.py              # SchedulerAgent
│ │ │ │ ├── 📁 tools/
│ │ │ │ │ ├── calendar_tool.py
│ │ │ │ │ ├── gmail_tool.py
│ │ │ │ │ └── slot_optimizer.py
│ │ │ │ └── 📁 schemas/
│ │ │ │     └── meeting_schema.py
│ │ │
│ │ └── 📁 decision/
│ │     ├── agent.py              # DecisionAgent (final summarizer/Reporter)
│ │     └── 📁 schemas/
│ │         └── decision_report.py
│ │
│ ├── 📁 mcp_server/
│ │   ├── main.py
│ │   ├── 📁 endpoints/
│ │   ├── auth.py
│ │   └── schemas.py
│ │
│ ├── 📁 gradio/
│ │   ├── app.py                  # Main Gradio app (Hugging Face Space entry)
│ │   ├── dashboard.py            # Live agent graph & logs view
│ │   ├── candidate_portal.py     # Candidate upload / screening status
│ │   ├── hr_portal.py            # HR review + interview approval
│ │   ├── components.py           # Shared Gradio components
│ │   └── 📁 assets/              # Logos, CSS, etc.
│ │
│ ├── 📁 cv_ui/
│ │   ├── app.py 
│ │
│ ├── 📁 voice_screening_ui/
│ │   ├── app.py 
│ │
│ │
│ ├── 📁 prompts/
│ │   ├── prompt_manager.py       # Centralized prompt versioning
│ │   ├── cv_prompts.py
│ │   ├── voice_prompts.py
│ │   └── scheduler_prompts.py
│ │
│ ├── 📁 database/
│ │   ├── models.py               # SQLAlchemy models
│ │   ├── db_client.py            # Connection & CRUD
│ │   └── context_sync.py         # Cognitive offloading (context ⇄ DB)
│ │
│ ├── main.py                     # CLI runner / local orchestrator entry
│ └── config.py                   # Environment configuration
│
├── 📁 tests/
│ │ ├── test_cv_agent.py
│ │ ├── test_voice_agent.py
│ │ ├── test_scheduler_agent.py
│ │ ├── test_mcp_server.py
│ │ └── test_integration.py
│
├── .env.example
├── requirements.txt
├── Dockerfile
├── app.py                         # Shortcut to src/ui/app.py
├── README.md
└── LICENSE

`Multi Agent System Architecture`

Below you will find an overview of the subagent components that mnake upo the entire system. More detailed information and brainstorming is decicated to the docs/agents/.. directory.

1) `Orchestrator`

Overview

The orchestrator agent is reponsible for supervising and triggering the tasks of the subagents.

For more planning and info, go to docs/agents/agent_orchestrator.md

2) `CV Screener`

Overview

The cv screening agent deals with scanning the applicant's CV's, and deciding who are fruitful versus unpromising candidates as a first filtering step.

For more planning and info, go to docs/agents/cv_screening.md

3) 🎙️ `Voice Screening Agent`

Overview

The Voice Screening Agent conducts automated phone interviews and integrates with the LangGraph HR Orchestrator.
It uses Twilio for phone calls, Whisper/ASR for speech-to-text, ElevenLabs for natural voice output, and LangGraph for dialogue logic.

For more planning and info, go to docs/agents/voice_screening.md

4) `Google MCP Agents`

Overview

The google mcp agents will be resposnible to: a) writing emails b) scheduling and menaging google calendar events

It adviseable to break this up into two subagents, to get rid of context poisoning.

For more planning and info, go to docs/agents/google_mcp_agent.md

4) `LLM as a Judge`

Overview

LLM-as-a-judge will be leveraged to judge call screening results.

For more planning and info, go to docs/agents/judging_agent.md

🗄️ `Data Layer`

The system uses a unified SQLAlchemy-based database for both candidate data management and context engineering.

📦 Purpose

Data Type	Description
🧾 Candidates	Stores CVs, parsed data, and screening results
🎙️ Voice Results	Saves transcripts, evaluations, and tone analysis
🗓️ Scheduling	Tracks HR availability and confirmed interviews
🧠 Agent Context Memory	Enables cognitive offloading — storing reasoning traces and summaries so the active context stays uncluttered and information can be recalled when needed
📚 Logs / Tool History	Archives tool interactions and results for transparency and reuse

We use SQLAlchemy as the ORM layer to manage both structured candidate data and persistent agent memory, allowing the system to offload, summarize, and retrieve context efficiently across sessions.

🗃️ `Prompt Archive`

To ensure consistent behavior and easy experimentation across subagents, the system includes a centralized prompt management layer.

📦 Purpose

Component	Description
🧠 Prompt Templates	Stores standardized prompts for each subagent (CV screening, voice screening, scheduling)
🔄 Prompt Versioning	Allows tracking and updating of prompt iterations without changing agent code
🧩 Dynamic Injection	Enables context-dependent prompt construction using retrieved memory or database summaries
📚 Archive	Keeps older prompt variants for reproducibility and ablation testing

📺 `Gradio Interface`

We use Gradio to demonstrate our agent's reasoning, planning, and tool use interactively — fully aligned with the Agents & MCP Hackathon focus on context engineering and user value.

🧩 Key Features

Section	Purpose
🧍 Candidate Portal	Upload CVs, submit applications, and view screening results
🧑‍💼 HR Portal	Review shortlisted candidates, trigger voice screenings, and schedule interviews
🧠 Agent Dashboard	Visualizes the current plan, tool calls, and reasoning traces in real time
⚙️ Tool Integration	Shows live MCP actions (Gmail send, Calendar scheduling) with status updates
📊 Context View	Displays agent memory, current workflow stage, and adaptive plan updates

Context Engineering Visualization?

This is what judges really care about — it must show that the system is agentic (reasoning, memory, planning). 🧠 Agent Plan Viewer gr.JSON() or custom visual showing the current plan state, e.g.:

{
  "plan": [
    "1. Screen CVs ✅",
    "2. Invite for voice screening 🔄",
    "3. Schedule HR interview ⬜",
    "4. Await HR decision ⬜"
  ]
}

🗺️ Live Plan Progress

Use a progress bar or color-coded status list of steps.
Judges must see autonomous transitions (from one step to another).

💬 Reasoning Log / Memory

Stream or text box showing LLM thought traces or context summary:
- “Detected strong match for Data Scientist role.”
- “Candidate completed voice interview; confidence: 8.4/10.”
- “Next step: scheduling HR interview.”

⚙️ Tool Call Trace

Small table showing:

Time	Tool	Action	Result
12:05	Gmail	`send_invite()`	Sent
12:06	Calendar	`create_event()`	Confirmed

🔗 `MCP Integration (Best Practice Setup)`

To align fully with the Agents & MCP Hackathon standards, our system will use or extend a standardized MCP server for integrations such as Gmail and Google Calendar — and potentially Scion Voice in later stages.

Inspired by Huggingface MCP Course: shows how to build an MCP app.

🧩 Why MCP?

Benefit	Description
✅ Standardized	Exposes Gmail & Calendar as reusable MCP tools with a consistent schema
🔐 Secure	OAuth handled once server-side — no tokens or secrets stored in the agent
🧱 Modular	Clean separation between the agent's reasoning logic and the integration layer
🔄 Reusable	Same MCP server can serve multiple projects or agents
🚀 Hackathon-Ready	Directly fulfills the “use MCP tools or external APIs” requirement

⚙️ Why Use MCP Instead of Just Defining Tools

Approach	Limitation / Risk	MCP Advantage
Custom-defined tools (e.g., direct Gmail API calls in code)	Each project must re-implement auth, rate limits, and API logic	MCP provides a shared, pre-authorized interface any agent can use
Embedded credentials in `.env`	Security risk, harder for judges to test	Credentials handled server-side — no secrets in the repo
Tight coupling between agent and tool	Hard to swap or extend integrations	MCP creates a plug-and-play API boundary between reasoning and execution
Limited reuse	Tools only exist in one codebase	MCP servers can expose many tools to multiple agents dynamically

MCP turns these one-off integrations into standardized, composable building blocks that work across agents, organizations, or platforms — the same philosophy used by Anthropic, LangChain, and Hugging Face in 2025 agent ecosystems.

We will build or extend the open-source mcp-gsuite server and host it securely on Google Cloud Run.
This server manages authentication, token refresh, and rate limiting — while exposing standardized MCP actions like:

{
  "action": "gmail.send",
  "parameters": { "to": "[email protected]", "subject": "Interview Invite", "body": "..." }
}

and

{
  "action": "calendar.create_event",
  "parameters": { "summary": "HR Interview", "start": "...", "end": "..." }
}

This architecture lets our HR agent (and future projects) perform real email and scheduling actions via secure MCP endpoints — giving judges a safe, live demo of true agentic behavior with no local credential setup required.

🧠 `Agent Supervisor — Why Parlant + LangGraph`

LangGraph provides a powerful orchestration backbone for planning, reasoning, and executing multi-agent workflows.
However, its common supervisor pattern has a key limitation: the supervisor routes each user query to only one sub-agent at a time.

⚠️ Example Problem

“I uploaded my CV yesterday. Can I also reschedule my interview — and how long is the voice call?”

A standard LangGraph supervisor would forward this entire message to, say, the CV Screening Agent,
missing the scheduling and voice screening parts — causing incomplete or fragmented responses.

💡 Parlant as the Fix

Parlant solves this by replacing single-route logic with dynamic guideline activation.
Instead of rigid routing, it loads multiple relevant guidelines into context simultaneously, allowing coherent handling of mixed intents.

agent.create_guideline(
  condition="User asks about rescheduling",
  action="Call SchedulerAgent via LangGraph tool"
)

agent.create_guideline(
  condition="User asks about voice screening duration",
  action="Query VoiceScreeningAgent"
)

If a user blends both topics, both guidelines trigger, producing a unified, context-aware response.

⚙️ Why Combine Them

Layer	Framework	Role
🧠 Workflow Orchestration	LangGraph	Executes structured agent workflows (CV → Voice → Schedule → Decision).
💬 Conversational Layer	Parlant	Dynamically manages mixed intents using guideline-based reasoning.
🔧 Integration Layer	MCP Tools	Provides standardized access to Gmail, Calendar, and Voice APIs.

Together, Parlant + LangGraph merge structured planning with conversational adaptability — enabling our HR agent to reason, plan, and respond naturally to complex, multi-topic interactions.

✨ `Agentic Enhancements [BONUS]`

To make the system more autonomous, interpretable, and resilient, we integrated a few lightweight yet powerful improvements:

🧠 Self-Reflection – before executing a step, the agent briefly states why it's taking that action, improving reasoning transparency.
🔄 Adaptive Re-Planning – if a subagent or tool call fails (e.g., no calendar slot, missing response, or API timeout), the main planner automatically updates its plan — skipping, retrying, or re-ordering steps instead of stopping.
🧮 LLM Self-Evaluation – after each stage (CV, voice, scheduling), a lightweight judge model rates the result and adds feedback for the next step.
🗂️ Context Summary – the dashboard displays a live summary of all candidates, their current stage, and key outcomes.
🤝 Human-in-the-Loop Checkpoint – HR receives a short confirmation prompt before final scheduling to ensure responsible autonomy.

These enhancements demonstrate true agentic behavior — autonomous planning, adaptive execution, and transparent reasoning — in a simple, explainable way.

👥 `Team`

Member
Sebastian Wefers
Owen Kaplinsky
SrikarMK
Dmitri Moscoglo

`License`

This project includes and builds upon gmail-mcp,
which is licensed under the GNU General Public License v3.0.

This repository extends gmail-mcp for experimental integration and automation with Claude Desktop.
All modifications are distributed under the same GPLv3 license.

Note: The original gmail-mcp code has not been modified at this stage.

Gradio Agents & MCP Hackathon Winter Edition 2025

🏁 Overview

🧠 Tools & Frameworks

📚 References for Context Engineering

🧾 HR Candidate Screening Multi-Agent System

Project Structure

Multi Agent System Architecture

1) Orchestrator

Overview

2) CV Screener

Overview

3) 🎙️ Voice Screening Agent

Overview

4) Google MCP Agents

Overview

4) LLM as a Judge

Overview

🗄️ Data Layer

📦 Purpose

🗃️ Prompt Archive

📦 Purpose

📺 Gradio Interface

🧩 Key Features

Context Engineering Visualization?

🔗 MCP Integration (Best Practice Setup)

🧩 Why MCP?

⚙️ Why Use MCP Instead of Just Defining Tools

🧠 Agent Supervisor — Why Parlant + LangGraph

⚠️ Example Problem

💡 Parlant as the Fix

⚙️ Why Combine Them

✨ Agentic Enhancements [BONUS]

👥 Team

License

`Gradio Agents & MCP Hackathon Winter Edition 2025`

🧠 `Tools & Frameworks`

📚 `References for Context Engineering`

🧾 `HR Candidate Screening Multi-Agent System`

`Project Structure`

`Multi Agent System Architecture`

1) `Orchestrator`

2) `CV Screener`

3) 🎙️ `Voice Screening Agent`

4) `Google MCP Agents`

4) `LLM as a Judge`

🗄️ `Data Layer`

🗃️ `Prompt Archive`

📺 `Gradio Interface`

🔗 `MCP Integration (Best Practice Setup)`

🧠 `Agent Supervisor — Why Parlant + LangGraph`

✨ `Agentic Enhancements [BONUS]`

👥 `Team`

`License`