Gradio Agents & MCP Hackathon Winter Edition 2025
๐ Overview
This repository hosts our team's submission for Track 2: MCP in Action in the MCP's 1st Birthday Hackathon.
Our goal is to build an autonomous agentic system that demonstrates:
- Planning, reasoning, and execution
- Integration of custom tools, MCP tools, or external APIs
- Effective context engineering
- Clear, practical user value
We'll use LangGraph as our orchestration backbone for building multi-turn, tool-using, and context-aware agents.
Check hackathon README for detilaed requirements.
๐ง Tools & Frameworks
- ๐งฉ LangGraph: for multi-agent orchestration and planning
- Why & how they built LangGraph for production agents
- ๐ง LLM Engines: OpenAI / Anthropic โ reasoning and planning models
- gpt-oss inference providers
- Open Router:
- LangChain Wrapper: https://github.com/langchain-ai/langchain/discussions/27964
- TogetherAI
- Open Router:
- gpt-oss inference providers
- ๐ฌ Gradio: for the UI and context-engineering demos
- โ๏ธ MCP Tools: standardized interfaces for Gmail, Google Calendar, Voice technologies and other APIs
- โ๏ธ Google Cloud Platform: optional backend for hosting MCP servers and integrated services
- ๐ Twilio: enables automated voice calls and candidate interactions
- ๐ ElevenLabs: (optional) natural text-to-speech for realistic voice screenings
- ๐๏ธ Whisper-based Transcription API (or OpenAI Whisper API ) โ for speech-to-text functionality in voice interviews
- ๐งญ Langfuse or LangSmith: debugging, observability, and trace visualization
- ๐ Docling: for parsing and analyzing uploaded CV documents
- ๐งฑ Pydantic: for structured outputs and data validation
- ๐ Parlant: enables agents to handle multi-intent, free-form conversations by dynamically activating relevant guidelines instead of rigidly routing to a single sub-agent โ solving the context fragmentation problem inherent in traditional LangGraph supervisor patterns.
๐ References for Context Engineering
- Context Engineering for AI Agents โ Manus Blog
- YouTube Talk Manus
- LangGraph Overview
- https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents
- https://medium.com/fundamentals-of-artificial-intelligence/mitigate-context-poisoning-in-ai-agents-using-context-engineering-96cf40dbb38d
- https://blog.langchain.com/context-engineering-for-agents/
- langgraph implementations
- Langgraph summary of what frontier labs and firms apply
These resources guide our approach to memory management, planning transparency, and tool orchestration in autonomous agents.
๐งพ HR Candidate Screening Multi-Agent System
An autonomous HR assistant that streamlines early recruitment through five steps:
- CV Upload (Application) โ candidate applications uploaded and parsed
- CV Screening โ rank and shortlist candidates using LLM reasoning
- Voice Screening โ invite and coordinate interviews using a voice agent.
- Person-to-Person Screening โ schedule HR interviews via Google Calendar integration
- Decision โ generate a concise summary and notify HR
NOTE
- Final decision of whether candidate will be hired is made by human.
- Just automate the boring, tedious stuff while keeping human final decision in the loop.
Architecture:
- Main Planner Agent: orchestrates the workflow
- Subagents:
- CV Screening Agent
- Voice Screening Agent
- Meeting Scheduler Agent
- Tools (via MCP) connect to Gmail, Calendar, and Voice APIs.
- Database stores both candidate info and persistent agent memory.
- Gradio UI visualizes workflow, reasoning, and results.
flowchart TD
subgraph MainAgent["๐ง Main Planner Agent"]
A1["Plans โข Reasons โข Executes"]
end
subgraph Subagents["๐ค Subagents"]
S1["๐ CV Screening"]
S2["๐๏ธ Voice Screening"]
S3["๐
Scheduling"]
S4["๐งพ Decision Summary"]
end
subgraph Tools["โ๏ธ MCP & External Tools"]
T1["๐ง Gmail"]
T2["๐๏ธ Google Calendar"]
T3["๐ฃ๏ธ Voice API"]
end
subgraph Data["๐๏ธ Database"]
D1["Candidate Data"]
D2["Context Memory (Cognitive Offloading)"]
end
subgraph UI["๐ฌ Gradio Dashboard"]
U1["HR View & Interaction"]
end
%% Connections
MainAgent --> Subagents
Subagents --> Tools
Subagents --> Data
MainAgent --> Data
MainAgent --> UI
GCP Setup for Judges:
A single demo Gmail/Calendar account ([email protected]) is pre-authorized via OAuth, with stored credentials in .env.
Judges can run or view the live demo without any credential setup, experiencing real Gmail + Calendar automation safely.
We use hierarchical planning:
- Main Agent: decides next step in the workflow (plan, adapt, replan)
- Subagents: specialized executors (screening, scheduling, summarization)
- Memory State: tracks plan progress and tool results
- Dashboard Visualization: shows active plan steps and reasoning traces for transparency
๐ง Why This Is an Agent (Not Just a Workflow)
| Criterion | Workflow | Our System |
|---|---|---|
| Autonomy | Executes fixed sequence of steps | Main agent decides next actions without manual triggers |
| Planning | Predefined order (A โ B โ C) | Main agent generates and adapts a plan (e.g., skip, retry, re-order) |
| Reasoning | No decision logic | Uses LLM reasoning to evaluate outputs and choose next subagent |
| Context Awareness | Stateless | Maintains shared memory of candidates, progress, and outcomes |
| Adaptation | Fails or stops on error | Re-plans (e.g., if calendar slots full or candidate unresponsive) |
โ Therefore: it qualifies as an agentic system because it plans, reasons, and executes autonomously rather than following a static workflow.
Project Structure
agentic-hr/
โ
โโโ ๐ src/
โ โ
โ โโโ ๐ core/
โ โ โ โโโ base_agent.py # Abstract BaseAgent (LangGraph-compatible)
โ โ โ โโโ supervisor.py # Supervisor agent (LangGraph graph assembly)
โ โ โ โโโ state.py # Shared AgentState + context window
โ โ โ โโโ planner.py # High-level planning logic
โ โ โ โโโ executor.py # Graph executor / runner
โ โ
โ โโโ ๐ agents/
โ โ โ
โ โ โโโ ๐ cv_screening/
โ โ โ โ โโโ agent.py # CVScreeningAgent implementation
โ โ โ โ โโโ ๐ tools/
โ โ โ โ โ โโโ doc_parser.py
โ โ โ โ โ โโโ normalize_skills.py
โ โ โ โ โ โโโ rank_candidates.py
โ โ โ โ โ โโโ match_to_jd.py
โ โ โ โ โโโ ๐ schemas/
โ โ โ โ โโโ cv_schema.py # Parsed CV Pydantic schema
โ โ โ โ โโโ jd_schema.py # Job description schema
โ โ โ
โ โ โโโ ๐ voice_screening/
โ โ โ โ โโโ agent.py # VoiceScreeningAgent
โ โ โ โ โโโ ๐ tools/
โ โ โ โ โ โโโ twilio_client.py
โ โ โ โ โ โโโ whisper_transcribe.py
โ โ โ โ โ โโโ tts_service.py
โ โ โ โ โโโ ๐ schemas/
โ โ โ โ โโโ call_result.py
โ โ โ โ โโโ transcript.py
โ โ โ
โ โ โโโ ๐ scheduler/
โ โ โ โ โโโ agent.py # SchedulerAgent
โ โ โ โ โโโ ๐ tools/
โ โ โ โ โ โโโ calendar_tool.py
โ โ โ โ โ โโโ gmail_tool.py
โ โ โ โ โ โโโ slot_optimizer.py
โ โ โ โ โโโ ๐ schemas/
โ โ โ โ โโโ meeting_schema.py
โ โ โ
โ โ โโโ ๐ decision/
โ โ โโโ agent.py # DecisionAgent (final summarizer/Reporter)
โ โ โโโ ๐ schemas/
โ โ โโโ decision_report.py
โ โ
โ โโโ ๐ mcp_server/
โ โ โโโ main.py
โ โ โโโ ๐ endpoints/
โ โ โโโ auth.py
โ โ โโโ schemas.py
โ โ
โ โโโ ๐ gradio/
โ โ โโโ app.py # Main Gradio app (Hugging Face Space entry)
โ โ โโโ dashboard.py # Live agent graph & logs view
โ โ โโโ candidate_portal.py # Candidate upload / screening status
โ โ โโโ hr_portal.py # HR review + interview approval
โ โ โโโ components.py # Shared Gradio components
โ โ โโโ ๐ assets/ # Logos, CSS, etc.
โ โ
โ โโโ ๐ cv_ui/
โ โ โโโ app.py
โ โ
โ โโโ ๐ voice_screening_ui/
โ โ โโโ app.py
โ โ
โ โ
โ โโโ ๐ prompts/
โ โ โโโ prompt_manager.py # Centralized prompt versioning
โ โ โโโ cv_prompts.py
โ โ โโโ voice_prompts.py
โ โ โโโ scheduler_prompts.py
โ โ
โ โโโ ๐ database/
โ โ โโโ models.py # SQLAlchemy models
โ โ โโโ db_client.py # Connection & CRUD
โ โ โโโ context_sync.py # Cognitive offloading (context โ DB)
โ โ
โ โโโ main.py # CLI runner / local orchestrator entry
โ โโโ config.py # Environment configuration
โ
โโโ ๐ tests/
โ โ โโโ test_cv_agent.py
โ โ โโโ test_voice_agent.py
โ โ โโโ test_scheduler_agent.py
โ โ โโโ test_mcp_server.py
โ โ โโโ test_integration.py
โ
โโโ .env.example
โโโ requirements.txt
โโโ Dockerfile
โโโ app.py # Shortcut to src/ui/app.py
โโโ README.md
โโโ LICENSE
Multi Agent System Architecture
Below you will find an overview of the subagent components that mnake upo the entire system. More detailed information and brainstorming is decicated to the docs/agents/.. directory.
1) Orchestrator
Overview
The orchestrator agent is reponsible for supervising and triggering the tasks of the subagents.
For more planning and info, go to
docs/agents/agent_orchestrator.md
2) CV Screener
Overview
The cv screening agent deals with scanning the applicant's CV's, and deciding who are fruitful versus unpromising candidates as a first filtering step.
For more planning and info, go to
docs/agents/cv_screening.md
3) ๐๏ธ Voice Screening Agent
Overview
The Voice Screening Agent conducts automated phone interviews and integrates with the LangGraph HR Orchestrator.
It uses Twilio for phone calls, Whisper/ASR for speech-to-text, ElevenLabs for natural voice output, and LangGraph for dialogue logic.
For more planning and info, go to
docs/agents/voice_screening.md
4) Google MCP Agents
Overview
The google mcp agents will be resposnible to: a) writing emails b) scheduling and menaging google calendar events
It adviseable to break this up into two subagents, to get rid of context poisoning.
For more planning and info, go to
docs/agents/google_mcp_agent.md
4) LLM as a Judge
Overview
LLM-as-a-judge will be leveraged to judge call screening results.
For more planning and info, go to
docs/agents/judging_agent.md
๐๏ธ Data Layer
The system uses a unified SQLAlchemy-based database for both candidate data management and context engineering.
๐ฆ Purpose
| Data Type | Description |
|---|---|
| ๐งพ Candidates | Stores CVs, parsed data, and screening results |
| ๐๏ธ Voice Results | Saves transcripts, evaluations, and tone analysis |
| ๐๏ธ Scheduling | Tracks HR availability and confirmed interviews |
| ๐ง Agent Context Memory | Enables cognitive offloading โ storing reasoning traces and summaries so the active context stays uncluttered and information can be recalled when needed |
| ๐ Logs / Tool History | Archives tool interactions and results for transparency and reuse |
We use SQLAlchemy as the ORM layer to manage both structured candidate data and persistent agent memory, allowing the system to offload, summarize, and retrieve context efficiently across sessions.
๐๏ธ Prompt Archive
To ensure consistent behavior and easy experimentation across subagents, the system includes a centralized prompt management layer.
๐ฆ Purpose
| Component | Description |
|---|---|
| ๐ง Prompt Templates | Stores standardized prompts for each subagent (CV screening, voice screening, scheduling) |
| ๐ Prompt Versioning | Allows tracking and updating of prompt iterations without changing agent code |
| ๐งฉ Dynamic Injection | Enables context-dependent prompt construction using retrieved memory or database summaries |
| ๐ Archive | Keeps older prompt variants for reproducibility and ablation testing |
๐บ Gradio Interface
We use Gradio to demonstrate our agent's reasoning, planning, and tool use interactively โ fully aligned with the Agents & MCP Hackathon focus on context engineering and user value.
๐งฉ Key Features
| Section | Purpose |
|---|---|
| ๐ง Candidate Portal | Upload CVs, submit applications, and view screening results |
| ๐งโ๐ผ HR Portal | Review shortlisted candidates, trigger voice screenings, and schedule interviews |
| ๐ง Agent Dashboard | Visualizes the current plan, tool calls, and reasoning traces in real time |
| โ๏ธ Tool Integration | Shows live MCP actions (Gmail send, Calendar scheduling) with status updates |
| ๐ Context View | Displays agent memory, current workflow stage, and adaptive plan updates |
Context Engineering Visualization?
This is what judges really care about โ it must show that the system is agentic (reasoning, memory, planning). ๐ง Agent Plan Viewer gr.JSON() or custom visual showing the current plan state, e.g.:
{
"plan": [
"1. Screen CVs โ
",
"2. Invite for voice screening ๐",
"3. Schedule HR interview โฌ",
"4. Await HR decision โฌ"
]
}
๐บ๏ธ Live Plan Progress
- Use a progress bar or color-coded status list of steps.
- Judges must see autonomous transitions (from one step to another).
๐ฌ Reasoning Log / Memory
- Stream or text box showing LLM thought traces or context summary:
- โDetected strong match for Data Scientist role.โ
- โCandidate completed voice interview; confidence: 8.4/10.โ
- โNext step: scheduling HR interview.โ
โ๏ธ Tool Call Trace
- Small table showing:
| Time | Tool | Action | Result |
|---|---|---|---|
| 12:05 | Gmail | send_invite() |
Sent |
| 12:06 | Calendar | create_event() |
Confirmed |
๐ MCP Integration (Best Practice Setup)
To align fully with the Agents & MCP Hackathon standards, our system will use or extend a standardized MCP server for integrations such as Gmail and Google Calendar โ and potentially Scion Voice in later stages.
Inspired by Huggingface MCP Course: shows how to build an MCP app.
๐งฉ Why MCP?
| Benefit | Description |
|---|---|
| โ Standardized | Exposes Gmail & Calendar as reusable MCP tools with a consistent schema |
| ๐ Secure | OAuth handled once server-side โ no tokens or secrets stored in the agent |
| ๐งฑ Modular | Clean separation between the agent's reasoning logic and the integration layer |
| ๐ Reusable | Same MCP server can serve multiple projects or agents |
| ๐ Hackathon-Ready | Directly fulfills the โuse MCP tools or external APIsโ requirement |
โ๏ธ Why Use MCP Instead of Just Defining Tools
| Approach | Limitation / Risk | MCP Advantage |
|---|---|---|
| Custom-defined tools (e.g., direct Gmail API calls in code) | Each project must re-implement auth, rate limits, and API logic | MCP provides a shared, pre-authorized interface any agent can use |
Embedded credentials in .env |
Security risk, harder for judges to test | Credentials handled server-side โ no secrets in the repo |
| Tight coupling between agent and tool | Hard to swap or extend integrations | MCP creates a plug-and-play API boundary between reasoning and execution |
| Limited reuse | Tools only exist in one codebase | MCP servers can expose many tools to multiple agents dynamically |
MCP turns these one-off integrations into standardized, composable building blocks that work across agents, organizations, or platforms โ the same philosophy used by Anthropic, LangChain, and Hugging Face in 2025 agent ecosystems.
We will build or extend the open-source mcp-gsuite server and host it securely on Google Cloud Run.
This server manages authentication, token refresh, and rate limiting โ while exposing standardized MCP actions like:
{
"action": "gmail.send",
"parameters": { "to": "[email protected]", "subject": "Interview Invite", "body": "..." }
}
and
{
"action": "calendar.create_event",
"parameters": { "summary": "HR Interview", "start": "...", "end": "..." }
}
This architecture lets our HR agent (and future projects) perform real email and scheduling actions via secure MCP endpoints โ giving judges a safe, live demo of true agentic behavior with no local credential setup required.
๐ง Agent Supervisor โ Why Parlant + LangGraph
LangGraph provides a powerful orchestration backbone for planning, reasoning, and executing multi-agent workflows.
However, its common supervisor pattern has a key limitation: the supervisor routes each user query to only one sub-agent at a time.
โ ๏ธ Example Problem
โI uploaded my CV yesterday. Can I also reschedule my interview โ and how long is the voice call?โ
A standard LangGraph supervisor would forward this entire message to, say, the CV Screening Agent,
missing the scheduling and voice screening parts โ causing incomplete or fragmented responses.
๐ก Parlant as the Fix
Parlant solves this by replacing single-route logic with dynamic guideline activation.
Instead of rigid routing, it loads multiple relevant guidelines into context simultaneously, allowing coherent handling of mixed intents.
agent.create_guideline(
condition="User asks about rescheduling",
action="Call SchedulerAgent via LangGraph tool"
)
agent.create_guideline(
condition="User asks about voice screening duration",
action="Query VoiceScreeningAgent"
)
If a user blends both topics, both guidelines trigger, producing a unified, context-aware response.
โ๏ธ Why Combine Them
| Layer | Framework | Role |
|---|---|---|
| ๐ง Workflow Orchestration | LangGraph | Executes structured agent workflows (CV โ Voice โ Schedule โ Decision). |
| ๐ฌ Conversational Layer | Parlant | Dynamically manages mixed intents using guideline-based reasoning. |
| ๐ง Integration Layer | MCP Tools | Provides standardized access to Gmail, Calendar, and Voice APIs. |
Together, Parlant + LangGraph merge structured planning with conversational adaptability โ enabling our HR agent to reason, plan, and respond naturally to complex, multi-topic interactions.
โจ Agentic Enhancements [BONUS]
To make the system more autonomous, interpretable, and resilient, we integrated a few lightweight yet powerful improvements:
- ๐ง Self-Reflection โ before executing a step, the agent briefly states why it's taking that action, improving reasoning transparency.
- ๐ Adaptive Re-Planning โ if a subagent or tool call fails (e.g., no calendar slot, missing response, or API timeout), the main planner automatically updates its plan โ skipping, retrying, or re-ordering steps instead of stopping.
- ๐งฎ LLM Self-Evaluation โ after each stage (CV, voice, scheduling), a lightweight judge model rates the result and adds feedback for the next step.
- ๐๏ธ Context Summary โ the dashboard displays a live summary of all candidates, their current stage, and key outcomes.
- ๐ค Human-in-the-Loop Checkpoint โ HR receives a short confirmation prompt before final scheduling to ensure responsible autonomy.
These enhancements demonstrate true agentic behavior โ autonomous planning, adaptive execution, and transparent reasoning โ in a simple, explainable way.
๐ฅ Team
License
This project includes and builds upon gmail-mcp,
which is licensed under the GNU General Public License v3.0.
This repository extends gmail-mcp for experimental integration and automation with Claude Desktop.
All modifications are distributed under the same GPLv3 license.
Note: The original gmail-mcp code has not been modified at this stage.