Spaces:

wenlianghuang
/

Deep-Agent-Tool

Sleeping

App Files Files Community

wenlianghuang commited on Jan 5

Commit

0d900a6

2 Parent(s): 485f02a 979763a

Add main source

Browse files

Files changed (31) hide show

.gitignore +9 -0
Deep_Agent_Gradio_RAG_localLLM_main.py +1 -1
OLLAMA_SETUP.md +153 -0
PRIVATE_FILE_RAG_GUIDE.md +129 -0
README.md +117 -0
deep_agent_rag/config.py +7 -0
deep_agent_rag/rag/adaptive_rag_selector.py +292 -0
deep_agent_rag/rag/llm_adapter.py +151 -0
deep_agent_rag/rag/private_file_rag.py +0 -0
deep_agent_rag/ui/calendar_interface.py +653 -0
deep_agent_rag/ui/email_interface.py +259 -0
deep_agent_rag/ui/gradio_interface.py +13 -891
deep_agent_rag/ui/private_file_rag_interface.py +663 -0
deep_agent_rag/utils/llm_utils.py +68 -45
pyproject.toml +11 -0
src/__init__.py +37 -0
src/document_processor.py +590 -0
src/hybrid_subquery_hyde_rag.py +399 -0
src/hyde_rag.py +235 -0
src/llm_integration.py +246 -0
src/prompt_formatter.py +395 -0
src/retrievers/__init__.py +17 -0
src/retrievers/base.py +32 -0
src/retrievers/bm25_retriever.py +127 -0
src/retrievers/hybrid_search.py +298 -0
src/retrievers/reranker.py +448 -0
src/retrievers/vector_retriever.py +254 -0
src/step_back_rag.py +305 -0
src/subquery_rag.py +361 -0
src/triple_hybrid_rag.py +467 -0
uv.lock +105 -0

.gitignore CHANGED Viewed

@@ -19,3 +19,12 @@ token.json
 *test_check*
 .DS_Store

 *test_check*
 .DS_Store
+<<<<<<< HEAD
+.cursor/*
+chroma_db*/*
+=======
+chroma_db*/
+.cursor/*
+>>>>>>> 8862b07082bc878942f9e22816227a2e9a718b23

Deep_Agent_Gradio_RAG_localLLM_main.py CHANGED Viewed

@@ -25,7 +25,7 @@ def main():
     """主函數：初始化系統並啟動 Gradio 界面"""
     print("\n🚀 Deep Research Agent with RAG (Local MLX Edition) 啟動！")
     print("💡 本系統整合了：股票查詢、網路搜尋、PDF 知識庫查詢功能\n")
-    print("📦 使用本地 MLX 模型，保護隱私，無需 API 金鑰\n")
     # 初始化 Parlant SDK
     print("🔧 正在初始化 Parlant SDK...")

     """主函數：初始化系統並啟動 Gradio 界面"""
     print("\n🚀 Deep Research Agent with RAG (Local MLX Edition) 啟動！")
     print("💡 本系統整合了：股票查詢、網路搜尋、PDF 知識庫查詢功能\n")
     # 初始化 Parlant SDK
     print("🔧 正在初始化 Parlant SDK...")

OLLAMA_SETUP.md ADDED Viewed

	@@ -0,0 +1,153 @@

+# Ollama 設置指南
+本指南說明如何在 Deep Agentic AI Tool 中設置和使用 Ollama，特別是 Llama 3.2 3B 模型。
+## 📋 前置需求
+- macOS 或 Linux 系統
+- 至少 16GB 記憶體（推薦）
+- Python >= 3.13
+## 🚀 安裝步驟
+### 1. 安裝 Ollama
+**macOS:**
+```bash
+brew install ollama
+```
+或從官網下載：https://ollama.com
+**Linux:**
+```bash
+curl -fsSL https://ollama.com/install.sh | sh
+```
+### 2. 下載 Llama 3.2 模型
+```bash
+ollama pull llama3.2:3b
+```
+這會下載約 2GB 的模型文件。
+### 3. 啟動 Ollama 服務
+Ollama 通常會自動啟動，如果需要手動啟動：
+```bash
+ollama serve
+```
+服務預設運行在 `http://localhost:11434`
+### 4. 驗證安裝
+測試模型是否可用：
+```bash
+ollama run llama3.2:3b "Hello, how are you?"
+```
+## ⚙️ 配置專案
+### 1. 更新環境變數
+在專案根目錄的 `.env` 文件中添加：
+```env
+# 啟用 Ollama
+USE_OLLAMA=true
+OLLAMA_BASE_URL=http://localhost:11434
+OLLAMA_MODEL=llama3.2:3b
+```
+### 2. 可選配置
+如果需要使用其他 Ollama 模型，可以修改：
+```env
+OLLAMA_MODEL=qwen2.5:7b        # 使用 Qwen2.5
+OLLAMA_MODEL=llama3.1:8b        # 使用 Llama 3.1
+OLLAMA_MODEL=deepseek-r1:7b     # 使用 DeepSeek-R1
+OLLAMA_MODEL=mistral:7b         # 使用 Mistral
+```
+## 🎯 使用方式
+系統會按照以下優先順序自動選擇 LLM：
+1. **Groq API**（如果配置了 `GROQ_API_KEY`）
+2. **Ollama**（如果 `USE_OLLAMA=true` 且服務可用）
+3. **MLX 模型**（備援選項）
+當 Groq API 額度用完時，系統會自動切換到 Ollama（如果啟用），否則使用 MLX 模型。
+## 🔍 檢查當前使用的模型
+啟動應用後，查看控制台輸出：
+- `✅ 使用 Groq API (優先)` - 使用 Groq API
+- `✅ 使用 Ollama 模型 (llama3.2:3b)` - 使用 Ollama
+- `ℹ️ 使用本地 MLX 模型` - 使用 MLX 模型
+## 🐛 故障排除
+### Ollama 服務無法連接
+**問題：** `⚠️ Ollama 初始化失敗: Connection refused`
+**解決方案：**
+1. 確認 Ollama 服務正在運行：`ollama serve`
+2. 檢查端口是否被占用：`lsof -i :11434`
+3. 確認 `OLLAMA_BASE_URL` 配置正確
+### 模型找不到
+**問題：** `⚠️ Ollama 初始化失敗: model not found`
+**解決方案：**
+```bash
+# 下載模型
+ollama pull llama3.2:3b
+# 列出已安裝的模型
+ollama list
+```
+### 記憶體不足
+**問題：** 系統運行緩慢或崩潰
+**解決方案：**
+- Llama 3.2:3B 需要約 2GB RAM
+- 確保系統有足夠的可用記憶體（推薦至少 8GB）
+- 這個模型已經很輕量，適合 16GB 記憶體的系統
+## 📊 模型比較
+| 模型 | 大小 | 記憶體需求 | 特點 |
+|------|------|-----------|------|
+| llama3.2:3b | ~2GB | ~4GB | 輕量高效，適合 16GB 記憶體系統，Meta 開源 |
+| deepseek-r1:7b | ~4.7GB | ~8GB | 優秀的推理能力，適合數學、編程 |
+| qwen2.5:7b | ~4.5GB | ~8GB | 通用能力強，中英文支援好 |
+| llama3.1:8b | ~4.6GB | ~8GB | Meta 開源，性能穩定 |
+| mistral:7b | ~4.1GB | ~7GB | 速度快，效率高 |
+## 💡 性能優化建議
+1. **優先使用 Groq API**：如果可用，Groq API 速度最快
+2. **Ollama 作為備援**：當 Groq 不可用時，Ollama 提供良好的本地推理
+3. **MLX 作為最後備援**：在 Apple Silicon 上，MLX 模型有硬體優化
+## 📚 相關資源
+- [Ollama 官方文檔](https://ollama.com/docs)
+- [Llama 3.2 模型資訊](https://ollama.com/library/llama3.2)
+- [LangChain Ollama 整合](https://python.langchain.com/docs/integrations/llms/ollama)
+---
+**注意**：首次使用時，Ollama 會下載模型文件，這可能需要一些時間，請耐心等待。

PRIVATE_FILE_RAG_GUIDE.md ADDED Viewed

	@@ -0,0 +1,129 @@

+# 私有檔案 RAG 功能使用指南
+## 總覽
+本功能整合了 `Learn_RAG` 專案的 RAG 系統，讓使用者可以上傳私有檔案（如 PDF、DOCX、TXT），並基於這些檔案內容進行智慧問答。系統採用了先進的混合檢索與 LLM 技術，以提供準確、相關的回答。
+## 功能特色
+- ✅ **支援多種檔案格式**：PDF、DOCX、DOC、TXT。
+- ✅ **支援多檔案上傳**：可一次上傳並處理多個檔案。
+- ✅ **混合檢索**：結合 BM25（關鍵字檢索）與向量檢索（語義檢索），大幅提升檢索準確度。
+- ✅ **可選重排序**：使用 BGE Reranker 模型進一步優化檢索結果的相關性。
+- ✅ **兩種分塊模式**：
+    - **語義分塊（推薦）**：智慧切分文件，確保語義完整性，不會在句子中間斷開，提升檢索品質。
+    - **字元分塊**：依固定字數切分，處理速度快，適合快速測試。
+- ✅ **智慧回答生成**：
+    - **LLM 自動切換策略**：系統會自動根據可用性選擇最佳的 LLM，優先順序為：**Groq API > Ollama > MLX 本地模型**。
+    - **自動化提示工程**：自動偵測檔案類型（如學術論文、履歷、通用文件）以調整提問風格，生成更貼切的回答。
+- ✅ **支援中英文問答**。
+## 使用方法
+### 1. 準備工作
+確保 `Learn_RAG` 專案與本專案 (`Deep_Agentic_AI_Tool`) 位於**同一個父目錄**下。正確的目錄結構應如下：
+```
+/some_parent_directory/
+├─── Deep_Agentic_AI_Tool/ (本專案)
+└─── Learn_RAG/
+```
+此外，請確保 `Learn_RAG` 的依賴已安裝。您可以在 `Learn_RAG` 目錄下執行 `uv sync` 或 `pip install -r requirements.txt`。
+### 2. 啟動系統
+執行主程式以啟動 Gradio 網頁介面：
+```bash
+python Deep_Agent_Gradio_RAG_localLLM_main.py
+```
+### 3. 使用步驟
+1.  **開啟介面**：在瀏覽器中開啟 Gradio 介面（預設為 `http://0.0.0.0:7860`），並點擊 **"📚 Private File RAG"** 標籤頁。
+2.  **上傳檔案**：點擊或拖曳檔案至 **"📁 上傳檔案"** 區域，可選擇一個或多個 PDF、DOCX 或 TXT 檔案。
+3.  **設定分塊模式**：
+    - **使用語義分塊（推薦）**：勾選此選項以獲得最佳的檢索品質。處理時間較長，但效果最好。
+    - **調整參數（可選）**：介面提供了對兩種分塊模式的進階參數調整，您可以根據需求調整，或直接使用已優化的預設值。
+4.  **處理檔案**：點擊 **"📝 處理檔案"** 按鈕。系統會根據您的設定進行分塊、建立索引並初始化 RAG 系統。請等待處理狀態顯示 "✅ 文件處理完成"。
+    - **注意**：首次使用時，系統需要下載 Embedding 模型，可能需要數分鐘時間，請耐心等候。
+5.  **提出問題**：在 **"❓ 請輸入您的問題"** 輸入框中，輸入您想詢問關於檔案內容的問題。
+6.  **調整查詢選項**：
+    - **返回結果數量**：可調整檢索到的相關文件片段數量（預設為 3）。
+    - **使用 LLM 生成回答**：勾選此項，AI 會總結檢索到的內容並生成流暢的回答。若取消勾選，則僅顯示原始的文件片段。
+7.  **執行查詢**：點擊 **"🔍 查詢"** 按鈕。
+8.  **檢視結果**：
+    - **💬 AI 回答**：顯示由 LLM 生成的最終回答。
+    - **📄 檢索到的文件片段**：顯示用於生成回答的原始文件內容、來源及相關性分數，方便您查證。
+9.  **清除與重置**：點擊 **"🗑️ 清除"** 按鈕可重設當前會話，讓您重新上傳檔案。
+## 技術細節
+### LLM 使用策略
+本系統採用彈性的 LLM 調度策略，無需手動設定：
+1.  🥇 **Groq API**：若您在環境變數中設定了 `GROQ_API_KEY`，系統會優先使用速度極快的 Groq API。
+2.  🥈 **Ollama**：若 Groq API 不可用或額度用盡，系統會自動切換至本地運行的 Ollama 模型（如 Llama3.2）。
+3.  🥉 **MLX**：若前兩者皆不可用，系統會使用 Apple MLX 在本地運行模型（如 Qwen2.5）作為最終備案。
+### 分塊模式詳解
+- **字元分塊**（預設 `chunk_size: 500`, `chunk_overlap: 100`）：
+  - **優點**：處理速度快。
+  - **適用場景**：快速測試、對語義完整性要求不高的文件。
+  - **參數說明**：
+    - `分塊大小`：每個區塊的字元數。較小值粒度更細，較大值上下文更完整。
+    - `分塊重疊`：相鄰區塊間重疊的字元數，有助於保持上下文連貫。
+- **語義分塊**（預設 `threshold: 1.0`, `min_chunk_size: 100`）：
+  - **優點**：根據語義邊界切分，能保持句子和段落的完整性，檢索品質更高。
+  - **適用場景**：專業文件、報告、論文等需要精準理解上下文的場景。
+  - **參數說明**：
+    - `語義分塊閾值`：控制分塊的敏感度。數值越小，分塊越細。建議值為 0.8-1.2。
+    - `最小分塊大小`：過���的區塊會被合併，以避免碎片化。
+### 檢索系統
+1.  **Embedding 模型**：使用 `sentence-transformers/all-MiniLM-L6-v2` 將文字轉換為向量，此模型輕量且高效。
+2.  **向量資料庫**：使用 `ChromaDB` 儲存並索引向量，資料庫會持久化儲存於 `./chroma_db_private` 目錄。
+3.  **混合檢索**：結合 `BM25`（基於關鍵字）和 `向量檢索`（基於語義），並透過 `RRF` (Reciprocal Rank Fusion) 演算法融合結果，兼顧關鍵字匹配和語義相似度。
+4.  **重排序器（Reranker）**：使用 `BAAI/bge-reranker-base` 模型對混合檢索的結果進行二次排序，將最相關的片段排在最前面，極大化提升了最終答案的品質。
+## 常見問題
+### Q: 處理檔案時出錯或提示 `Learn_RAG` 模組不可用？
+**A:** 請檢查：
+1.  **專案位置**：確保 `Learn_RAG` 專案目錄與 `Deep_Agentic_AI_Tool` 位於同一父目錄下。
+2.  **依賴安裝**：確認您已安裝 `Learn_RAG` 的所有 Python 依賴。最簡單的方式是進入 `Learn_RAG` 目錄並執行 `uv sync`。
+3.  **檔案格式**：確認您上傳的是支援的格式（PDF, DOCX, DOC, TXT）且檔案未損毀。
+### Q: AI 無法生成回答，或回答很慢？
+**A:** 請檢查：
+1.  **LLM 服務**：如果您想使用 Ollama，請確保 Ollama 服務正在本地運行（可透過終端機執行 `ollama serve`）。
+2.  **模型下載**：確認 Ollama 需要的模型已經下載（如 `ollama pull llama3.2:3b`）。
+3.  **LLM 狀態**：系統會自動從 Groq 切換至本地模型。若切換至 MLX，處理速度會較慢，請耐心等待。若不需生成式回答，可取消勾選 "使用 LLM 生成回答" 以直接查看檢索結果。
+### Q: "清除" 按鈕的功能是什麼？
+**A:** "清除" 按鈕會重設當前 Gradio 會話中的 RAG 系統，清空已上傳的檔案和記憶體中的索引。這讓您可以重新上傳並處理新的一批檔案。
+**注意**：此按鈕**不會**刪除磁碟上 `./chroma_db_private` 目錄中持久化的向量資料庫。若要完全清空所有資料，您需要手動刪除該目錄。
+### Q: 處理檔案速度很慢？
+**A:**
+1.  首次使用時，系統需要下載數百 MB 的 Embedding 模型，請耐心等待。
+2.  語義分塊模式會進行複雜的計算，處理時間比字元分塊長，但效果更好。
+3.  檔案越大、數量越多，處理時間越長。
+## 依賴項
+- `Learn_RAG` 專案及其所有依賴項。
+- `langchain-community`, `sentence-transformers`, `chromadb`, `rank_bm25`, `pypdf`, `docx2txt` 等。
+- `ollama` (若使用本地 Ollama LLM)。

README.md CHANGED Viewed

@@ -66,9 +66,29 @@
 # 使用 uv（推薦）
 uv sync
 # 或使用 pip
 pip install -e .
 ```
 ### 2. 環境變數配置
@@ -220,6 +240,7 @@ Deep_Agentic_AI_Tool/
 3. 在 `get_tools_list()` 中添加工具
 4. 代理會自動發現並使用新工具
 ### 修改代理邏輯
 - **規劃邏輯**：編輯 `deep_agent_rag/agents/planner.py`
@@ -231,6 +252,41 @@ Deep_Agentic_AI_Tool/
 編輯 `deep_agent_rag/ui/gradio_interface.py` 修改 Web 界面。
 **詳細開發指南請參考：[系統架構](ARCHITECTURE.md#開發指南)**
 ## 📦 主要依賴
@@ -288,14 +344,75 @@ Deep_Agentic_AI_Tool/
 ## 📧 聯絡
 [添加聯絡資訊]
 ## 🙏 致謝
 - **LangChain & LangGraph**：優秀的代理框架
 - **MLX Team**：高效的本地模型推理
 - **Qwen Team**：Qwen2.5 模型
 - **Jina AI**：嵌入模型
 ---

 # 使用 uv（推薦）
 uv sync
+<<<<<<< HEAD
 # 或使用 pip
 pip install -e .
 ```
+=======
+3. **Set up environment variables** (create a `.env` file in the root directory):
+   ```env
+   # Optional: Groq API (for faster inference)
+   GROQ_API_KEY=your_groq_api_key_here
+   # Optional: Ollama (for local inference with Llama 3.2 or other models)
+   USE_OLLAMA=true
+   OLLAMA_BASE_URL=http://localhost:11434
+   OLLAMA_MODEL=llama3.2:3b
+   # Optional: Tavily API (for web search)
+   TAVILY_API_KEY=your_tavily_api_key_here
+   # Optional: Gmail API credentials
+   GMAIL_CREDENTIALS_FILE=credentials.json
+   GMAIL_TOKEN_FILE=token.json
+   ```
+>>>>>>> 5beccbe9dfa0ef53e4123976ad54e2f1c28b72f8
 ### 2. 環境變數配置
 3. 在 `get_tools_list()` 中添加工具
 4. 代理會自動發現並使用新工具
+<<<<<<< HEAD
 ### 修改代理邏輯
 - **規劃邏輯**：編輯 `deep_agent_rag/agents/planner.py`
 編輯 `deep_agent_rag/ui/gradio_interface.py` 修改 Web 界面。
 **詳細開發指南請參考：[系統架構](ARCHITECTURE.md#開發指南)**
+=======
+The system supports multiple LLM backends with automatic fallback (priority order):
+1. **Primary**: Groq API (fastest, requires API key)
+   - Model: `llama-3.3-70b-versatile`
+   - Automatically used if `GROQ_API_KEY` is set
+2. **Secondary**: Ollama (local inference, excellent reasoning capabilities)
+   - Default Model: `llama3.2:3b` (Llama 3.2 3B)
+   - Requires Ollama installed and model downloaded
+   - Enable with `USE_OLLAMA=true` in `.env`
+   - Lightweight and efficient, suitable for 16GB memory systems
+   - Automatically used when Groq API is unavailable or quota exhausted
+3. **Fallback**: Local MLX Model (privacy-preserving, no API key needed)
+   - Model: `mlx-community/Qwen2.5-Coder-7B-Instruct-4bit`
+   - Automatically used when both Groq API and Ollama are unavailable
+The system automatically switches between backends based on availability.
+**Setting up Ollama:**
+```bash
+# Install Ollama (if not already installed)
+# macOS: brew install ollama
+# Or download from https://ollama.com
+# Download Llama 3.2 model
+ollama pull llama3.2:3b
+# Start Ollama service (usually runs automatically)
+ollama serve
+```
+## ⚙️ Configuration
+>>>>>>> 5beccbe9dfa0ef53e4123976ad54e2f1c28b72f8
 ## 📦 主要依賴
 ## 📧 聯絡
+<<<<<<< HEAD
 [添加聯絡資訊]
+=======
+- **LangChain**: Agent framework and tool integration
+- **LangGraph**: Agent orchestration and workflow management
+- **MLX/MLX-LM**: Local model inference (Apple Silicon optimized)
+- **LangChain Ollama**: Ollama integration for local models
+- **Gradio**: Web interface
+- **ChromaDB**: Vector database for RAG
+- **Tavily**: Web search API
+- **yfinance**: Stock data retrieval
+- **Google API Client**: Gmail API integration
+>>>>>>> 5beccbe9dfa0ef53e4123976ad54e2f1c28b72f8
 ## 🙏 致謝
+<<<<<<< HEAD
 - **LangChain & LangGraph**：優秀的代理框架
 - **MLX Team**：高效的本地模型推理
 - **Qwen Team**：Qwen2.5 模型
 - **Jina AI**：嵌入模型
+=======
+### MLX Model Issues
+- **Model not loading**: Ensure you have sufficient disk space and memory
+- **Slow inference**: This is normal for local models. Consider using Groq API for faster results
+### Groq API Issues
+- **Quota exhausted**: The system automatically falls back to Ollama (if enabled) or local MLX model
+- **API errors**: Check your `GROQ_API_KEY` in `.env` file
+### Ollama Issues
+- **Ollama not starting**: Ensure Ollama service is running (`ollama serve`)
+- **Model not found**: Download the model first (`ollama pull llama3.2:3b`)
+- **Connection errors**: Check `OLLAMA_BASE_URL` in `.env` (default: `http://localhost:11434`)
+- **Memory issues**: Llama 3.2:3B requires ~2GB RAM, suitable for systems with 16GB memory
+### RAG System Issues
+- **PDF not found**: Ensure the PDF file exists at the path specified in `config.py`
+- **Embedding model errors**: The system will attempt to re-download the model if cache is corrupted
+### Gmail API Issues
+- **Authorization errors**: Delete `token.json` and re-authorize
+- **Credentials not found**: Ensure `credentials.json` is in the project root
+- See `GMAIL_API_SETUP.md` for detailed setup instructions
+## 📝 License
+[Add your license information here]
+## 🤝 Contributing
+[Add contribution guidelines here]
+## 📧 Contact
+[Add contact information here]
+## 🙏 Acknowledgments
+- **LangChain & LangGraph**: For the excellent agent framework
+- **MLX Team**: For efficient local model inference
+- **Qwen Team**: For the Qwen2.5 model
+- **Jina AI**: For the embedding model
+>>>>>>> 5beccbe9dfa0ef53e4123976ad54e2f1c28b72f8
 ---

deep_agent_rag/config.py CHANGED Viewed

@@ -49,6 +49,13 @@ GROQ_MAX_TOKENS = 2048
 GROQ_TEMPERATURE = 0.7
 USE_GROQ_FIRST = True  # 是否优先使用 Groq API
 # Email 配置 - 使用 Gmail API
 EMAIL_SENDER = "matthuang46@gmail.com"
 # Gmail API 配置

 GROQ_TEMPERATURE = 0.7
 USE_GROQ_FIRST = True  # 是否优先使用 Groq API
+# Ollama 配置
+OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
+OLLAMA_MODEL = os.getenv("OLLAMA_MODEL", "llama3.2:3b")  # Llama 3.2 3B
+OLLAMA_MAX_TOKENS = 2048
+OLLAMA_TEMPERATURE = 0.7
+USE_OLLAMA = os.getenv("USE_OLLAMA", "false").lower() == "true"  # 是否啟用 Ollama
 # Email 配置 - 使用 Gmail API
 EMAIL_SENDER = "matthuang46@gmail.com"
 # Gmail API 配置

deep_agent_rag/rag/adaptive_rag_selector.py ADDED Viewed

	@@ -0,0 +1,292 @@

+"""
+自适应 RAG 方法选择器
+根据查询和文件特征自动选择最佳的 RAG 方法
+"""
+from typing import Dict, List, Optional
+from enum import Enum
+import re
+import logging
+logger = logging.getLogger(__name__)
+class RAGMethod(Enum):
+    """可用的 RAG 方法"""
+    BASIC = "basic"  # 基础 RAG（当前使用的）
+    SUBQUERY = "subquery"  # 子查询分解
+    HYDE = "hyde"  # 假设文档嵌入
+    STEP_BACK = "step_back"  # 后退推理
+    HYBRID_SUBQUERY_HYDE = "hybrid_subquery_hyde"  # 混合子查询+HyDE
+    TRIPLE_HYBRID = "triple_hybrid"  # 三重混合
+class QueryComplexity(Enum):
+    """查询复杂度"""
+    SIMPLE = "simple"  # 简单查询（单问题，短句）
+    MODERATE = "moderate"  # 中等复杂度（包含多个概念）
+    COMPLEX = "complex"  # 复杂查询（多部分问题，需要分解）
+    VERY_COMPLEX = "very_complex"  # 非常复杂（多个相关问题）
+class QueryType(Enum):
+    """查询类型"""
+    FACTUAL = "factual"  # 事实性查询（"什么是X"）
+    CONCEPTUAL = "conceptual"  # 概念性查询（"如何理解X"）
+    COMPARATIVE = "comparative"  # 比较性查询（"X和Y的区别"）
+    PRINCIPLE = "principle"  # 原理性查询（"X的工作原理"）
+    MULTI_ASPECT = "multi_aspect"  # 多面向查询（包含多个问题）
+class AdaptiveRAGSelector:
+    """
+    自适应 RAG 方法选择器
+    根据以下特征选择最佳 RAG 方法：
+    1. 查询复杂度
+    2. 查询类型
+    3. 文件数量和类型
+    4. 文档复杂度
+    """
+    def __init__(self):
+        """初始化选择器"""
+        pass
+    def analyze_query(self, query: str) -> Dict:
+        """
+        分析查询特征
+        Args:
+            query: 用户查询问题
+        Returns:
+            包含查询特征的字典
+        """
+        query_lower = query.lower()
+        query_len = len(query)
+        word_count = len(query.split())
+        # 检测查询复杂度
+        complexity = self._detect_complexity(query, word_count)
+        # 检测查询类型
+        query_type = self._detect_query_type(query, query_lower)
+        # 检测是否包含多个问题
+        question_count = query.count('?') + query.count('？')
+        has_multiple_questions = question_count > 1
+        # 检测是否包含比较性词汇
+        comparison_keywords = ['vs', 'versus', 'difference', '区别', '比较', 'compare', '对比', '和', 'and', '与']
+        is_comparative = any(kw in query_lower for kw in comparison_keywords)
+        # 检测是否包含专业术语
+        technical_indicators = [
+            '原理', 'mechanism', 'algorithm', 'architecture', 'model', 'system',
+            '原理', '机制', '算法', '架构', '模型', '系统', '方法', 'method',
+            '如何工作', 'how does', 'how do', 'work', 'function'
+        ]
+        has_technical_terms = any(ind in query_lower for ind in technical_indicators)
+        # 检测是否包含"为什么"、"如何"等需要解释的词汇
+        explanation_keywords = ['为什么', 'why', '如何', 'how', 'explain', '解释', '说明']
+        needs_explanation = any(kw in query_lower for kw in explanation_keywords)
+        return {
+            'complexity': complexity,
+            'type': query_type,
+            'word_count': word_count,
+            'length': query_len,
+            'has_multiple_questions': has_multiple_questions,
+            'is_comparative': is_comparative,
+            'has_technical_terms': has_technical_terms,
+            'needs_explanation': needs_explanation,
+            'question_count': question_count
+        }
+    def _detect_complexity(self, query: str, word_count: int) -> QueryComplexity:
+        """检测查询复杂度"""
+        # 简单查询：短句，单问题
+        if word_count <= 10 and query.count('?') + query.count('？') <= 1:
+            return QueryComplexity.SIMPLE
+        # 中等复杂度：中等长度，可能包含多个概念
+        if word_count <= 25:
+            return QueryComplexity.MODERATE
+        # 复杂查询：长句，多个问题或概念
+        if word_count <= 50:
+            return QueryComplexity.COMPLEX
+        # 非常复杂：很长，多个问题
+        return QueryComplexity.VERY_COMPLEX
+    def _detect_query_type(self, query: str, query_lower: str) -> QueryType:
+        """检测查询类型"""
+        # 比较性查询
+        if any(kw in query_lower for kw in ['vs', 'versus', 'difference', '区别', '比较', 'compare', '对比', '和', 'and', '与']):
+            return QueryType.COMPARATIVE
+        # 原理性查询
+        if any(kw in query_lower for kw in ['原理', 'principle', 'how does', 'how do', 'mechanism', '如何工作', '工作原理']):
+            return QueryType.PRINCIPLE
+        # 概念性查询
+        if any(kw in query_lower for kw in ['什么是', 'what is', '理解', 'understand', 'explain', '解释']):
+            return QueryType.CONCEPTUAL
+        # 多面向查询
+        if query.count('?') + query.count('？') > 1:
+            return QueryType.MULTI_ASPECT
+        # 默认：事实性查询
+        return QueryType.FACTUAL
+    def analyze_files(self, file_paths: List[str], documents: Optional[List[Dict]] = None) -> Dict:
+        """
+        分析文件特征
+        Args:
+            file_paths: 文件路径列表
+            documents: 文档列表（可选，如果已处理）
+        Returns:
+            包含文件特征的字典
+        """
+        file_count = len(file_paths)
+        # 检测文件类型
+        file_types = []
+        for path in file_paths:
+            if path.endswith('.pdf'):
+                file_types.append('pdf')
+            elif path.endswith(('.docx', '.doc')):
+                file_types.append('docx')
+            else:
+                file_types.append('txt')
+        # 分析文档复杂度（如果有文档）
+        total_chunks = len(documents) if documents else 0
+        avg_chunk_size = 0
+        if documents:
+            chunk_sizes = [len(doc.get('content', '')) for doc in documents]
+            avg_chunk_size = sum(chunk_sizes) / len(chunk_sizes) if chunk_sizes else 0
+        # 检测是否为学术论文（基于文件名或内容）
+        is_academic = any('paper' in path.lower() or 'arxiv' in path.lower() or
+                         path.endswith('.pdf') for path in file_paths)
+        return {
+            'file_count': file_count,
+            'file_types': file_types,
+            'total_chunks': total_chunks,
+            'avg_chunk_size': avg_chunk_size,
+            'is_academic': is_academic,
+            'is_single_file': file_count == 1,
+            'is_multi_file': file_count > 1
+        }
+    def select_best_method(
+        self,
+        query_features: Dict,
+        file_features: Dict,
+        enable_advanced: bool = True
+    ) -> RAGMethod:
+        """
+        根据特征选择最佳 RAG 方法
+        Args:
+            query_features: 查询特征（来自 analyze_query）
+            file_features: 文件特征（来自 analyze_files）
+            enable_advanced: 是否启用高级方法（如果 False，只使用基础方法）
+        Returns:
+            选择的 RAG 方法
+        """
+        if not enable_advanced:
+            return RAGMethod.BASIC
+        complexity = query_features['complexity']
+        query_type = query_features['type']
+        has_multiple_questions = query_features['has_multiple_questions']
+        is_comparative = query_features['is_comparative']
+        has_technical_terms = query_features['has_technical_terms']
+        needs_explanation = query_features['needs_explanation']
+        file_count = file_features['file_count']
+        is_multi_file = file_features['is_multi_file']
+        # 决策树
+        # 1. 非常复杂的查询 + 多文件 → Triple Hybrid（最强）
+        if complexity == QueryComplexity.VERY_COMPLEX and is_multi_file:
+            return RAGMethod.TRIPLE_HYBRID
+        # 2. 复杂查询 + 多问题 → SubQuery 或 Hybrid Subquery+HyDE
+        if complexity in [QueryComplexity.COMPLEX, QueryComplexity.VERY_COMPLEX]:
+            if has_multiple_questions or query_type == QueryType.MULTI_ASPECT:
+                if is_multi_file:
+                    return RAGMethod.HYBRID_SUBQUERY_HYDE
+                else:
+                    return RAGMethod.SUBQUERY
+        # 3. 原理性查询 → Step-back（需要背景知识）
+        if query_type == QueryType.PRINCIPLE:
+            if complexity in [QueryComplexity.MODERATE, QueryComplexity.COMPLEX]:
+                return RAGMethod.STEP_BACK
+        # 4. 专业术语查询 → HyDE（生成假设文档）
+        if has_technical_terms and complexity in [QueryComplexity.MODERATE, QueryComplexity.COMPLEX]:
+            return RAGMethod.HYDE
+        # 5. 比较性查询 + 多文件 → SubQuery（需要分别检索）
+        if is_comparative and is_multi_file:
+            return RAGMethod.SUBQUERY
+        # 6. 中等复杂度 + 多文件 → Hybrid Subquery+HyDE
+        if complexity == QueryComplexity.MODERATE and is_multi_file:
+            return RAGMethod.HYBRID_SUBQUERY_HYDE
+        # 7. 简单查询 → 基础 RAG 或 HyDE
+        if complexity == QueryComplexity.SIMPLE:
+            if has_technical_terms:
+                return RAGMethod.HYDE
+            else:
+                return RAGMethod.BASIC
+        # 8. 需要解释的查询 → Step-back（提供背景知识）
+        if needs_explanation and complexity == QueryComplexity.MODERATE:
+            return RAGMethod.STEP_BACK
+        # 9. 默认：中等复杂度使用 Step-back
+        if complexity == QueryComplexity.MODERATE:
+            return RAGMethod.STEP_BACK
+        # 10. 默认：复杂查询使用 SubQuery
+        return RAGMethod.SUBQUERY
+    def get_method_reason(self, method: RAGMethod, query_features: Dict, file_features: Dict) -> str:
+        """
+        获取选择该方法的理由
+        Args:
+            method: 选择的 RAG 方法
+            query_features: 查询特征
+            file_features: 文件特征
+        Returns:
+            选择理由的字符串
+        """
+        complexity = query_features['complexity'].value
+        query_type = query_features['type'].value
+        file_count = file_features['file_count']
+        reasons = {
+            RAGMethod.BASIC: f"简单查询（{complexity}），使用基础 RAG 方法即可",
+            RAGMethod.SUBQUERY: f"查询包含多个方面（{query_features['question_count']}个问题，{complexity}），使用子查询分解以全面检索",
+            RAGMethod.HYDE: f"查询包含专业术语（{complexity}），使用假设文档嵌入以改善语义检索",
+            RAGMethod.STEP_BACK: f"原理性查询（{query_type}，{complexity}），使用后退推理获取背景知识和原理",
+            RAGMethod.HYBRID_SUBQUERY_HYDE: f"复杂查询（{complexity}）+ {file_count}个文件，使用混合子查询+HyDE方法以全面检索",
+            RAGMethod.TRIPLE_HYBRID: f"非常复杂的查询（{complexity}）+ {file_count}个文件，使用三重混合方法（SubQuery+HyDE+Step-back）以获得最佳效果"
+        }
+        return reasons.get(method, f"使用 {method.value} 方法")

deep_agent_rag/rag/llm_adapter.py ADDED Viewed

	@@ -0,0 +1,151 @@

+"""
+LLM 适配器：将 LangChain ChatModel 包装成 OllamaLLM 接口
+用于兼容 Learn_RAG 项目中的进阶 RAG 方法
+"""
+from typing import Optional
+from langchain_core.messages import HumanMessage
+from langchain_core.language_models.chat_models import BaseChatModel
+import logging
+logger = logging.getLogger(__name__)
+class LangChainLLMAdapter:
+    """
+    将 LangChain ChatModel 适配为 OllamaLLM 接口
+    这个适配器允许 Learn_RAG 项目中的进阶 RAG 方法（需要 OllamaLLM）
+    使用 Deep_Agentic_AI_Tool 的统一 LLM 系统（Groq -> Ollama -> MLX）
+    """
+    def __init__(self, langchain_llm: BaseChatModel):
+        """
+        初始化适配器
+        Args:
+            langchain_llm: LangChain ChatModel 实例（来自 get_llm()）
+        """
+        self.llm = langchain_llm
+        self.model_name = self._detect_model_name()
+        self.base_url = "http://localhost:11434"  # 默认值，实际不使用
+        self.timeout = 120  # 默认值，实际不使用
+        logger.info(f"✅ LLM 适配器初始化完成 (模型类型: {self.model_name})")
+    def _detect_model_name(self) -> str:
+        """
+        检测 LLM 类型和模型名称
+        Returns:
+            模型名称字符串
+        """
+        llm_type = type(self.llm).__name__
+        # 检测 Groq
+        if "Groq" in llm_type or "ChatGroq" in llm_type:
+            model_name = getattr(self.llm, 'model_name', 'groq-unknown')
+            return f"groq:{model_name}"
+        # 检测 Ollama
+        if "Ollama" in llm_type or "ChatOllama" in llm_type:
+            model_name = getattr(self.llm, 'model', 'ollama-unknown')
+            return f"ollama:{model_name}"
+        # 检测 MLX
+        if "MLX" in llm_type or "MLXChatModel" in llm_type:
+            return "mlx:qwen2.5"
+        # 默认
+        return f"langchain:{llm_type}"
+    def _check_ollama_connection(self) -> bool:
+        """
+        检查 Ollama 服务是否可用（兼容性方法，实际不使用）
+        Returns:
+            总是返回 True（因为我们使用的是统一的 LLM 系统）
+        """
+        return True
+    def _check_model_available(self) -> bool:
+        """
+        检查模型是否可用（兼容性方法，实际不使用）
+        Returns:
+            总是返回 True（因为我们使用的是统一的 LLM 系统）
+        """
+        return True
+    def generate(
+        self,
+        prompt: str,
+        temperature: float = 0.7,
+        max_tokens: Optional[int] = None,
+        stream: bool = False
+    ) -> str:
+        """
+        生成回答（兼容 OllamaLLM.generate 接口）
+        Args:
+            prompt: 输入 prompt
+            temperature: 温度参数（0.0-1.0），控制随机性
+            max_tokens: 最大生成 token 数（None 表示使用模型预设）
+            stream: 是否使用流式输出（当前不支持，总是返回完整结果）
+        Returns:
+            生成的回答字符串
+        """
+        try:
+            # 将 prompt 转换为 LangChain 消息格式
+            messages = [HumanMessage(content=prompt)]
+            # 准备调用参数
+            invoke_kwargs = {}
+            # 如果 LLM 支持 temperature 参数
+            if hasattr(self.llm, 'temperature'):
+                # 临时设置 temperature（如果支持）
+                original_temp = getattr(self.llm, 'temperature', None)
+                try:
+                    self.llm.temperature = temperature
+                except:
+                    pass  # 如果不支持设置，忽略
+            # 如果 LLM 支持 max_tokens 参数
+            if max_tokens and hasattr(self.llm, 'max_tokens'):
+                original_max_tokens = getattr(self.llm, 'max_tokens', None)
+                try:
+                    self.llm.max_tokens = max_tokens
+                except:
+                    pass  # 如果不支持设置，忽略
+            # 调用 LangChain LLM
+            response = self.llm.invoke(messages, **invoke_kwargs)
+            # 恢复原始参数（如果之前修改过）
+            if hasattr(self.llm, 'temperature') and 'original_temp' in locals():
+                try:
+                    self.llm.temperature = original_temp
+                except:
+                    pass
+            if hasattr(self.llm, 'max_tokens') and 'original_max_tokens' in locals():
+                try:
+                    self.llm.max_tokens = original_max_tokens
+                except:
+                    pass
+            # 提取回答内容
+            if hasattr(response, 'content'):
+                answer = response.content
+            elif isinstance(response, str):
+                answer = response
+            else:
+                answer = str(response)
+            return answer.strip()
+        except Exception as e:
+            logger.error(f"⚠️ LLM 生成回答时出错: {e}")
+            raise RuntimeError(f"LLM 生成失败: {e}") from e

deep_agent_rag/rag/private_file_rag.py ADDED Viewed

The diff for this file is too large to render. See raw diff

deep_agent_rag/ui/calendar_interface.py ADDED Viewed

	@@ -0,0 +1,653 @@

+# deep_agent_rag/ui/calendar_interface.py
+import gradio as gr
+from datetime import datetime, timedelta
+import re
+import json
+import time
+from ..agents.calendar_agent import generate_calendar_draft, create_calendar_draft
+# Assuming is_using_local_llm might be used for warnings/status, similar to email_interface
+# from ..utils.llm_utils import is_using_local_llm
+# Agent log path for debugging (if needed)
+log_path = "/Users/matthuang/Desktop/Deep_Agentic_AI_Tool/.cursor/debug.log"
+def _create_calendar_interface():
+    """創建 Calendar Tool 界面"""
+    gr.Markdown(
+        """
+        ### 📅 智能行事曆管理助手
+        使用 AI 根據您的完整提示自動生成行事曆事件草稿，您可以在創建前檢查和修改。
+        **使用方式：**
+        1. **快速選擇**：點擊下方常見事件按鈕，自動生成草稿
+        2. **自定義輸入**：在下方輸入完整的事件提示，包含：事件、日期、時間、地點、參與者
+        3. 查看 AI 反思評估結果和改進建議（如有）
+        4. 如果有缺失的資訊（如時間），系統會顯示下拉選單讓您選擇
+        5. 檢查並修改生成的事件內容
+        6. 確認無誤後點擊「創建事件」按鈕
+        **✨ 新功能：AI 迭代反思評估 + Google Maps 地點驗證**
+        - 系統會自動進行多輪反思評估（最多 3 輪）
+        - 自動驗證並標準化地址，計算交通時間
+        - 每輪評估後，如果有改進建議，會自動生成改進版本
+        - 改進後的版本會再次評估，直到 AI 認為滿意為止
+        """
+    )
+    # 快速選擇按鈕區域
+    gr.Markdown("### 🚀 快速選擇常見事件")
+    with gr.Row():
+        quick_meeting_btn = gr.Button("📋 團隊會議", variant="secondary", scale=1)
+        quick_client_btn = gr.Button("🤝 客戶拜訪", variant="secondary", scale=1)
+        quick_lunch_btn = gr.Button("🍽️ 午餐會議", variant="secondary", scale=1)
+        quick_oneonone_btn = gr.Button("💬 一對一會議", variant="secondary", scale=1)
+    with gr.Row():
+        quick_project_btn = gr.Button("📊 項目討論", variant="secondary", scale=1)
+        quick_training_btn = gr.Button("🎓 培訓/學習", variant="secondary", scale=1)
+        quick_social_btn = gr.Button("🎉 社交活動", variant="secondary", scale=1)
+        quick_custom_btn = gr.Button("✏️ 自定義輸入", variant="secondary", scale=1)
+    with gr.Row():
+        with gr.Column(scale=1):
+            # 單一 prompt 輸入
+            calendar_prompt_input = gr.Textbox(
+                label="📝 事件提示（包含事件、日期、時間、地點、參與者）",
+                placeholder="例如：明天下午2點團隊會議，討論項目進度，地點在會議室A，參與者包括john@example.com和mary@example.com",
+                lines=5,
+                value=""
+            )
+            # 按鈕
+            with gr.Row():
+                generate_draft_btn = gr.Button("📝 生成事件草稿", variant="primary", scale=1)
+                clear_calendar_btn = gr.Button("🗑️ 清除", variant="secondary", scale=1)
+            # 狀態顯示
+            calendar_status_display = gr.Textbox(
+                label="📊 狀態",
+                value="等待操作...",
+                interactive=False,
+                lines=2
+            )
+            # 反思結果顯示
+            calendar_reflection_display = gr.Textbox(
+                label="🔍 AI 反思評估",
+                value="等待生成事件...",
+                interactive=False,
+                lines=8,
+                visible=True
+            )
+            # 缺失資訊的補充區域（動態顯示）
+            missing_info_group = gr.Group(visible=False)
+            with missing_info_group:
+                gr.Markdown("**⚠️ 請補充以下缺失的資訊：**")
+                # 日期選擇（如果缺失）
+                missing_date_display = gr.Dropdown(
+                    label="📆 選擇日期",
+                    choices=[],
+                    visible=False,
+                    interactive=True
+                )
+                # 時間選擇（如果缺失）
+                missing_time_display = gr.Dropdown(
+                    label="🕐 選擇時間",
+                    choices=[],
+                    visible=False,
+                    interactive=True
+                )
+                fill_missing_btn = gr.Button("✅ 確認補充資訊", variant="primary", visible=False)
+            # 隱藏狀態變數，用於存儲 event_dict
+            event_dict_storage = gr.State(value={})
+        with gr.Column(scale=1):
+            # 事件詳情顯示和編輯區域
+            event_summary_display = gr.Textbox(
+                label="📌 事件標題",
+                placeholder="事件標題將在這裡顯示",
+                lines=1,
+                interactive=True
+            )
+            event_start_display = gr.Textbox(
+                label="🕐 開始時間",
+                placeholder="開始時間將在這裡顯示（格式: YYYY-MM-DDTHH:MM:SS+08:00）",
+                lines=1,
+                interactive=True
+            )
+            event_end_display = gr.Textbox(
+                label="🕐 結束時間",
+                placeholder="結束時間將在這裡顯示（格式: YYYY-MM-DDTHH:MM:SS+08:00）",
+                lines=1,
+                interactive=True
+            )
+            event_description_display = gr.Textbox(
+                label="📄 事件描述（可編輯）",
+                placeholder="事件描述將在這裡顯示，您可以編輯",
+                lines=6,
+                interactive=True
+            )
+            event_location_display = gr.Textbox(
+                label="📍 地點（可編輯，已自動驗證並標準化）",
+                placeholder="事件地點將在這裡顯示，您可以編輯",
+                lines=2,
+                interactive=True
+            )
+            event_attendees_display = gr.Textbox(
+                label="👥 參與者郵箱（可編輯，多個用逗號分隔）",
+                placeholder="參與者郵箱將在這裡顯示，您可以編輯",
+                lines=1,
+                interactive=True
+            )
+            # 創建按鈕
+            create_event_btn = gr.Button("✅ 創建事件", variant="primary", scale=1)
+            # 操作結果顯示
+            calendar_result_display = gr.Textbox(
+                label="📊 操作結果",
+                lines=8,
+                interactive=False
+            )
+    # 生成時間選項（每30分鐘一個選項）
+    def generate_time_options():
+        """生成時間選項列表"""
+        times = []
+        for hour in range(24):
+            for minute in [0, 30]:
+                time_str = f"{hour:02d}:{minute:02d}"
+                times.append(time_str)
+        return times
+    # 生成日期選項（今天、明天、後天，以及未來7天）
+    def generate_date_options():
+        """生成日期選項列表"""
+        dates = []
+        today = datetime.now()
+        date_names = ["今天", "明天", "後天"]
+        for i in range(3):
+            date_obj = today + timedelta(days=i)
+            date_str = date_obj.strftime('%Y-%m-%d')
+            dates.append(f"{date_names[i]} ({date_str})")
+        for i in range(3, 7):
+            date_obj = today + timedelta(days=i)
+            date_str = date_obj.strftime('%Y-%m-%d')
+            dates.append(date_str)
+        return dates
+    # 快速選擇事件模板生成函數
+    def generate_quick_prompt(event_type: str) -> str:
+        """根據事件類型生成預設提示"""
+        from datetime import datetime, timedelta
+        # 獲取明天的日期
+        tomorrow = datetime.now() + timedelta(days=1)
+        tomorrow_str = tomorrow.strftime("%Y-%m-%d")
+        templates = {
+            "meeting": f"明天下午2點團隊會議，討論項目進度和下週計劃，地點在會議室，參與者包括團隊成員",
+            "client": f"明天上午10點客戶拜訪，討論合作方案和需求，地點在客戶公司或會議室",
+            "lunch": f"明天中午12點午餐會議，與合作夥伴討論業務合作，地點在附近的餐廳",
+            "oneonone": f"明天下午3點一對一會議，討論工作進展和職業發展，地點在會議室或咖啡廳",
+            "project": f"明天上午9點項目討論會議，審查項目進度和解決問題，地點在項目室，參與者包括項目團隊",
+            "training": f"明天下午2點培訓課程，學習新技能和最佳實踐，地點在培訓室或線上",
+            "social": f"明天晚上6點團隊聚餐，慶祝項目完成，地點在餐廳，參與者包括團隊成員",
+            "custom": ""  # 自定義，返回空讓用戶輸入
+        }
+        return templates.get(event_type, "")
+    # 快速選擇按鈕處理函數（自動生成草稿）
+    def quick_select_and_generate(event_type: str):
+        """快速選擇事件類型並自動生成草稿"""
+        prompt = generate_quick_prompt(event_type)
+        if not prompt:
+            # 如果是自定義，只返回空提示，不自動生成
+            return (
+                prompt,  # calendar_prompt_input
+                "請在下方輸入框中輸入事件提示，然後點擊「生成事件草稿」",  # calendar_status_display
+                "等待輸入...",  # calendar_reflection_display
+                gr.update(visible=False),  # missing_info_group
+                gr.update(visible=False, choices=[]),  # missing_date_display
+                gr.update(visible=False, choices=[]),  # missing_time_display
+                gr.update(visible=False),  # fill_missing_btn
+                "", "", "", "", "", "",  # event fields
+                {},
+                ""  # calendar_result_display
+            )
+        # 自動生成草稿（調用 generate_draft 並返回所有輸出）
+        draft_result = generate_draft(prompt)
+        # generate_draft 返回的格式是：(status, reflection_display, missing_info_group, ...)
+        # 但我們需要返回 (prompt, status, reflection_display, ...)
+        # draft_result 是一個元組，我們需要將 prompt 添加到開頭
+        return (prompt,) + draft_result
+    def quick_select_meeting():
+        """快速選擇：團隊會議"""
+        return quick_select_and_generate("meeting")
+    def quick_select_client():
+        """快速選擇：客戶拜訪"""
+        return quick_select_and_generate("client")
+    def quick_select_lunch():
+        """快速選擇：午餐會議"""
+        return quick_select_and_generate("lunch")
+    def quick_select_oneonone():
+        """快速選擇：一對一會議"""
+        return quick_select_and_generate("oneonone")
+    def quick_select_project():
+        """快速選擇：項目討論"""
+        return quick_select_and_generate("project")
+    def quick_select_training():
+        """快速選擇：培訓/學習"""
+        return quick_select_and_generate("training")
+    def quick_select_social():
+        """快速選擇：社交活動"""
+        return quick_select_and_generate("social")
+    def quick_select_custom():
+        """快速選擇：自定義輸入（只清空，不自動生成）"""
+        return (
+            "",  # calendar_prompt_input
+            "請在下方輸入框中輸入事件提示，然後點擊「生成事件草稿」",  # calendar_status_display
+            "等待輸入...",  # calendar_reflection_display
+            gr.update(visible=False),  # missing_info_group
+            gr.update(visible=False, choices=[]),  # missing_date_display
+            gr.update(visible=False, choices=[]),  # missing_time_display
+            gr.update(visible=False),  # fill_missing_btn
+            "", "", "", "", "", "",  # event fields
+            {},
+            ""  # calendar_result_display
+        )
+    # 事件處理函數
+    def generate_draft(prompt):
+        """生成行事曆事件草稿（包含反思功能）"""
+        if not prompt or not prompt.strip():
+            return (
+                "❌ 請輸入事件提示",
+                "❌ 請輸入事件提示",
+                gr.update(visible=False),
+                gr.update(visible=False, choices=[]),
+                gr.update(visible=False, choices=[]),
+                gr.update(visible=False),
+                "", "", "", "", "", "", "",
+                "❌ 請輸入事件提示"
+            )
+        try:
+            status_msg = "🔄 正在生成事件草稿..."
+            # 生成事件草稿（包含反思功能）
+            event_dict, status, missing_info, reflection_result, was_improved = generate_calendar_draft(
+                prompt.strip(), enable_reflection=True
+            )
+            if not event_dict:
+                return (
+                    status,
+                    gr.update(visible=False),
+                    gr.update(visible=False, choices=[]),
+                    gr.update(visible=False, choices=[]),
+                    gr.update(visible=False),
+                    "", "", "", "", "", "", "",
+                    status
+                )
+            # 格式化反思結果顯示
+            if reflection_result:
+                # 計算反思輪數
+                reflection_count = reflection_result.count("【第") if "【第" in reflection_result else 0
+                if was_improved:
+                    if reflection_count > 1:
+                        reflection_display = (
+                            f"🔍 **AI 迭代反思評估結果**（共 {reflection_count} 輪）\n\n"
+                            f"{reflection_result}\n\n"
+                            f"✨ **已自動應用改進建議，經過 {reflection_count} 輪優化，當前顯示的是最終優化版本**"
+                        )
+                    else:
+                        reflection_display = (
+                            f"🔍 **AI 反思評估結果**\n\n"
+                            f"{reflection_result}\n\n"
+                            f"✨ **已自動應用改進建議，當前顯示的是優化後的版本**"
+                        )
+                else:
+                    reflection_display = (
+                        f"🔍 **AI 反思評估結果**\n\n"
+                        f"{reflection_result}\n\n"
+                        f"✅ **事件質量良好，無需改進**"
+                    )
+            else:
+                reflection_display = "⚠️ 反思功能未返回結果"
+            # 【Google Maps 整合】添加地點建議訊息
+            location_suggestion = event_dict.get("location_suggestion", "")
+            if location_suggestion:
+                # 將地點建議添加到狀態訊息中
+                if status:
+                    status = f"{status}\n\n🗺️ **地點資訊：**\n{location_suggestion}"
+                else:
+                    status = f"🗺️ **地點資訊：**\n{location_suggestion}"
+            # 檢查是否有缺失資訊
+            has_missing = bool(missing_info)
+            if has_missing:
+                # 顯示缺失資訊區域
+                date_visible = missing_info.get("date", False)
+                time_visible = missing_info.get("time", False)
+                date_choices = generate_date_options() if date_visible else []
+                time_choices = generate_time_options() if time_visible else []
+                return (
+                    status,
+                    reflection_display,
+                    gr.update(visible=True),  # 顯示缺失資訊區域
+                    gr.update(visible=date_visible, choices=date_choices, value=date_choices[0] if date_choices else None),
+                    gr.update(visible=time_visible, choices=time_choices, value=time_choices[0] if time_choices else None),
+                    gr.update(visible=True),  # 顯示確認按鈕
+                    event_dict.get("summary", ""),
+                    event_dict.get("start_datetime", ""),
+                    event_dict.get("end_datetime", ""),
+                    event_dict.get("description", ""),
+                    event_dict.get("location", ""),
+                    event_dict.get("attendees", ""),
+                    event_dict,  # 傳遞完整的事件字典以便後續使用
+                    ""
+                )
+            else:
+                # 沒有缺失資訊，直接顯示結果
+                return (
+                    status,
+                    reflection_display,
+                    gr.update(visible=False),
+                    gr.update(visible=False, choices=[]),
+                    gr.update(visible=False, choices=[]),
+                    gr.update(visible=False),
+                    event_dict.get("summary", ""),
+                    event_dict.get("start_datetime", ""),
+                    event_dict.get("end_datetime", ""),
+                    event_dict.get("description", ""),
+                    event_dict.get("location", ""),
+                    event_dict.get("attendees", ""),
+                    event_dict,
+                    ""
+                )
+        except Exception as e:
+            error_msg = f"❌ 發生錯誤：{str(e)}"
+            print(f"Calendar Tool 錯誤：{e}")
+            import traceback
+            traceback.print_exc()
+            return (
+                "❌ 發生錯誤",
+                f"❌ 發生錯誤：{str(e)}",
+                gr.update(visible=False),
+                gr.update(visible=False, choices=[]),
+                gr.update(visible=False, choices=[]),
+                gr.update(visible=False),
+                "", "", "", "", "", "", {},
+                error_msg
+            )
+    def fill_missing_info(event_dict_storage, selected_date, selected_time):
+        """填充缺失的資訊"""
+        if not event_dict_storage:
+            return (
+                "❌ 沒有事件資料",
+                gr.update(visible=False),
+                gr.update(visible=False, choices=[]),
+                gr.update(visible=False, choices=[]),
+                gr.update(visible=False),
+                "", "", "", "", "", "",
+                {}
+            )
+        # 更新日期和時間
+        if selected_date:
+            # 從選項中提取日期字串（例如："明天 (2026-01-25)" -> "2026-01-25"）
+            if "(" in selected_date:
+                date_str = selected_date.split("(")[1].split(")")[0]
+            else:
+                date_str = selected_date
+        else:
+            date_str = event_dict_storage.get("date", "今天")
+        if selected_time:
+            time_str = selected_time
+        else:
+            time_str = "09:00"  # 預設時間
+        # 重新解析日期和時間
+        from ..agents.calendar_agent import parse_datetime # Import here to avoid circular dependency or unnecessary global import
+        start_datetime, end_datetime = parse_datetime(date_str, time_str)
+        # 更新事件字典
+        event_dict_storage["start_datetime"] = start_datetime
+        event_dict_storage["end_datetime"] = end_datetime
+        return (
+            "✅ 資訊已補充，請檢查並創建事件",
+            gr.update(visible=False),  # 隱藏缺失資訊區域
+            gr.update(visible=False, choices=[]),
+            gr.update(visible=False, choices=[]),
+            gr.update(visible=False),
+            event_dict_storage.get("summary", ""),
+            start_datetime,
+            end_datetime,
+            event_dict_storage.get("description", ""),
+            event_dict_storage.get("location", ""),
+            event_dict_storage.get("attendees", ""),
+            event_dict_storage
+        )
+    def create_event(summary, start_datetime, end_datetime, description, location, attendees):
+        """創建行事曆事件"""
+        if not summary or not summary.strip():
+            return "❌ 請輸入事件標題", "❌ 請輸入事件標題"
+        if not start_datetime or not start_datetime.strip():
+            return "❌ 請輸入開始時間", "❌ 請輸入開始時間"
+        if not end_datetime or not end_datetime.strip():
+            return "❌ 請輸入結束時間", "❌ 請輸入結束時間"
+        try:
+            status_msg = "🔄 正在創建事件..."
+            # 構建事件字典
+            event_dict = {
+                "summary": summary.strip(),
+                "start_datetime": start_datetime.strip(),
+                "end_datetime": end_datetime.strip(),
+                "description": description.strip() if description else "",
+                "location": location.strip() if location else "",
+                "attendees": attendees.strip() if attendees else "",
+                "timezone": "Asia/Taipei"
+            }
+            # 創建事件
+            result = create_calendar_draft(event_dict)
+            return "✅ 事件已創建", result
+        except Exception as e:
+            error_msg = f"❌ 創建事件時發生錯誤：{str(e)}"
+            print(f"Calendar Tool 錯誤：{e}")
+            import traceback
+            traceback.print_exc()
+            return "❌ 發生錯誤", error_msg
+    def clear_calendar():
+        """清除行事曆相關輸入和輸出"""
+        return (
+            "",  # prompt
+            "等待操作...",  # status
+            "等待生成事件...",  # reflection_display
+            gr.update(visible=False),  # missing_info_group
+            gr.update(visible=False, choices=[]),  # missing_date
+            gr.update(visible=False, choices=[]),  # missing_time
+            gr.update(visible=False),  # fill_missing_btn
+            "", "", "", "", "", "",  # event fields
+            {},
+            ""  # result
+        )
+    # 綁定事件
+    generate_draft_btn.click(
+        fn=generate_draft,
+        inputs=[calendar_prompt_input],
+        outputs=[
+            calendar_status_display,
+            calendar_reflection_display,
+            missing_info_group,
+            missing_date_display,
+            missing_time_display,
+            fill_missing_btn,
+            event_summary_display,
+            event_start_display,
+            event_end_display,
+            event_description_display,
+            event_location_display,
+            event_attendees_display,
+            event_dict_storage,
+            calendar_result_display
+        ]
+    )
+    # 綁定快速選擇按鈕（自動填充提示並生成草稿）
+    quick_outputs = [
+        calendar_prompt_input,  # 更新提示輸入框
+        calendar_status_display,
+        calendar_reflection_display,
+        missing_info_group,
+        missing_date_display,
+        missing_time_display,
+        fill_missing_btn,
+        event_summary_display,
+        event_start_display,
+        event_end_display,
+        event_description_display,
+        event_location_display,
+        event_attendees_display,
+        event_dict_storage,
+        calendar_result_display
+    ]
+    quick_meeting_btn.click(fn=quick_select_meeting, outputs=quick_outputs)
+    quick_client_btn.click(fn=quick_select_client, outputs=quick_outputs)
+    quick_lunch_btn.click(fn=quick_select_lunch, outputs=quick_outputs)
+    quick_oneonone_btn.click(fn=quick_select_oneonone, outputs=quick_outputs)
+    quick_project_btn.click(fn=quick_select_project, outputs=quick_outputs)
+    quick_training_btn.click(fn=quick_select_training, outputs=quick_outputs)
+    quick_social_btn.click(fn=quick_select_social, outputs=quick_outputs)
+    quick_custom_btn.click(fn=quick_select_custom, outputs=quick_outputs)
+    fill_missing_btn.click(
+        fn=fill_missing_info,
+        inputs=[event_dict_storage, missing_date_display, missing_time_display],
+        outputs=[
+            calendar_status_display,
+            missing_info_group,
+            missing_date_display,
+            missing_time_display,
+            fill_missing_btn,
+            event_summary_display,
+            event_start_display,
+            event_end_display,
+            event_description_display,
+            event_location_display,
+            event_attendees_display,
+            event_dict_storage
+        ]
+    )
+    create_event_btn.click(
+        fn=create_event,
+        inputs=[
+            event_summary_display,
+            event_start_display,
+            event_end_display,
+            event_description_display,
+            event_location_display,
+            event_attendees_display
+        ],
+        outputs=[calendar_status_display, calendar_result_display]
+    )
+    clear_calendar_btn.click(
+        fn=clear_calendar,
+        outputs=[
+            calendar_prompt_input,
+            calendar_status_display,
+            calendar_reflection_display,
+            missing_info_group,
+            missing_date_display,
+            missing_time_display,
+            fill_missing_btn,
+            event_summary_display,
+            event_start_display,
+            event_end_display,
+            event_description_display,
+            event_location_display,
+            event_attendees_display,
+            event_dict_storage,
+            calendar_result_display
+        ]
+    )
+    # 示例
+    gr.Examples(
+        examples=[
+            "明天下午2點團隊會議，討論項目進度，地點在會議室A，參與者包括john@example.com",
+            "2026-01-25 上午9點產品發布會，介紹新功能和改進，地點在總部大樓",
+            "後天下午3點客戶會議，討論合作細節，參與者包括客戶代表",
+            "下週一上午10點技術分享會，分享最新的 AI 技術，地點在研發中心"
+        ],
+        inputs=[calendar_prompt_input]
+    )
+    # 頁腳說明
+    gr.Markdown(
+        """
+        ---
+        **注意事項：**
+        1. 使用 Google Calendar API 管理行事曆事件
+        2. 首次使用需要在專案根目錄放置 `credentials.json`（從 Google Cloud Console 下載的 OAuth2 憑證）
+        3. 首次運行時會自動開啟瀏覽器進行授權，授權後會生成 `token.json` 文件
+        4. 事件內容由 AI 自動生成，請在創建前檢查結果
+        5. 在提示中包含所有資訊：事件、日期、時間、地點、參與者
+        6. 如果缺少日期或時間，系統會顯示下拉選單讓您選擇
+        7. 日期格式支援：YYYY-MM-DD（例如：2026-01-25）或相對日期（今天、明天、後天）
+        8. 時間格式支援：24小時制（14:00）或12小時制（2:00 PM）
+        **設置步驟：**
+        - 前往 [Google Cloud Console](https://console.cloud.google.com/) 創建專案
+        - 啟用 Google Calendar API
+        - 創建 OAuth2 憑證並下載為 `credentials.json`
+        - 將 `credentials.json` 放在專案根目錄
+        - 確保授予 Calendar API 的完整存取權限
+        """
+    )

deep_agent_rag/ui/email_interface.py ADDED Viewed

	@@ -0,0 +1,259 @@

+# deep_agent_rag/ui/email_interface.py
+import gradio as gr
+import re
+import json
+import time
+from ..agents.email_agent import generate_email_draft, send_email_draft
+from ..config import EMAIL_SENDER
+from ..utils.llm_utils import is_using_local_llm # Assuming this might be used for warnings/status
+# Agent log path for debugging (if needed)
+log_path = "/Users/matthuang/Desktop/Deep_Agentic_AI_Tool/.cursor/debug.log"
+def _create_email_interface():
+    """創建 Email Tool 界面"""
+    gr.Markdown(
+        f"""
+        ### 📧 智能郵件助手
+        使用 AI 根據您的關鍵提示自動生成專業郵件草稿，您可以在發送前檢查和修改。
+        **寄件者：** {EMAIL_SENDER}
+        **使用方式：**
+        1. 在下方輸入郵件提示（例如："寫一封感謝信"、"邀請參加會議"等）
+        2. 輸入收件人 Gmail 郵箱地址（僅支援 @gmail.com 或 @googlemail.com）
+        3. 點擊「生成郵件草稿」按鈕
+        4. 查看 AI 反思評估結果和改進建議（如有）
+        5. 檢查並修改生成的郵件內容（特別是簽名部分）
+        6. 確認無誤後點擊「發送郵件」按鈕
+        **✨ 新功能：AI 迭代反思評估**
+        - 系統會自動進行多輪反思評估（最多 3 輪）
+        - 每輪評估後，如果有改進建議，會自動生成改進版本
+        - 改進後的版本會再次評估，直到 AI 認為滿意為止
+        - 您可以看到完整的反思過程和每輪的改進建議
+        **注意：此工具僅支援 Gmail 郵箱，收件人必須使用 Gmail 郵箱地址。**
+        """
+    )
+    with gr.Row():
+        with gr.Column(scale=1):
+            # 郵件提示輸入
+            email_prompt_input = gr.Textbox(
+                label="📝 郵件提示",
+                placeholder="例如：寫一封感謝信，感謝對方在項目中的幫助",
+                lines=5,
+                value="寫一封專業的郵件，介紹我們的 AI 產品"
+            )
+            # 收件人輸入
+            recipient_input = gr.Textbox(
+                label="📮 收件人郵箱（僅支援 Gmail）",
+                placeholder="recipient@gmail.com",
+                lines=1
+            )
+            # 按鈕
+            with gr.Row():
+                generate_draft_btn = gr.Button("📝 生成郵件草稿", variant="primary", scale=1)
+                clear_email_btn = gr.Button("🗑️ 清除", variant="secondary", scale=1)
+            # 狀態顯示
+            email_status_display = gr.Textbox(
+                label="📊 狀態",
+                value="等待操作...",
+                interactive=False,
+                lines=2
+            )
+            # 反思結果顯示
+            email_reflection_display = gr.Textbox(
+                label="🔍 AI 反思評估",
+                value="等待生成郵件...",
+                interactive=False,
+                lines=8,
+                visible=True
+            )
+        with gr.Column(scale=1):
+            # 郵件主題（可編輯）
+            email_subject_input = gr.Textbox(
+                label="📌 郵件主題",
+                placeholder="郵件主題將在這裡顯示，您可以編輯",
+                lines=1,
+                interactive=True
+            )
+            # 郵件正文（可編輯）
+            email_body_input = gr.Textbox(
+                label="📄 郵件正文（可編輯）",
+                placeholder="郵件內容將在這裡顯示，您可以編輯",
+                lines=15,
+                interactive=True
+            )
+            # 發送按鈕
+            send_draft_btn = gr.Button("📧 發送郵件", variant="primary", scale=1)
+            # 發送結果顯示
+            email_result_display = gr.Textbox(
+                label="📊 發送結果",
+                lines=5,
+                interactive=False
+            )
+    # 事件處理函數
+    def generate_draft(prompt, recipient):
+        """生成郵件草稿（包含反思功能）"""
+        if not prompt or not prompt.strip():
+            return "❌ 請輸入郵件提示", "", "", "❌ 請輸入郵件提示", "❌ 請輸入郵件提示"
+        if not recipient or not recipient.strip():
+            return "❌ 請輸入收件人郵箱", "", "", "❌ 請輸入收件人郵箱", "❌ 請輸入收件人郵箱"
+        # 驗證郵箱格式和 Gmail 限制
+        if "@" not in recipient or "." not in recipient.split("@")[1]:
+            return "❌ 郵箱格式不正確", "", "", "❌ 郵箱格式不正確，請輸入有效的郵箱地址", "❌ 郵箱格式不正確，請輸入有效的郵箱地址"
+        # 驗證是否為 Gmail 郵箱
+        recipient_lower = recipient.strip().lower()
+        if not (recipient_lower.endswith("@gmail.com") or recipient_lower.endswith("@googlemail.com")):
+            return "❌ 僅支援 Gmail 郵箱", "", "", "❌ 此工具僅支援 Gmail 郵箱（@gmail.com 或 @googlemail.com），請輸入 Gmail 郵箱地址", "❌ 此工具僅支援 Gmail 郵箱（@gmail.com 或 @googlemail.com），請輸入 Gmail 郵箱地址"
+        try:
+            status_msg = "🔄 正在生成郵件草稿..."
+            reflection_msg = "🔄 正在生成郵件草稿..."
+            # 生成郵件草稿（包含反思功能，會自動改進）
+            subject, body, status, reflection_result, was_improved = generate_email_draft(
+                prompt, recipient.strip(), enable_reflection=True
+            )
+            if subject and body:
+                # 格式化反思結果顯示
+                if reflection_result:
+                    # 計算反思輪數
+                    reflection_count = reflection_result.count("【第") if "【第" in reflection_result else 0
+                    if was_improved:
+                        if reflection_count > 1:
+                            reflection_display = (
+                                f"🔍 **AI 迭代反思評估結果**（共 {reflection_count} 輪）\n\n"
+                                f"{reflection_result}\n\n"
+                                f"✨ **已自動應用改進建議，經過 {reflection_count} 輪優化，當前顯示的是最終優化版本**"
+                            )
+                        else:
+                            reflection_display = (
+                                f"🔍 **AI 反思評估結果**\n\n"
+                                f"{reflection_result}\n\n"
+                                f"✨ **已自動應用改進建議，當前顯示的是優化後的版本**"
+                            )
+                    else:
+                        reflection_display = (
+                            f"🔍 **AI 反思評估結果**\n\n"
+                            f"{reflection_result}\n\n"
+                            f"✅ **郵件質量良好，無需改進**"
+                        )
+                else:
+                    reflection_display = "⚠️ 反思功能未返回結果"
+                return status, subject, body, "", reflection_display
+            else:
+                return status, "", "", status, "❌ 生成失敗，無法進行反思評估"
+        except Exception as e:
+            error_msg = f"❌ 發生錯誤：{str(e)}"
+            print(f"Email Tool 錯誤：{e}")
+            import traceback
+            traceback.print_exc()
+            return "❌ 發生錯誤", "", "", error_msg, f"❌ 發生錯誤：{str(e)}"
+    def send_draft(recipient, subject, body):
+        """發送已編輯的郵件草稿"""
+        if not recipient or not recipient.strip():
+            return "❌ 請輸入收件人郵箱", "❌ 請輸入收件人郵箱"
+        if not subject or not subject.strip():
+            return "❌ 請輸入郵件主題", "❌ 請輸入郵件主題"
+        if not body or not body.strip():
+            return "❌ 請輸入郵件內容", "❌ 請輸入郵件內容"
+        # 驗證郵箱格式和 Gmail 限制
+        if "@" not in recipient or "." not in recipient.split("@")[1]:
+            return "❌ 郵箱格式不正確", "❌ 郵箱格式不正確，請輸入有效的郵箱地址"
+        # 驗證是否為 Gmail 郵箱
+        recipient_lower = recipient.strip().lower()
+        if not (recipient_lower.endswith("@gmail.com") or recipient_lower.endswith("@googlemail.com")):
+            return "❌ 僅支援 Gmail 郵箱", "❌ 此工具僅支援 Gmail 郵箱（@gmail.com 或 @googlemail.com），請輸入 Gmail 郵箱地址"
+        try:
+            status_msg = "🔄 正在發送郵件..."
+            # 發送郵件
+            result = send_email_draft(recipient.strip(), subject.strip(), body.strip())
+            return "✅ 郵件已發送", result
+        except Exception as e:
+            error_msg = f"❌ 發送郵件時發生錯誤：{str(e)}"
+            print(f"Email Tool 錯誤：{e}")
+            import traceback
+            traceback.print_exc()
+            return "❌ 發生錯誤", error_msg
+    def clear_email():
+        """清除郵件相關輸入和輸出"""
+        return "", "", "等待操作...", "", "", "等待生成郵件..."
+    # 綁定事件
+    generate_draft_btn.click(
+        fn=generate_draft,
+        inputs=[email_prompt_input, recipient_input],
+        outputs=[email_status_display, email_subject_input, email_body_input, email_result_display, email_reflection_display]
+    )
+    send_draft_btn.click(
+        fn=send_draft,
+        inputs=[recipient_input, email_subject_input, email_body_input],
+        outputs=[email_status_display, email_result_display]
+    )
+    clear_email_btn.click(
+        fn=clear_email,
+        outputs=[email_prompt_input, recipient_input, email_status_display, email_subject_input, email_body_input, email_result_display, email_reflection_display]
+    )
+    # 示例
+    gr.Examples(
+        examples=[
+            ["寫一封感謝信，感謝對方在項目中的幫助和支持", "example@gmail.com"],
+            ["邀請參加下週的產品發布會", "colleague@gmail.com"],
+            ["詢問項目進度並提供更新", "partner@gmail.com"],
+            ["發送會議記錄和後續行動項目", "team@gmail.com"]
+        ],
+        inputs=[email_prompt_input, recipient_input]
+    )
+    # 頁腳說明
+    gr.Markdown(
+        f"""
+        ---
+        **注意事項：**
+        1. 使用 Gmail API 發送郵件，避免被歸類為垃圾郵件
+        2. **此工具僅支援 Gmail 郵箱，收件人必須使用 @gmail.com 或 @googlemail.com 結尾的郵箱地址**
+        3. 首次使用需要在專案根目錄放置 `credentials.json`（從 Google Cloud Console 下載的 OAuth2 憑證）
+        4. 首次運行時會自動開啟瀏覽器進行授權，授權後會生成 `token.json` 文件
+        5. 郵件內容由 AI 自動生成，請在發送前檢查結果
+        6. 寄件者固定為：{EMAIL_SENDER}
+        **設置步驟：**
+        - 前往 [Google Cloud Console](https://console.cloud.google.com/) 創建專案
+        - 啟用 Gmail API
+        - 創建 OAuth2 憑證並下載為 `credentials.json`
+        - 將 `credentials.json` 放在專案根目錄
+        """
+    )

deep_agent_rag/ui/gradio_interface.py CHANGED Viewed

@@ -5,12 +5,17 @@ Gradio 界面模組
 import uuid
 import re
 import time
 from typing import Iterator, Tuple
 import gradio as gr
 from langchain_core.messages import HumanMessage
 # graph 和 rag_retriever 將從外部傳入，不在這裡導入
 from ..utils.llm_utils import get_llm_type, is_using_local_llm
 def run_research_agent(query: str, graph, thread_id: str = None) -> Iterator[Tuple[str, str, str, str, str]]:
@@ -204,9 +209,9 @@ def create_gradio_interface(graph):
         gr.Markdown(
             """
             <div class="header">
-            <h1>🚀 Deep Research Agent with RAG (Local MLX)</h1>
             <p><strong>功能特色：</strong></p>
-            <p>📊 股票資訊查詢 | 🌐 網路搜尋 | 📚 PDF 知識庫查詢（Tree of Thoughts 論文）| 📧 智能郵件助手 | 📅 智能行事曆管理</p>
             <p><strong>智能規劃：</strong> 系統會根據問題類型自動選擇合適的研究工具</p>
             <p><strong>本地模型：</strong> 使用 MLX 本地模型，保護隱私，無需 API 金鑰</p>
             </div>
@@ -227,6 +232,10 @@ def create_gradio_interface(graph):
             # Tab 3: Calendar Tool
             with gr.Tab("📅 Calendar Tool"):
                 _create_calendar_interface()
     return demo
@@ -363,895 +372,8 @@ def _create_research_interface(graph):
     )
-def _create_email_interface():
-    """創建 Email Tool 界面"""
-    from ..agents.email_agent import generate_email_draft, send_email_draft
-    from ..config import EMAIL_SENDER
-    gr.Markdown(
-        f"""
-        ### 📧 智能郵件助手
-        使用 AI 根據您的關鍵提示自動生成專業郵件草稿，您可以在發送前檢查和修改。
-        **寄件者：** {EMAIL_SENDER}
-        **使用方式：**
-        1. 在下方輸入郵件提示（例如："寫一封感謝信"、"邀請參加會議"等）
-        2. 輸入收件人 Gmail 郵箱地址（僅支援 @gmail.com 或 @googlemail.com）
-        3. 點擊「生成郵件草稿」按鈕
-        4. 查看 AI 反思評估結果和改進建議（如有）
-        5. 檢查並修改生成的郵件內容（特別是簽名部分）
-        6. 確認無誤後點擊「發送郵件」按鈕
-        **✨ 新功能：AI 迭代反思評估**
-        - 系統會自動進行多輪反思評估（最多 3 輪）
-        - 每輪評估後，如果有改進建議，會自動生成改進版本
-        - 改進後的版本會再次評估，直到 AI 認為滿意為止
-        - 您可以看到完整的反思過程和每輪的改進建議
-        **注意：此工具僅支援 Gmail 郵箱，收件人必須使用 Gmail 郵箱地址。**
-        """
-    )
-    with gr.Row():
-        with gr.Column(scale=1):
-            # 郵件提示輸入
-            email_prompt_input = gr.Textbox(
-                label="📝 郵件提示",
-                placeholder="例如：寫一封感謝信，感謝對方在項目中的幫助",
-                lines=5,
-                value="寫一封專業的郵件，介紹我們的 AI 產品"
-            )
-            # 收件人輸入
-            recipient_input = gr.Textbox(
-                label="📮 收件人郵箱（僅支援 Gmail）",
-                placeholder="recipient@gmail.com",
-                lines=1
-            )
-            # 按鈕
-            with gr.Row():
-                generate_draft_btn = gr.Button("📝 生成郵件草稿", variant="primary", scale=1)
-                clear_email_btn = gr.Button("🗑️ 清除", variant="secondary", scale=1)
-            # 狀態顯示
-            email_status_display = gr.Textbox(
-                label="📊 狀態",
-                value="等待操作...",
-                interactive=False,
-                lines=2
-            )
-            # 反思結果顯示
-            email_reflection_display = gr.Textbox(
-                label="🔍 AI 反思評估",
-                value="等待生成郵件...",
-                interactive=False,
-                lines=8,
-                visible=True
-            )
-        with gr.Column(scale=1):
-            # 郵件主題（可編輯）
-            email_subject_input = gr.Textbox(
-                label="📌 郵件主題",
-                placeholder="郵件主題將在這裡顯示���您可以編輯",
-                lines=1,
-                interactive=True
-            )
-            # 郵件正文（可編輯）
-            email_body_input = gr.Textbox(
-                label="📄 郵件正文（可編輯）",
-                placeholder="郵件內容將在這裡顯示，您可以編輯",
-                lines=15,
-                interactive=True
-            )
-            # 發送按鈕
-            send_draft_btn = gr.Button("📧 發送郵件", variant="primary", scale=1)
-            # 發送結果顯示
-            email_result_display = gr.Textbox(
-                label="📊 發送結果",
-                lines=5,
-                interactive=False
-            )
-    # 事件處理函數
-    def generate_draft(prompt, recipient):
-        """生成郵件草稿（包含反思功能）"""
-        if not prompt or not prompt.strip():
-            return "❌ 請輸入郵件提示", "", "", "❌ 請輸入郵件提示", "❌ 請輸入郵件提示"
-        if not recipient or not recipient.strip():
-            return "❌ 請輸入收件人郵箱", "", "", "❌ 請輸入收件人郵箱", "❌ 請輸入收件人郵箱"
-        # 驗證郵箱格式和 Gmail 限制
-        if "@" not in recipient or "." not in recipient.split("@")[1]:
-            return "❌ 郵箱格式不正確", "", "", "❌ 郵箱格式不正確，請輸入有效的郵箱地址", "❌ 郵箱格式不正確，請輸入有效的郵箱地址"
-        # 驗證是否為 Gmail 郵箱
-        recipient_lower = recipient.strip().lower()
-        if not (recipient_lower.endswith("@gmail.com") or recipient_lower.endswith("@googlemail.com")):
-            return "❌ 僅支援 Gmail 郵箱", "", "", "❌ 此工具僅支援 Gmail 郵箱（@gmail.com 或 @googlemail.com），請輸入 Gmail 郵箱地址", "❌ 此工具僅支援 Gmail 郵箱（@gmail.com 或 @googlemail.com），請輸入 Gmail 郵箱地址"
-        try:
-            status_msg = "🔄 正在生成郵件草稿..."
-            reflection_msg = "🔄 正在生成郵件草稿..."
-            # 生成郵件草稿（包含反思功能，會自動改進）
-            subject, body, status, reflection_result, was_improved = generate_email_draft(
-                prompt, recipient.strip(), enable_reflection=True
-            )
-            if subject and body:
-                # 格式化反思結果顯示
-                if reflection_result:
-                    # 計算反思輪數
-                    reflection_count = reflection_result.count("【第") if "【第" in reflection_result else 0
-                    if was_improved:
-                        if reflection_count > 1:
-                            reflection_display = (
-                                f"🔍 **AI 迭代反思評估結果**（共 {reflection_count} 輪）\n\n"
-                                f"{reflection_result}\n\n"
-                                f"✨ **已自動應用改進建議，經過 {reflection_count} 輪優化，當前顯示的是最終優化版本**"
-                            )
-                        else:
-                            reflection_display = (
-                                f"🔍 **AI 反思評估結果**\n\n"
-                                f"{reflection_result}\n\n"
-                                f"✨ **已自動應用改進建議，當前顯示的是優化後的版本**"
-                            )
-                    else:
-                        reflection_display = (
-                            f"🔍 **AI 反思評估結果**\n\n"
-                            f"{reflection_result}\n\n"
-                            f"✅ **郵件質量良好，無需改進**"
-                        )
-                else:
-                    reflection_display = "⚠️ 反思功能未返回結果"
-                return status, subject, body, "", reflection_display
-            else:
-                return status, "", "", status, "❌ 生成失敗，無法進行反思評估"
-        except Exception as e:
-            error_msg = f"❌ 發生錯誤：{str(e)}"
-            print(f"Email Tool 錯誤：{e}")
-            import traceback
-            traceback.print_exc()
-            return "❌ 發生錯誤", "", "", error_msg, f"❌ 發生錯誤：{str(e)}"
-    def send_draft(recipient, subject, body):
-        """發送已編輯的郵件草稿"""
-        if not recipient or not recipient.strip():
-            return "❌ 請輸入收件人郵箱", "❌ 請輸入收件人郵箱"
-        if not subject or not subject.strip():
-            return "❌ 請輸入郵件主題", "❌ 請輸入郵件主題"
-        if not body or not body.strip():
-            return "❌ 請輸入郵件內容", "❌ 請輸入郵件內容"
-        # 驗證郵箱格式和 Gmail 限制
-        if "@" not in recipient or "." not in recipient.split("@")[1]:
-            return "❌ 郵箱格式不正確", "❌ 郵箱格式不正確，請輸入有效的郵箱地址"
-        # 驗證是否為 Gmail 郵箱
-        recipient_lower = recipient.strip().lower()
-        if not (recipient_lower.endswith("@gmail.com") or recipient_lower.endswith("@googlemail.com")):
-            return "❌ 僅支援 Gmail 郵箱", "❌ 此工具僅支援 Gmail 郵箱（@gmail.com 或 @googlemail.com），請輸入 Gmail 郵箱地址"
-        try:
-            status_msg = "🔄 正在發送郵件..."
-            # 發送郵件
-            result = send_email_draft(recipient.strip(), subject.strip(), body.strip())
-            return "✅ 郵件已發送", result
-        except Exception as e:
-            error_msg = f"❌ 發送郵件時發生錯誤：{str(e)}"
-            print(f"Email Tool 錯誤：{e}")
-            import traceback
-            traceback.print_exc()
-            return "❌ 發生錯誤", error_msg
-    def clear_email():
-        """清除郵件相關輸入和輸出"""
-        return "", "", "等待操作...", "", "", "等待生成郵件..."
-    # 綁定事件
-    generate_draft_btn.click(
-        fn=generate_draft,
-        inputs=[email_prompt_input, recipient_input],
-        outputs=[email_status_display, email_subject_input, email_body_input, email_result_display, email_reflection_display]
-    )
-    send_draft_btn.click(
-        fn=send_draft,
-        inputs=[recipient_input, email_subject_input, email_body_input],
-        outputs=[email_status_display, email_result_display]
-    )
-    clear_email_btn.click(
-        fn=clear_email,
-        outputs=[email_prompt_input, recipient_input, email_status_display, email_subject_input, email_body_input, email_result_display, email_reflection_display]
-    )
-    # 示例
-    gr.Examples(
-        examples=[
-            ["寫一封感謝信，感謝對方在項目中的幫助和支持", "example@gmail.com"],
-            ["邀請參加下週的產品發布會", "colleague@gmail.com"],
-            ["詢問項目進度並提供更新", "partner@gmail.com"],
-            ["發送會議記錄和後續行動項目", "team@gmail.com"]
-        ],
-        inputs=[email_prompt_input, recipient_input]
-    )
-    # 頁腳說明
-    gr.Markdown(
-        f"""
-        ---
-        **注意事項：**
-        1. 使用 Gmail API 發送郵件，避免被歸類為垃圾郵件
-        2. **此工具僅支援 Gmail 郵箱，收件人必須使用 @gmail.com 或 @googlemail.com 結尾的郵箱地址**
-        3. 首次使用需要在專案根目錄放置 `credentials.json`（從 Google Cloud Console 下載的 OAuth2 憑證）
-        4. 首次運行時會自動開啟瀏覽器進行授權，授權後會生成 `token.json` 文件
-        5. 郵件內容由 AI 自動生成，請在發送前檢查結果
-        6. 寄件者固定為：{EMAIL_SENDER}
-        **設置步驟：**
-        - 前往 [Google Cloud Console](https://console.cloud.google.com/) 創建專案
-        - 啟用 Gmail API
-        - 創建 OAuth2 憑證並下載為 `credentials.json`
-        - 將 `credentials.json` 放在專案根目錄
-        """
-    )
-def _create_calendar_interface():
-    """創建 Calendar Tool 界面"""
-    from ..agents.calendar_agent import generate_calendar_draft, create_calendar_draft
-    from datetime import datetime, timedelta
-    gr.Markdown(
-        """
-        ### 📅 智能行事曆管理助手
-        使用 AI 根據您的完整提示自動生成行事曆事件草稿，您可以在創建前檢查和修改。
-        **使用方式：**
-        1. **快速選擇**：點擊下方常見事件按鈕，自動生成草稿
-        2. **自定義輸入**：在下方輸入完整的事件提示，包含：事件、日期、時間、地點、參與者
-        3. 查看 AI 反思評估結果和改進建議（如有）
-        4. 如果有缺失的資訊（如時間），系統會顯示下拉選單讓您選擇
-        5. 檢查並修改生成的事件內容
-        6. 確認無誤後點擊「創建事件」按鈕
-        **✨ 新功能：AI 迭代反思評估 + Google Maps 地點驗證**
-        - 系統會自動進行多輪反思評估（最多 3 輪）
-        - 自動驗證並標準化地址，計算交通時間
-        - 每輪評估後，如果有改進建議，會自動生成改進版本
-        - 改進後的版本會再次評估，直到 AI 認為滿意為止
-        """
-    )
-    # 快速選擇按鈕區域
-    gr.Markdown("### 🚀 快速選擇常見事件")
-    with gr.Row():
-        quick_meeting_btn = gr.Button("📋 團隊會議", variant="secondary", scale=1)
-        quick_client_btn = gr.Button("🤝 客戶拜訪", variant="secondary", scale=1)
-        quick_lunch_btn = gr.Button("🍽️ 午餐會議", variant="secondary", scale=1)
-        quick_oneonone_btn = gr.Button("💬 一對一會議", variant="secondary", scale=1)
-    with gr.Row():
-        quick_project_btn = gr.Button("📊 項目討論", variant="secondary", scale=1)
-        quick_training_btn = gr.Button("🎓 培訓/學習", variant="secondary", scale=1)
-        quick_social_btn = gr.Button("🎉 社交活動", variant="secondary", scale=1)
-        quick_custom_btn = gr.Button("✏️ 自定義輸入", variant="secondary", scale=1)
-    with gr.Row():
-        with gr.Column(scale=1):
-            # 單一 prompt 輸入
-            calendar_prompt_input = gr.Textbox(
-                label="📝 事件提示（包含事件、日期、時間、地點、參與者）",
-                placeholder="例如：明天下午2點團隊會議，討論項目進度，地點在會議室A，參與者包括john@example.com和mary@example.com",
-                lines=5,
-                value=""
-            )
-            # 按鈕
-            with gr.Row():
-                generate_draft_btn = gr.Button("📝 生成事件草稿", variant="primary", scale=1)
-                clear_calendar_btn = gr.Button("🗑️ 清除", variant="secondary", scale=1)
-            # 狀態顯示
-            calendar_status_display = gr.Textbox(
-                label="📊 狀態",
-                value="等待操作...",
-                interactive=False,
-                lines=2
-            )
-            # 反思結果顯示
-            calendar_reflection_display = gr.Textbox(
-                label="🔍 AI 反思評估",
-                value="等待生成事件...",
-                interactive=False,
-                lines=8,
-                visible=True
-            )
-            # 缺失資訊的補充區域（動態顯示）
-            missing_info_group = gr.Group(visible=False)
-            with missing_info_group:
-                gr.Markdown("**⚠️ 請補充以下缺失的資訊：**")
-                # 日期選擇（如果缺失）
-                missing_date_display = gr.Dropdown(
-                    label="📆 選擇日期",
-                    choices=[],
-                    visible=False,
-                    interactive=True
-                )
-                # 時間選擇（如果缺失）
-                missing_time_display = gr.Dropdown(
-                    label="🕐 選擇時間",
-                    choices=[],
-                    visible=False,
-                    interactive=True
-                )
-                fill_missing_btn = gr.Button("✅ 確認補充資訊", variant="primary", visible=False)
-            # 隱藏狀態變數，用於存儲 event_dict
-            event_dict_storage = gr.State(value={})
-        with gr.Column(scale=1):
-            # 事件詳情顯示和編輯區域
-            event_summary_display = gr.Textbox(
-                label="📌 事件標題",
-                placeholder="事件標題將在這裡顯示",
-                lines=1,
-                interactive=True
-            )
-            event_start_display = gr.Textbox(
-                label="🕐 開始時間",
-                placeholder="開始時間將在這裡顯示（格式: YYYY-MM-DDTHH:MM:SS+08:00）",
-                lines=1,
-                interactive=True
-            )
-            event_end_display = gr.Textbox(
-                label="🕐 結束時間",
-                placeholder="結束時間將在這裡顯示（格式: YYYY-MM-DDTHH:MM:SS+08:00）",
-                lines=1,
-                interactive=True
-            )
-            event_description_display = gr.Textbox(
-                label="📄 事件描述（可編輯）",
-                placeholder="事件描述將在這裡顯示，您可以編輯",
-                lines=6,
-                interactive=True
-            )
-            event_location_display = gr.Textbox(
-                label="📍 地點（可編輯，已自動驗證並標準化）",
-                placeholder="事件地點將在這裡顯示，您可以編輯",
-                lines=2,
-                interactive=True
-            )
-            event_attendees_display = gr.Textbox(
-                label="👥 參與者郵箱（可編輯，多個用逗號分隔）",
-                placeholder="參與者郵箱將在這裡顯示，您可以編輯",
-                lines=1,
-                interactive=True
-            )
-            # 創建按鈕
-            create_event_btn = gr.Button("✅ 創建事件", variant="primary", scale=1)
-            # 操作結果顯示
-            calendar_result_display = gr.Textbox(
-                label="📊 操作結果",
-                lines=8,
-                interactive=False
-            )
-    # 生成時間選項（每30分鐘一個選項）
-    def generate_time_options():
-        """生成時間選項列表"""
-        times = []
-        for hour in range(24):
-            for minute in [0, 30]:
-                time_str = f"{hour:02d}:{minute:02d}"
-                times.append(time_str)
-        return times
-    # 生成日期選項（今天、明天、後天，以及未來7天）
-    def generate_date_options():
-        """生成日期選項列表"""
-        dates = []
-        today = datetime.now()
-        date_names = ["今天", "明天", "後天"]
-        for i in range(3):
-            date_obj = today + timedelta(days=i)
-            date_str = date_obj.strftime('%Y-%m-%d')
-            dates.append(f"{date_names[i]} ({date_str})")
-        for i in range(3, 7):
-            date_obj = today + timedelta(days=i)
-            date_str = date_obj.strftime('%Y-%m-%d')
-            dates.append(date_str)
-        return dates
-    # 快速選擇事件模板生成函數
-    def generate_quick_prompt(event_type: str) -> str:
-        """根據事件類型生成預設提示"""
-        from datetime import datetime, timedelta
-        # 獲取明天的日期
-        tomorrow = datetime.now() + timedelta(days=1)
-        tomorrow_str = tomorrow.strftime("%Y-%m-%d")
-        templates = {
-            "meeting": f"明天下午2點團隊會議，討論項目進度和下週計劃，地點在會議室，參與者包括團隊成員",
-            "client": f"明天上午10點客戶拜訪，討論合作方案和需求，地點在客戶公司或會議室",
-            "lunch": f"明天中午12點午餐會議，與合作夥伴討論業務合作，地點在附近的餐廳",
-            "oneonone": f"明天下午3點一對一會議，討論工作進展和職業發展，地點在會議室或咖啡廳",
-            "project": f"明天上午9點項目討論會議，審查項目進度和解決問題，地點在項目室，參與者包括項目團隊",
-            "training": f"明天下午2點培訓課程，學習新技能和最佳實踐，地點在培訓室或線上",
-            "social": f"明天晚上6點團隊聚餐，慶祝項目完成，地點在餐廳，參與者包括團隊成員",
-            "custom": ""  # 自定義，返回空讓用戶輸入
-        }
-        return templates.get(event_type, "")
-    # 快速選擇按鈕處理函數（自動生成草稿）
-    def quick_select_and_generate(event_type: str):
-        """快速選擇事件類型並自動生成草稿"""
-        prompt = generate_quick_prompt(event_type)
-        if not prompt:
-            # 如果是自定義，只返回空提示，不自動生成
-            return (
-                prompt,  # calendar_prompt_input
-                "請在下方輸入框中輸入事件提示，然後點擊「生成事件草稿」",  # calendar_status_display
-                "等待輸入...",  # calendar_reflection_display
-                gr.update(visible=False),  # missing_info_group
-                gr.update(visible=False, choices=[]),  # missing_date_display
-                gr.update(visible=False, choices=[]),  # missing_time_display
-                gr.update(visible=False),  # fill_missing_btn
-                "", "", "", "", "", "",  # event fields
-                {},  # event_dict_storage
-                ""  # calendar_result_display
-            )
-        # 自動生成草稿（調用 generate_draft 並返回所有輸出）
-        draft_result = generate_draft(prompt)
-        # generate_draft 返回的格式是：(status, reflection_display, missing_info_group, ...)
-        # 但我們需要返回 (prompt, status, reflection_display, ...)
-        # draft_result 是一個元組，我們需要將 prompt 添加到開頭
-        return (prompt,) + draft_result
-    def quick_select_meeting():
-        """快速選擇：團隊會議"""
-        return quick_select_and_generate("meeting")
-    def quick_select_client():
-        """快速選擇：客戶拜訪"""
-        return quick_select_and_generate("client")
-    def quick_select_lunch():
-        """快速選擇：午餐會議"""
-        return quick_select_and_generate("lunch")
-    def quick_select_oneonone():
-        """快速選擇：一對一會議"""
-        return quick_select_and_generate("oneonone")
-    def quick_select_project():
-        """快速選擇：項目討論"""
-        return quick_select_and_generate("project")
-    def quick_select_training():
-        """快速選擇：培訓/學習"""
-        return quick_select_and_generate("training")
-    def quick_select_social():
-        """快速選擇：社交活動"""
-        return quick_select_and_generate("social")
-    def quick_select_custom():
-        """快速選擇：自定義輸入（只清空，不自動生成）"""
-        return (
-            "",  # calendar_prompt_input
-            "請在下方輸入框中輸入事件提示，然後點擊「生成事件草稿」",  # calendar_status_display
-            "等待輸入...",  # calendar_reflection_display
-            gr.update(visible=False),  # missing_info_group
-            gr.update(visible=False, choices=[]),  # missing_date_display
-            gr.update(visible=False, choices=[]),  # missing_time_display
-            gr.update(visible=False),  # fill_missing_btn
-            "", "", "", "", "", "",  # event fields
-            {},  # event_dict_storage
-            ""  # calendar_result_display
-        )
-    # 事件處理函數
-    def generate_draft(prompt):
-        """生成行事曆事件草稿（包含反思功能）"""
-        if not prompt or not prompt.strip():
-            return (
-                "❌ 請輸入事件提示",
-                "❌ 請輸入事件提示",
-                gr.update(visible=False),
-                gr.update(visible=False, choices=[]),
-                gr.update(visible=False, choices=[]),
-                gr.update(visible=False),
-                "", "", "", "", "", "", {},
-                "❌ 請輸入事件提示"
-            )
-        try:
-            status_msg = "🔄 正在生成事件草稿..."
-            # 生成事件草稿（包含反思功能）
-            event_dict, status, missing_info, reflection_result, was_improved = generate_calendar_draft(
-                prompt.strip(), enable_reflection=True
-            )
-            if not event_dict:
-                return (
-                    status,
-                    gr.update(visible=False),
-                    gr.update(visible=False, choices=[]),
-                    gr.update(visible=False, choices=[]),
-                    gr.update(visible=False),
-                    "", "", "", "", "", "", "",
-                    status
-                )
-            # 格式化反思結果顯示
-            if reflection_result:
-                # 計算反思輪數
-                reflection_count = reflection_result.count("【第") if "【第" in reflection_result else 0
-                if was_improved:
-                    if reflection_count > 1:
-                        reflection_display = (
-                            f"🔍 **AI 迭代反思評估結果**（共 {reflection_count} 輪）\n\n"
-                            f"{reflection_result}\n\n"
-                            f"✨ **已自動應用改進建議，經過 {reflection_count} 輪優化，當前顯示的是最終優化版本**"
-                        )
-                    else:
-                        reflection_display = (
-                            f"🔍 **AI 反思評估結果**\n\n"
-                            f"{reflection_result}\n\n"
-                            f"✨ **已自動應用改進建議，當前顯示的是優化後的版本**"
-                        )
-                else:
-                    reflection_display = (
-                        f"🔍 **AI 反思評估結果**\n\n"
-                        f"{reflection_result}\n\n"
-                        f"✅ **事件質量良好，無需改進**"
-                    )
-            else:
-                reflection_display = "⚠️ 反思功能未返回結果"
-            # 【Google Maps 整合】添加地點建議訊息
-            location_suggestion = event_dict.get("location_suggestion", "")
-            if location_suggestion:
-                # 將地點建議添加到狀態訊息中
-                if status:
-                    status = f"{status}\n\n🗺️ **地點資訊：**\n{location_suggestion}"
-                else:
-                    status = f"🗺️ **地點資訊：**\n{location_suggestion}"
-            # 檢查是否有缺失資訊
-            has_missing = bool(missing_info)
-            if has_missing:
-                # 顯示缺失資訊區域
-                date_visible = missing_info.get("date", False)
-                time_visible = missing_info.get("time", False)
-                date_choices = generate_date_options() if date_visible else []
-                time_choices = generate_time_options() if time_visible else []
-                return (
-                    status,
-                    reflection_display,
-                    gr.update(visible=True),  # 顯示缺失資訊區域
-                    gr.update(visible=date_visible, choices=date_choices, value=date_choices[0] if date_choices else None),
-                    gr.update(visible=time_visible, choices=time_choices, value=time_choices[0] if time_choices else None),
-                    gr.update(visible=True),  # 顯示確認按鈕
-                    event_dict.get("summary", ""),
-                    event_dict.get("start_datetime", ""),
-                    event_dict.get("end_datetime", ""),
-                    event_dict.get("description", ""),
-                    event_dict.get("location", ""),
-                    event_dict.get("attendees", ""),
-                    event_dict,  # 傳遞完整的事件字典以便後續使用
-                    ""
-                )
-            else:
-                # 沒有缺失資訊，直接顯示結果
-                return (
-                    status,
-                    reflection_display,
-                    gr.update(visible=False),
-                    gr.update(visible=False, choices=[]),
-                    gr.update(visible=False, choices=[]),
-                    gr.update(visible=False),
-                    event_dict.get("summary", ""),
-                    event_dict.get("start_datetime", ""),
-                    event_dict.get("end_datetime", ""),
-                    event_dict.get("description", ""),
-                    event_dict.get("location", ""),
-                    event_dict.get("attendees", ""),
-                    event_dict,
-                    ""
-                )
-        except Exception as e:
-            error_msg = f"❌ 發生錯誤：{str(e)}"
-            print(f"Calendar Tool 錯誤：{e}")
-            import traceback
-            traceback.print_exc()
-            return (
-                "❌ 發生錯誤",
-                f"❌ 發生錯誤：{str(e)}",
-                gr.update(visible=False),
-                gr.update(visible=False, choices=[]),
-                gr.update(visible=False, choices=[]),
-                gr.update(visible=False),
-                "", "", "", "", "", "", {},
-                error_msg
-            )
-    def fill_missing_info(event_dict_storage, selected_date, selected_time):
-        """填充缺失的資訊"""
-        if not event_dict_storage:
-            return (
-                "❌ 沒有事件資料",
-                gr.update(visible=False),
-                gr.update(visible=False, choices=[]),
-                gr.update(visible=False, choices=[]),
-                gr.update(visible=False),
-                "", "", "", "", "", "",
-                {}
-            )
-        # 更新日期和時間
-        if selected_date:
-            # 從選項中提取日期字串（例如："明天 (2026-01-25)" -> "2026-01-25"）
-            if "(" in selected_date:
-                date_str = selected_date.split("(")[1].split(")")[0]
-            else:
-                date_str = selected_date
-        else:
-            date_str = event_dict_storage.get("date", "今天")
-        if selected_time:
-            time_str = selected_time
-        else:
-            time_str = "09:00"  # 預設時間
-        # 重新解析日期和時間
-        from ..agents.calendar_agent import parse_datetime
-        start_datetime, end_datetime = parse_datetime(date_str, time_str)
-        # 更新事件字典
-        event_dict_storage["start_datetime"] = start_datetime
-        event_dict_storage["end_datetime"] = end_datetime
-        return (
-            "✅ 資訊已補充，請檢查並創建事件",
-            gr.update(visible=False),  # 隱藏缺失資訊區域
-            gr.update(visible=False, choices=[]),
-            gr.update(visible=False, choices=[]),
-            gr.update(visible=False),
-            event_dict_storage.get("summary", ""),
-            start_datetime,
-            end_datetime,
-            event_dict_storage.get("description", ""),
-            event_dict_storage.get("location", ""),
-            event_dict_storage.get("attendees", ""),
-            event_dict_storage
-        )
-    def create_event(summary, start_datetime, end_datetime, description, location, attendees):
-        """創建行事曆事件"""
-        if not summary or not summary.strip():
-            return "❌ 請輸入事件標題", "❌ 請輸入事件標題"
-        if not start_datetime or not start_datetime.strip():
-            return "❌ 請輸入開始時間", "❌ 請輸入開始時間"
-        if not end_datetime or not end_datetime.strip():
-            return "❌ 請輸入結束時間", "❌ 請輸入結束時間"
-        try:
-            status_msg = "🔄 正在創建事件..."
-            # 構建事件字典
-            event_dict = {
-                "summary": summary.strip(),
-                "start_datetime": start_datetime.strip(),
-                "end_datetime": end_datetime.strip(),
-                "description": description.strip() if description else "",
-                "location": location.strip() if location else "",
-                "attendees": attendees.strip() if attendees else "",
-                "timezone": "Asia/Taipei"
-            }
-            # 創建事件
-            result = create_calendar_draft(event_dict)
-            return "✅ 事件已創建", result
-        except Exception as e:
-            error_msg = f"❌ 創建事件時發生錯誤：{str(e)}"
-            print(f"Calendar Tool 錯誤：{e}")
-            import traceback
-            traceback.print_exc()
-            return "❌ 發生錯誤", error_msg
-    def clear_calendar():
-        """清除行事曆相關輸入和輸出"""
-        return (
-            "",  # prompt
-            "等待操作...",  # status
-            "等待生成事件...",  # reflection_display
-            gr.update(visible=False),  # missing_info_group
-            gr.update(visible=False, choices=[]),  # missing_date
-            gr.update(visible=False, choices=[]),  # missing_time
-            gr.update(visible=False),  # fill_missing_btn
-            "", "", "", "", "", "",  # event fields
-            {},  # event_dict_storage
-            ""  # result
-        )
-    # 綁定事件
-    generate_draft_btn.click(
-        fn=generate_draft,
-        inputs=[calendar_prompt_input],
-        outputs=[
-            calendar_status_display,
-            calendar_reflection_display,
-            missing_info_group,
-            missing_date_display,
-            missing_time_display,
-            fill_missing_btn,
-            event_summary_display,
-            event_start_display,
-            event_end_display,
-            event_description_display,
-            event_location_display,
-            event_attendees_display,
-            event_dict_storage,
-            calendar_result_display
-        ]
-    )
-    # 綁定快速選擇按鈕（自動填充提示並生成草稿）
-    quick_outputs = [
-        calendar_prompt_input,  # 更新提示輸入框
-        calendar_status_display,
-        calendar_reflection_display,
-        missing_info_group,
-        missing_date_display,
-        missing_time_display,
-        fill_missing_btn,
-        event_summary_display,
-        event_start_display,
-        event_end_display,
-        event_description_display,
-        event_location_display,
-        event_attendees_display,
-        event_dict_storage,
-        calendar_result_display
-    ]
-    quick_meeting_btn.click(fn=quick_select_meeting, outputs=quick_outputs)
-    quick_client_btn.click(fn=quick_select_client, outputs=quick_outputs)
-    quick_lunch_btn.click(fn=quick_select_lunch, outputs=quick_outputs)
-    quick_oneonone_btn.click(fn=quick_select_oneonone, outputs=quick_outputs)
-    quick_project_btn.click(fn=quick_select_project, outputs=quick_outputs)
-    quick_training_btn.click(fn=quick_select_training, outputs=quick_outputs)
-    quick_social_btn.click(fn=quick_select_social, outputs=quick_outputs)
-    quick_custom_btn.click(fn=quick_select_custom, outputs=quick_outputs)
-    fill_missing_btn.click(
-        fn=fill_missing_info,
-        inputs=[event_dict_storage, missing_date_display, missing_time_display],
-        outputs=[
-            calendar_status_display,
-            missing_info_group,
-            missing_date_display,
-            missing_time_display,
-            fill_missing_btn,
-            event_summary_display,
-            event_start_display,
-            event_end_display,
-            event_description_display,
-            event_location_display,
-            event_attendees_display,
-            event_dict_storage
-        ]
-    )
-    create_event_btn.click(
-        fn=create_event,
-        inputs=[
-            event_summary_display,
-            event_start_display,
-            event_end_display,
-            event_description_display,
-            event_location_display,
-            event_attendees_display
-        ],
-        outputs=[calendar_status_display, calendar_result_display]
-    )
-    clear_calendar_btn.click(
-        fn=clear_calendar,
-        outputs=[
-            calendar_prompt_input,
-            calendar_status_display,
-            calendar_reflection_display,
-            missing_info_group,
-            missing_date_display,
-            missing_time_display,
-            fill_missing_btn,
-            event_summary_display,
-            event_start_display,
-            event_end_display,
-            event_description_display,
-            event_location_display,
-            event_attendees_display,
-            event_dict_storage,
-            calendar_result_display
-        ]
-    )
-    # 示例
-    gr.Examples(
-        examples=[
-            "明天下午2點團隊會議，討論項目進度，地點在會議室A，參與者包括john@example.com",
-            "2026-01-25 上午9點產品發布會，介紹新功能和改進，地點在總部大樓",
-            "後天下午3點客戶會議，討論合作細節，參與者包括客戶代表",
-            "下週一上午10點技術分享會，分享最新的 AI 技術，地點在研發中心"
-        ],
-        inputs=[calendar_prompt_input]
-    )
-    # 頁腳說明
-    gr.Markdown(
-        """
-        ---
-        **注意事項：**
-        1. 使用 Google Calendar API 管理行事曆事件
-        2. 首次使用需要在專案根目錄放置 `credentials.json`（從 Google Cloud Console 下載的 OAuth2 憑證）
-        3. 首次運行時會自動開啟瀏覽器進行授權，授權後會生成 `token.json` 文件
-        4. 事件內容由 AI 自動生成，請在創建前檢查結果
-        5. 在提示中包含所有資訊：事件、日期、時間、地點、參與者
-        6. 如果缺少日期或時間，系統會顯示下拉選單讓您選擇
-        7. 日期格式支援：YYYY-MM-DD（例如：2026-01-25）或相對日期（今天��明天、後天）
-        8. 時間格式支援：24小時制（14:00）或12小時制（2:00 PM）
-        **設置步驟：**
-        - 前往 [Google Cloud Console](https://console.cloud.google.com/) 創建專案
-        - 啟用 Google Calendar API
-        - 創建 OAuth2 憑證並下載為 `credentials.json`
-        - 將 `credentials.json` 放在專案根目錄
-        - 確保授予 Calendar API 的完整存取權限
-        """
-    )

 import uuid
 import re
 import time
+import json
+import os
 from typing import Iterator, Tuple
 import gradio as gr
 from langchain_core.messages import HumanMessage
 # graph 和 rag_retriever 將從外部傳入，不在這裡導入
 from ..utils.llm_utils import get_llm_type, is_using_local_llm
+from .email_interface import _create_email_interface
+from .calendar_interface import _create_calendar_interface
+from .private_file_rag_interface import _create_private_file_rag_interface
 def run_research_agent(query: str, graph, thread_id: str = None) -> Iterator[Tuple[str, str, str, str, str]]:
         gr.Markdown(
             """
             <div class="header">
+            <h1>🚀 Deep Research Agent with RAG</h1>
             <p><strong>功能特色：</strong></p>
+            <p>📊 股票資訊查詢 | 🌐 網路搜尋 | 📚 PDF 知識庫查詢（Tree of Thoughts 論文）| 📧 智能郵件助手 | 📅 智能行事曆管理 | 📄 私有文件 RAG 問答</p>
             <p><strong>智能規劃：</strong> 系統會根據問題類型自動選擇合適的研究工具</p>
             <p><strong>本地模型：</strong> 使用 MLX 本地模型，保護隱私，無需 API 金鑰</p>
             </div>
             # Tab 3: Calendar Tool
             with gr.Tab("📅 Calendar Tool"):
                 _create_calendar_interface()
+            # Tab 4: Private File RAG
+            with gr.Tab("📚 Private File RAG"):
+                _create_private_file_rag_interface()
     return demo
     )

deep_agent_rag/ui/private_file_rag_interface.py ADDED Viewed

	@@ -0,0 +1,663 @@

+# deep_agent_rag/ui/private_file_rag_interface.py
+import gradio as gr
+import re
+import json
+import os
+import time
+from ..rag.private_file_rag import get_private_rag_instance, reset_private_rag_instance
+# Assuming is_using_local_llm might be used for warnings/status, similar to email_interface
+# from ..utils.llm_utils import is_using_local_llm
+# Agent log path for debugging (if needed)
+log_path = "/Users/matthuang/Desktop/Deep_Agentic_AI_Tool/.cursor/debug.log"
+def _create_private_file_rag_interface():
+    """創建私有文件 RAG 界面（對話式 Chatbot）"""
+    gr.Markdown(
+        """
+        ### 📚 私有文件 RAG 對話系統
+        上傳您的私有文件（PDF、DOCX、TXT），系統會自動建立 RAG 知識庫，讓 AI 可以回答關於這些文件的問題。
+        支持多輪對話，AI 會記住之前的對話內容，提供更連貫的回答。
+        **使用方式：**
+        1. 上傳一個或多個文件（PDF、DOCX、TXT）
+        2. 點擊「處理文件」按鈕，系統會自動處理文件並建立 RAG 系統
+        3. 在對話框中輸入您的問題，按 Enter 或點擊「發送」按鈕
+        4. AI 會基於上傳的文件回答問題，支持多輪對話
+        **功能特色：**
+        - 💬 **對話式界面** ：類似 Gemini 的對話體驗，支持多輪對話
+        - 📄 支持多種文件格式：PDF、DOCX、TXT
+        - 🔍 使用混合搜尋（BM25 + 向量檢索）提升檢索準確度
+        - 🎯 可選重排序功能，進一步優化結果
+        - 🧠 支持語義分塊，保持語義完整性
+        - 🌐 自動檢測文檔類型並調整回答風格
+        **LLM 使用策略：**
+        - 🥇 **優先使用 Groq API** ：如果配置了 API 金鑰，優先使用 Groq（速度快、質量高）
+        - 🥈 **其次使用 Ollama** ：如果 Groq 不可用，自動切換到 Ollama 本地模型
+        - 🥉 **最後使用 MLX** ：如果前兩者都不可用，使用 MLX 本地模型作為備選
+        - 💡 **自動切換** ：系統會根據 API 額度、服務狀態等自動選擇最合適的 LLM
+        **注意：** 此功能需要 Learn_RAG 項目在正確的位置
+        """
+    )
+    # 對話歷史狀態
+    chat_history = gr.State(value=[])
+    with gr.Row():
+        # 左側：文件上傳和設置
+        with gr.Column(scale=1):
+            # 文件上傳區域
+            file_upload = gr.File(
+                label="📁 上傳文件（PDF、DOCX、TXT）",
+                file_count="multiple",
+                file_types=[ ".pdf", ".docx", ".doc", ".txt"]
+            )
+            # 處理按鈕
+            with gr.Row():
+                process_btn = gr.Button("📝 處理文件", variant="primary", scale=1)
+                clear_files_btn = gr.Button("🗑️ 清除所有", variant="secondary", scale=1)
+            # 處理狀態
+            process_status = gr.Textbox(
+                label="📊 處理狀態",
+                value="等待上傳文件...",
+                interactive=False,
+                lines=2
+            )
+            # 設置區域（使用 Accordion 摺疊）
+            with gr.Accordion("⚙️ 進階設置", open=False):
+                # 處理選項
+                use_semantic_chunking = gr.Checkbox(
+                    label="使用語義分塊（推薦）",
+                    value=False,
+                    info="語義分塊能保持語義完整性，但處理時間較長"
+                )
+                # 分塊參數調整（字符分塊模式）
+                gr.Markdown("**📏 字符分塊參數（僅在未使用語義分塊時有效）**")
+                chunk_size_slider = gr.Slider(
+                    minimum=200,
+                    maximum=1500,
+                    value=500,
+                    step=50,
+                    label="分塊大小（字符數）",
+                    info="建議：300-800"
+                )
+                chunk_overlap_slider = gr.Slider(
+                    minimum=0,
+                    maximum=300,
+                    value=100,
+                    step=25,
+                    label="分塊重疊（字符數）",
+                    info="建議：chunk_size 的 15-25%"
+                )
+                # 語義分塊參數調整（僅在使用語義分塊時有效）
+                gr.Markdown("**🔬 語義分塊參數（僅在使用語義分塊時有效）**")
+                semantic_threshold_slider = gr.Slider(
+                    minimum=0.5,
+                    maximum=2.5,
+                    value=1.0,
+                    step=0.1,
+                    label="語義分塊閾值（敏感度）",
+                    info="建議：0.8-1.2（細粒度）"
+                )
+                semantic_min_chunk_slider = gr.Slider(
+                    minimum=50,
+                    maximum=300,
+                    value=100,
+                    step=25,
+                    label="最小分塊大小（字符數）",
+                    info="建議：50-200"
+                )
+                # RAG 方法選擇
+                gr.Markdown("**🎯 RAG 方法選擇**")
+                enable_adaptive_selection = gr.Checkbox(
+                    label="自動選擇最佳 RAG 方法（推薦）",
+                    value=True,
+                    info="系統會根據查詢和文件特征自動選擇最合適的 RAG 方法"
+                )
+                manual_rag_method = gr.Dropdown(
+                    choices=[
+                        "basic",
+                        "subquery",
+                        "hyde",
+                        "step_back",
+                        "hybrid_subquery_hyde",
+                        "triple_hybrid"
+                    ],
+                    value="basic",
+                    label="手動選擇 RAG 方法",
+                    info="僅在自動選擇關閉時生效",
+                    visible=False
+                )
+                # 查詢選項
+                top_k_slider = gr.Slider(
+                    minimum=1,
+                    maximum=10,
+                    value=3,
+                    step=1,
+                    label="返回結果數量"
+                )
+                use_llm_checkbox = gr.Checkbox(
+                    label="使用 LLM 生成回答",
+                    value=True
+                )
+        # 右側：對話界面
+        with gr.Column(scale=2):
+            # Chatbot 組件
+            # #region agent log
+            try:
+                with open(log_path, "a", encoding="utf-8") as f:
+                    log_entry = {
+                        "sessionId": "debug-session",
+                        "runId": "run1",
+                        "hypothesisId": "A",
+                        "location": "private_file_rag_interface.py:1409", # Adjusted line number
+                        "message": "Before Chatbot creation",
+                        "data": {
+                            "gradio_version": gr.__version__ if hasattr(gr, '__version__') else "unknown"
+                        },
+                        "timestamp": int(time.time() * 1000)
+                    }
+                    f.write(json.dumps(log_entry, ensure_ascii=False) + "\n")
+            except:
+                pass
+            # #endregion
+            # 創建 Chatbot（移除不支持的參數：show_copy_button 和 avatar_images）
+            # #region agent log
+            try:
+                with open(log_path, "a", encoding="utf-8") as f:
+                    log_entry = {
+                        "sessionId": "debug-session",
+                        "runId": "run1",
+                        "hypothesisId": "A",
+                        "location": "private_file_rag_interface.py:1430", # Adjusted line number
+                        "message": "Creating Chatbot with minimal params",
+                        "data": {"params": ["label", "height"]},
+                        "timestamp": int(time.time() * 1000)
+                    }
+                    f.write(json.dumps(log_entry, ensure_ascii=False) + "\n")
+            except:
+                pass
+            # #endregion
+            try:
+                chatbot = gr.Chatbot(
+                    label="💬 對話",
+                    height=500
+                )
+                # #region agent log
+                try:
+                    with open(log_path, "a", encoding="utf-8") as f:
+                        log_entry = {
+                            "sessionId": "debug-session",
+                            "runId": "run1",
+                            "hypothesisId": "A",
+                            "location": "private_file_rag_interface.py:1448", # Adjusted line number
+                            "message": "Chatbot created successfully",
+                            "data": {"success": True},
+                            "timestamp": int(time.time() * 1000)
+                        }
+                        f.write(json.dumps(log_entry, ensure_ascii=False) + "\n")
+                except:
+                    pass
+                # #endregion
+            except Exception as e:
+                # #region agent log
+                try:
+                    with open(log_path, "a", encoding="utf-8") as f:
+                        log_entry = {
+                            "sessionId": "debug-session",
+                            "runId": "run1",
+                            "hypothesisId": "A",
+                            "location": "private_file_rag_interface.py:1460", # Adjusted line number
+                            "message": "Chatbot creation failed",
+                            "data": {
+                                "error_type": type(e).__name__,
+                                "error_message": str(e)
+                            },
+                            "timestamp": int(time.time() * 1000)
+                        }
+                        f.write(json.dumps(log_entry, ensure_ascii=False) + "\n")
+                except:
+                    pass
+                # #endregion
+                raise
+            # 輸入框
+            msg = gr.Textbox(
+                label="輸入問題",
+                placeholder="輸入您的問題，按 Enter 發送...",
+                lines=2,
+                scale=4
+            )
+            # 按鈕區域
+            with gr.Row():
+                submit_btn = gr.Button("📤 發送", variant="primary", scale=1)
+                clear_chat_btn = gr.Button("🗑️ 清除對話", variant="secondary", scale=1)
+            # 查詢狀態
+            query_status = gr.Textbox(
+                label="📊 狀態",
+                value="等待查詢...",
+                interactive=False,
+                lines=1
+            )
+    # 輔助函數：轉換 Gradio 歷史格式（dict）和 RAG 歷史格式（tuple）
+    def history_dict_to_tuple(history_dict):
+        """
+        將 Gradio 歷史格式（List[Dict]）轉換為 RAG 歷史格式（List[Tuple[str, str]]）
+        Args:
+            history_dict: Gradio 格式的歷史，每個元素為 {"role": "user"/"assistant", "content": "..."}
+        Returns:
+            RAG 格式的歷史，每個元素為 (user_message, assistant_message)
+        """
+        if not history_dict:
+            return []
+        conversation_history = []
+        current_user_msg = None
+        for msg in history_dict:
+            if isinstance(msg, dict):
+                role = msg.get("role", "")
+                content = msg.get("content", "")
+                if role == "user":
+                    current_user_msg = content
+                elif role == "assistant" and current_user_msg is not None:
+                    conversation_history.append((current_user_msg, content))
+                    current_user_msg = None
+            elif isinstance(msg, tuple) and len(msg) == 2:
+                # 如果已經是 tuple 格式，直接使用（向後兼容）
+                conversation_history.append(msg)
+        return conversation_history
+    def history_tuple_to_dict(history_tuple):
+        """
+        將 RAG 歷史格式（List[Tuple[str, str]]）轉換為 Gradio 歷史格式（List[Dict]）
+        Args:
+            history_tuple: RAG 格式的歷史，每個元素為 (user_message, assistant_message)
+        Returns:
+            Gradio 格式的歷史，每個元素為 {"role": "user"/"assistant", "content": "..."}
+        """
+        if not history_tuple:
+            return []
+        history_dict = []
+        for msg in history_tuple:
+            if isinstance(msg, dict):
+                # 如果已經是 dict 格式，直接使用
+                history_dict.append(msg)
+            elif isinstance(msg, tuple) and len(msg) == 2:
+                # 轉換 tuple 為 dict 格式
+                user_msg, assistant_msg = msg
+                history_dict.append({"role": "user", "content": user_msg})
+                history_dict.append({"role": "assistant", "content": assistant_msg})
+        return history_dict
+    def ensure_dict_format(history):
+        """
+        確保歷史是 Gradio dict 格式
+        Args:
+            history: 歷史列表（可能是 dict 或 tuple 格式，也可能是 None）
+        Returns:
+            Gradio 格式的歷史（List[Dict]）
+        """
+        if not history:
+            return []
+        # 檢查第一個元素的類型來判斷格式
+        try:
+            if isinstance(history[0], dict):
+                return history
+            elif isinstance(history[0], tuple):
+                return history_tuple_to_dict(history)
+            else:
+                # 未知格式，返回空列表
+                return []
+        except (IndexError, TypeError):
+            # 如果 history 為空或無法索引，返回空列表
+            return []
+    # 事件處理函數
+    def process_files(files, use_semantic, chunk_size, chunk_overlap, semantic_threshold, semantic_min_chunk):
+        """
+        處理上傳的文件
+        Args:
+            files: 上傳的文件列表
+            use_semantic: 是否使用語義分塊
+            chunk_size: 字符分塊大小（僅用於字符分塊模式）
+            chunk_overlap: 字符分塊重疊大小（僅用於字符分塊模式）
+            semantic_threshold: 語義分塊閾值（僅用於語義分塊模式）
+            semantic_min_chunk: 語義分塊最小 chunk 大小（僅用於語義分塊模式）
+        """
+        if not files:
+            return "❌ 請先上傳文件", "等待上傳文件..."
+        try:
+            # 獲取 RAG ��例
+            rag = get_private_rag_instance()
+            # 更新配置
+            rag.use_semantic_chunking = use_semantic
+            # 更新分塊參數（根據分塊模式選擇）
+            if not use_semantic:
+                # 字符分塊模式：更新字符分塊參數
+                rag.chunk_size = int(chunk_size)
+                rag.chunk_overlap = int(chunk_overlap)
+                print(f"📏 使用字符分塊：chunk_size={rag.chunk_size}, chunk_overlap={rag.chunk_overlap}")
+            else:
+                # 語義分塊模式：更新語義分塊參數
+                rag.semantic_threshold = float(semantic_threshold)
+                rag.semantic_min_chunk_size = int(semantic_min_chunk)
+                print(f"📏 使用語義分塊：threshold={rag.semantic_threshold}, min_chunk_size={rag.semantic_min_chunk_size}")
+            # 處理上傳的文件（Gradio 會自動保存到臨時目錄）
+            # Gradio 6.x 返回的是文件路徑字符串列表
+            file_paths = []
+            for file in files:
+                # Gradio 6.x 返回字符串路徑，舊版本可能返回文件對象
+                if isinstance(file, str):
+                    file_path = file
+                elif hasattr(file, 'name'):
+                    # 舊版本 Gradio 文件對象
+                    file_path = file.name
+                else:
+                    # 嘗試轉換為字符串
+                    file_path = str(file)
+                if os.path.exists(file_path):
+                    file_paths.append(file_path)
+                else:
+                    return f"❌ 文件不存在: {file_path}", "處理失敗"
+            if not file_paths:
+                return "❌ 沒有有效的文件路徑", "處理失敗"
+            # 處理文件
+            documents, status_msg = rag.process_files(file_paths)
+            if documents:
+                return status_msg, "✅ 文件處理完成，可以開始查詢"
+            else:
+                return status_msg, "❌ 處理失敗"
+        except Exception as e:
+            error_msg = f"❌ 處理文件時發生錯誤: {str(e)}"
+            print(error_msg)
+            import traceback
+            traceback.print_exc()
+            return error_msg, "❌ 處理失敗"
+    def query_rag_stream(message, history, top_k, use_llm, enable_adaptive, manual_method):
+        """
+        查詢 RAG 系統（對話式，流式輸出）
+        Args:
+            message: 當前用戶消息
+            history: 對話歷史（Gradio 格式：List[Dict] 或 List[Tuple[str, str]]）
+            top_k: 返回結果數量
+            use_llm: 是否使用 LLM 生成回答
+            enable_adaptive: 是否啟用自動選擇
+            manual_method: 手動選擇的方法（僅在自動選擇關閉時生效）
+        Yields:
+            Tuple[history, status_msg]: 逐步更新的對話歷史和狀態訊息
+        """
+        if not message or not message.strip():
+            yield history, "❌ 請輸入問題"
+            return
+        try:
+            # 獲取 RAG 實例
+            rag = get_private_rag_instance()
+            if not rag.is_initialized:
+                error_msg = "❌ RAG 系統尚未初始化，請先處理文件"
+                # 確保 history 是 dict 格式
+                history = ensure_dict_format(history)
+                history.append({"role": "user", "content": message})
+                history.append({"role": "assistant", "content": error_msg})
+                yield history, error_msg
+                return
+            # 設置 RAG 方法選擇參數
+            rag.enable_adaptive_selection = enable_adaptive
+            if not enable_adaptive:
+                rag.selected_rag_method = manual_method
+            else:
+                rag.selected_rag_method = None
+            # 準備對話歷史：轉換為 RAG 需要的 tuple 格式
+            conversation_history = history_dict_to_tuple(history) if history else []
+            # 確保 history 是 dict 格式並添加用戶消息
+            history = ensure_dict_format(history)
+            history.append({"role": "user", "content": message})
+            # 執行查詢（傳入對話歷史，使用流式輸出）
+            if use_llm:
+                # 使用流式查詢
+                answer_generator = rag.query_stream(
+                    query=message,
+                    top_k=int(top_k),
+                    conversation_history=conversation_history
+                )
+                # 初始化回答
+                accumulated_answer = ""
+                history_with_user = history.copy()
+                final_result = {}
+                # 逐步接收流式回答
+                for chunk in answer_generator:
+                    if chunk.get("success") is False:
+                        error = chunk.get("error", "未知錯誤")
+                        error_msg = f"❌ 查詢失敗: {error}"
+                        history_with_user.append({"role": "assistant", "content": error_msg})
+                        yield history_with_user, error_msg
+                        return
+                    # 保存最後一個 chunk 作為最終結果
+                    final_result = chunk
+                    # 獲取新的回答片段
+                    new_answer = chunk.get("answer", "")
+                    if new_answer:
+                        # 累積回答
+                        accumulated_answer = new_answer
+                        # 更新歷史
+                        history_with_answer = history_with_user.copy()
+                        history_with_answer.append({"role": "assistant", "content": accumulated_answer})
+                        yield history_with_answer, "🔄 正在生成回答..."
+                # 獲取最終結果（包含統計信息）
+                rag_method = final_result.get("rag_method", "basic")
+                stats = final_result.get("stats", {})
+                status_msg = f"✅ 查詢完成（方法: {rag_method.upper()}）"
+                if stats:
+                    total_time = stats.get("total_time", 0)
+                    if total_time > 0:
+                        status_msg += f" | 耗時: {total_time:.2f}秒"
+                # 確保最終回答完整
+                if accumulated_answer:
+                    history_with_answer = history_with_user.copy()
+                    history_with_answer.append({"role": "assistant", "content": accumulated_answer})
+                    yield history_with_answer, status_msg
+                else:
+                    error_msg = "⚠️ LLM 未生成回答（可能 LLM 服務未啟動）"
+                    history_with_answer = history_with_user.copy()
+                    history_with_answer.append({"role": "assistant", "content": error_msg})
+                    yield history_with_answer, status_msg
+            else:
+                # 不使用 LLM，直接返回檢索結果
+                result = rag.query(
+                    query=message,
+                    top_k=int(top_k),
+                    use_llm=False,
+                    conversation_history=conversation_history
+                )
+                if not result.get("success"):
+                    error = result.get("error", "未知錯誤")
+                    error_msg = f"❌ 查詢失敗: {error}"
+                    history.append({"role": "assistant", "content": error_msg})
+                    yield history, error_msg
+                    return
+                # 格式化檢索結果
+                formatted_context = result.get("formatted_context", "")
+                answer = f"📄 檢索到的相關內容：\n\n{formatted_context}"
+                # 獲取 RAG 方法信息
+                rag_method = result.get("rag_method", "basic")
+                stats = result.get("stats", {})
+                status_msg = f"✅ 查詢完成（方法: {rag_method.upper()}）"
+                if stats:
+                    total_time = stats.get("total_time", 0)
+                    if total_time > 0:
+                        status_msg += f" | 耗時: {total_time:.2f}秒"
+                history.append({"role": "assistant", "content": answer})
+                yield history, status_msg
+        except Exception as e:
+            error_msg = f"❌ 查詢時發生錯誤: {str(e)}"
+            print(error_msg)
+            import traceback
+            traceback.print_exc()
+            # 確保 history 是 dict 格式
+            history = ensure_dict_format(history)
+            if not any(msg.get("role") == "user" and msg.get("content") == message for msg in history):
+                history.append({"role": "user", "content": message})
+            history.append({"role": "assistant", "content": error_msg})
+            yield history, error_msg
+    def clear_chat():
+        """清除對話歷史（不重置 RAG 系統）"""
+        return [], "對話已清除"
+    def clear_all():
+        """清除所有內容（包括 RAG 系統）"""
+        reset_private_rag_instance()
+        empty_history = []
+        return (
+            None,  # file_upload
+            False,  # use_semantic_chunking
+            500,  # chunk_size_slider
+            100,  # chunk_overlap_slider
+            1.0,  # semantic_threshold_slider
+            100,  # semantic_min_chunk_slider
+            True,  # enable_adaptive_selection
+            "basic",  # manual_rag_method
+            "等待上傳文件...",  # process_status
+            empty_history,  # chatbot (對話歷史)
+            empty_history,  # chat_history (狀態)
+            "等��查詢...",  # query_status
+        )
+    # 綁定事件
+    process_btn.click(
+        fn=process_files,
+        inputs=[
+            file_upload,
+            use_semantic_chunking,
+            chunk_size_slider,
+            chunk_overlap_slider,
+            semantic_threshold_slider,
+            semantic_min_chunk_slider
+        ],
+        outputs=[process_status, query_status]
+    )
+    # 自動選擇開關時顯示/隱藏手動選擇下拉菜單
+    def toggle_manual_method(enable_adaptive):
+        return gr.update(visible=not enable_adaptive)
+    enable_adaptive_selection.change(
+        fn=toggle_manual_method,
+        inputs=[enable_adaptive_selection],
+        outputs=[manual_rag_method]
+    )
+    # 提交消息（按鈕點擊或 Enter 鍵）
+    def submit_message(message, history, top_k, use_llm, enable_adaptive, manual_method):
+        """提交消息並更新對話歷史（流式輸出）"""
+        if not message or not message.strip():
+            # 確保 history 是 dict 格式
+            history = ensure_dict_format(history)
+            return history, history, "", "等待查詢..."
+        # 清空輸入框並執行流式查詢
+        for new_history, status in query_rag_stream(message, history, top_k, use_llm, enable_adaptive, manual_method):
+            yield new_history, new_history, "", status
+    # 綁定提交按鈕和 Enter 鍵
+    submit_btn.click(
+        fn=submit_message,
+        inputs=[msg, chat_history, top_k_slider, use_llm_checkbox, enable_adaptive_selection, manual_rag_method],
+        outputs=[chatbot, chat_history, msg, query_status]
+    )
+    msg.submit(
+        fn=submit_message,
+        inputs=[msg, chat_history, top_k_slider, use_llm_checkbox, enable_adaptive_selection, manual_rag_method],
+        outputs=[chatbot, chat_history, msg, query_status]
+    )
+    # 清除對話按鈕（需要更新 chat_history 狀態）
+    def clear_chat_with_state():
+        """清除對話歷史並更新狀態"""
+        empty_history = []
+        return empty_history, empty_history, "對話已清除"
+    clear_chat_btn.click(
+        fn=clear_chat_with_state,
+        outputs=[chatbot, chat_history, query_status]
+    )
+    # 清除所有按鈕
+    clear_files_btn.click(
+        fn=clear_all,
+        outputs=[
+            file_upload,
+            use_semantic_chunking,
+            chunk_size_slider,
+            chunk_overlap_slider,
+            semantic_threshold_slider,
+            semantic_min_chunk_slider,
+            enable_adaptive_selection,
+            manual_rag_method,
+            process_status,
+            chatbot,  # 更新 chatbot 顯示
+            chat_history,  # 更新 chat_history 狀態
+            query_status
+        ]
+    )

deep_agent_rag/utils/llm_utils.py CHANGED Viewed

@@ -1,11 +1,12 @@
 """
 LLM 工具函數
 提供 LLM 實例的創建和管理
-優先使用 Groq API，額度用完後自動切換到本地 MLX 模型
 """
 import warnings
 from typing import Optional
 from langchain_groq import ChatGroq
 from ..models import MLXChatModel, load_mlx_model
 from ..config import (
     MLX_MAX_TOKENS,
@@ -14,7 +15,12 @@ from ..config import (
     GROQ_MODEL,
     GROQ_MAX_TOKENS,
     GROQ_TEMPERATURE,
-    USE_GROQ_FIRST
 )
 # 全局變量：跟踪當前使用的 LLM 類型
@@ -29,30 +35,17 @@ def get_llm_type() -> str:
 def is_using_local_llm() -> bool:
     """檢查是否正在使用本地 LLM"""
-    return _current_llm_type == "mlx" or _groq_quota_exceeded
 def get_llm():
     """
     獲取 LLM 實例
-    優先使用 Groq API，額度用完後自動切換到本地 MLX 模型
     """
     global _current_llm_type, _groq_quota_exceeded
-    # 如果已經知道 Groq 額度用完，直接使用本地模型
-    if _groq_quota_exceeded:
-        if _current_llm_type != "mlx":
-            print("⚠️ 警告：Groq API 額度已用完，已切換到本地 MLX 模型 (Qwen2.5)")
-        _current_llm_type = "mlx"
-        model, tokenizer = load_mlx_model()
-        return MLXChatModel(
-            model=model,
-            tokenizer=tokenizer,
-            max_tokens=MLX_MAX_TOKENS,
-            temperature=MLX_TEMPERATURE
-        )
-    # 嘗試使用 Groq API
     if USE_GROQ_FIRST and GROQ_API_KEY:
         try:
             groq_llm = ChatGroq(
@@ -61,48 +54,61 @@ def get_llm():
                 max_tokens=GROQ_MAX_TOKENS,
                 temperature=GROQ_TEMPERATURE
             )
-            # 測試連接（通過一個簡單的調用來驗證）
-            # 注意：這裡不實際調用，只是創建實例
             _current_llm_type = "groq"
             print("✅ 使用 Groq API (優先)")
             return groq_llm
         except Exception as e:
-            # 如果創建失敗，可能是 API key 無效
             print(f"⚠️ Groq API 初始化失敗: {e}")
-            _groq_quota_exceeded = True
-            _current_llm_type = "mlx"
-            print("⚠️ 警告：已切換到本地 MLX 模型 (Qwen2.5)")
-            model, tokenizer = load_mlx_model()
-            return MLXChatModel(
-                model=model,
-                tokenizer=tokenizer,
-                max_tokens=MLX_MAX_TOKENS,
-                temperature=MLX_TEMPERATURE
             )
-    else:
-        # 如果沒有配置 Groq 或選擇不使用，直接使用本地模型
-        if not GROQ_API_KEY:
-            print("ℹ️ 未配置 GROQ_API_KEY，使用本地 MLX 模型")
-        _current_llm_type = "mlx"
-        model, tokenizer = load_mlx_model()
-        return MLXChatModel(
-            model=model,
-            tokenizer=tokenizer,
-            max_tokens=MLX_MAX_TOKENS,
-            temperature=MLX_TEMPERATURE
-        )
 def handle_groq_error(error: Exception) -> Optional[MLXChatModel]:
     """
     處理 Groq API 錯誤
-    如果是額度用完錯誤，切換到本地模型
     Args:
         error: 捕獲的異常
     Returns:
-        如果切換到本地模型，返回 MLXChatModel；否則返回 None
     """
     global _current_llm_type, _groq_quota_exceeded
@@ -121,10 +127,27 @@ def handle_groq_error(error: Exception) -> Optional[MLXChatModel]:
     if any(indicator in error_str for indicator in quota_indicators):
         if not _groq_quota_exceeded:
             _groq_quota_exceeded = True
-            warning_msg = "⚠️ 警告：Groq API 額度已用完，正在切換到本地 MLX 模型 (Qwen2.5)"
             print(warning_msg)
             warnings.warn(warning_msg, UserWarning)
         _current_llm_type = "mlx"
         model, tokenizer = load_mlx_model()
         return MLXChatModel(

 """
 LLM 工具函數
 提供 LLM 實例的創建和管理
+優先順序：Groq API > Ollama > MLX 模型
 """
 import warnings
 from typing import Optional
 from langchain_groq import ChatGroq
+from langchain_ollama import ChatOllama
 from ..models import MLXChatModel, load_mlx_model
 from ..config import (
     MLX_MAX_TOKENS,
     GROQ_MODEL,
     GROQ_MAX_TOKENS,
     GROQ_TEMPERATURE,
+    USE_GROQ_FIRST,
+    OLLAMA_BASE_URL,
+    OLLAMA_MODEL,
+    OLLAMA_MAX_TOKENS,
+    OLLAMA_TEMPERATURE,
+    USE_OLLAMA,
 )
 # 全局變量：跟踪當前使用的 LLM 類型
 def is_using_local_llm() -> bool:
     """檢查是否正在使用本地 LLM"""
+    return _current_llm_type in ["mlx", "ollama"] or _groq_quota_exceeded
 def get_llm():
     """
     獲取 LLM 實例
+    優先順序：Groq API > Ollama > MLX 模型
     """
     global _current_llm_type, _groq_quota_exceeded
+    # 優先順序 1: Groq API
     if USE_GROQ_FIRST and GROQ_API_KEY:
         try:
             groq_llm = ChatGroq(
                 max_tokens=GROQ_MAX_TOKENS,
                 temperature=GROQ_TEMPERATURE
             )
             _current_llm_type = "groq"
             print("✅ 使用 Groq API (優先)")
             return groq_llm
         except Exception as e:
+            # 如果創建失敗，繼續嘗試其他選項
             print(f"⚠️ Groq API 初始化失敗: {e}")
+            # 不立即設置 _groq_quota_exceeded，先嘗試 Ollama
+    # 優先順序 2: Ollama (Llama 3.2 或其他模型)
+    if USE_OLLAMA:
+        try:
+            ollama_llm = ChatOllama(
+                base_url=OLLAMA_BASE_URL,
+                model=OLLAMA_MODEL,
+                num_predict=OLLAMA_MAX_TOKENS,
+                temperature=OLLAMA_TEMPERATURE,
             )
+            _current_llm_type = "ollama"
+            print(f"✅ 使用 Ollama 模型 ({OLLAMA_MODEL})")
+            return ollama_llm
+        except Exception as e:
+            print(f"⚠️ Ollama 初始化失敗: {e}")
+            print("   請確保 Ollama 服務正在運行: ollama serve")
+            print("   或檢查模型是否已下載: ollama pull " + OLLAMA_MODEL)
+    # 優先順序 3: MLX 模型（備援）
+    # 如果 Groq 額度用完，記錄狀態
+    if _groq_quota_exceeded and _current_llm_type != "mlx":
+        print("⚠️ 警告：Groq API 額度已用完，已切換到本地 MLX 模型 (Qwen2.5)")
+    elif _current_llm_type != "mlx":
+        if not GROQ_API_KEY and not USE_OLLAMA:
+            print("ℹ️ 未配置 Groq API 或 Ollama，使用本地 MLX 模型")
+        elif not USE_OLLAMA:
+            print("ℹ️ Ollama 未啟用，使用本地 MLX 模型作為備援")
+    _current_llm_type = "mlx"
+    model, tokenizer = load_mlx_model()
+    return MLXChatModel(
+        model=model,
+        tokenizer=tokenizer,
+        max_tokens=MLX_MAX_TOKENS,
+        temperature=MLX_TEMPERATURE
+    )
 def handle_groq_error(error: Exception) -> Optional[MLXChatModel]:
     """
     處理 Groq API 錯誤
+    如果是額度用完錯誤，先嘗試切換到 Ollama，否則切換到 MLX 模型
     Args:
         error: 捕獲的異常
     Returns:
+        如果切換到本地模型，返回 ChatOllama 或 MLXChatModel；否則返回 None
     """
     global _current_llm_type, _groq_quota_exceeded
     if any(indicator in error_str for indicator in quota_indicators):
         if not _groq_quota_exceeded:
             _groq_quota_exceeded = True
+            warning_msg = "⚠️ 警告：Groq API 額度已用完"
             print(warning_msg)
             warnings.warn(warning_msg, UserWarning)
+        # 先嘗試使用 Ollama
+        if USE_OLLAMA:
+            try:
+                ollama_llm = ChatOllama(
+                    base_url=OLLAMA_BASE_URL,
+                    model=OLLAMA_MODEL,
+                    num_predict=OLLAMA_MAX_TOKENS,
+                    temperature=OLLAMA_TEMPERATURE,
+                )
+                _current_llm_type = "ollama"
+                print(f"✅ 已切換到 Ollama 模型 ({OLLAMA_MODEL})")
+                return ollama_llm
+            except Exception as e:
+                print(f"⚠️ Ollama 切換失敗: {e}")
+                print("   回退到 MLX 模型")
+        # 回退到 MLX 模型
         _current_llm_type = "mlx"
         model, tokenizer = load_mlx_model()
         return MLXChatModel(

pyproject.toml CHANGED Viewed

@@ -17,6 +17,7 @@ dependencies = [
     "yfinance>=0.2.66",
     "langgraph>=1.0.4",
     "langchain-groq>=1.1.0",
     "grandalf>=0.8",
     "langserve[all]>=0.3.3",
     "fastapi>=0.124.2",
@@ -36,5 +37,15 @@ dependencies = [
     "google-auth-httplib2>=0.3.0",
     "google-auth-oauthlib>=1.2.3",
     "googlemaps>=4.10.0",
     "parlant @ git+https://github.com/emcie-co/parlant@develop",
 ]

     "yfinance>=0.2.66",
     "langgraph>=1.0.4",
     "langchain-groq>=1.1.0",
+    "langchain-ollama>=0.1.0",
     "grandalf>=0.8",
     "langserve[all]>=0.3.3",
     "fastapi>=0.124.2",
     "google-auth-httplib2>=0.3.0",
     "google-auth-oauthlib>=1.2.3",
     "googlemaps>=4.10.0",
+<<<<<<< HEAD
     "parlant @ git+https://github.com/emcie-co/parlant@develop",
+=======
+    # Learn_RAG 項目依賴（用於私有文件 RAG 功能）
+    "arxiv>=2.3.1",
+    "langchain-text-splitters>=0.0.1",
+    "rank-bm25>=0.2.2",
+    "chromadb>=0.4.22",
+    "docx2txt>=0.8",
+    "langchain-experimental>=0.0.50",
+>>>>>>> 5beccbe9dfa0ef53e4123976ad54e2f1c28b72f8
 ]

src/__init__.py ADDED Viewed

	@@ -0,0 +1,37 @@

+"""
+RAG 系統模組套件
+"""
+from .document_processor import DocumentProcessor
+from .retrievers import (
+    BaseRetriever,
+    BM25Retriever,
+    VectorRetriever,
+    HybridSearch,
+    Reranker,
+    RAGPipeline,
+)
+from .prompt_formatter import PromptFormatter
+from .llm_integration import OllamaLLM
+from .subquery_rag import SubQueryDecompositionRAG
+from .hyde_rag import HyDERAG
+from .hybrid_subquery_hyde_rag import HybridSubqueryHyDERAG
+from .step_back_rag import StepBackRAG
+from .triple_hybrid_rag import TripleHybridRAG
+__all__ = [
+    "DocumentProcessor",
+    "BaseRetriever",
+    "BM25Retriever",
+    "VectorRetriever",
+    "HybridSearch",
+    "Reranker",
+    "RAGPipeline",
+    "PromptFormatter",
+    "OllamaLLM",
+    "SubQueryDecompositionRAG",
+    "HyDERAG",
+    "HybridSubqueryHyDERAG",
+    "StepBackRAG",
+    "TripleHybridRAG",
+]

src/document_processor.py ADDED Viewed

	@@ -0,0 +1,590 @@

+"""
+文檔處理模組：載入 arXiv 論文並進行文字分割
+支援本地檔案：PDF, DOCX, TXT
+支援兩種分塊策略：
+1. 字符分塊（預設）：基於固定字符數的分塊，速度快
+2. 語義分塊（可選）：基於語義相似度的分塊，能保持語義完整性
+"""
+from typing import List, Dict, Optional, Any
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+from pathlib import Path
+import os
+import arxiv
+import re
+# 嘗試導入語義分塊器（需要 langchain-experimental）
+try:
+    from langchain_experimental.text_splitter import SemanticChunker
+    SEMANTIC_CHUNKER_AVAILABLE = True
+except ImportError:
+    SEMANTIC_CHUNKER_AVAILABLE = False
+class DocumentProcessor:
+    """
+    處理 arXiv 論文文檔，進行分割和準備
+    支援兩種分塊模式：
+    - 字符分塊（預設）：快速、穩定，適合大多數場景
+    - 語義分塊（可選）：更智能，能保持語義完整性，但需要額外依賴和計算時間
+    """
+    def __init__(
+        self,
+        chunk_size: int = 1000,
+        chunk_overlap: int = 200,
+        embeddings: Optional[Any] = None,  # 可選：用於語義分塊的 embedding 模型
+        use_semantic_chunking: bool = False,  # 是否使用語義分塊
+        breakpoint_threshold_amount: float = 1.5,  # 語義分塊敏感度（標準差倍數）
+        min_chunk_size: int = 100  # 語義分塊的最小 chunk 大小（字符數）
+    ):
+        """
+        初始化文檔處理器
+        Args:
+            chunk_size: 每個 chunk 的大小（字符數），僅用於字符分塊模式
+            chunk_overlap: chunk 之間的重疊大小（字符數），僅用於字符分塊模式
+            embeddings: 用於計算語義距離的 embedding 模型物件（可選）
+                       當 use_semantic_chunking=True 時必須提供
+            use_semantic_chunking: 是否使用語義分塊
+                                  True: 使用語義分塊（需要提供 embeddings）
+                                  False: 使用字符分塊（預設）
+            breakpoint_threshold_amount: 語義分塊的敏感度參數
+                                        數值越大，分塊越少（chunks 越大）
+                                        數值越小，分塊越多（chunks 越小）
+                                        建議範圍：1.0 - 2.0，預設 1.5
+            min_chunk_size: 語義分塊的最小 chunk 大小（字符數）
+                           小於此大小的 chunks 會被合併到相鄰的 chunks
+                           預設 100 字符
+        """
+        self.embeddings = embeddings
+        self.use_semantic_chunking = use_semantic_chunking
+        self.min_chunk_size = min_chunk_size
+        # 如果要求使用語義分塊
+        if use_semantic_chunking:
+            # 檢查是否安裝了必要的依賴
+            if not SEMANTIC_CHUNKER_AVAILABLE:
+                raise ImportError(
+                    "使用語義分塊需要安裝 langchain-experimental 套件。\n"
+                    "請執行: pip install langchain-experimental\n"
+                    "或使用字符分塊模式（use_semantic_chunking=False）"
+                )
+            # 檢查是否提供了 embeddings
+            if embeddings is None:
+                raise ValueError(
+                    "使用語義分塊時必須提供 embeddings 參數。\n"
+                    "範例：\n"
+                    "  from langchain_community.embeddings import HuggingFaceEmbeddings\n"
+                    "  embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')\n"
+                    "  processor = DocumentProcessor(embeddings=embeddings, use_semantic_chunking=True)"
+                )
+            # 初始化語義分塊器
+            # 使用「標準差」策略：當相鄰句子之間的語義距離超過平均距離的標準差倍數時，進行切分
+            self.text_splitter = SemanticChunker(
+                embeddings,
+                breakpoint_threshold_type="standard_deviation",
+                breakpoint_threshold_amount=breakpoint_threshold_amount
+            )
+            print(f"✓ 使用語義分塊模式（敏感度: {breakpoint_threshold_amount}，最小 chunk 大小: {min_chunk_size} 字符）")
+        else:
+            # 使用傳統的字符分塊（預設模式）
+            self.text_splitter = RecursiveCharacterTextSplitter(
+                chunk_size=chunk_size,
+                chunk_overlap=chunk_overlap,
+                length_function=len,
+            )
+            print(f"✓ 使用字符分塊模式（大小: {chunk_size} 字符，重疊: {chunk_overlap} 字符）")
+    def _post_process_chunks(self, chunks: List[str]) -> List[str]:
+        """
+        後處理 chunks：過濾和合併太小的 chunks
+        語義分塊可能會產生一些非��小的 chunks（例如只有幾個單詞），
+        這些小 chunks 可能不包含足夠的上下文資訊。此方法會：
+        1. 將小於 min_chunk_size 的 chunks 合併到相鄰的 chunks
+        2. 確保最終的 chunks 都有足夠的大小
+        Args:
+            chunks: 原始 chunks 列表（從分塊器產生的）
+        Returns:
+            處理後的 chunks 列表（過濾和合併後的）
+        """
+        # 如果使用字符分塊，不需要後處理（因為已經有固定大小）
+        if not self.use_semantic_chunking:
+            return chunks
+        # 如果沒有 chunks，直接返回
+        if not chunks:
+            return chunks
+        processed = []
+        current_small_chunk = ""  # 累積的小 chunk
+        for chunk in chunks:
+            chunk_stripped = chunk.strip()
+            chunk_length = len(chunk_stripped)
+            # 如果當前 chunk 太小，嘗試與下一個合併
+            if chunk_length < self.min_chunk_size:
+                # 累積到臨時變數中
+                if current_small_chunk:
+                    current_small_chunk += "\n\n" + chunk
+                else:
+                    current_small_chunk = chunk
+            else:
+                # 當前 chunk 足夠大
+                # 如果有累積的小 chunk，先處理它
+                if current_small_chunk:
+                    current_small_chunk_stripped = current_small_chunk.strip()
+                    if len(current_small_chunk_stripped) >= self.min_chunk_size:
+                        # 累積後足夠大，作為獨立 chunk
+                        processed.append(current_small_chunk)
+                    else:
+                        # 累積後還是太小，合併到上一個 chunk（如果存在）
+                        if processed:
+                            processed[-1] += "\n\n" + current_small_chunk
+                        else:
+                            # 如果沒有上一個 chunk，還是要保留
+                            processed.append(current_small_chunk)
+                    current_small_chunk = ""
+                # 添加當前足夠大的 chunk
+                processed.append(chunk)
+        # 處理最後的累積小 chunk
+        if current_small_chunk:
+            current_small_chunk_stripped = current_small_chunk.strip()
+            if len(current_small_chunk_stripped) >= self.min_chunk_size:
+                # 足夠大，作為獨立 chunk
+                processed.append(current_small_chunk)
+            elif processed:
+                # 太小，合併到最後一個 chunk
+                processed[-1] += "\n\n" + current_small_chunk
+            else:
+                # 如果沒有其他 chunks，還是要保留
+                processed.append(current_small_chunk)
+        return processed
+    def fetch_papers(self, query: str, max_results: int = 10) -> List[Dict]:
+        """
+        從 arXiv 獲取論文
+        Args:
+            query: 搜尋查詢（例如 "cat:cs.AI"）
+            max_results: 最大結果數量
+        Returns:
+            論文列表，每個論文包含標題、摘要等資訊
+        """
+        search = arxiv.Search(
+            query=query,
+            max_results=max_results,
+            sort_by=arxiv.SortCriterion.SubmittedDate
+        )
+        papers = []
+        for paper in search.results():
+            papers.append({
+                "title": paper.title,
+                "authors": [author.name for author in paper.authors],
+                "summary": paper.summary,
+                "published": str(paper.published),
+                "arxiv_id": paper.entry_id.split('/')[-1],
+                "arxiv_url": paper.entry_id,
+                "pdf_url": paper.pdf_url,
+                "categories": paper.categories,
+            })
+        return papers
+    def process_documents(self, papers: List[Dict]) -> List[Dict]:
+        """
+        處理論文，將每篇論文分割成 chunks
+        Args:
+            papers: 論文列表
+        Returns:
+            處理後的文檔 chunks，每個 chunk 包含內容和元數據
+        """
+        documents = []
+        for paper in papers:
+            # 組合論文的完整文字（標題 + 摘要）
+            # 保留換行符號 \n\n 作為語義斷點的結構參考
+            full_text = f"Title: {paper['title']}\n\nAbstract: {paper['summary']}"
+            # 分割文字（根據選擇的模式：字符分塊或語義分塊）
+            chunks = self.text_splitter.split_text(full_text)
+            # 後處理：過濾和合併太小的 chunks（僅語義分塊模式）
+            chunks = self._post_process_chunks(chunks)
+            # 為每個 chunk 創建文檔物件
+            for i, chunk in enumerate(chunks):
+                doc = {
+                    "content": chunk,
+                    "metadata": {
+                        "title": paper['title'],
+                        "arxiv_id": paper['arxiv_id'],
+                        "arxiv_url": paper['arxiv_url'],
+                        "pdf_url": paper['pdf_url'],
+                        "authors": paper['authors'],
+                        "published": paper['published'],
+                        "categories": paper['categories'],
+                        "chunk_index": i,
+                        "total_chunks": len(chunks),
+                        "chunking_method": "semantic" if self.use_semantic_chunking else "character"
+                    }
+                }
+                documents.append(doc)
+        return documents
+    def get_texts_and_metadatas(self, documents: List[Dict]):
+        """
+        從文檔列表中提取文字和元數據
+        Args:
+            documents: 文檔列表
+        Returns:
+            (texts, metadatas) 元組
+        """
+        texts = [doc["content"] for doc in documents]
+        metadatas = [doc["metadata"] for doc in documents]
+        return texts, metadatas
+    @staticmethod
+    def clean_extracted_text(text: str) -> str:
+        """
+        清理從 PDF/DOCX 提取的文本，移除多餘的空格和修復字符換行問題
+        某些 PDF 提取工具會在每個字符之間插入空格或換行，特別是中文文本。
+        此方法會：
+        1. 修復「每個字符一行」的問題（將單字符行合併）
+        2. 移除中文字符之間的多餘空格
+        3. 保留英文單詞之間的空格
+        4. 保留標點符號周圍的適當空格
+        5. 保留真正的段落分隔
+        Args:
+            text: 原始提取的文本
+        Returns:
+            清理後的文本
+        """
+        if not text:
+            return text
+        # 步驟 0: 修復「每個字符一行」的問題
+        # 檢測模式：每行只有一個字符（可能是中文字符、標點、或單個字母/數字）
+        # 將這些單字符行合併成連續文本
+        lines = text.split('\n')
+        merged_lines = []
+        i = 0
+        def is_single_char_line(line: str) -> bool:
+            """
+            判斷是否為單字符行
+            考慮：去除空格後長度 <= 3（可能是單字符+標點，或單字符+空格）
+            """
+            stripped = line.strip()
+            if not stripped:
+                return False  # 空行不算
+            # 如果去除空格後長度 <= 3，且主要是中文字符、標點或單個字母/數字
+            if len(stripped) <= 3:
+                # 檢查是否主要是單個字符（可能帶標點或空格）
+                # 移除所有空格後，如果長度 <= 2，認為是單字符行
+                no_space = stripped.replace(' ', '')
+                if len(no_space) <= 2:
+                    return True
+            return False
+        while i < len(lines):
+            line = lines[i]
+            stripped_line = line.strip()
+            # 如果當前行是單字符行
+            if is_single_char_line(line):
+                # 收集連續的單字符行（包括空行，因為空行可能是分隔符）
+                merged_chars = []
+                j = i
+                consecutive_single_chars = 0
+                while j < len(lines):
+                    current_line = lines[j]
+                    current_stripped = current_line.strip()
+                    if is_single_char_line(current_line):
+                        # 是單字符行，收集字符（去除空格）
+                        char = current_stripped.replace(' ', '')
+                        if char:
+                            merged_chars.append(char)
+                        consecutive_single_chars += 1
+                        j += 1
+                    elif not current_stripped:
+                        # 空行：如果前面有單字符，且後面可能還有單字符，跳過空行
+                        # 檢查下一行是否也是單字符
+                        if j + 1 < len(lines) and is_single_char_line(lines[j + 1]):
+                            # 空行後面還有單字符，跳過空行繼續收集
+                            j += 1
+                        else:
+                            # 空行後面沒有單字符了，停止收集
+                            break
+                    else:
+                        # 遇到正常行，停止收集
+                        break
+                # 如果收集到多個單字符，合併它們
+                if len(merged_chars) > 1:
+                    merged_text = ''.join(merged_chars)
+                    merged_lines.append(merged_text)
+                    i = j
+                    continue
+                elif len(merged_chars) == 1 and consecutive_single_chars > 1:
+                    # 只有一個字符但有多行（可能是空格導致的），也合併
+                    merged_text = ''.join(merged_chars)
+                    merged_lines.append(merged_text)
+                    i = j
+                    continue
+                else:
+                    # 只有一個單字符，且確實只有一行，保留原樣
+                    if merged_chars:
+                        merged_lines.append(merged_chars[0])
+                    i = j
+                    continue
+            else:
+                # 正常行，直接添加
+                if stripped_line:  # 非空行
+                    merged_lines.append(stripped_line)
+                i += 1
+        # 重新組合文本
+        text = '\n'.join(merged_lines)
+        # 步驟 0.5: 再次處理可能的殘留問題
+        # 如果還有單字符行（可能是第一次處理遺漏的），再次處理
+        lines = text.split('\n')
+        final_lines = []
+        i = 0
+        while i < len(lines):
+            line = lines[i].strip()
+            if is_single_char_line(line):
+                # 再次收集連續的單字符行
+                merged_chars = []
+                j = i
+                while j < len(lines) and is_single_char_line(lines[j]):
+                    char = lines[j].strip().replace(' ', '')
+                    if char:
+                        merged_chars.append(char)
+                    j += 1
+                if len(merged_chars) > 1:
+                    final_lines.append(''.join(merged_chars))
+                    i = j
+                else:
+                    if merged_chars:
+                        final_lines.append(merged_chars[0])
+                    i = j
+            else:
+                if line:
+                    final_lines.append(line)
+                i += 1
+        text = '\n'.join(final_lines)
+        # 1. 移除中文字符之間的空格
+        # 匹配模式：中文字符 + 空格 + 中文字符
+        chinese_char_pattern = r'([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])\s+([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])'
+        text = re.sub(chinese_char_pattern, r'\1\2', text)
+        # 2. 移除中文和標點符號之間的多餘空格
+        # 中文 + 空格 + 標點符號
+        chinese_punct_pattern = r'([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])\s+([，。、；：！？""''（）【】《》])'
+        text = re.sub(chinese_punct_pattern, r'\1\2', text)
+        # 標點符號 + 空格 + 中文
+        # 使用 re.escape 來正確處理標點符號，避免轉義序列警告
+        punct_chars = '，。、；：！？""''（）【】《》'
+        punct_chinese_pattern = f'([{re.escape(punct_chars)}])\\s+([\\u4e00-\\u9fff\\u3400-\\u4dbf\\uf900-\\ufaff])'
+        text = re.sub(punct_chinese_pattern, r'\1\2', text)
+        # 3. 移除數字和中文之間的多餘空格（例如："500  公里" -> "500公里"）
+        number_chinese_pattern = r'(\d+)\s+([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])'
+        text = re.sub(number_chinese_pattern, r'\1\2', text)
+        chinese_number_pattern = r'([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])\s+(\d+)'
+        text = re.sub(chinese_number_pattern, r'\1\2', text)
+        # 4. 移除英文單詞內部的多餘空格（例如："Nebula-X 跨次 元量" -> "Nebula-X 跨次元量"）
+        # 但保留英文單詞之間的空格
+        # 匹配：非空格字符 + 空格 + 非空格字符（如果其中一個是中文，則移除空格）
+        mixed_space_pattern = r'([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])\s+([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])'
+        text = re.sub(mixed_space_pattern, r'\1\2', text)
+        # 5. 移除多個連續空格（保留單個空格，用於英文單詞之間）
+        text = re.sub(r' +', ' ', text)
+        # 6. 清理行首行尾的空格（但保留換行符）
+        lines = text.split('\n')
+        cleaned_lines = [line.strip() for line in lines]
+        text = '\n'.join(cleaned_lines)
+        # 7. 移除多個連續的換行符（保留最多兩個，用於段落分隔）
+        text = re.sub(r'\n{3,}', '\n\n', text)
+        # 8. 修復可能的殘留問題：移除中文字符之間殘留的空格
+        # 再次檢查並移除中文字符之間的空格（處理可能遺漏的情況）
+        text = re.sub(r'([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])\s+([\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff])', r'\1\2', text)
+        return text
+    def load_from_file(self, file_path: str) -> Dict:
+        """
+        從本地檔案載入文檔（支援 PDF, DOCX, TXT 等）
+        Args:
+            file_path: 檔案路徑
+        Returns:
+            文檔字典，包含內容和元數據
+        """
+        file_path = Path(file_path)
+        if not file_path.exists():
+            raise FileNotFoundError(f"檔案不存在: {file_path}")
+        file_ext = file_path.suffix.lower()
+        file_name = file_path.stem
+        file_size = os.path.getsize(file_path)
+        # 根據檔案類型選擇不同的加載器
+        if file_ext == '.pdf':
+            try:
+                from langchain_community.document_loaders import PyPDFLoader
+                loader = PyPDFLoader(str(file_path))
+                pages = loader.load()
+                # 合併所有頁面
+                full_text = "\n\n".join([page.page_content for page in pages])
+                # 清理提取的文本（移除多餘空格）
+                full_text = self.clean_extracted_text(full_text)
+            except ImportError:
+                raise ImportError(
+                    "需要安裝 pypdf 來處理 PDF 檔案: pip install pypdf"
+                )
+        elif file_ext in ['.docx', '.doc']:
+            try:
+                from langchain_community.document_loaders import Docx2txtLoader
+                loader = Docx2txtLoader(str(file_path))
+                pages = loader.load()
+                full_text = "\n\n".join([page.page_content for page in pages])
+                # 清理提取的文本（移除多餘空格）
+                full_text = self.clean_extracted_text(full_text)
+            except ImportError:
+                raise ImportError(
+                    "需要安裝 docx2txt 來處理 DOCX 檔案: pip install docx2txt"
+                )
+        elif file_ext == '.txt':
+            # 嘗試不同的編碼
+            encodings = ['utf-8', 'gbk', 'big5', 'latin-1']
+            full_text = None
+            for encoding in encodings:
+                try:
+                    with open(file_path, 'r', encoding=encoding) as f:
+                        full_text = f.read()
+                    break
+                except UnicodeDecodeError:
+                    continue
+            if full_text is None:
+                raise ValueError(f"無法讀取檔案，嘗試的編碼都不適用: {encodings}")
+        else:
+            raise ValueError(
+                f"不支援的檔案類型: {file_ext}\n"
+                f"支援的格式: .pdf, .docx, .doc, .txt"
+            )
+        if not full_text or len(full_text.strip()) == 0:
+            raise ValueError(f"檔案為空或無法提取文字: {file_path}")
+        return {
+            "title": file_name,
+            "content": full_text,
+            "file_path": str(file_path),
+            "file_type": file_ext,
+            "file_size": file_size,
+        }
+    def process_file(self, file_path: str) -> List[Dict]:
+        """
+        處理單個檔案，分割成 chunks
+        Args:
+            file_path: 檔案路徑
+        Returns:
+            處理後的文檔 chunks 列表
+        """
+        # 載入檔案
+        file_doc = self.load_from_file(file_path)
+        # 分割文字（根據選擇的模式：字符分塊或語義分塊）
+        chunks = self.text_splitter.split_text(file_doc["content"])
+        # 後處理：過濾和合併太小的 chunks（僅語義分塊模式）
+        chunks = self._post_process_chunks(chunks)
+        if not chunks:
+            raise ValueError(f"檔案分割後沒有內容: {file_path}")
+        # 創建文檔 chunks
+        documents = []
+        for i, chunk in enumerate(chunks):
+            doc = {
+                "content": chunk,
+                "metadata": {
+                    "title": file_doc["title"],
+                    "file_path": file_doc["file_path"],
+                    "file_type": file_doc["file_type"],
+                    "file_size": file_doc["file_size"],
+                    "chunk_index": i,
+                    "total_chunks": len(chunks),
+                    "chunking_method": "semantic" if self.use_semantic_chunking else "character"
+                }
+            }
+            documents.append(doc)
+        return documents
+    def process_files(self, file_paths: List[str]) -> List[Dict]:
+        """
+        處理多個檔案
+        Args:
+            file_paths: 檔案路徑列表
+        Returns:
+            所有檔案的文檔 chunks 列表
+        """
+        all_documents = []
+        for file_path in file_paths:
+            try:
+                print(f"處理檔案: {file_path}")
+                documents = self.process_file(file_path)
+                all_documents.extend(documents)
+                print(f"  ✓ 創建了 {len(documents)} 個 chunks")
+            except Exception as e:
+                print(f"  ✗ 處理檔案失敗: {file_path}")
+                print(f"    錯誤: {e}")
+                continue
+        return all_documents

src/hybrid_subquery_hyde_rag.py ADDED Viewed

	@@ -0,0 +1,399 @@

+"""
+Hybrid Sub-query + HyDE RAG：融合 Sub-query Decomposition 和 HyDE
+結合兩種方法的優勢，提升檢索精度
+"""
+from typing import List, Dict, Optional
+from .retrievers.reranker import RAGPipeline
+from .retrievers.vector_retriever import VectorRetriever
+from .prompt_formatter import PromptFormatter
+from .llm_integration import OllamaLLM
+import hashlib
+import time
+import logging
+from concurrent.futures import ThreadPoolExecutor, as_completed
+logger = logging.getLogger(__name__)
+class HybridSubqueryHyDERAG:
+    """融合 Sub-query Decomposition 和 HyDE 的 RAG 系統"""
+    def __init__(
+        self,
+        rag_pipeline: RAGPipeline,
+        vector_retriever: VectorRetriever,
+        llm: OllamaLLM,
+        max_sub_queries: int = 3,
+        top_k_per_subquery: int = 5,
+        hypothetical_length: int = 200,
+        temperature_subquery: float = 0.3,
+        temperature_hyde: float = 0.7,
+        enable_parallel: bool = True
+    ):
+        """
+        初始化融合 RAG
+        Args:
+            rag_pipeline: RAG 管線實例
+            vector_retriever: 向量檢索器
+            llm: LLM 實例
+            max_sub_queries: 最多生成的子問題數量
+            top_k_per_subquery: 每個子問題檢索的結果數量
+            hypothetical_length: 假設性文檔目標長度（字符數）
+            temperature_subquery: 生成子問題的溫度（較低，更穩定）
+            temperature_hyde: 生成假設性文檔的溫度（較高，更多專業術語）
+            enable_parallel: 是否並行處理
+        """
+        self.rag_pipeline = rag_pipeline
+        self.vector_retriever = vector_retriever
+        self.llm = llm
+        self.max_sub_queries = max_sub_queries
+        self.top_k_per_subquery = top_k_per_subquery
+        self.hypothetical_length = hypothetical_length
+        self.temperature_subquery = temperature_subquery
+        self.temperature_hyde = temperature_hyde
+        self.enable_parallel = enable_parallel
+    def _generate_sub_queries(self, question: str) -> List[str]:
+        """
+        生成子問題（與 SubQueryDecompositionRAG 相同）
+        Args:
+            question: 原始問題
+        Returns:
+            子問題列表
+        """
+        is_chinese = PromptFormatter.detect_language(question) == "zh"
+        if is_chinese:
+            prompt = f"""你是一個專業助理。請將以下原始問題拆解成最多 {self.max_sub_queries} 個具體的子問題，以便進行資料搜尋。
+每個子問題應專注於原始問題的一個特定面向。請以換行符號分隔問題。
+原始問題: {question}
+子問題清單:"""
+        else:
+            prompt = f"""You are a professional assistant. Please decompose the following original question into at most {self.max_sub_queries} specific sub-questions for information retrieval.
+Each sub-question should focus on a specific aspect of the original question. Please separate questions with newlines.
+Original question: {question}
+Sub-question list:"""
+        try:
+            response = self.llm.generate(
+                prompt=prompt,
+                temperature=self.temperature_subquery,
+                max_tokens=500
+            )
+            sub_queries = [
+                q.strip()
+                for q in response.strip().split("\n")
+                if q.strip() and not q.strip().startswith("#")
+            ]
+            # 移除編號前綴（如 "1. ", "1) " 等）
+            cleaned_queries = []
+            for q in sub_queries:
+                q = q.lstrip("0123456789. )")
+                q = q.strip()
+                if q:
+                    cleaned_queries.append(q)
+            cleaned_queries = cleaned_queries[:self.max_sub_queries]
+            if not cleaned_queries:
+                logger.warning("⚠️  未生成子問題，使用原始問題")
+                cleaned_queries = [question]
+            return cleaned_queries
+        except Exception as e:
+            logger.error(f"⚠️  生成子問題時出錯: {e}")
+            return [question]
+    def _generate_hypothetical_document(self, sub_query: str) -> str:
+        """
+        為子問題生成假設性文檔（與 HyDERAG 相同）
+        Args:
+            sub_query: 子問題
+        Returns:
+            假設性文檔文本
+        """
+        is_chinese = PromptFormatter.detect_language(sub_query) == "zh"
+        if is_chinese:
+            prompt = f"""請針對以下問題，寫出一段約 {self.hypothetical_length} 字的專業技術檔案內容。
+這段內容應包含該領域常見的專業術語與原理說明，以便用於後續的語義檢索。
+請使用專業的術語和概念，即使你對某些細節不確定，也要包含相關的專業詞彙。
+問題: {sub_query}
+專業技術內容："""
+        else:
+            prompt = f"""Please write a professional technical document of approximately {self.hypothetical_length} words in response to the following question.
+This content should include common professional terminology and principle explanations in this field, to be used for subsequent semantic retrieval.
+Please use professional terms and concepts, and include relevant professional vocabulary even if you are uncertain about some details.
+Question: {sub_query}
+Professional technical content:"""
+        try:
+            hypothetical_doc = self.llm.generate(
+                prompt=prompt,
+                temperature=self.temperature_hyde,
+                max_tokens=500
+            )
+            hypothetical_doc = hypothetical_doc.strip()
+            if not hypothetical_doc:
+                logger.warning(f"⚠️  子問題 '{sub_query}' 的假設性文檔為空，使用子問題本身")
+                return sub_query
+            logger.debug(f"✅ 為子問題生成假設性文檔（長度: {len(hypothetical_doc)} 字符）")
+            return hypothetical_doc
+        except Exception as e:
+            logger.error(f"⚠️  生成假設性文檔時出錯: {e}")
+            return sub_query
+    def _get_doc_id(self, doc: Dict) -> str:
+        """
+        生成文檔的唯一標識符
+        Args:
+            doc: 文檔字典
+        Returns:
+            唯一 ID
+        """
+        metadata = doc.get("metadata", {})
+        content = doc.get("content", "")
+        if "arxiv_id" in metadata and "chunk_index" in metadata:
+            return f"{metadata['arxiv_id']}_{metadata['chunk_index']}"
+        elif "file_path" in metadata and "chunk_index" in metadata:
+            return f"{metadata['file_path']}_{metadata['chunk_index']}"
+        else:
+            content_hash = hashlib.md5(content.encode()).hexdigest()[:16]
+            return f"doc_{content_hash}"
+    def _process_subquery_with_hyde(
+        self,
+        sub_query: str,
+        metadata_filter: Optional[Dict] = None
+    ) -> tuple:
+        """
+        處理單個子問題：生成假設性文檔並檢索
+        Args:
+            sub_query: 子問題
+            metadata_filter: 可選的 metadata 過濾條件
+        Returns:
+            (檢索結果列表, 假設性文檔)
+        """
+        try:
+            # 生成假設性文檔
+            hypothetical_doc = self._generate_hypothetical_document(sub_query)
+            # 使用假設性文檔檢索
+            results = self.vector_retriever.retrieve(
+                query=hypothetical_doc,  # 使用假設性文檔而不是子問題
+                top_k=self.top_k_per_subquery,
+                metadata_filter=metadata_filter
+            )
+            return results, hypothetical_doc
+        except Exception as e:
+            logger.error(f"⚠️  處理子問題 '{sub_query}' 時出錯: {e}")
+            return [], ""
+    def query(
+        self,
+        question: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        return_sub_queries: bool = False,
+        return_hypothetical: bool = False
+    ) -> Dict:
+        """
+        執行融合 RAG 檢索（不生成答案）
+        Args:
+            question: 原始問題
+            top_k: 返回前 k 個結果
+            metadata_filter: 可選的 metadata 過濾條件
+            return_sub_queries: 是否返回子問題列表
+            return_hypothetical: 是否返回假設性文檔字典（子問題 -> 假設性文檔）
+        Returns:
+            包含檢索結果和統計資訊的字典
+        """
+        start_time = time.time()
+        # 第一步：生成子問題
+        logger.info(f"🔍 拆解問題: '{question}'")
+        sub_queries = self._generate_sub_queries(question)
+        logger.info(f"✅ 生成 {len(sub_queries)} 個子問題")
+        # 第二步：為每個子問題生成假設性文檔並檢索
+        logger.info(f"📚 為每個子問題生成假設性文檔並檢索...")
+        unique_docs = {}
+        hypothetical_docs = {}
+        if self.enable_parallel and len(sub_queries) > 1:
+            # 並行處理
+            logger.info(f"🔄 並行處理 {len(sub_queries)} 個子問題...")
+            with ThreadPoolExecutor(max_workers=min(len(sub_queries), 5)) as executor:
+                future_to_query = {
+                    executor.submit(self._process_subquery_with_hyde, sq, metadata_filter): sq
+                    for sq in sub_queries
+                }
+                for future in as_completed(future_to_query):
+                    sub_query = future_to_query[future]
+                    try:
+                        results, hypo_doc = future.result()
+                        hypothetical_docs[sub_query] = hypo_doc
+                        logger.debug(f"✅ 子問題 '{sub_query}' 找到 {len(results)} 個結果")
+                        for doc in results:
+                            doc_id = self._get_doc_id(doc)
+                            if doc_id not in unique_docs:
+                                unique_docs[doc_id] = doc
+                            else:
+                                # 保留分數更高的
+                                existing_score = unique_docs[doc_id].get('score', 0)
+                                new_score = doc.get('score', 0)
+                                if new_score > existing_score:
+                                    unique_docs[doc_id] = doc
+                    except Exception as e:
+                        logger.error(f"⚠️  處理子問題 '{sub_query}' 時出錯: {e}")
+        else:
+            # 串行處理
+            logger.info(f"🔄 串行處理 {len(sub_queries)} 個子問題...")
+            for sub_query in sub_queries:
+                results, hypo_doc = self._process_subquery_with_hyde(sub_query, metadata_filter)
+                hypothetical_docs[sub_query] = hypo_doc
+                logger.debug(f"✅ 子問題 '{sub_query}' 找到 {len(results)} 個結果")
+                for doc in results:
+                    doc_id = self._get_doc_id(doc)
+                    if doc_id not in unique_docs:
+                        unique_docs[doc_id] = doc
+                    else:
+                        existing_score = unique_docs[doc_id].get('score', 0)
+                        new_score = doc.get('score', 0)
+                        if new_score > existing_score:
+                            unique_docs[doc_id] = doc
+        # 第三步：排序並返回前 top_k
+        result_list = list(unique_docs.values())
+        result_list.sort(key=lambda x: x.get('score', 0), reverse=True)
+        final_results = result_list[:top_k]
+        elapsed_time = time.time() - start_time
+        logger.info(f"✅ 找到 {len(final_results)} 個唯一文檔（去重後，總共 {len(result_list)} 個）")
+        return {
+            "results": final_results,
+            "total_docs_found": len(result_list),
+            "sub_queries": sub_queries if return_sub_queries else None,
+            "hypothetical_documents": hypothetical_docs if return_hypothetical else None,
+            "elapsed_time": elapsed_time
+        }
+    def generate_answer(
+        self,
+        question: str,
+        formatter: PromptFormatter,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        document_type: str = "general",
+        return_sub_queries: bool = False,
+        return_hypothetical: bool = False
+    ) -> Dict:
+        """
+        完整的融合 RAG 流程：檢索 + 生成答案
+        Args:
+            question: 原始問題
+            formatter: Prompt 格式化器
+            top_k: 用於生成答案的文檔數量
+            metadata_filter: 可選的 metadata 過濾條件
+            document_type: 文檔類型 ("paper", "cv", "general")
+            return_sub_queries: 是否返回子問題列表
+            return_hypothetical: 是否返回假設性文檔字典
+        Returns:
+            包含檢索結果、生成的答案和統計資訊的字典
+        """
+        start_time = time.time()
+        # 檢索
+        retrieval_result = self.query(
+            question=question,
+            top_k=top_k,
+            metadata_filter=metadata_filter,
+            return_sub_queries=return_sub_queries,
+            return_hypothetical=return_hypothetical
+        )
+        if not retrieval_result["results"]:
+            return {
+                **retrieval_result,
+                "answer": "抱歉，未找到相關文檔來回答此問題。",
+                "formatted_context": None,
+                "answer_time": 0.0,
+                "total_time": retrieval_result["elapsed_time"]
+            }
+        # 格式化上下文
+        formatted_context = formatter.format_context(
+            retrieval_result["results"],
+            document_type=document_type
+        )
+        # 創建 prompt（使用原始問題）
+        prompt = formatter.create_prompt(
+            question,
+            formatted_context,
+            document_type=document_type
+        )
+        # 生成回答
+        logger.info("🤖 生成回答中...")
+        answer_start = time.time()
+        try:
+            answer = self.llm.generate(
+                prompt=prompt,
+                temperature=0.7,
+                max_tokens=2048
+            )
+            answer_time = time.time() - answer_start
+            logger.info(f"✅ 回答生成完成（耗時: {answer_time:.2f}s）")
+        except Exception as e:
+            logger.error(f"❌ 生成回答時出錯: {e}")
+            answer = f"生成回答時出錯: {e}"
+            answer_time = time.time() - answer_start
+        total_time = time.time() - start_time
+        return {
+            **retrieval_result,
+            "answer": answer,
+            "formatted_context": formatted_context,
+            "answer_time": answer_time,
+            "total_time": total_time
+        }

src/hyde_rag.py ADDED Viewed

	@@ -0,0 +1,235 @@

+"""
+HyDE (Hypothetical Document Embeddings) RAG：使用假設性文檔改善檢索
+"""
+from typing import List, Dict, Optional
+from .retrievers.reranker import RAGPipeline
+from .retrievers.vector_retriever import VectorRetriever
+from .prompt_formatter import PromptFormatter
+from .llm_integration import OllamaLLM
+import time
+import logging
+logger = logging.getLogger(__name__)
+class HyDERAG:
+    """使用 HyDE (Hypothetical Document Embeddings) 的 RAG 系統"""
+    def __init__(
+        self,
+        rag_pipeline: RAGPipeline,
+        vector_retriever: VectorRetriever,
+        llm: OllamaLLM,
+        hypothetical_length: int = 200,
+        temperature: float = 0.7
+    ):
+        """
+        初始化 HyDE RAG
+        Args:
+            rag_pipeline: RAG 管線實例（用於最終答案生成）
+            vector_retriever: 向量檢索器（用於基於假設性文檔的檢索）
+            llm: LLM 實例（用於生成假設性文檔）
+            hypothetical_length: 假設性文檔的目標長度（字符數）
+            temperature: 生成假設性文檔時的溫度參數（建議 0.7，以獲得更多專業術語）
+        """
+        self.rag_pipeline = rag_pipeline
+        self.vector_retriever = vector_retriever
+        self.llm = llm
+        self.hypothetical_length = hypothetical_length
+        self.temperature = temperature
+    def _generate_hypothetical_document(self, question: str) -> str:
+        """
+        生成假設性文檔（Hypothetical Document）
+        Args:
+            question: 用戶問題
+        Returns:
+            假設性文檔文本
+        """
+        # 檢測語言
+        is_chinese = PromptFormatter.detect_language(question) == "zh"
+        if is_chinese:
+            prompt = f"""請針對以下問題，寫出一段約 {self.hypothetical_length} 字的專業技術檔案內容。
+這段內容應包含該領域常見的專業術語與原理說明，以便用於後續的語義檢索。
+請使用專業的術語和概念，即使你對某些細節不確定，也要包含相關的專業詞彙。
+問題: {question}
+專業技術內容："""
+        else:
+            prompt = f"""Please write a professional technical document of approximately {self.hypothetical_length} words in response to the following question.
+This content should include common professional terminology and principle explanations in this field, to be used for subsequent semantic retrieval.
+Please use professional terms and concepts, and include relevant professional vocabulary even if you are uncertain about some details.
+Question: {question}
+Professional technical content:"""
+        try:
+            hypothetical_doc = self.llm.generate(
+                prompt=prompt,
+                temperature=self.temperature,  # 較高的溫度以獲得更多專業術語
+                max_tokens=500
+            )
+            # 清理輸出
+            hypothetical_doc = hypothetical_doc.strip()
+            if not hypothetical_doc:
+                logger.warning("⚠️  生成的假設性文檔為空，使用原始問題")
+                return question
+            logger.info(f"✅ 生成假設性文檔（長度: {len(hypothetical_doc)} 字符）")
+            return hypothetical_doc
+        except Exception as e:
+            logger.error(f"⚠️  生成假設性文檔時出錯: {e}")
+            # 回退到使用原始問題
+            return question
+    def query(
+        self,
+        question: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        return_hypothetical: bool = False
+    ) -> Dict:
+        """
+        執行 HyDE 檢索（不生成答案）
+        Args:
+            question: 原始問題
+            top_k: 返回前 k 個結果
+            metadata_filter: 可選的 metadata 過濾條件
+            return_hypothetical: 是否在結果中包含假設性文檔
+        Returns:
+            包含檢索結果和統計資訊的字典
+        """
+        start_time = time.time()
+        # 第一步：生成假設性文檔
+        logger.info(f"🔍 生成假設性文檔: '{question}'")
+        hypothetical_doc = self._generate_hypothetical_document(question)
+        # 第二步：使用假設性文檔進行檢索
+        logger.info(f"📚 使用假設性文檔進行檢索...")
+        results = self.vector_retriever.retrieve(
+            query=hypothetical_doc,  # 使用假設性文檔而不是原始問題
+            top_k=top_k,
+            metadata_filter=metadata_filter
+        )
+        elapsed_time = time.time() - start_time
+        logger.info(f"✅ 找到 {len(results)} 個結果（耗時: {elapsed_time:.2f}s）")
+        result = {
+            "results": results,
+            "total_docs_found": len(results),
+            "hypothetical_document": hypothetical_doc if return_hypothetical else None,
+            "elapsed_time": elapsed_time
+        }
+        return result
+    def generate_answer(
+        self,
+        question: str,
+        formatter: PromptFormatter,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        document_type: str = "general",
+        return_hypothetical: bool = False
+    ) -> Dict:
+        """
+        完整的 HyDE RAG 流程：生成假設性文檔 -> 檢索 -> 生成答案
+        Args:
+            question: 原始問題
+            formatter: Prompt 格式化器
+            top_k: 用於生成答案的文檔數量
+            metadata_filter: 可選的 metadata 過濾條件
+            document_type: 文檔類型 ("paper", "cv", "general")
+            return_hypothetical: 是否在結果中包含假設性文檔
+        Returns:
+            包含檢索結果、生成的答案和統計資訊的字典
+        """
+        start_time = time.time()
+        # 第一步：生成假設性文檔
+        logger.info(f"🔍 生成假設性文檔: '{question}'")
+        hypothetical_start = time.time()
+        hypothetical_doc = self._generate_hypothetical_document(question)
+        hypothetical_time = time.time() - hypothetical_start
+        # 第二步：使用假設性文檔進行檢索
+        logger.info(f"📚 使用假設性文檔進行檢索...")
+        retrieval_start = time.time()
+        results = self.vector_retriever.retrieve(
+            query=hypothetical_doc,  # 使用假設性文檔而不是原始問題
+            top_k=top_k,
+            metadata_filter=metadata_filter
+        )
+        retrieval_time = time.time() - retrieval_start
+        if not results:
+            return {
+                "results": [],
+                "total_docs_found": 0,
+                "hypothetical_document": hypothetical_doc if return_hypothetical else None,
+                "elapsed_time": retrieval_time + hypothetical_time,
+                "answer": "抱歉，未找到相關文檔來回答此問題。",
+                "formatted_context": None,
+                "answer_time": 0.0,
+                "total_time": retrieval_time + hypothetical_time
+            }
+        # 第三步：格式化上下文
+        formatted_context = formatter.format_context(
+            results,
+            document_type=document_type
+        )
+        # 第四步：創建 prompt（使用原始問題，而不是假設性文檔）
+        prompt = formatter.create_prompt(
+            question,  # 使用原始問題生成答案
+            formatted_context,
+            document_type=document_type
+        )
+        # 第五步：生成回答
+        logger.info("🤖 生成回答中...")
+        answer_start = time.time()
+        try:
+            answer = self.llm.generate(
+                prompt=prompt,
+                temperature=0.7,
+                max_tokens=2048
+            )
+            answer_time = time.time() - answer_start
+            logger.info(f"✅ 回答生成完成（耗時: {answer_time:.2f}s）")
+        except Exception as e:
+            logger.error(f"❌ 生成回答時出錯: {e}")
+            answer = f"生成回答時出錯: {e}"
+            answer_time = time.time() - answer_start
+        total_time = time.time() - start_time
+        return {
+            "results": results,
+            "total_docs_found": len(results),
+            "hypothetical_document": hypothetical_doc if return_hypothetical else None,
+            "elapsed_time": retrieval_time + hypothetical_time,
+            "hypothetical_time": hypothetical_time,
+            "retrieval_time": retrieval_time,
+            "answer": answer,
+            "formatted_context": formatted_context,
+            "answer_time": answer_time,
+            "total_time": total_time
+        }

src/llm_integration.py ADDED Viewed

	@@ -0,0 +1,246 @@

+"""
+LLM 集成模組：使用 Ollama 進行本地 LLM 推理
+"""
+from typing import Optional, Dict, List
+import logging
+import requests
+import json
+logger = logging.getLogger(__name__)
+class OllamaLLM:
+    """使用 Ollama 進行本地 LLM 推理"""
+    # 適合 16GB MacBook Air 的模型推薦
+    RECOMMENDED_MODELS = {
+        "deepseek-r1:7b": {
+            "name": "deepseek-r1:7b",
+            "description": "DeepSeek R1 7B - 大模型，高質量",
+            "memory_required": "~8GB",
+            "quality": "優秀"
+        },
+        "llama3.2:3b": {
+            "name": "llama3.2:3b",
+            "description": "Meta Llama 3.2 3B - 輕量級，適合 16GB 內存",
+            "memory_required": "~4GB",
+            "quality": "良好"
+        },
+        "llama3.2:1b": {
+            "name": "llama3.2:1b",
+            "description": "Meta Llama 3.2 1B - 極輕量級，快速響應",
+            "memory_required": "~2GB",
+            "quality": "基礎"
+        },
+        "phi3:mini": {
+            "name": "phi3:mini",
+            "description": "Microsoft Phi-3 Mini - 小模型，高質量",
+            "memory_required": "~3GB",
+            "quality": "良好"
+        },
+        "gemma:2b": {
+            "name": "gemma:2b",
+            "description": "Google Gemma 2B - 輕量級，開源",
+            "memory_required": "~3GB",
+            "quality": "良好"
+        },
+        "mistral:7b": {
+            "name": "mistral:7b",
+            "description": "Mistral 7B - 較大但質量高（如果內存足夠）",
+            "memory_required": "~8GB",
+            "quality": "優秀"
+        }
+    }
+    def __init__(
+        self,
+        model_name: str = "llama3.2:3b",
+        base_url: str = "http://localhost:11434",
+        timeout: int = 120
+    ):
+        """
+        初始化 Ollama LLM
+        Args:
+            model_name: Ollama 模型名稱（預設: llama3.2:3b）
+            base_url: Ollama API 基礎 URL
+            timeout: 請求超時時間（秒）
+        """
+        self.model_name = model_name
+        self.base_url = base_url.rstrip('/')
+        self.timeout = timeout
+        self.api_url = f"{self.base_url}/api"
+        # 檢查模型是否在推薦列表中
+        if model_name not in self.RECOMMENDED_MODELS:
+            logger.warning(
+                f"⚠️  模型 '{model_name}' 不在推薦列表中。"
+                f"推薦的模型: {', '.join(self.RECOMMENDED_MODELS.keys())}"
+            )
+        logger.info(f"✅ Ollama LLM 初始化完成 (模型: {model_name})")
+    def _check_ollama_connection(self) -> bool:
+        """
+        檢查 Ollama 服務是否可用
+        Returns:
+            是否連接成功
+        """
+        try:
+            response = requests.get(f"{self.base_url}/api/tags", timeout=5)
+            return response.status_code == 200
+        except Exception as e:
+            logger.error(f"❌ 無法連接到 Ollama: {e}")
+            logger.error(f"   請確保 Ollama 正在運行: ollama serve")
+            return False
+    def _check_model_available(self) -> bool:
+        """
+        檢查模型是否已下載
+        Returns:
+            模型是否可用
+        """
+        try:
+            response = requests.get(f"{self.base_url}/api/tags", timeout=5)
+            if response.status_code == 200:
+                models = response.json().get('models', [])
+                model_names = [m.get('name', '') for m in models]
+                return any(self.model_name in name for name in model_names)
+            return False
+        except Exception as e:
+            logger.error(f"❌ 檢查模型時出錯: {e}")
+            return False
+    def generate(
+        self,
+        prompt: str,
+        temperature: float = 0.7,
+        max_tokens: Optional[int] = None,
+        stream: bool = False
+    ) -> str:
+        """
+        生成回答
+        Args:
+            prompt: 輸入 prompt
+            temperature: 溫度參數（0.0-1.0），控制隨機性
+            max_tokens: 最大生成 token 數（None 表示使用模型預設）
+            stream: 是否使用流式輸出
+        Returns:
+            生成的回答
+        """
+        # 檢查連接
+        if not self._check_ollama_connection():
+            raise ConnectionError(
+                f"無法連接到 Ollama 服務 ({self.base_url})\n"
+                f"請確保 Ollama 正在運行：\n"
+                f"  1. 安裝 Ollama: https://ollama.ai\n"
+                f"  2. 啟動服務: ollama serve\n"
+                f"  3. 下載模型: ollama pull {self.model_name}"
+            )
+        # 檢查模型
+        if not self._check_model_available():
+            logger.warning(
+                f"⚠️  模型 '{self.model_name}' 可能未下載。"
+                f"請運行: ollama pull {self.model_name}"
+            )
+        # 準備請求參數
+        payload = {
+            "model": self.model_name,
+            "prompt": prompt,
+            "stream": stream,
+            "options": {
+                "temperature": temperature,
+            }
+        }
+        if max_tokens:
+            payload["options"]["num_predict"] = max_tokens
+        try:
+            # 發送請求
+            response = requests.post(
+                f"{self.api_url}/generate",
+                json=payload,
+                timeout=self.timeout,
+                stream=stream
+            )
+            if response.status_code != 200:
+                error_msg = response.text
+                raise RuntimeError(f"Ollama API 錯誤: {error_msg}")
+            if stream:
+                # 流式處理
+                full_response = ""
+                for line in response.iter_lines():
+                    if line:
+                        try:
+                            data = json.loads(line)
+                            if 'response' in data:
+                                chunk = data['response']
+                                full_response += chunk
+                                print(chunk, end='', flush=True)
+                            if data.get('done', False):
+                                break
+                        except json.JSONDecodeError:
+                            continue
+                print()  # 換行
+                return full_response
+            else:
+                # 非流式處理
+                data = response.json()
+                return data.get('response', '')
+        except requests.exceptions.Timeout:
+            raise TimeoutError(
+                f"請求超時（{self.timeout}秒）。"
+                f"可以嘗試增加 timeout 或使用更小的模型。"
+            )
+        except requests.exceptions.ConnectionError:
+            raise ConnectionError(
+                f"無法連接到 Ollama 服務。"
+                f"請確保 Ollama 正在運行：ollama serve"
+            )
+        except Exception as e:
+            logger.error(f"❌ 生成回答時出錯: {e}")
+            raise
+    def list_available_models(self) -> List[str]:
+        """
+        列出本地可用的模型
+        Returns:
+            可用模型名稱列表
+        """
+        try:
+            response = requests.get(f"{self.base_url}/api/tags", timeout=5)
+            if response.status_code == 200:
+                models = response.json().get('models', [])
+                return [m.get('name', '') for m in models]
+            return []
+        except Exception as e:
+            logger.error(f"❌ 獲取模型列表時出錯: {e}")
+            return []
+    @classmethod
+    def print_recommended_models(cls):
+        """打印推薦的模型列表"""
+        print("\n" + "="*60)
+        print("適合 16GB MacBook Air 的 Ollama 模型推薦")
+        print("="*60)
+        print()
+        for model_key, info in cls.RECOMMENDED_MODELS.items():
+            print(f"📦 {info['name']}")
+            print(f"   描述: {info['description']}")
+            print(f"   內存需求: {info['memory_required']}")
+            print(f"   質量: {info['quality']}")
+            print(f"   下載命令: ollama pull {info['name']}")
+            print()

src/prompt_formatter.py ADDED Viewed

	@@ -0,0 +1,395 @@

+"""
+Prompt 格式化模組：將檢索結果格式化為 LLM 可讀的上下文
+"""
+from typing import List, Dict, Optional
+import re
+class PromptFormatter:
+    """格式化檢索結果供 LLM 使用"""
+    def __init__(
+        self,
+        include_metadata: bool = True,
+        format_style: str = "detailed",
+        max_context_length: Optional[int] = None,
+        auto_detect_language: bool = True
+    ):
+        """
+        初始化 Prompt 格式化器
+        Args:
+            include_metadata: 是否包含來源資訊
+            format_style: 格式風格 ("detailed", "simple", "minimal")
+            max_context_length: 最大上下文長度（字符數），None 表示不限制
+            auto_detect_language: 是否自動檢測語言並相應調整回答語言
+        """
+        self.include_metadata = include_metadata
+        self.format_style = format_style
+        self.max_context_length = max_context_length
+        self.auto_detect_language = auto_detect_language
+    @staticmethod
+    def detect_language(text: str) -> str:
+        """
+        檢測文本的主要語言
+        Args:
+            text: 輸入文本
+        Returns:
+            "zh" 表示中文，"en" 表示英文
+        """
+        # 檢查是否包含中文字符（CJK 統一表意文字範圍）
+        chinese_pattern = re.compile(r'[\u4e00-\u9fff\u3400-\u4dbf\uf900-\ufaff]')
+        chinese_chars = len(chinese_pattern.findall(text))
+        # 計算中文字符比例
+        total_chars = len([c for c in text if c.isalnum() or c.isspace()])
+        if total_chars == 0:
+            return "en"  # 預設英文
+        chinese_ratio = chinese_chars / total_chars if total_chars > 0 else 0
+        # 如果中文字符比例超過 20%，認為是中文
+        if chinese_ratio > 0.2:
+            return "zh"
+        else:
+            return "en"
+    def get_system_prompt(self, language: str = "zh", document_type: str = "general") -> str:
+        """
+        根據語言和文檔類型獲取系統提示詞
+        Args:
+            language: 語言代碼 ("zh" 或 "en")
+            document_type: 文檔類型 ("paper", "cv", "general")
+                          "paper": 學術論文
+                          "cv": 履歷/履歷
+                          "general": 通用文檔（預設）
+        Returns:
+            系統提示詞字符串
+        """
+        if language == "zh":
+            if document_type == "paper":
+                return (
+                    "你是一個專業的 AI 研究助手，專門回答關於機器學習、"
+                    "深度學習和自然語言處理的問題。\n\n"
+                    "請基於以下提供的學術論文片段來回答用戶的問題。"
+                    "每個片段都標註了來源論文的資訊。\n\n"
+                    "回答要求：\n"
+                    "1. 基於提供的上下文回答問題\n"
+                    "2. 如果上下文不足以回答，請明確說明\n"
+                    "3. 在回答中引用具體的論文來源（使用 arXiv ID）\n"
+                    "4. 如果不同論文有不同觀點，請分別說明\n"
+                    "5. 保持回答簡潔、準確、專業\n"
+                    "6. **重要：請使用與用戶問題相同的語言回答**\n"
+                )
+            elif document_type == "cv":
+                return (
+                    "你是一個專業的 AI 助手，專門幫助分析和介紹簡歷（CV）內容。\n\n"
+                    "請基於以下提供的文檔片段來回答用戶的問題。"
+                    "這些片段來自一份簡歷或履歷表。\n\n"
+                    "回答要求：\n"
+                    "1. 基於提供的上下文回答問題\n"
+                    "2. 如果上下文不足以回答，請明確說明\n"
+                    "3. 在回答中引用具體的文檔內容\n"
+                    "4. 保持回答簡潔、準確、專業\n"
+                    "5. **重要：請使用與用戶問題相同的語言回答**\n"
+                    "6. **請理解：這些片段就是簡歷的內容，請直接基於這些內容回答問題**\n"
+                )
+            else:  # general
+                return (
+                    "你是一個專業的 AI 助手。\n\n"
+                    "請基於以下提供的文檔片段來回答用戶的問題。"
+                    "每個片段都標註了來源資訊。\n\n"
+                    "回答要求：\n"
+                    "1. 基於提供的上下文回答問題\n"
+                    "2. 如果上下文不足以回答，請明確說明\n"
+                    "3. 在回答中引用具體的文檔內容\n"
+                    "4. 保持回答簡潔、準確、專業\n"
+                    "5. **重要：請使用與用戶問題相同的語言回答**\n"
+                )
+        else:  # English
+            if document_type == "paper":
+                return (
+                    "You are a professional AI research assistant specializing in "
+                    "machine learning, deep learning, and natural language processing.\n\n"
+                    "Please answer the user's question based on the provided academic paper excerpts. "
+                    "Each excerpt is labeled with source paper information.\n\n"
+                    "Answer requirements:\n"
+                    "1. Answer the question based on the provided context\n"
+                    "2. If the context is insufficient, clearly state so\n"
+                    "3. Cite specific paper sources in your answer (using arXiv ID)\n"
+                    "4. If different papers have different viewpoints, explain them separately\n"
+                    "5. Keep answers concise, accurate, and professional\n"
+                    "6. **Important: Please answer in the same language as the user's question**\n"
+                )
+            elif document_type == "cv":
+                return (
+                    "You are a professional AI assistant specializing in analyzing and introducing CV (Curriculum Vitae) content.\n\n"
+                    "Please answer the user's question based on the provided document excerpts. "
+                    "These excerpts are from a CV or resume.\n\n"
+                    "Answer requirements:\n"
+                    "1. Answer the question based on the provided context\n"
+                    "2. If the context is insufficient, clearly state so\n"
+                    "3. Cite specific document content in your answer\n"
+                    "4. Keep answers concise, accurate, and professional\n"
+                    "5. **Important: Please answer in the same language as the user's question**\n"
+                    "6. **Please understand: These excerpts ARE the CV content. Please answer directly based on this content.**\n"
+                )
+            else:  # general
+                return (
+                    "You are a professional AI assistant.\n\n"
+                    "Please answer the user's question based on the provided document excerpts. "
+                    "Each excerpt is labeled with source information.\n\n"
+                    "Answer requirements:\n"
+                    "1. Answer the question based on the provided context\n"
+                    "2. If the context is insufficient, clearly state so\n"
+                    "3. Cite specific document content in your answer\n"
+                    "4. Keep answers concise, accurate, and professional\n"
+                    "5. **Important: Please answer in the same language as the user's question**\n"
+                )
+    def format_context(
+        self,
+        results: List[Dict],
+        include_metadata: Optional[bool] = None,
+        format_style: Optional[str] = None,
+        document_type: str = "general"
+    ) -> str:
+        """
+        格式化檢索結果為 LLM 可讀的上下文
+        Args:
+            results: 檢索結果列表
+            include_metadata: 是否包含來源資訊（覆蓋初始化參數）
+            format_style: 格式風格（覆蓋初始化參數）
+            document_type: 文檔類型 ("paper", "cv", "general")，用於調整格式
+        Returns:
+            格式化後的上下文字符串
+        """
+        if include_metadata is None:
+            include_metadata = self.include_metadata
+        if format_style is None:
+            format_style = self.format_style
+        if not results:
+            # 根據格式風格選擇語言
+            if format_style == "detailed" or format_style == "simple":
+                return "（未找到相關文檔片段）"
+            else:
+                return "(No relevant excerpts found)"
+        formatted_parts = []
+        for i, result in enumerate(results, 1):
+            content = result.get("content", "")
+            metadata = result.get("metadata", {})
+            if not include_metadata:
+                # 不包含來源資訊，直接使用內容
+                formatted_parts.append(f"{content}\n")
+            elif format_style == "detailed":
+                # 詳細格式：根據文檔類型調整顯示資訊
+                if document_type == "cv":
+                    # CV 格式：顯示檔案名和路徑
+                    source_info = (
+                        f"[來源 {i}]\n"
+                        f"檔案標題: {metadata.get('title', 'N/A')}\n"
+                    )
+                    if 'file_path' in metadata:
+                        source_info += f"檔案路徑: {metadata.get('file_path', 'N/A')}\n"
+                    if 'file_type' in metadata:
+                        source_info += f"檔案類型: {metadata.get('file_type', 'N/A')}\n"
+                elif document_type == "paper":
+                    # 論文格式：顯示論文資訊
+                    authors = metadata.get('authors', [])
+                    if isinstance(authors, str):
+                        authors_str = authors
+                    elif isinstance(authors, list):
+                        authors_str = ', '.join(authors[:3])  # 最多顯示 3 個作者
+                        if len(authors) > 3:
+                            authors_str += f" 等 {len(authors)} 位作者"
+                    else:
+                        authors_str = 'N/A'
+                    source_info = (
+                        f"[來源 {i}]\n"
+                        f"論文標題: {metadata.get('title', 'N/A')}\n"
+                        f"arXiv ID: {metadata.get('arxiv_id', 'N/A')}\n"
+                        f"作者: {authors_str}\n"
+                        f"發布日期: {metadata.get('published', 'N/A')}\n"
+                    )
+                else:
+                    # 通用格式：顯示可用的資訊
+                    source_info = f"[來源 {i}]\n"
+                    if 'title' in metadata:
+                        source_info += f"標題: {metadata.get('title', 'N/A')}\n"
+                    if 'file_path' in metadata:
+                        source_info += f"檔案: {metadata.get('file_path', 'N/A')}\n"
+                    if 'arxiv_id' in metadata:
+                        source_info += f"arXiv ID: {metadata.get('arxiv_id', 'N/A')}\n"
+                # 添加相關性分數（如果有的話）
+                rerank_score = result.get('rerank_score')
+                hybrid_score = result.get('hybrid_score')
+                if rerank_score is not None:
+                    source_info += f"相關性分數: {rerank_score:.4f}\n"
+                elif hybrid_score is not None:
+                    source_info += f"相關性分數: {hybrid_score:.4f}\n"
+                source_info += f"---\n{content}\n"
+                formatted_parts.append(source_info)
+            elif format_style == "simple":
+                # 簡單格式：只包含關鍵資訊
+                title = metadata.get('title', 'N/A')
+                if document_type == "paper" and 'arxiv_id' in metadata:
+                    arxiv_id = metadata.get('arxiv_id', 'N/A')
+                    source_info = (
+                        f"[來源 {i}: {title} "
+                        f"(arXiv:{arxiv_id})]\n"
+                        f"{content}\n"
+                    )
+                elif document_type == "cv" and 'file_path' in metadata:
+                    file_path = metadata.get('file_path', 'N/A')
+                    source_info = (
+                        f"[來源 {i}: {title} "
+                        f"({file_path})]\n"
+                        f"{content}\n"
+                    )
+                else:
+                    source_info = (
+                        f"[來源 {i}: {title}]\n"
+                        f"{content}\n"
+                    )
+                formatted_parts.append(source_info)
+            else:  # minimal
+                # 最小格式：只標註來源
+                if document_type == "paper" and 'arxiv_id' in metadata:
+                    arxiv_id = metadata.get('arxiv_id', 'N/A')
+                    source_info = (
+                        f"[arXiv:{arxiv_id}]\n"
+                        f"{content}\n"
+                    )
+                elif 'title' in metadata:
+                    title = metadata.get('title', 'N/A')
+                    source_info = (
+                        f"[{title}]\n"
+                        f"{content}\n"
+                    )
+                else:
+                    source_info = (
+                        f"[來源 {i}]\n"
+                        f"{content}\n"
+                    )
+                formatted_parts.append(source_info)
+        formatted_text = "\n" + "="*60 + "\n".join(formatted_parts)
+        # 如果設置了最大長度，進行截斷
+        if self.max_context_length and len(formatted_text) > self.max_context_length:
+            # 從後往前截斷，保留格式
+            formatted_text = formatted_text[:self.max_context_length]
+            # 確保最後一個來源資訊完整
+            last_separator = formatted_text.rfind("="*60)
+            if last_separator > 0:
+                formatted_text = formatted_text[:last_separator] + "\n（內容已截斷...）"
+        return formatted_text
+    def create_prompt(
+        self,
+        query: str,
+        context: str,
+        system_prompt: Optional[str] = None,
+        document_type: str = "general"
+    ) -> str:
+        """
+        創建完整的 LLM prompt
+        Args:
+            query: 用戶查詢
+            context: 格式化後的上下文
+            system_prompt: 可選的系統提示詞（如果為 None，會根據語言和文檔類型自動選擇）
+            document_type: 文檔類型 ("paper", "cv", "general")
+        Returns:
+            完整的 prompt 字符串
+        """
+        # 自動檢測語言並選擇相應的系統提示詞
+        if system_prompt is None and self.auto_detect_language:
+            detected_language = self.detect_language(query)
+            system_prompt = self.get_system_prompt(detected_language, document_type)
+        elif system_prompt is None:
+            # 如果禁用自動檢測，使用中文作為預設
+            system_prompt = self.get_system_prompt("zh", document_type)
+        # 根據檢測到的語言選擇提示詞格式
+        detected_language = self.detect_language(query) if self.auto_detect_language else "zh"
+        # 根據文檔類型選擇不同的提示詞結尾
+        if document_type == "paper":
+            if detected_language == "zh":
+                ending = "## 請基於上述文獻片段回答問題，並在回答中引用具體的論文來源。"
+            else:
+                ending = "## Please answer the question based on the above document excerpts and cite specific paper sources in your answer."
+        else:
+            if detected_language == "zh":
+                ending = "## 請基於上述文檔片段回答問題，並在回答中引用具體的文檔內容。"
+            else:
+                ending = "## Please answer the question based on the above document excerpts and cite specific document content in your answer."
+        if detected_language == "zh":
+            prompt = f"""{system_prompt}
+## 相關文檔片段：
+{context}
+## 用戶問題：
+{query}
+{ending}"""
+        else:  # English
+            prompt = f"""{system_prompt}
+## Relevant Document Excerpts:
+{context}
+## User Question:
+{query}
+{ending}"""
+        return prompt
+    def format_for_llm(
+        self,
+        query: str,
+        results: List[Dict],
+        system_prompt: Optional[str] = None,
+        document_type: str = "general"
+    ) -> str:
+        """
+        一站式方法：格式化檢索結果並創建完整的 prompt
+        Args:
+            query: 用戶查詢
+            results: 檢索結果列表
+            system_prompt: 可選的系統提示詞
+            document_type: 文檔類型 ("paper", "cv", "general")
+        Returns:
+            完整的 prompt 字符串
+        """
+        context = self.format_context(results, document_type=document_type)
+        return self.create_prompt(query, context, system_prompt, document_type)

src/retrievers/__init__.py ADDED Viewed

	@@ -0,0 +1,17 @@

+"""
+檢索器模組
+"""
+from .base import BaseRetriever
+from .bm25_retriever import BM25Retriever
+from .vector_retriever import VectorRetriever
+from .hybrid_search import HybridSearch
+from .reranker import Reranker, RAGPipeline
+__all__ = [
+    "BaseRetriever",
+    "BM25Retriever",
+    "VectorRetriever",
+    "HybridSearch",
+    "Reranker",
+    "RAGPipeline",
+]

src/retrievers/base.py ADDED Viewed

	@@ -0,0 +1,32 @@

+"""
+檢索器模組的抽象基類
+"""
+from abc import ABC, abstractmethod
+from typing import List, Dict, Optional
+class BaseRetriever(ABC):
+    """檢索器的抽象基類"""
+    @abstractmethod
+    def retrieve(
+        self,
+        query: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None
+    ) -> List[Dict]:
+        """
+        檢索相關文檔並返回帶有分數的結果。
+        Args:
+            query: 查詢文字
+            top_k: 返回前 k 個結果
+            metadata_filter: 可選的 metadata 過濾條件字典。
+                            例如: {"arxiv_id": "1234.5678"} 或 {"title": "Machine Learning"}
+                            支援多個條件，所有條件必須同時滿足（AND 邏輯）
+        Returns:
+            相關文檔列表，每個文檔字典都應包含 "score" 鍵，
+            且分數越高代表越相關。返回的結果會根據 metadata_filter 進行過濾。
+        """
+        pass

src/retrievers/bm25_retriever.py ADDED Viewed

	@@ -0,0 +1,127 @@

+"""
+BM25 檢索器模組
+"""
+from typing import List, Dict, Optional
+from rank_bm25 import BM25Okapi
+import re
+from .base import BaseRetriever
+class BM25Retriever(BaseRetriever):
+    """使用 BM25 演算法進行關鍵字檢索"""
+    def __init__(self, documents: List[Dict]):
+        """
+        初始化 BM25 檢索器
+        Args:
+            documents: 文檔列表，每個文檔包含 "content" 和 "metadata"
+        """
+        self.documents = documents
+        self.texts = [doc["content"] for doc in documents]
+        # 對文字進行 tokenization（簡單的分詞）
+        tokenized_texts = [self._tokenize(text) for text in self.texts]
+        # 初始化 BM25
+        self.bm25 = BM25Okapi(tokenized_texts)
+    def _tokenize(self, text: str) -> List[str]:
+        """
+        將文字轉換為 tokens（簡單的實作）
+        Args:
+            text: 輸入文字
+        Returns:
+            token 列表
+        """
+        # 轉為小寫並分割
+        text = text.lower()
+        # 使用正則表達式分割（保留字母和數字）
+        tokens = re.findall(r'\b\w+\b', text)
+        return tokens
+    def retrieve(
+        self,
+        query: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None
+    ) -> List[Dict]:
+        """
+        檢索相關文檔，支援根據 metadata 進行過濾。
+        Args:
+            query: 查詢文字
+            top_k: 返回前 k 個結果
+            metadata_filter: 可選的 metadata 過濾條件字典。
+                            例如: {"arxiv_id": "1234.5678"} 只檢索特定論文的 chunks
+                            或 {"title": "Machine Learning"} 只檢索特定標題的論文
+                            支援多個條件，所有條件必須同時滿足（AND 邏輯）
+                            注意：BM25 的過濾是在檢索後進行的，所以可能會返回少於 top_k 的結果
+        Returns:
+            相關文檔列表，每個包含 "content", "metadata", "score"
+            結果會根據 metadata_filter 進行過濾
+        """
+        # Tokenize 查詢
+        tokenized_query = self._tokenize(query)
+        # 計算 BM25 分數
+        scores = self.bm25.get_scores(tokenized_query)
+        # 獲取所有結果並排序（先獲取更多結果以應對過濾後可能減少的情況）
+        # 如果沒有過濾條件，只需要 top_k 個；如果有過濾條件，需要更多候選結果
+        candidate_k = top_k * 3 if metadata_filter else top_k
+        # 獲取候選結果索引（按分數降序排列）
+        sorted_indices = sorted(
+            range(len(scores)),
+            key=lambda i: scores[i],
+            reverse=True
+        )[:candidate_k]
+        # 構建候選結果
+        candidate_results = []
+        for idx in sorted_indices:
+            candidate_results.append({
+                "content": self.documents[idx]["content"],
+                "metadata": self.documents[idx]["metadata"],
+                "score": float(scores[idx]),
+            })
+        # 如果提供了 metadata_filter，則進行過濾
+        if metadata_filter:
+            filtered_results = []
+            for result in candidate_results:
+                # 檢查該結果的 metadata 是否滿足所有過濾條件
+                metadata = result.get("metadata", {})
+                matches_all = True
+                for filter_key, filter_value in metadata_filter.items():
+                    # 獲取文檔中對應的 metadata 值
+                    doc_value = metadata.get(filter_key)
+                    # 檢查是否匹配
+                    # 支援精確匹配和部分匹配（如果 filter_value 是字串且 doc_value 也是字串）
+                    if isinstance(filter_value, str) and isinstance(doc_value, str):
+                        # 字串匹配：支援精確匹配或包含匹配
+                        if filter_value.lower() not in doc_value.lower():
+                            matches_all = False
+                            break
+                    else:
+                        # 其他類型（數字、布林值等）使用精確匹配
+                        if doc_value != filter_value:
+                            matches_all = False
+                            break
+                # 如果所有條件都滿足，則加入結果
+                if matches_all:
+                    filtered_results.append(result)
+            # 返回過濾後的結果（最多 top_k 個）
+            return filtered_results[:top_k]
+        else:
+            # 沒有過濾條件，直接返回候選結果
+            return candidate_results

src/retrievers/hybrid_search.py ADDED Viewed

	@@ -0,0 +1,298 @@

+"""
+Hybrid Search 模組：結合 BM25 和向量檢索
+支援兩種融合方法：加權求和（Weighted Sum）和倒數排名融合（RRF）
+"""
+from typing import List, Dict, Optional, Literal
+from .base import BaseRetriever
+import numpy as np
+class HybridSearch(BaseRetriever):
+    """結合稀疏和密集檢索的混合搜尋"""
+    def __init__(
+        self,
+        sparse_retriever: BaseRetriever,
+        dense_retriever: BaseRetriever,
+        sparse_weight: float = 0.4,
+        dense_weight: float = 0.6,
+        fusion_method: Literal["weighted_sum", "rrf"] = "rrf",
+        rrf_k: int = 60,
+    ):
+        """
+        初始化 Hybrid Search
+        Args:
+            sparse_retriever: 稀疏檢索器 (例如 BM25)
+            dense_retriever: 密集檢索器 (例如向量檢索)
+            sparse_weight: 稀疏檢索分數的權重（僅用於 weighted_sum 方法）
+            dense_weight: 密集檢索分數的權重（僅用於 weighted_sum 方法）
+            fusion_method: 融合方法，可選 "weighted_sum" 或 "rrf"
+                          - "weighted_sum": 加權求和，需要正規化分數並設置權重
+                          - "rrf": 倒數排名融合（Reciprocal Rank Fusion），
+                                  不需要分數正規化，對不同分數分佈更魯棒
+            rrf_k: RRF 方法中的常數 k，通常設為 60（僅用於 rrf 方法）
+                   較大的 k 值會讓排名較低的文檔獲得更多權重
+        """
+        self.sparse_retriever = sparse_retriever
+        self.dense_retriever = dense_retriever
+        self.fusion_method = fusion_method
+        self.rrf_k = rrf_k
+        # 僅在 weighted_sum 方法中使用權重
+        if fusion_method == "weighted_sum":
+            self.sparse_weight = sparse_weight
+            self.dense_weight = dense_weight
+            # 確保權重總和為 1
+            total_weight = sparse_weight + dense_weight
+            if abs(total_weight - 1.0) > 1e-6:
+                self.sparse_weight = sparse_weight / total_weight
+                self.dense_weight = dense_weight / total_weight
+    def _normalize_scores(self, results: List[Dict]) -> List[Dict]:
+        """
+        將分數正規化到 [0, 1] 區間。
+        僅用於 weighted_sum 方法。
+        Args:
+            results: 檢索結果列表，每個字典包含 'score'
+        Returns:
+            帶有正規化分數的結果列表
+        """
+        scores = [res.get("score", 0.0) for res in results]
+        if not scores:
+            return results
+        scores_array = np.array(scores)
+        min_score = scores_array.min()
+        max_score = scores_array.max()
+        if max_score == min_score:
+            # 如果所有分數都相同，將它們設置為 1.0
+            normalized_scores = [1.0] * len(scores)
+        else:
+            normalized_scores = ((scores_array - min_score) / (max_score - min_score)).tolist()
+        for i, res in enumerate(results):
+            res["score"] = normalized_scores[i]
+        return results
+    def _get_doc_id(self, doc: Dict) -> str:
+        """
+        從文檔中提取唯一標識符
+        Args:
+            doc: 文檔字典
+        Returns:
+            文檔的唯一 ID
+        """
+        metadata = doc.get("metadata", {})
+        return f"{metadata.get('arxiv_id', 'unknown')}_{metadata.get('chunk_index', 0)}"
+    def _apply_rrf(
+        self,
+        sparse_results: List[Dict],
+        dense_results: List[Dict]
+    ) -> List[Dict]:
+        """
+        應用倒數排名融合（Reciprocal Rank Fusion, RRF）方法
+        RRF 公式：RRF(d) = Σ(1 / (k + rank_i(d)))
+        其中：
+        - d 是文檔
+        - rank_i(d) 是文檔在第 i 個檢索結果中的排名（從 1 開始）
+        - k 是常數（預設為 60）
+        RRF 的優點：
+        1. 不需要分數正規化，對不同分數分佈的檢索器更魯棒
+        2. 只依賴排名位置，不依賴分數值
+        3. 自動處理分數分佈差異的問題
+        Args:
+            sparse_results: 稀疏檢索結果列表
+            dense_results: 密集檢索結果列表
+        Returns:
+            融合後的結果列表，按 RRF 分數排序
+        """
+        # 建立文檔 ID 到 RRF 分數的映射
+        doc_to_rrf_score = {}
+        # 處理稀疏檢索結果（BM25）
+        for rank, result in enumerate(sparse_results, start=1):
+            doc_id = self._get_doc_id(result)
+            if doc_id not in doc_to_rrf_score:
+                doc_to_rrf_score[doc_id] = {
+                    "doc": result,
+                    "rrf_score": 0.0,
+                    "sparse_rank": None,
+                    "dense_rank": None
+                }
+            # 計算 RRF 貢獻：1 / (k + rank)
+            doc_to_rrf_score[doc_id]["rrf_score"] += 1.0 / (self.rrf_k + rank)
+            doc_to_rrf_score[doc_id]["sparse_rank"] = rank
+        # 處理密集檢索結果（向量）
+        for rank, result in enumerate(dense_results, start=1):
+            doc_id = self._get_doc_id(result)
+            if doc_id not in doc_to_rrf_score:
+                doc_to_rrf_score[doc_id] = {
+                    "doc": result,
+                    "rrf_score": 0.0,
+                    "sparse_rank": None,
+                    "dense_rank": None
+                }
+            # 計算 RRF 貢獻：1 / (k + rank)
+            doc_to_rrf_score[doc_id]["rrf_score"] += 1.0 / (self.rrf_k + rank)
+            doc_to_rrf_score[doc_id]["dense_rank"] = rank
+        # 構建結果列表
+        rrf_results = []
+        for doc_id, data in doc_to_rrf_score.items():
+            result = data["doc"].copy()
+            result["hybrid_score"] = data["rrf_score"]
+            result["rrf_score"] = data["rrf_score"]
+            result["sparse_rank"] = data["sparse_rank"]
+            result["dense_rank"] = data["dense_rank"]
+            # 從原始結果中獲取分數以供參考
+            if data["sparse_rank"] is not None:
+                # 從稀疏檢索結果中獲取原始分數
+                for sparse_res in sparse_results:
+                    if self._get_doc_id(sparse_res) == doc_id:
+                        result["sparse_score"] = sparse_res.get("score", 0.0)
+                        break
+            else:
+                result["sparse_score"] = None
+            if data["dense_rank"] is not None:
+                # 從密集檢索結果中獲取原始分數
+                for dense_res in dense_results:
+                    if self._get_doc_id(dense_res) == doc_id:
+                        result["dense_score"] = dense_res.get("score", 0.0)
+                        break
+            else:
+                result["dense_score"] = None
+            rrf_results.append(result)
+        # 按 RRF 分數從高到低排序
+        rrf_results.sort(key=lambda x: x["rrf_score"], reverse=True)
+        return rrf_results
+    def _apply_weighted_sum(
+        self,
+        sparse_results: List[Dict],
+        dense_results: List[Dict]
+    ) -> List[Dict]:
+        """
+        應用加權求和（Weighted Sum）方法
+        此方法需要：
+        1. 正規化兩組分數到相同範圍
+        2. 根據權重進行加權求和
+        Args:
+            sparse_results: 稀疏檢索結果列表
+            dense_results: 密集檢索結果列表
+        Returns:
+            融合後的結果列表，按混合分數排序
+        """
+        # 正規化兩組分數
+        normalized_sparse = self._normalize_scores(sparse_results)
+        normalized_dense = self._normalize_scores(dense_results)
+        # 結合分數
+        doc_to_scores = {}
+        # 處理稀疏檢索結果
+        for res in normalized_sparse:
+            doc_id = self._get_doc_id(res)
+            if doc_id not in doc_to_scores:
+                doc_to_scores[doc_id] = {"doc": res, "sparse": 0.0, "dense": 0.0}
+            doc_to_scores[doc_id]["sparse"] = res["score"]
+        # 處理密集檢索結果
+        for res in normalized_dense:
+            doc_id = self._get_doc_id(res)
+            if doc_id not in doc_to_scores:
+                doc_to_scores[doc_id] = {"doc": res, "sparse": 0.0, "dense": 0.0}
+            doc_to_scores[doc_id]["dense"] = res["score"]
+        # 計算混合分數並排序
+        hybrid_results = []
+        for doc_id, scores in doc_to_scores.items():
+            hybrid_score = (
+                self.sparse_weight * scores["sparse"] +
+                self.dense_weight * scores["dense"]
+            )
+            result = scores["doc"].copy()
+            result["hybrid_score"] = hybrid_score
+            result["sparse_score"] = scores["sparse"]
+            result["dense_score"] = scores["dense"]
+            hybrid_results.append(result)
+        # 按混合分數從高到低排序
+        hybrid_results.sort(key=lambda x: x["hybrid_score"], reverse=True)
+        return hybrid_results
+    def retrieve(
+        self,
+        query: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None
+    ) -> List[Dict]:
+        """
+        執行混合搜尋，支援根據 metadata 進行過濾
+        Args:
+            query: 查詢文字
+            top_k: 返回前 k 個結果
+            metadata_filter: 可選的 metadata 過濾條件字典。
+                            例如: {"arxiv_id": "1234.5678"} 只檢索特定論文的 chunks
+                            或 {"title": "Machine Learning"} 只檢索特定標題的論文
+                            支援多個條件，所有條件必須同時滿足（AND 邏輯）
+                            此過濾條件會傳遞給底層的稀疏和密集檢索器
+        Returns:
+            相關文檔列表，每個包含 "content", "metadata", "hybrid_score"
+            結果會根據 metadata_filter 進行過濾
+            根據 fusion_method 的不同，返回的結果會包含不同的分數欄位：
+            - RRF 方法：包含 "rrf_score", "sparse_rank", "dense_rank"
+            - Weighted Sum 方法：包含 "sparse_score", "dense_score"
+        """
+        # 1. 從兩個檢索器獲取結果（請求更多結果以確保覆蓋率）
+        # 將 metadata_filter 傳遞給底層檢索器
+        sparse_results = self.sparse_retriever.retrieve(
+            query,
+            top_k=top_k * 2,
+            metadata_filter=metadata_filter
+        )
+        dense_results = self.dense_retriever.retrieve(
+            query,
+            top_k=top_k * 2,
+            metadata_filter=metadata_filter
+        )
+        # 2. 根據選擇的融合方法進行結果融合
+        if self.fusion_method == "rrf":
+            # 使用 RRF（倒數排名融合）方法
+            # RRF 不需要分數正規化，直接基於排名進行融合
+            hybrid_results = self._apply_rrf(sparse_results, dense_results)
+        else:
+            # 使用加權求和方法
+            # 需要先正規化分數，然後根據權重進行加權求和
+            hybrid_results = self._apply_weighted_sum(sparse_results, dense_results)
+        # 3. 返回前 top_k 個結果
+        return hybrid_results[:top_k]

src/retrievers/reranker.py ADDED Viewed

	@@ -0,0 +1,448 @@

+"""
+重排序模組：使用 Cross-Encoder 進行精準重排
+"""
+from typing import List, Dict, Optional, Tuple
+from sentence_transformers import CrossEncoder
+import time
+import logging
+# 嘗試導入 torch 來檢測可用的設備
+try:
+    import torch
+    TORCH_AVAILABLE = True
+except ImportError:
+    TORCH_AVAILABLE = False
+# 配置日志
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+def get_device() -> str:
+    """
+    自動檢測並返回最佳可用的設備
+    Returns:
+        設備名稱: 'mps' (macOS GPU), 'cuda' (NVIDIA GPU), 或 'cpu'
+    """
+    if not TORCH_AVAILABLE:
+        return 'cpu'
+    # 優先順序: MPS (macOS) > CUDA (NVIDIA) > CPU
+    if torch.backends.mps.is_available():
+        return 'mps'
+    elif torch.cuda.is_available():
+        return 'cuda'
+    else:
+        return 'cpu'
+class Reranker:
+    """重排序組件：使用 Cross-Encoder 進行精準重排"""
+    def __init__(
+        self,
+        model_name: str = "BAAI/bge-reranker-base",
+        device: str = None,
+        max_length: int = 512,
+        batch_size: int = 32,
+        enable_cache: bool = True
+    ):
+        """
+        初始化 Cross-Encoder 模型
+        Args:
+            model_name: Cross-Encoder 模型名稱
+            device: 設備名稱 ('cuda', 'cpu', 'mps')
+            max_length: 最大 token 長度（模型限制）
+            batch_size: 批處理大小，用於優化內存使用
+            enable_cache: 是否啟用模型緩存
+        """
+        try:
+            # 自動檢測設備（如果未指定）
+            if device is None:
+                device = get_device()
+            device_name_map = {
+                'mps': 'MPS (macOS GPU)',
+                'cuda': 'CUDA (NVIDIA GPU)',
+                'cpu': 'CPU'
+            }
+            device_display = device_name_map.get(device, device)
+            self.model = CrossEncoder(
+                model_name,
+                device=device,
+                max_length=max_length
+            )
+            self.max_length = max_length
+            self.batch_size = batch_size
+            self.model_name = model_name
+            logger.info(f"✅ 重排模型 {model_name} 已載入 (device: {device_display})")
+        except Exception as e:
+            logger.error(f"❌ 模型載入失敗: {e}")
+            raise
+    def _truncate_text(self, text: str, max_chars: int = 2000) -> str:
+        """
+        截斷過長的文本（粗略估計，避免超過 token 限制）
+        Args:
+            text: 原始文本
+            max_chars: 最大字符數（保守估計，約 500 tokens）
+        Returns:
+            截斷後的文本
+        """
+        if len(text) <= max_chars:
+            return text
+        # 截斷並添加省略號
+        return text[:max_chars - 3] + "..."
+    def _prepare_pairs(
+        self,
+        query: str,
+        documents: List[Dict]
+    ) -> List[Tuple[str, str]]:
+        """
+        準備 (query, document) 配對，處理文本長度
+        Args:
+            query: 查詢文本
+            documents: 文檔列表
+        Returns:
+            (query, content) 配對列表
+        """
+        pairs = []
+        truncated_indices = []  # 記錄哪些文檔被截斷了
+        # 粗略估計：每個字符約 0.25 tokens，為 query 預留空間
+        max_doc_chars = int((self.max_length * 0.7) - len(query))
+        for i, doc in enumerate(documents):
+            content = doc.get("content", "")
+            original_length = len(content)
+            # 如果內容過長，進行截斷
+            if len(content) > max_doc_chars:
+                content = self._truncate_text(content, max_doc_chars)
+                truncated_indices.append(i)
+            pairs.append([query, content])
+        if truncated_indices:
+            logger.warning(
+                f"⚠️  有 {len(truncated_indices)} 個文檔因過長被截斷 "
+                f"(最大長度: {max_doc_chars} 字符)"
+            )
+        return pairs
+    def rerank(
+        self,
+        query: str,
+        documents: List[Dict],
+        top_k: int = 5,
+        preserve_original_scores: bool = True
+    ) -> List[Dict]:
+        """
+        執行精準重排
+        Args:
+            query: 查詢文本
+            documents: 文檔列表，每個應包含 "content" 和可選的 "hybrid_score"
+            top_k: 返回前 k 個結果
+            preserve_original_scores: 是否保留原始分數（hybrid_score）
+        Returns:
+            重排後的文檔列表，按 rerank_score 降序排列
+        """
+        if not documents:
+            logger.warning("⚠️  文檔列表為空，返回空結果")
+            return []
+        if not query or not query.strip():
+            logger.warning("⚠️  查詢為空，返回原始文檔順序")
+            return documents[:top_k]
+        start_time = time.time()
+        logger.info(f"🔄 開始重排 {len(documents)} 個文檔...")
+        try:
+            # 1. 準備配對
+            pairs = self._prepare_pairs(query, documents)
+            # 2. 批處理計算分數（優化內存使用）
+            scores = []
+            for i in range(0, len(pairs), self.batch_size):
+                batch_pairs = pairs[i:i + self.batch_size]
+                batch_scores = self.model.predict(batch_pairs)
+                scores.extend(batch_scores.tolist() if hasattr(batch_scores, 'tolist') else batch_scores)
+            # 3. 更新文檔分數
+            for i, doc in enumerate(documents):
+                doc = doc.copy()  # 避免修改原始文檔
+                doc["rerank_score"] = float(scores[i])
+                # 保留原始分數供參考
+                if preserve_original_scores:
+                    if "hybrid_score" not in doc:
+                        # 如果沒有 hybrid_score，嘗試使用其他分數
+                        doc["original_score"] = doc.get("score", 0.0)
+                documents[i] = doc
+            # 4. 根據 rerank_score 重新排序
+            reranked_docs = sorted(
+                documents,
+                key=lambda x: x.get("rerank_score", float('-inf')),
+                reverse=True
+            )
+            # 5. 統計資訊
+            elapsed_time = time.time() - start_time
+            avg_score = sum(scores) / len(scores) if scores else 0.0
+            max_score = max(scores) if scores else 0.0
+            min_score = min(scores) if scores else 0.0
+            logger.info(
+                f"✅ 重排完成 (耗時: {elapsed_time:.2f}s, "
+                f"平均分數: {avg_score:.4f}, "
+                f"範圍: [{min_score:.4f}, {max_score:.4f}])"
+            )
+            return reranked_docs[:top_k]
+        except Exception as e:
+            logger.error(f"❌ 重排過程出錯: {e}")
+            # 降級策略：返回原始順序的前 top_k 個
+            logger.warning("⚠️  使用降級策略：返回原始順序")
+            return documents[:top_k]
+class RAGPipeline:
+    """協調管線：管理完整的 RAG 流程（召回 + 重排）"""
+    def __init__(
+        self,
+        hybrid_search,
+        reranker,
+        recall_k: int = 25,
+        adaptive_recall: bool = True,
+        min_recall_k: int = 10,
+        max_recall_k: int = 50
+    ):
+        """
+        初始化 RAG 管線
+        Args:
+            hybrid_search: HybridSearch 實例
+            reranker: Reranker 實例
+            recall_k: 第一階段召回的數量（預設值）
+            adaptive_recall: 是否根據查詢動態調整 recall_k
+            min_recall_k: 最小召回數量
+            max_recall_k: 最大召回數量
+        """
+        self.hybrid_search = hybrid_search
+        self.reranker = reranker
+        self.base_recall_k = recall_k
+        self.adaptive_recall = adaptive_recall
+        self.min_recall_k = min_recall_k
+        self.max_recall_k = max_recall_k
+        # 性能統計
+        self.stats = {
+            "total_queries": 0,
+            "avg_recall_time": 0.0,
+            "avg_rerank_time": 0.0,
+            "avg_total_time": 0.0
+        }
+    def _calculate_adaptive_recall_k(self, query: str) -> int:
+        """
+        根據查詢複雜度動態計算 recall_k
+        Args:
+            query: 查詢文本
+        Returns:
+            調整後的 recall_k
+        """
+        if not self.adaptive_recall:
+            return self.base_recall_k
+        # 簡單啟發式：根據查詢長度和關鍵詞數量調整
+        query_length = len(query.split())
+        keyword_count = len(set(query.lower().split()))
+        # 複雜查詢需要更多候選
+        if query_length > 10 or keyword_count > 5:
+            recall_k = min(self.base_recall_k * 2, self.max_recall_k)
+        elif query_length < 3:
+            recall_k = max(self.base_recall_k // 2, self.min_recall_k)
+        else:
+            recall_k = self.base_recall_k
+        return recall_k
+    def query(
+        self,
+        text: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        enable_rerank: bool = True,
+        return_stats: bool = False
+    ) -> List[Dict]:
+        """
+        執行完整的搜尋流程
+        Args:
+            text: 查詢文本
+            top_k: 最終返回的結果數量
+            metadata_filter: 可選的 metadata 過濾條件
+            enable_rerank: 是否啟用重排序（可選，用於性能測試）
+            return_stats: 是否返回���能統計資訊
+        Returns:
+            相關文檔列表，如果 return_stats=True，則返回 (results, stats) 元組
+        """
+        if not text or not text.strip():
+            logger.warning("⚠️  查詢為空")
+            return []
+        total_start = time.time()
+        self.stats["total_queries"] += 1
+        # 動態計算 recall_k
+        recall_k = self._calculate_adaptive_recall_k(text)
+        logger.info(
+            f"🔍 搜尋中: '{text[:50]}...' "
+            f"(召回階段: {recall_k} 筆, 最終返回: {top_k} 筆)"
+        )
+        try:
+            # 第一階段：混合搜尋（召回階段）
+            recall_start = time.time()
+            initial_results = self.hybrid_search.retrieve(
+                query=text,
+                top_k=recall_k,
+                metadata_filter=metadata_filter
+            )
+            recall_time = time.time() - recall_start
+            if not initial_results:
+                logger.warning("⚠️  召回階段未找到任何結果")
+                return []
+            logger.info(
+                f"✅ 召回階段完成: 找到 {len(initial_results)} 個候選 "
+                f"(耗時: {recall_time:.2f}s)"
+            )
+            # 第二階段：重排序（精篩階段）
+            if enable_rerank and len(initial_results) > top_k:
+                rerank_start = time.time()
+                final_results = self.reranker.rerank(
+                    query=text,
+                    documents=initial_results,
+                    top_k=top_k
+                )
+                rerank_time = time.time() - rerank_start
+                logger.info(
+                    f"✅ 重排階段完成: 從 {len(initial_results)} 個候選中選出 "
+                    f"{len(final_results)} 個結果 (耗時: {rerank_time:.2f}s)"
+                )
+            else:
+                # 跳過重排序（用於性能測試或候選數較少時）
+                final_results = initial_results[:top_k]
+                rerank_time = 0.0
+                logger.info("⏭️  跳過重排序階段（候選數不足或已禁用）")
+            # 更新統計資訊
+            total_time = time.time() - total_start
+            self._update_stats(recall_time, rerank_time, total_time)
+            # 添加性能資訊到結果（可選）
+            if return_stats:
+                stats = {
+                    "recall_time": recall_time,
+                    "rerank_time": rerank_time,
+                    "total_time": total_time,
+                    "recall_k": recall_k,
+                    "candidates_found": len(initial_results),
+                    "final_results": len(final_results)
+                }
+                return final_results, stats
+            return final_results
+        except Exception as e:
+            logger.error(f"❌ 查詢過程出錯: {e}")
+            # 降級策略：嘗試只使用召回階段
+            try:
+                logger.warning("⚠️  嘗試降級策略：僅使用召回結果")
+                return self.hybrid_search.retrieve(text, top_k=top_k, metadata_filter=metadata_filter)
+            except Exception as e2:
+                logger.error(f"❌ 降級策略也失敗: {e2}")
+                return []
+    def _update_stats(self, recall_time: float, rerank_time: float, total_time: float):
+        """更新性能統計資訊"""
+        n = self.stats["total_queries"]
+        self.stats["avg_recall_time"] = (
+            (self.stats["avg_recall_time"] * (n - 1) + recall_time) / n
+        )
+        self.stats["avg_rerank_time"] = (
+            (self.stats["avg_rerank_time"] * (n - 1) + rerank_time) / n
+        )
+        self.stats["avg_total_time"] = (
+            (self.stats["avg_total_time"] * (n - 1) + total_time) / n
+        )
+    def get_stats(self) -> Dict:
+        """獲取性能統計資訊"""
+        return self.stats.copy()
+    def reset_stats(self):
+        """重置統計資訊"""
+        self.stats = {
+            "total_queries": 0,
+            "avg_recall_time": 0.0,
+            "avg_rerank_time": 0.0,
+            "avg_total_time": 0.0
+        }
+    def format_results_for_llm(
+        self,
+        results: List[Dict],
+        format_style: str = "detailed"
+    ) -> str:
+        """
+        格式化檢索結果供 LLM 使用（需要導入 PromptFormatter）
+        Args:
+            results: 檢索結果列表
+            format_style: 格式風格 ("detailed", "simple", "minimal")
+        Returns:
+            格式化後的上下文字符串
+        """
+        try:
+            from ..prompt_formatter import PromptFormatter
+            formatter = PromptFormatter(format_style=format_style)
+            return formatter.format_context(results)
+        except ImportError:
+            # 如果無法導入，使用簡單格式
+            formatted_parts = []
+            for i, result in enumerate(results, 1):
+                metadata = result.get("metadata", {})
+                content = result.get("content", "")
+                arxiv_id = metadata.get('arxiv_id', 'N/A')
+                title = metadata.get('title', 'N/A')
+                formatted_parts.append(
+                    f"[來源 {i}: {title} (arXiv:{arxiv_id})]\n{content}\n"
+                )
+            return "\n" + "="*60 + "\n".join(formatted_parts)

src/retrievers/vector_retriever.py ADDED Viewed

	@@ -0,0 +1,254 @@

+"""
+向量檢索器模組：使用 embedding 和向量資料庫進行語義檢索
+支援兩種初始化方式：
+1. 自動初始化 embeddings（預設）：根據參數創建新的 embedding 模型
+2. 使用外部 embeddings：接收已初始化的 embedding 模型（可與 DocumentProcessor 共用）
+"""
+from typing import List, Dict, Optional, Any
+from langchain_community.vectorstores import Chroma
+from langchain_core.documents import Document
+import os
+from .base import BaseRetriever
+# 嘗試導入 HuggingFaceEmbeddings（免費模型）
+try:
+    from langchain_community.embeddings import HuggingFaceEmbeddings
+except ImportError:
+    try:
+        from langchain_huggingface import HuggingFaceEmbeddings
+    except ImportError:
+        raise ImportError("需要安裝 langchain-community 或 langchain-huggingface 才能使用 Hugging Face embeddings")
+# 導入 torch 來檢測可用的設備
+try:
+    import torch
+    TORCH_AVAILABLE = True
+except ImportError:
+    TORCH_AVAILABLE = False
+def get_device() -> str:
+    """
+    自動檢測並返回最佳可用的設備
+    Returns:
+        設備名稱: 'mps' (macOS GPU), 'cuda' (NVIDIA GPU), 或 'cpu'
+    """
+    if not TORCH_AVAILABLE:
+        return 'cpu'
+    # 優先順序: MPS (macOS) > CUDA (NVIDIA) > CPU
+    if torch.backends.mps.is_available():
+        return 'mps'
+    elif torch.cuda.is_available():
+        return 'cuda'
+    else:
+        return 'cpu'
+class VectorRetriever(BaseRetriever):
+    """使用向量檢索進行語義搜尋"""
+    def __init__(
+        self,
+        documents: List[Dict],
+        embedding_model: str = "sentence-transformers/all-MiniLM-L6-v2",
+        persist_directory: Optional[str] = "./chroma_db",
+        hf_cache_dir: Optional[str] = None,
+        device: Optional[str] = None,
+        embeddings: Optional[Any] = None  # 可選：外部傳入的 embedding 模型（優先使用）
+    ):
+        """
+        初始化向量檢索器（使用 Hugging Face embeddings）
+        Args:
+            documents: 文檔列表，每個文檔包含 "content" 和 "metadata"
+            embedding_model: Hugging Face embedding 模型名稱（預設: "sentence-transformers/all-MiniLM-L6-v2"）
+                            僅在 embeddings=None 時使用
+            persist_directory: Chroma 資料庫持久化目錄
+            hf_cache_dir: Hugging Face 模型緩存目錄（例如外接硬碟路徑）
+                         如果為 None，則使用環境變數 HF_HOME 或預設位置 ~/.cache/huggingface/
+                         僅在 embeddings=None 時使用
+            device: 設備名稱 ('mps', 'cuda', 'cpu')，如果為 None 則自動檢測最佳設備
+                   僅在 embeddings=None 時使用
+            embeddings: 可選的外部 embedding 模型物件
+                       如果提供，將優先使用此模型，忽略其他參數（embedding_model, hf_cache_dir, device）
+                       這允許與 DocumentProcessor 共用同一個 embedding 模型實例
+                       優點：
+                       - 節省內存（只加載一次模型）
+                       - 節省時間（避免重複初始化）
+                       - 確保一致性（分塊和檢索使用相同的模型）
+        """
+        # 優先使用傳入的共用模型
+        if embeddings is not None:
+            self.embeddings = embeddings
+            print("✓ 使用外部傳入的 embeddings 模型（與 DocumentProcessor 共用）")
+        else:
+            # 若無傳入，則執行原有的初始化邏輯
+            print(f"使用 Hugging Face embedding 模型: {embedding_model}")
+            # 設置 Hugging Face 緩存目錄
+            if hf_cache_dir:
+                # 如果指定了緩存目錄，設置環境變數
+                os.environ['HF_HOME'] = hf_cache_dir
+                os.environ['TRANSFORMERS_CACHE'] = hf_cache_dir
+                print(f"模型將存儲在: {hf_cache_dir}")
+            else:
+                # 檢查是否已經設置了環境變數
+                default_cache = os.path.expanduser("~/.cache/huggingface")
+                current_cache = os.getenv('HF_HOME', default_cache)
+                print(f"模型緩存位置: {current_cache}")
+                print("提示: 可以通過設置 hf_cache_dir 參數或環境變數 HF_HOME 來指定外接硬碟路徑")
+            # 自動檢測或使用指定的設備
+            if device is None:
+                device = get_device()
+            device_name_map = {
+                'mps': 'MPS (macOS GPU)',
+                'cuda': 'CUDA (NVIDIA GPU)',
+                'cpu': 'CPU'
+            }
+            print(f"使用設備: {device_name_map.get(device, device)}")
+            print("首次使用時會下載模型，請稍候...")
+            # 構建 model_kwargs，包含緩存目錄和設備
+            model_kwargs = {'device': device}
+            if hf_cache_dir:
+                model_kwargs['cache_dir'] = hf_cache_dir
+            self.embeddings = HuggingFaceEmbeddings(
+                model_name=embedding_model,
+                model_kwargs=model_kwargs,
+                encode_kwargs={'normalize_embeddings': True}  # 正規化 embeddings 以提升效果
+            )
+        # 將文檔轉換為 LangChain Document 格式
+        # 需要將 metadata 中的列表轉換為字串，因為 ChromaDB 不接受列表類型
+        def sanitize_metadata(metadata: Dict) -> Dict:
+            """將 metadata 中的列表轉換為字串，以符合 ChromaDB 的要求"""
+            sanitized = {}
+            for key, value in metadata.items():
+                if isinstance(value, list):
+                    # 將列表轉換為逗號分隔的字串
+                    sanitized[key] = ", ".join(str(v) for v in value)
+                elif isinstance(value, (dict, set)):
+                    # 將字典或集合轉換為字串
+                    sanitized[key] = str(value)
+                else:
+                    # 其他類型（str, int, float, bool, None）直接保留
+                    sanitized[key] = value
+            return sanitized
+        langchain_docs = [
+            Document(
+                page_content=doc["content"],
+                metadata=sanitize_metadata(doc["metadata"])
+            )
+            for doc in documents
+        ]
+        # 創建向量資料庫
+        self.vectorstore = Chroma.from_documents(
+            documents=langchain_docs,
+            embedding=self.embeddings,
+            persist_directory=persist_directory
+        )
+        # 創建 retriever
+        self.retriever = self.vectorstore.as_retriever()
+    def retrieve(
+        self,
+        query: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None
+    ) -> List[Dict]:
+        """
+        檢索相關文檔，並返回標準化的相似度分數（越高越好）。
+        支援根據 metadata 進行過濾。
+        Args:
+            query: 查詢文字
+            top_k: 返回前 k 個結果
+            metadata_filter: 可選的 metadata 過濾條件字典。
+                            例如: {"arxiv_id": "1234.5678"} 只檢索特定論文的 chunks
+                            或 {"title": "Machine Learning"} 只檢索特定標題的論文
+                            支援多個條件，所有條件必須同時滿足（AND 邏輯）
+                            注意：ChromaDB 的 where 條件支援精確匹配，不支援部分匹配
+        Returns:
+            相關文檔列表，每個包含 "content", "metadata", 和 "score"
+            結果會根據 metadata_filter 進行過濾
+        """
+        # 構建過濾條件
+        # 如果提供了 metadata_filter，先獲取更多結果，然後在 Python 中進行過濾
+        # 這是因為 LangChain ChromaDB 的 similarity_search_with_score 方法
+        # 對 filter 參數的支援可能因版本而異
+        if metadata_filter:
+            # 獲取更多結果以確保有足夠的候選進行過濾
+            results_with_scores = self.vectorstore.similarity_search_with_score(
+                query,
+                k=top_k * 10  # 獲取更多結果
+            )
+            # 在 Python 中進行過濾
+            filtered_results = []
+            for doc, distance_score in results_with_scores:
+                metadata = doc.metadata
+                matches = True
+                for key, value in metadata_filter.items():
+                    doc_value = metadata.get(key)
+                    # 檢查是否匹配
+                    if isinstance(value, dict):
+                        # 支援運算符格式（例如 {"$eq": "value"}）
+                        if "$eq" in value:
+                            if doc_value != value["$eq"]:
+                                matches = False
+                                break
+                        else:
+                            # 其他運算符可以在此擴展
+                            matches = False
+                            break
+                    elif isinstance(value, str) and isinstance(doc_value, str):
+                        # 字串匹配：支援部分匹配（包含）
+                        if value.lower() not in doc_value.lower():
+                            matches = False
+                            break
+                    else:
+                        # 其他類型使用精確匹配
+                        if doc_value != value:
+                            matches = False
+                            break
+                if matches:
+                    filtered_results.append((doc, distance_score))
+            # 只保留前 top_k 個結果
+            results_with_scores = filtered_results[:top_k]
+        else:
+            # 沒有過濾條件，直接獲取結果
+            results_with_scores = self.vectorstore.similarity_search_with_score(
+                query,
+                k=top_k
+            )
+        # 構建結果並轉換分數
+        results = []
+        for doc, distance_score in results_with_scores:
+            # 因為 embedding 已正規化，L2 距離的平方為 2 - 2 * cos_sim
+            # -> cos_sim = 1 - (distance^2 / 2)
+            # 分數範圍在 [0, 1] 之間，越高越相似
+            similarity_score = 1 - (distance_score**2 / 2)
+            results.append({
+                "content": doc.page_content,
+                "metadata": doc.metadata,
+                "score": float(similarity_score),
+            })
+        return results

src/step_back_rag.py ADDED Viewed

	@@ -0,0 +1,305 @@

+"""
+Step-back Prompting 雙軌 RAG：結合具體事實與抽象原理
+使用 Step-back Prompting 技術，同時檢索具體事實和抽象原理，提升回答質量
+"""
+from typing import List, Dict, Optional
+from .retrievers.reranker import RAGPipeline
+from .retrievers.vector_retriever import VectorRetriever
+from .prompt_formatter import PromptFormatter
+from .llm_integration import OllamaLLM
+import time
+import logging
+import hashlib
+from concurrent.futures import ThreadPoolExecutor, as_completed
+logger = logging.getLogger(__name__)
+class StepBackRAG:
+    """使用 Step-back Prompting 的雙軌 RAG 系統"""
+    def __init__(
+        self,
+        rag_pipeline: RAGPipeline,
+        vector_retriever: VectorRetriever,
+        llm: OllamaLLM,
+        step_back_temperature: float = 0.3,  # 生成抽象問題時使用較低溫度
+        answer_temperature: float = 0.7,
+        enable_parallel: bool = True
+    ):
+        """
+        初始化 Step-back RAG
+        Args:
+            rag_pipeline: RAG 管線實例（用於最終答案生成）
+            vector_retriever: 向量檢索器
+            llm: LLM 實例
+            step_back_temperature: 生成抽象問題的溫度（較低，更穩定）
+            answer_temperature: 生成答案的溫度
+            enable_parallel: 是否並行執行雙軌檢索
+        """
+        self.rag_pipeline = rag_pipeline
+        self.vector_retriever = vector_retriever
+        self.llm = llm
+        self.step_back_temperature = step_back_temperature
+        self.answer_temperature = answer_temperature
+        self.enable_parallel = enable_parallel
+    def _generate_step_back_question(self, question: str) -> str:
+        """
+        生成 Step-back 抽象問題
+        Args:
+            question: 原始具體問題
+        Returns:
+            抽象問題
+        """
+        is_chinese = PromptFormatter.detect_language(question) == "zh"
+        if is_chinese:
+            prompt = f"""你是一個資深專家。請將以下具體問題轉換為一個更抽象、更基礎的原理性問題。
+這個抽象問題應該幫助理解該領域的基礎概念和原理，而不是直接回答具體問題。
+具體問題: {question}
+請生成一個抽象問題，用於檢索相關的原理和背景知識：
+"""
+        else:
+            prompt = f"""You are a senior expert. Please convert the following specific question into a more abstract, fundamental question about principles and concepts.
+This abstract question should help understand the basic concepts and principles in this field, rather than directly answering the specific question.
+Specific question: {question}
+Please generate an abstract question for retrieving relevant principles and background knowledge:
+"""
+        try:
+            abstract_question = self.llm.generate(
+                prompt=prompt,
+                temperature=self.step_back_temperature,
+                max_tokens=200
+            )
+            abstract_question = abstract_question.strip()
+            if not abstract_question:
+                logger.warning("⚠️  生成的抽象問題為空，使用原始問題")
+                return question
+            logger.info(f"✅ 生成抽象問題: '{abstract_question}'")
+            return abstract_question
+        except Exception as e:
+            logger.error(f"⚠️  生成抽象問題時出錯: {e}")
+            return question
+    def _get_doc_id(self, doc: Dict) -> str:
+        """
+        生成文檔的唯一標識符
+        Args:
+            doc: 文檔字典
+        Returns:
+            唯一 ID
+        """
+        metadata = doc.get("metadata", {})
+        content = doc.get("content", "")
+        if "arxiv_id" in metadata and "chunk_index" in metadata:
+            return f"{metadata['arxiv_id']}_{metadata['chunk_index']}"
+        elif "file_path" in metadata and "chunk_index" in metadata:
+            return f"{metadata['file_path']}_{metadata['chunk_index']}"
+        else:
+            content_hash = hashlib.md5(content.encode()).hexdigest()[:16]
+            return f"doc_{content_hash}"
+    def _retrieve_direct(self, question: str, top_k: int, metadata_filter: Optional[Dict] = None) -> List[Dict]:
+        """直接檢索原始問題（具體事實）"""
+        return self.vector_retriever.retrieve(
+            query=question,
+            top_k=top_k,
+            metadata_filter=metadata_filter
+        )
+    def _retrieve_step_back(self, question: str, top_k: int, metadata_filter: Optional[Dict] = None) -> tuple:
+        """Step-back 檢索（抽象原理）"""
+        abstract_question = self._generate_step_back_question(question)
+        results = self.vector_retriever.retrieve(
+            query=abstract_question,
+            top_k=top_k,
+            metadata_filter=metadata_filter
+        )
+        return results, abstract_question
+    def query(
+        self,
+        question: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        return_abstract_question: bool = False
+    ) -> Dict:
+        """
+        執行雙軌檢索（不生成答案）
+        Args:
+            question: 原始問題
+            top_k: 每軌返回的結果數量
+            metadata_filter: 可選的 metadata 過濾條件
+            return_abstract_question: 是否返回抽象問題
+        Returns:
+            包含雙軌檢索結果的字典
+        """
+        start_time = time.time()
+        if self.enable_parallel:
+            # 並行執行雙軌檢索
+            logger.info(f"🔄 並行執行雙軌檢索: '{question}'")
+            with ThreadPoolExecutor(max_workers=2) as executor:
+                direct_future = executor.submit(
+                    self._retrieve_direct, question, top_k, metadata_filter
+                )
+                step_back_future = executor.submit(
+                    self._retrieve_step_back, question, top_k, metadata_filter
+                )
+                specific_results = direct_future.result()
+                abstract_results, abstract_question = step_back_future.result()
+        else:
+            # 串行執行
+            logger.info(f"🔄 串行執行雙軌檢索: '{question}'")
+            specific_results = self._retrieve_direct(question, top_k, metadata_filter)
+            abstract_results, abstract_question = self._retrieve_step_back(question, top_k, metadata_filter)
+        elapsed_time = time.time() - start_time
+        logger.info(
+            f"✅ 雙軌檢索完成（耗時: {elapsed_time:.2f}s）\n"
+            f"   具體事實: {len(specific_results)} 個結果\n"
+            f"   抽象原理: {len(abstract_results)} 個結果"
+        )
+        return {
+            "specific_context": specific_results,
+            "abstract_context": abstract_results,
+            "abstract_question": abstract_question if return_abstract_question else None,
+            "question": question,
+            "elapsed_time": elapsed_time
+        }
+    def generate_answer(
+        self,
+        question: str,
+        formatter: PromptFormatter,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        document_type: str = "general",
+        return_abstract_question: bool = False
+    ) -> Dict:
+        """
+        完整的 Step-back RAG 流程：雙軌檢索 -> 生成答案
+        Args:
+            question: 原始問題
+            formatter: Prompt 格式化器
+            top_k: 每軌用於生成答案的文檔數量
+            metadata_filter: 可選的 metadata 過濾條件
+            document_type: 文檔類型 ("paper", "cv", "general")
+            return_abstract_question: 是否返回抽象問題
+        Returns:
+            包含檢索結果、生成的答案和統計資訊的字典
+        """
+        start_time = time.time()
+        # 第一步：雙軌檢索
+        retrieval_result = self.query(
+            question=question,
+            top_k=top_k,
+            metadata_filter=metadata_filter,
+            return_abstract_question=return_abstract_question
+        )
+        specific_results = retrieval_result["specific_context"]
+        abstract_results = retrieval_result["abstract_context"]
+        if not specific_results and not abstract_results:
+            return {
+                **retrieval_result,
+                "answer": "抱歉，未找到相關文檔來回答此問題。",
+                "formatted_context": None,
+                "answer_time": 0.0,
+                "total_time": retrieval_result["elapsed_time"]
+            }
+        # 第二步：格式化雙軌上下文
+        specific_context = formatter.format_context(
+            specific_results,
+            document_type=document_type
+        ) if specific_results else "未找到相關的具體事實資料。"
+        abstract_context = formatter.format_context(
+            abstract_results,
+            document_type=document_type
+        ) if abstract_results else "未找到相關的基礎原理資料。"
+        # 第三步：創建融合提示詞（關鍵步驟）
+        is_chinese = PromptFormatter.detect_language(question) == "zh"
+        if is_chinese:
+            final_prompt = f"""你是一個資深專家。請結合以下兩類資訊來回答使用者的具體問題。
+【基礎原理與背景】
+{abstract_context}
+【具體事實資料】
+{specific_context}
+使用者問題：{question}
+請根據原理推導並結合事實，給出一個專業且具備邏輯的回答：
+"""
+        else:
+            final_prompt = f"""You are a senior expert. Please answer the user's specific question by combining the following two types of information.
+【Fundamental Principles and Background】
+{abstract_context}
+【Specific Facts and Data】
+{specific_context}
+User question: {question}
+Please provide a professional and logical answer based on principles and facts:
+"""
+        # 第四步：生成回答
+        logger.info("🤖 生成回答中...")
+        answer_start = time.time()
+        try:
+            answer = self.llm.generate(
+                prompt=final_prompt,
+                temperature=self.answer_temperature,
+                max_tokens=2048
+            )
+            answer_time = time.time() - answer_start
+            logger.info(f"✅ 回答生成完成（耗時: {answer_time:.2f}s）")
+        except Exception as e:
+            logger.error(f"❌ 生成回答時出錯: {e}")
+            answer = f"生成回答時出錯: {e}"
+            answer_time = time.time() - answer_start
+        total_time = time.time() - start_time
+        return {
+            **retrieval_result,
+            "answer": answer,
+            "formatted_context": {
+                "specific": specific_context,
+                "abstract": abstract_context
+            },
+            "answer_time": answer_time,
+            "total_time": total_time
+        }

src/subquery_rag.py ADDED Viewed

	@@ -0,0 +1,361 @@

+"""
+Sub-query Decomposition RAG：將複雜問題拆解成子問題後檢索
+"""
+from typing import List, Dict, Optional
+from .retrievers.reranker import RAGPipeline
+from .prompt_formatter import PromptFormatter
+from .llm_integration import OllamaLLM
+import hashlib
+import time
+import logging
+from concurrent.futures import ThreadPoolExecutor, as_completed
+logger = logging.getLogger(__name__)
+class SubQueryDecompositionRAG:
+    """使用子問題拆解的 RAG 系統"""
+    def __init__(
+        self,
+        rag_pipeline: RAGPipeline,
+        llm: OllamaLLM,
+        max_sub_queries: int = 3,
+        top_k_per_subquery: int = 5,
+        enable_parallel: bool = True
+    ):
+        """
+        初始化 Sub-query Decomposition RAG
+        Args:
+            rag_pipeline: 現有的 RAG 管線實例
+            llm: LLM 實例（用於生成子問題）
+            max_sub_queries: 最多生成的子問題數量
+            top_k_per_subquery: 每個子問題檢索的結果數量
+            enable_parallel: 是否並行處理子查詢
+        """
+        self.rag_pipeline = rag_pipeline
+        self.llm = llm
+        self.max_sub_queries = max_sub_queries
+        self.top_k_per_subquery = top_k_per_subquery
+        self.enable_parallel = enable_parallel
+    def _generate_sub_queries(self, question: str) -> List[str]:
+        """
+        將原始問題拆解成子問題
+        Args:
+            question: 原始問題
+        Returns:
+            子問題列表
+        """
+        # 檢測語言
+        is_chinese = PromptFormatter.detect_language(question) == "zh"
+        if is_chinese:
+            prompt = f"""你是一個專業助理。請將以下原始問題拆解成最多 {self.max_sub_queries} 個具體的子問題，以便進行資料搜尋。
+每個子問題應專注於原始問題的一個特定面向。請以換行符號分隔問題。
+原始問題: {question}
+子問題清單:"""
+        else:
+            prompt = f"""You are a professional assistant. Please decompose the following original question into at most {self.max_sub_queries} specific sub-questions for information retrieval.
+Each sub-question should focus on a specific aspect of the original question. Please separate questions with newlines.
+Original question: {question}
+Sub-question list:"""
+        try:
+            response = self.llm.generate(
+                prompt=prompt,
+                temperature=0.3,  # 降低溫度以獲得更穩定的結果
+                max_tokens=500
+            )
+            # 解析子問題
+            sub_queries = [
+                q.strip()
+                for q in response.strip().split("\n")
+                if q.strip() and not q.strip().startswith("#")
+            ]
+            # 移除編號前綴（如 "1. ", "1) " 等）
+            cleaned_queries = []
+            for q in sub_queries:
+                # 移除開頭的編號
+                q = q.lstrip("0123456789. )")
+                q = q.strip()
+                if q:
+                    cleaned_queries.append(q)
+            # 限制數量
+            cleaned_queries = cleaned_queries[:self.max_sub_queries]
+            # 如果沒有生成子問題，使用原始問題
+            if not cleaned_queries:
+                logger.warning("⚠️  未生成子問題，使用原始問題")
+                cleaned_queries = [question]
+            return cleaned_queries
+        except Exception as e:
+            logger.error(f"⚠️  生成子問題時出錯: {e}")
+            # 回退到原始問題
+            return [question]
+    def _get_doc_id(self, doc: Dict) -> str:
+        """
+        生成文檔的唯一標識符
+        Args:
+            doc: 文檔字典
+        Returns:
+            唯一 ID
+        """
+        metadata = doc.get("metadata", {})
+        content = doc.get("content", "")
+        # 使用 metadata 中的唯一標識（如果有的話）
+        if "arxiv_id" in metadata and "chunk_index" in metadata:
+            return f"{metadata['arxiv_id']}_{metadata['chunk_index']}"
+        elif "file_path" in metadata and "chunk_index" in metadata:
+            return f"{metadata['file_path']}_{metadata['chunk_index']}"
+        else:
+            # 回退到內容的 hash
+            content_hash = hashlib.md5(content.encode()).hexdigest()[:16]
+            return f"doc_{content_hash}"
+    def _retrieve_for_subquery(
+        self,
+        sub_query: str,
+        metadata_filter: Optional[Dict] = None
+    ) -> List[Dict]:
+        """
+        針對單個子問題進行檢索
+        Args:
+            sub_query: 子問題
+            metadata_filter: 可選的 metadata 過濾條件
+        Returns:
+            檢索結果列表
+        """
+        try:
+            results = self.rag_pipeline.query(
+                text=sub_query,
+                top_k=self.top_k_per_subquery,
+                metadata_filter=metadata_filter,
+                enable_rerank=True
+            )
+            return results
+        except Exception as e:
+            logger.error(f"⚠️  檢索子問題 '{sub_query}' 時出錯: {e}")
+            return []
+    def _get_unique_documents(
+        self,
+        sub_queries: List[str],
+        metadata_filter: Optional[Dict] = None
+    ) -> List[Dict]:
+        """
+        針對所有子問題進行檢索，並移除重複的檔案
+        Args:
+            sub_queries: 子問題列表
+            metadata_filter: 可選的 metadata 過濾條件
+        Returns:
+            去重後的文檔列表
+        """
+        unique_docs = {}
+        if self.enable_parallel and len(sub_queries) > 1:
+            # 並行處理子查詢
+            logger.info(f"🔄 並行處理 {len(sub_queries)} 個子查詢...")
+            with ThreadPoolExecutor(max_workers=min(len(sub_queries), 5)) as executor:
+                future_to_query = {
+                    executor.submit(self._retrieve_for_subquery, q, metadata_filter): q
+                    for q in sub_queries
+                }
+                for future in as_completed(future_to_query):
+                    sub_query = future_to_query[future]
+                    try:
+                        docs = future.result()
+                        logger.debug(f"✅ 子問題 '{sub_query}' 找到 {len(docs)} 個結果")
+                        for doc in docs:
+                            doc_id = self._get_doc_id(doc)
+                            if doc_id not in unique_docs:
+                                unique_docs[doc_id] = doc
+                            else:
+                                # 如果已存在，保留分數更高的
+                                existing_score = unique_docs[doc_id].get(
+                                    'rerank_score',
+                                    unique_docs[doc_id].get('hybrid_score', 0)
+                                )
+                                new_score = doc.get(
+                                    'rerank_score',
+                                    doc.get('hybrid_score', 0)
+                                )
+                                if new_score > existing_score:
+                                    unique_docs[doc_id] = doc
+                    except Exception as e:
+                        logger.error(f"⚠️  處理子問題 '{sub_query}' 時出錯: {e}")
+        else:
+            # 串行處理
+            logger.info(f"🔄 串行處理 {len(sub_queries)} 個子查詢...")
+            for sub_query in sub_queries:
+                docs = self._retrieve_for_subquery(sub_query, metadata_filter)
+                logger.debug(f"✅ 子問題 '{sub_query}' 找到 {len(docs)} 個結果")
+                for doc in docs:
+                    doc_id = self._get_doc_id(doc)
+                    if doc_id not in unique_docs:
+                        unique_docs[doc_id] = doc
+                    else:
+                        # 保留分數更高的
+                        existing_score = unique_docs[doc_id].get(
+                            'rerank_score',
+                            unique_docs[doc_id].get('hybrid_score', 0)
+                        )
+                        new_score = doc.get(
+                            'rerank_score',
+                            doc.get('hybrid_score', 0)
+                        )
+                        if new_score > existing_score:
+                            unique_docs[doc_id] = doc
+        # 按分數排序
+        result_list = list(unique_docs.values())
+        result_list.sort(
+            key=lambda x: x.get('rerank_score', x.get('hybrid_score', 0)),
+            reverse=True
+        )
+        return result_list
+    def query(
+        self,
+        question: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        return_sub_queries: bool = False
+    ) -> Dict:
+        """
+        執行 Sub-query Decomposition RAG 查詢
+        Args:
+            question: 原始問題
+            top_k: 返回前 k 個結果
+            metadata_filter: 可選的 metadata 過濾條件
+            return_sub_queries: 是否在結果中包含子問題列表
+        Returns:
+            包含檢索結果和統計資訊的字典
+        """
+        start_time = time.time()
+        # 第一步：產生子問題
+        logger.info(f"🔍 拆解問題: '{question}'")
+        sub_queries = self._generate_sub_queries(question)
+        logger.info(f"✅ 生成 {len(sub_queries)} 個子問題:")
+        for i, sq in enumerate(sub_queries, 1):
+            logger.info(f"   {i}. {sq}")
+        # 第二步：檢索並去重
+        logger.info(f"📚 檢索相關文檔...")
+        docs = self._get_unique_documents(sub_queries, metadata_filter)
+        logger.info(f"✅ 找到 {len(docs)} 個唯一文檔（去重後）")
+        # 第三步：返回前 top_k 個結果
+        final_results = docs[:top_k]
+        elapsed_time = time.time() - start_time
+        result = {
+            "results": final_results,
+            "total_docs_found": len(docs),
+            "sub_queries": sub_queries if return_sub_queries else None,
+            "elapsed_time": elapsed_time
+        }
+        return result
+    def generate_answer(
+        self,
+        question: str,
+        formatter: PromptFormatter,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        document_type: str = "general",
+        return_sub_queries: bool = False
+    ) -> Dict:
+        """
+        完整的 Sub-query Decomposition RAG 流程：檢索 + 生成答案
+        Args:
+            question: 原始問題
+            formatter: Prompt 格式化器
+            top_k: 返回前 k 個結果用於生成答案
+            metadata_filter: 可選的 metadata 過濾條件
+            document_type: 文檔類型 ("paper", "cv", "general")
+            return_sub_queries: 是否在結果中包含子問題列表
+        Returns:
+            包含檢索結果、生成的答案和統計資訊的字典
+        """
+        # 檢索
+        retrieval_result = self.query(
+            question=question,
+            top_k=top_k,
+            metadata_filter=metadata_filter,
+            return_sub_queries=return_sub_queries
+        )
+        if not retrieval_result["results"]:
+            return {
+                **retrieval_result,
+                "answer": "抱歉，未找到相關文檔來回答此問題。",
+                "formatted_context": None
+            }
+        # 格式化上下文
+        formatted_context = formatter.format_context(
+            retrieval_result["results"],
+            document_type=document_type
+        )
+        # 創建 prompt
+        prompt = formatter.create_prompt(
+            question,
+            formatted_context,
+            document_type=document_type
+        )
+        # 生成回答
+        logger.info("🤖 生成回答中...")
+        answer_start = time.time()
+        try:
+            answer = self.llm.generate(
+                prompt=prompt,
+                temperature=0.7,
+                max_tokens=2048
+            )
+            answer_time = time.time() - answer_start
+            logger.info(f"✅ 回答生成完成（耗時: {answer_time:.2f}s）")
+        except Exception as e:
+            logger.error(f"❌ 生成回答時出錯: {e}")
+            answer = f"生成回答時出錯: {e}"
+            answer_time = time.time() - answer_start
+        return {
+            **retrieval_result,
+            "answer": answer,
+            "formatted_context": formatted_context,
+            "answer_time": answer_time,
+            "total_time": retrieval_result["elapsed_time"] + answer_time
+        }

src/triple_hybrid_rag.py ADDED Viewed

	@@ -0,0 +1,467 @@

+"""
+Triple Hybrid RAG：融合 SubQuery + HyDE + Step-back Prompting
+結合三種技術的優勢，實現最強大的 RAG 系統
+"""
+from typing import List, Dict, Optional
+from .retrievers.reranker import RAGPipeline
+from .retrievers.vector_retriever import VectorRetriever
+from .prompt_formatter import PromptFormatter
+from .llm_integration import OllamaLLM
+import hashlib
+import time
+import logging
+from concurrent.futures import ThreadPoolExecutor, as_completed
+logger = logging.getLogger(__name__)
+class TripleHybridRAG:
+    """融合 SubQuery + HyDE + Step-back 的三重混合 RAG 系統"""
+    def __init__(
+        self,
+        rag_pipeline: RAGPipeline,
+        vector_retriever: VectorRetriever,
+        llm: OllamaLLM,
+        max_sub_queries: int = 3,
+        top_k_per_subquery: int = 5,
+        hypothetical_length: int = 200,
+        temperature_subquery: float = 0.3,
+        temperature_hyde: float = 0.7,
+        temperature_stepback: float = 0.3,
+        answer_temperature: float = 0.7,
+        enable_parallel: bool = True
+    ):
+        """
+        初始化三重混合 RAG
+        Args:
+            rag_pipeline: RAG 管線實例
+            vector_retriever: 向量檢索器
+            llm: LLM 實例
+            max_sub_queries: 最多生成的子問題數量
+            top_k_per_subquery: 每個子問題檢索的結果數量
+            hypothetical_length: 假設性文檔目標長度（字符數）
+            temperature_subquery: 生成子問題的溫度（較低，更穩定）
+            temperature_hyde: 生成假設性文檔的溫度（較高，更多專業術語）
+            temperature_stepback: 生成抽象問題的溫度（較低，更穩定）
+            answer_temperature: 生成答案的溫度
+            enable_parallel: 是否並行處理
+        """
+        self.rag_pipeline = rag_pipeline
+        self.vector_retriever = vector_retriever
+        self.llm = llm
+        self.max_sub_queries = max_sub_queries
+        self.top_k_per_subquery = top_k_per_subquery
+        self.hypothetical_length = hypothetical_length
+        self.temperature_subquery = temperature_subquery
+        self.temperature_hyde = temperature_hyde
+        self.temperature_stepback = temperature_stepback
+        self.answer_temperature = answer_temperature
+        self.enable_parallel = enable_parallel
+    def _generate_sub_queries(self, question: str) -> List[str]:
+        """生成子問題（SubQuery）"""
+        is_chinese = PromptFormatter.detect_language(question) == "zh"
+        if is_chinese:
+            prompt = f"""你是一個專業助理。請將以下原始問題拆解成最多 {self.max_sub_queries} 個具體的子問題，以便進行資料搜尋。
+每個子問題應專注於原始問題的一個特定面向。請以換行符號分隔問題。
+原始問題: {question}
+子問題清單:"""
+        else:
+            prompt = f"""You are a professional assistant. Please decompose the following original question into at most {self.max_sub_queries} specific sub-questions for information retrieval.
+Each sub-question should focus on a specific aspect of the original question. Please separate questions with newlines.
+Original question: {question}
+Sub-question list:"""
+        try:
+            response = self.llm.generate(
+                prompt=prompt,
+                temperature=self.temperature_subquery,
+                max_tokens=500
+            )
+            sub_queries = [
+                q.strip()
+                for q in response.strip().split("\n")
+                if q.strip() and not q.strip().startswith("#")
+            ]
+            # 移除編號前綴
+            cleaned_queries = []
+            for q in sub_queries:
+                q = q.lstrip("0123456789. )")
+                q = q.strip()
+                if q:
+                    cleaned_queries.append(q)
+            cleaned_queries = cleaned_queries[:self.max_sub_queries]
+            if not cleaned_queries:
+                logger.warning("⚠️  未生成子問題，使用原始問題")
+                cleaned_queries = [question]
+            return cleaned_queries
+        except Exception as e:
+            logger.error(f"⚠️  生成子問題時出錯: {e}")
+            return [question]
+    def _generate_hypothetical_document(self, sub_query: str) -> str:
+        """為子問題生成假設性文檔（HyDE）"""
+        is_chinese = PromptFormatter.detect_language(sub_query) == "zh"
+        if is_chinese:
+            prompt = f"""請針對以下問題，寫出一段約 {self.hypothetical_length} 字的專業技術檔案內容。
+這段內容應包含該領域常見的專業術語與原理說明，以便用於後續的語義檢索。
+請使用專業的術語和概念，即使你對某些細節不確定，也要包含相關的專業詞彙。
+問題: {sub_query}
+專業��術內容："""
+        else:
+            prompt = f"""Please write a professional technical document of approximately {self.hypothetical_length} words in response to the following question.
+This content should include common professional terminology and principle explanations in this field, to be used for subsequent semantic retrieval.
+Please use professional terms and concepts, and include relevant professional vocabulary even if you are uncertain about some details.
+Question: {sub_query}
+Professional technical content:"""
+        try:
+            hypothetical_doc = self.llm.generate(
+                prompt=prompt,
+                temperature=self.temperature_hyde,
+                max_tokens=500
+            )
+            hypothetical_doc = hypothetical_doc.strip()
+            if not hypothetical_doc:
+                logger.warning(f"⚠️  子問題 '{sub_query}' 的假設性文檔為空，使用子問題本身")
+                return sub_query
+            return hypothetical_doc
+        except Exception as e:
+            logger.error(f"⚠️  生成假設性文檔時出錯: {e}")
+            return sub_query
+    def _generate_step_back_question(self, question: str) -> str:
+        """生成 Step-back 抽象問題"""
+        is_chinese = PromptFormatter.detect_language(question) == "zh"
+        if is_chinese:
+            prompt = f"""你是一個資深專家。請將以下具體問題轉換為一個更抽象、更基礎的原理性問題。
+這個抽象問題應該幫助理解該領域的基礎概念和原理，而不是直接回答具體問題。
+具體問題: {question}
+請生成一個抽象問題，用於檢索相關的原理和背景知識：
+"""
+        else:
+            prompt = f"""You are a senior expert. Please convert the following specific question into a more abstract, fundamental question about principles and concepts.
+This abstract question should help understand the basic concepts and principles in this field, rather than directly answering the specific question.
+Specific question: {question}
+Please generate an abstract question for retrieving relevant principles and background knowledge:
+"""
+        try:
+            abstract_question = self.llm.generate(
+                prompt=prompt,
+                temperature=self.temperature_stepback,
+                max_tokens=200
+            )
+            abstract_question = abstract_question.strip()
+            if not abstract_question:
+                logger.warning("⚠️  生成的抽象問題為空，使用原始問題")
+                return question
+            return abstract_question
+        except Exception as e:
+            logger.error(f"⚠️  生成抽象問題時出錯: {e}")
+            return question
+    def _get_doc_id(self, doc: Dict) -> str:
+        """生成文檔的唯一標識符"""
+        metadata = doc.get("metadata", {})
+        content = doc.get("content", "")
+        if "arxiv_id" in metadata and "chunk_index" in metadata:
+            return f"{metadata['arxiv_id']}_{metadata['chunk_index']}"
+        elif "file_path" in metadata and "chunk_index" in metadata:
+            return f"{metadata['file_path']}_{metadata['chunk_index']}"
+        else:
+            content_hash = hashlib.md5(content.encode()).hexdigest()[:16]
+            return f"doc_{content_hash}"
+    def _process_subquery_with_hyde(
+        self,
+        sub_query: str,
+        metadata_filter: Optional[Dict] = None
+    ) -> tuple:
+        """處理單個子問題：生成假設性文檔並檢索"""
+        try:
+            hypothetical_doc = self._generate_hypothetical_document(sub_query)
+            results = self.vector_retriever.retrieve(
+                query=hypothetical_doc,
+                top_k=self.top_k_per_subquery,
+                metadata_filter=metadata_filter
+            )
+            return results, hypothetical_doc
+        except Exception as e:
+            logger.error(f"⚠️  處理子問題 '{sub_query}' 時出錯: {e}")
+            return [], ""
+    def query(
+        self,
+        question: str,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        return_sub_queries: bool = False,
+        return_hypothetical: bool = False,
+        return_abstract_question: bool = False
+    ) -> Dict:
+        """
+        執行三重混合 RAG 檢索
+        流程：
+        1. 拆解成子問題（SubQuery）
+        2. 對每個子問題生成假設性文檔並檢索（HyDE）
+        3. 直接檢索原始問題（具體事實）
+        4. 生成抽象問題並檢索（Step-back，抽象原理）
+        5. 合併所有結果並去重
+        """
+        start_time = time.time()
+        # 第一步：生成子問題
+        logger.info(f"🔍 [SubQuery] 拆解問題: '{question}'")
+        sub_queries = self._generate_sub_queries(question)
+        logger.info(f"✅ 生成 {len(sub_queries)} 個子問題")
+        # 第二步：為每個子問題生成假設性文檔並檢索（HyDE）
+        logger.info(f"📚 [HyDE] 為每個子問題生成假設性文檔並檢索...")
+        subquery_results = []
+        hypothetical_docs = {}
+        if self.enable_parallel and len(sub_queries) > 1:
+            with ThreadPoolExecutor(max_workers=min(len(sub_queries), 5)) as executor:
+                future_to_query = {
+                    executor.submit(self._process_subquery_with_hyde, sq, metadata_filter): sq
+                    for sq in sub_queries
+                }
+                for future in as_completed(future_to_query):
+                    sub_query = future_to_query[future]
+                    try:
+                        results, hypo_doc = future.result()
+                        hypothetical_docs[sub_query] = hypo_doc
+                        subquery_results.extend(results)
+                    except Exception as e:
+                        logger.error(f"⚠️  處理子問題 '{sub_query}' 時出錯: {e}")
+        else:
+            for sub_query in sub_queries:
+                results, hypo_doc = self._process_subquery_with_hyde(sub_query, metadata_filter)
+                hypothetical_docs[sub_query] = hypo_doc
+                subquery_results.extend(results)
+        # 第三步：Step-back 雙軌檢索
+        logger.info(f"🔍 [Step-back] 執行雙軌檢索...")
+        if self.enable_parallel:
+            with ThreadPoolExecutor(max_workers=2) as executor:
+                direct_future = executor.submit(
+                    self.vector_retriever.retrieve,
+                    question, top_k, metadata_filter
+                )
+                abstract_question = self._generate_step_back_question(question)
+                step_back_future = executor.submit(
+                    self.vector_retriever.retrieve,
+                    abstract_question, top_k, metadata_filter
+                )
+                specific_results = direct_future.result()
+                abstract_results = step_back_future.result()
+        else:
+            specific_results = self.vector_retriever.retrieve(
+                query=question,
+                top_k=top_k,
+                metadata_filter=metadata_filter
+            )
+            abstract_question = self._generate_step_back_question(question)
+            abstract_results = self.vector_retriever.retrieve(
+                query=abstract_question,
+                top_k=top_k,
+                metadata_filter=metadata_filter
+            )
+        # 第四步：合併所有結果並去重
+        logger.info(f"🔄 合併並去重所有檢索結果...")
+        all_results = subquery_results + specific_results + abstract_results
+        unique_docs = {}
+        for doc in all_results:
+            doc_id = self._get_doc_id(doc)
+            if doc_id not in unique_docs:
+                unique_docs[doc_id] = doc
+            else:
+                # 保留分數更高的
+                existing_score = unique_docs[doc_id].get('score', 0)
+                new_score = doc.get('score', 0)
+                if new_score > existing_score:
+                    unique_docs[doc_id] = doc
+        # 排序並返回前 top_k
+        result_list = list(unique_docs.values())
+        result_list.sort(key=lambda x: x.get('score', 0), reverse=True)
+        final_results = result_list[:top_k]
+        elapsed_time = time.time() - start_time
+        logger.info(
+            f"✅ 三重混合檢索完成（耗時: {elapsed_time:.2f}s）\n"
+            f"   子問題檢索: {len(subquery_results)} 個結果\n"
+            f"   具體事實: {len(specific_results)} 個結果\n"
+            f"   抽象原理: {len(abstract_results)} 個結果\n"
+            f"   去重後總計: {len(result_list)} 個，返回前 {len(final_results)} 個"
+        )
+        return {
+            "results": final_results,
+            "total_docs_found": len(result_list),
+            "sub_queries": sub_queries if return_sub_queries else None,
+            "hypothetical_documents": hypothetical_docs if return_hypothetical else None,
+            "abstract_question": abstract_question if return_abstract_question else None,
+            "subquery_results": subquery_results,
+            "specific_context": specific_results,
+            "abstract_context": abstract_results,
+            "question": question,
+            "elapsed_time": elapsed_time
+        }
+    def generate_answer(
+        self,
+        question: str,
+        formatter: PromptFormatter,
+        top_k: int = 5,
+        metadata_filter: Optional[Dict] = None,
+        document_type: str = "general",
+        return_sub_queries: bool = False,
+        return_hypothetical: bool = False,
+        return_abstract_question: bool = False
+    ) -> Dict:
+        """
+        完整的三重混合 RAG 流程：檢索 + 生成答案
+        """
+        start_time = time.time()
+        # 檢索
+        retrieval_result = self.query(
+            question=question,
+            top_k=top_k,
+            metadata_filter=metadata_filter,
+            return_sub_queries=return_sub_queries,
+            return_hypothetical=return_hypothetical,
+            return_abstract_question=return_abstract_question
+        )
+        if not retrieval_result["results"]:
+            return {
+                **retrieval_result,
+                "answer": "抱歉，未找到相關文檔來回答此問題。",
+                "formatted_context": None,
+                "answer_time": 0.0,
+                "total_time": retrieval_result["elapsed_time"]
+            }
+        # 格式化三類上下文
+        subquery_context = formatter.format_context(
+            retrieval_result["subquery_results"][:top_k],
+            document_type=document_type
+        ) if retrieval_result.get("subquery_results") else "未找到相關的子問題檢索結果。"
+        specific_context = formatter.format_context(
+            retrieval_result["specific_context"],
+            document_type=document_type
+        ) if retrieval_result.get("specific_context") else "未找到相關的具體事實資料。"
+        abstract_context = formatter.format_context(
+            retrieval_result["abstract_context"],
+            document_type=document_type
+        ) if retrieval_result.get("abstract_context") else "未找到相關的基礎原理資料。"
+        # 創建融合提示詞（關鍵步驟）
+        is_chinese = PromptFormatter.detect_language(question) == "zh"
+        if is_chinese:
+            final_prompt = f"""你是一個資深專家。請結合以下三類資訊來回答使用者的具體問題。
+【基礎原理與背景】（來自 Step-back 抽象問題檢索）
+{abstract_context}
+【具體事實資料】（來自直接問題檢索）
+{specific_context}
+【子問題相關資料】（來自 SubQuery + HyDE 檢索）
+{subquery_context}
+使用者問題：{question}
+請根據原理推導、結合具體事實，並參考子問題的相關資料，給出一個專業、全面且具備邏輯的回答：
+"""
+        else:
+            final_prompt = f"""You are a senior expert. Please answer the user's specific question by combining the following three types of information.
+【Fundamental Principles and Background】(from Step-back abstract question retrieval)
+{abstract_context}
+【Specific Facts and Data】(from direct question retrieval)
+{specific_context}
+【Sub-question Related Information】(from SubQuery + HyDE retrieval)
+{subquery_context}
+User question: {question}
+Please provide a professional, comprehensive, and logical answer based on principles, facts, and sub-question related information:
+"""
+        # 生成回答
+        logger.info("🤖 生成回答中...")
+        answer_start = time.time()
+        try:
+            answer = self.llm.generate(
+                prompt=final_prompt,
+                temperature=self.answer_temperature,
+                max_tokens=2048
+            )
+            answer_time = time.time() - answer_start
+            logger.info(f"✅ 回答生成完成（耗時: {answer_time:.2f}s）")
+        except Exception as e:
+            logger.error(f"❌ 生成回答時出錯: {e}")
+            answer = f"生成回答時出錯: {e}"
+            answer_time = time.time() - answer_start
+        total_time = time.time() - start_time
+        return {
+            **retrieval_result,
+            "answer": answer,
+            "formatted_context": {
+                "subquery": subquery_context,
+                "specific": specific_context,
+                "abstract": abstract_context
+            },
+            "answer_time": answer_time,
+            "total_time": total_time
+        }

uv.lock CHANGED Viewed

@@ -179,6 +179,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/7f/9c/36c5c37947ebfb8c7f22e0eb6e4d188ee2d53aa3880f3f2744fb894f0cb1/anyio-4.12.0-py3-none-any.whl", hash = "sha256:dad2376a628f98eeca4881fc56cd06affd18f659b17a747d3ff0307ced94b1bb", size = 113362, upload-time = "2025-11-28T23:36:57.897Z" },
 ]
 [[package]]
 name = "attrs"
 version = "25.4.0"
@@ -808,6 +821,9 @@ version = "0.1.0"
 source = { virtual = "." }
 dependencies = [
     { name = "accelerate" },
     { name = "einops" },
     { name = "fastapi" },
     { name = "google-api-python-client" },
@@ -820,9 +836,12 @@ dependencies = [
     { name = "langchain" },
     { name = "langchain-chroma" },
     { name = "langchain-community" },
     { name = "langchain-google-genai" },
     { name = "langchain-groq" },
     { name = "langchain-tavily" },
     { name = "langgraph" },
     { name = "langserve", extra = ["all"] },
     { name = "mcp" },
@@ -833,6 +852,7 @@ dependencies = [
     { name = "pillow" },
     { name = "pypdf" },
     { name = "python-dotenv" },
     { name = "sentence-transformers" },
     { name = "tavily-python" },
     { name = "torch" },
@@ -844,6 +864,9 @@ dependencies = [
 [package.metadata]
 requires-dist = [
     { name = "accelerate", specifier = ">=1.12.0" },
     { name = "einops", specifier = ">=0.8.1" },
     { name = "fastapi", specifier = ">=0.124.2" },
     { name = "google-api-python-client", specifier = ">=2.187.0" },
@@ -856,9 +879,12 @@ requires-dist = [
     { name = "langchain", specifier = ">=1.1.3" },
     { name = "langchain-chroma", specifier = ">=1.0.0" },
     { name = "langchain-community", specifier = ">=0.4.1" },
     { name = "langchain-google-genai", specifier = ">=4.0.0" },
     { name = "langchain-groq", specifier = ">=1.1.0" },
     { name = "langchain-tavily", specifier = ">=0.2.13" },
     { name = "langgraph", specifier = ">=1.0.4" },
     { name = "langserve", extras = ["all"], specifier = ">=0.3.3" },
     { name = "mcp", specifier = ">=1.24.0" },
@@ -869,6 +895,7 @@ requires-dist = [
     { name = "pillow", specifier = ">=12.0.0" },
     { name = "pypdf", specifier = ">=6.4.1" },
     { name = "python-dotenv", specifier = ">=1.2.1" },
     { name = "sentence-transformers", specifier = ">=5.2.0" },
     { name = "tavily-python", specifier = ">=0.7.14" },
     { name = "torch", specifier = ">=2.9.1" },
@@ -908,6 +935,7 @@ wheels = [
 ]
 [[package]]
 name = "dnspython"
 version = "2.8.0"
 source = { registry = "https://pypi.org/simple" }
@@ -932,6 +960,14 @@ source = { registry = "https://pypi.org/simple" }
 sdist = { url = "https://files.pythonhosted.org/packages/ae/b6/03bb70946330e88ffec97aefd3ea75ba575cb2e762061e0e62a213befee8/docutils-0.22.4.tar.gz", hash = "sha256:4db53b1fde9abecbb74d91230d32ab626d94f6badfc575d6db9194a49df29968", size = 2291750, upload-time = "2025-12-18T19:00:26.443Z" }
 wheels = [
     { url = "https://files.pythonhosted.org/packages/02/10/5da547df7a391dcde17f59520a231527b8571e6f46fc8efb02ccb370ab12/docutils-0.22.4-py3-none-any.whl", hash = "sha256:d0013f540772d1420576855455d050a2180186c91c15779301ac2ccb3eeb68de", size = 633196, upload-time = "2025-12-18T19:00:18.077Z" },
 ]
 [[package]]
@@ -990,6 +1026,7 @@ wheels = [
 ]
 [[package]]
 name = "fastmcp"
 version = "2.13.0"
 source = { registry = "https://pypi.org/simple" }
@@ -1012,6 +1049,17 @@ dependencies = [
 sdist = { url = "https://files.pythonhosted.org/packages/bc/3b/c30af894db2c3ec439d0e4168ba7ce705474cabdd0a599033ad9a19ad977/fastmcp-2.13.0.tar.gz", hash = "sha256:57f7b7503363e1babc0d1a13af18252b80366a409e1de85f1256cce66a4bee35", size = 7767346, upload-time = "2025-10-25T12:54:10.957Z" }
 wheels = [
     { url = "https://files.pythonhosted.org/packages/c0/7f/09942135f506953fc61bb81b9e5eaf50a8eea923b83d9135bd959168ef2d/fastmcp-2.13.0-py3-none-any.whl", hash = "sha256:bdff1399d3b7ebb79286edfd43eb660182432514a5ab8e4cbfb45f1d841d2aa0", size = 367134, upload-time = "2025-10-25T12:54:09.284Z" },
 ]
 [[package]]
@@ -2116,6 +2164,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/83/bd/9df897cbc98290bf71140104ee5b9777cf5291afb80333aa7da5a497339b/langchain_core-1.2.5-py3-none-any.whl", hash = "sha256:3255944ef4e21b2551facb319bfc426057a40247c0a05de5bd6f2fc021fbfa34", size = 484851, upload-time = "2025-12-22T23:45:30.525Z" },
 ]
 [[package]]
 name = "langchain-google-genai"
 version = "4.1.1"
@@ -2144,6 +2205,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/af/4a/3d6227a16fe9f79968414b50e50869519378b20653805e2e8fab283908e6/langchain_groq-1.1.1-py3-none-any.whl", hash = "sha256:1c6d5146f60205dcde09d7e47bb5291c295d3f0c7bcd2417e4d3a73a04bd1050", size = 19039, upload-time = "2025-12-12T22:00:45.86Z" },
 ]
 [[package]]
 name = "langchain-tavily"
 version = "0.2.16"
@@ -2989,6 +3063,19 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/be/9c/92789c596b8df838baa98fa71844d84283302f7604ed565dafe5a6b5041a/oauthlib-3.3.1-py3-none-any.whl", hash = "sha256:88119c938d2b8fb88561af5f6ee0eec8cc8d552b7bb1f712743136eb7523b7a1", size = 160065, upload-time = "2025-06-19T22:48:06.508Z" },
 ]
 [[package]]
 name = "onnxruntime"
 version = "1.23.2"
@@ -4101,6 +4188,18 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/f1/12/de94a39c2ef588c7e6455cfbe7343d3b2dc9d6b6b2f40c4c6565744c873d/pyyaml-6.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:ebc55a14a21cb14062aa4162f906cd962b28e2e9ea38f9b4391244cd8de4ae0b", size = 149341, upload-time = "2025-09-25T21:32:56.828Z" },
 ]
 [[package]]
 name = "referencing"
 version = "0.36.2"
@@ -4585,6 +4684,12 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/a3/dc/17031897dae0efacfea57dfd3a82fdd2a2aeb58e0ff71b77b87e44edc772/setuptools-80.9.0-py3-none-any.whl", hash = "sha256:062d34222ad13e0cc312a4c02d73f059e86a4acbfbdea8f8f76b28c99f306922", size = 1201486, upload-time = "2025-05-27T00:56:49.664Z" },
 ]
 [[package]]
 name = "shellingham"
 version = "1.5.4"

     { url = "https://files.pythonhosted.org/packages/7f/9c/36c5c37947ebfb8c7f22e0eb6e4d188ee2d53aa3880f3f2744fb894f0cb1/anyio-4.12.0-py3-none-any.whl", hash = "sha256:dad2376a628f98eeca4881fc56cd06affd18f659b17a747d3ff0307ced94b1bb", size = 113362, upload-time = "2025-11-28T23:36:57.897Z" },
 ]
+[[package]]
+name = "arxiv"
+version = "2.3.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "feedparser" },
+    { name = "requests" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/dd/95/65e38ddfb54762a8f1777bbe80da2cebf7941376e67a2212de487d9372db/arxiv-2.3.1.tar.gz", hash = "sha256:08567185dfc102c8d349de4b9e84dfde0af46d6402486e3009afc90f8ccf9709", size = 16692, upload-time = "2025-11-13T06:22:59.853Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/90/7f/340847023184305a6378d75ec71e1dd38a942dfe71b7c29314b8fbe26948/arxiv-2.3.1-py3-none-any.whl", hash = "sha256:eb5a0b76808cc0a16de0c1448df0f927a3cf576096686d8e335a98b8872df1be", size = 11565, upload-time = "2025-11-13T06:22:58.662Z" },
+]
 [[package]]
 name = "attrs"
 version = "25.4.0"
 source = { virtual = "." }
 dependencies = [
     { name = "accelerate" },
+    { name = "arxiv" },
+    { name = "chromadb" },
+    { name = "docx2txt" },
     { name = "einops" },
     { name = "fastapi" },
     { name = "google-api-python-client" },
     { name = "langchain" },
     { name = "langchain-chroma" },
     { name = "langchain-community" },
+    { name = "langchain-experimental" },
     { name = "langchain-google-genai" },
     { name = "langchain-groq" },
+    { name = "langchain-ollama" },
     { name = "langchain-tavily" },
+    { name = "langchain-text-splitters" },
     { name = "langgraph" },
     { name = "langserve", extra = ["all"] },
     { name = "mcp" },
     { name = "pillow" },
     { name = "pypdf" },
     { name = "python-dotenv" },
+    { name = "rank-bm25" },
     { name = "sentence-transformers" },
     { name = "tavily-python" },
     { name = "torch" },
 [package.metadata]
 requires-dist = [
     { name = "accelerate", specifier = ">=1.12.0" },
+    { name = "arxiv", specifier = ">=2.3.1" },
+    { name = "chromadb", specifier = ">=0.4.22" },
+    { name = "docx2txt", specifier = ">=0.8" },
     { name = "einops", specifier = ">=0.8.1" },
     { name = "fastapi", specifier = ">=0.124.2" },
     { name = "google-api-python-client", specifier = ">=2.187.0" },
     { name = "langchain", specifier = ">=1.1.3" },
     { name = "langchain-chroma", specifier = ">=1.0.0" },
     { name = "langchain-community", specifier = ">=0.4.1" },
+    { name = "langchain-experimental", specifier = ">=0.0.50" },
     { name = "langchain-google-genai", specifier = ">=4.0.0" },
     { name = "langchain-groq", specifier = ">=1.1.0" },
+    { name = "langchain-ollama", specifier = ">=0.1.0" },
     { name = "langchain-tavily", specifier = ">=0.2.13" },
+    { name = "langchain-text-splitters", specifier = ">=0.0.1" },
     { name = "langgraph", specifier = ">=1.0.4" },
     { name = "langserve", extras = ["all"], specifier = ">=0.3.3" },
     { name = "mcp", specifier = ">=1.24.0" },
     { name = "pillow", specifier = ">=12.0.0" },
     { name = "pypdf", specifier = ">=6.4.1" },
     { name = "python-dotenv", specifier = ">=1.2.1" },
+    { name = "rank-bm25", specifier = ">=0.2.2" },
     { name = "sentence-transformers", specifier = ">=5.2.0" },
     { name = "tavily-python", specifier = ">=0.7.14" },
     { name = "torch", specifier = ">=2.9.1" },
 ]
 [[package]]
+<<<<<<< HEAD
 name = "dnspython"
 version = "2.8.0"
 source = { registry = "https://pypi.org/simple" }
 sdist = { url = "https://files.pythonhosted.org/packages/ae/b6/03bb70946330e88ffec97aefd3ea75ba575cb2e762061e0e62a213befee8/docutils-0.22.4.tar.gz", hash = "sha256:4db53b1fde9abecbb74d91230d32ab626d94f6badfc575d6db9194a49df29968", size = 2291750, upload-time = "2025-12-18T19:00:26.443Z" }
 wheels = [
     { url = "https://files.pythonhosted.org/packages/02/10/5da547df7a391dcde17f59520a231527b8571e6f46fc8efb02ccb370ab12/docutils-0.22.4-py3-none-any.whl", hash = "sha256:d0013f540772d1420576855455d050a2180186c91c15779301ac2ccb3eeb68de", size = 633196, upload-time = "2025-12-18T19:00:18.077Z" },
+=======
+name = "docx2txt"
+version = "0.9"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/ea/07/4486a038624e885e227fe79111914c01f55aa70a51920ff1a7f2bd216d10/docx2txt-0.9.tar.gz", hash = "sha256:18013f6229b14909028b19aa7bf4f8f3d6e4632d7b089ab29f7f0a4d1f660e28", size = 3613, upload-time = "2025-03-24T20:59:25.21Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/d6/51/756e71bec48ece0ecc2a10e921ef2756e197dcb7e478f2b43673b6683902/docx2txt-0.9-py3-none-any.whl", hash = "sha256:e3718c0653fd6f2fcf4b51b02a61452ad1c38a4c163bcf0a6fd9486cd38f529a", size = 4025, upload-time = "2025-03-24T20:59:24.394Z" },
+>>>>>>> 5beccbe9dfa0ef53e4123976ad54e2f1c28b72f8
 ]
 [[package]]
 ]
 [[package]]
+<<<<<<< HEAD
 name = "fastmcp"
 version = "2.13.0"
 source = { registry = "https://pypi.org/simple" }
 sdist = { url = "https://files.pythonhosted.org/packages/bc/3b/c30af894db2c3ec439d0e4168ba7ce705474cabdd0a599033ad9a19ad977/fastmcp-2.13.0.tar.gz", hash = "sha256:57f7b7503363e1babc0d1a13af18252b80366a409e1de85f1256cce66a4bee35", size = 7767346, upload-time = "2025-10-25T12:54:10.957Z" }
 wheels = [
     { url = "https://files.pythonhosted.org/packages/c0/7f/09942135f506953fc61bb81b9e5eaf50a8eea923b83d9135bd959168ef2d/fastmcp-2.13.0-py3-none-any.whl", hash = "sha256:bdff1399d3b7ebb79286edfd43eb660182432514a5ab8e4cbfb45f1d841d2aa0", size = 367134, upload-time = "2025-10-25T12:54:09.284Z" },
+=======
+name = "feedparser"
+version = "6.0.12"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "sgmllib3k" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/dc/79/db7edb5e77d6dfbc54d7d9df72828be4318275b2e580549ff45a962f6461/feedparser-6.0.12.tar.gz", hash = "sha256:64f76ce90ae3e8ef5d1ede0f8d3b50ce26bcce71dd8ae5e82b1cd2d4a5f94228", size = 286579, upload-time = "2025-09-10T13:33:59.486Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/4e/eb/c96d64137e29ae17d83ad2552470bafe3a7a915e85434d9942077d7fd011/feedparser-6.0.12-py3-none-any.whl", hash = "sha256:6bbff10f5a52662c00a2e3f86a38928c37c48f77b3c511aedcd51de933549324", size = 81480, upload-time = "2025-09-10T13:33:58.022Z" },
+>>>>>>> 5beccbe9dfa0ef53e4123976ad54e2f1c28b72f8
 ]
 [[package]]
     { url = "https://files.pythonhosted.org/packages/83/bd/9df897cbc98290bf71140104ee5b9777cf5291afb80333aa7da5a497339b/langchain_core-1.2.5-py3-none-any.whl", hash = "sha256:3255944ef4e21b2551facb319bfc426057a40247c0a05de5bd6f2fc021fbfa34", size = 484851, upload-time = "2025-12-22T23:45:30.525Z" },
 ]
+[[package]]
+name = "langchain-experimental"
+version = "0.4.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "langchain-community" },
+    { name = "langchain-core" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/a2/ec/6fe7b2e3c105b4f4fc6b943d8fc1b5b10f883429edc36c58a09fc2e28419/langchain_experimental-0.4.1.tar.gz", hash = "sha256:ab6b19a0b98fbc15225fbfcf096176fec339b7e3e930bcf328bb717985fc1da5", size = 170449, upload-time = "2025-12-11T05:30:48.455Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/24/fa/fb2c8b6418e1c9ef50c82b3b6e0184bce321582577240bb4b8ed3274a4aa/langchain_experimental-0.4.1-py3-none-any.whl", hash = "sha256:b6ee2f42b50aaadb45e581439ecf5ee50f3a6a0986d52e74d1e64721309e387d", size = 210096, upload-time = "2025-12-11T05:30:47.234Z" },
+]
 [[package]]
 name = "langchain-google-genai"
 version = "4.1.1"
     { url = "https://files.pythonhosted.org/packages/af/4a/3d6227a16fe9f79968414b50e50869519378b20653805e2e8fab283908e6/langchain_groq-1.1.1-py3-none-any.whl", hash = "sha256:1c6d5146f60205dcde09d7e47bb5291c295d3f0c7bcd2417e4d3a73a04bd1050", size = 19039, upload-time = "2025-12-12T22:00:45.86Z" },
 ]
+[[package]]
+name = "langchain-ollama"
+version = "1.0.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "langchain-core" },
+    { name = "ollama" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/73/51/72cd04d74278f3575f921084f34280e2f837211dc008c9671c268c578afe/langchain_ollama-1.0.1.tar.gz", hash = "sha256:e37880c2f41cdb0895e863b1cfd0c2c840a117868b3f32e44fef42569e367443", size = 153850, upload-time = "2025-12-12T21:48:28.68Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/e3/46/f2907da16dc5a5a6c679f83b7de21176178afad8d2ca635a581429580ef6/langchain_ollama-1.0.1-py3-none-any.whl", hash = "sha256:37eb939a4718a0255fe31e19fbb0def044746c717b01b97d397606ebc3e9b440", size = 29207, upload-time = "2025-12-12T21:48:27.832Z" },
+]
 [[package]]
 name = "langchain-tavily"
 version = "0.2.16"
     { url = "https://files.pythonhosted.org/packages/be/9c/92789c596b8df838baa98fa71844d84283302f7604ed565dafe5a6b5041a/oauthlib-3.3.1-py3-none-any.whl", hash = "sha256:88119c938d2b8fb88561af5f6ee0eec8cc8d552b7bb1f712743136eb7523b7a1", size = 160065, upload-time = "2025-06-19T22:48:06.508Z" },
 ]
+[[package]]
+name = "ollama"
+version = "0.6.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "httpx" },
+    { name = "pydantic" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/9d/5a/652dac4b7affc2b37b95386f8ae78f22808af09d720689e3d7a86b6ed98e/ollama-0.6.1.tar.gz", hash = "sha256:478c67546836430034b415ed64fa890fd3d1ff91781a9d548b3325274e69d7c6", size = 51620, upload-time = "2025-11-13T23:02:17.416Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/47/4f/4a617ee93d8208d2bcf26b2d8b9402ceaed03e3853c754940e2290fed063/ollama-0.6.1-py3-none-any.whl", hash = "sha256:fc4c984b345735c5486faeee67d8a265214a31cbb828167782dc642ce0a2bf8c", size = 14354, upload-time = "2025-11-13T23:02:16.292Z" },
+]
 [[package]]
 name = "onnxruntime"
 version = "1.23.2"
     { url = "https://files.pythonhosted.org/packages/f1/12/de94a39c2ef588c7e6455cfbe7343d3b2dc9d6b6b2f40c4c6565744c873d/pyyaml-6.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:ebc55a14a21cb14062aa4162f906cd962b28e2e9ea38f9b4391244cd8de4ae0b", size = 149341, upload-time = "2025-09-25T21:32:56.828Z" },
 ]
+[[package]]
+name = "rank-bm25"
+version = "0.2.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "numpy" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/fc/0a/f9579384aa017d8b4c15613f86954b92a95a93d641cc849182467cf0bb3b/rank_bm25-0.2.2.tar.gz", hash = "sha256:096ccef76f8188563419aaf384a02f0ea459503fdf77901378d4fd9d87e5e51d", size = 8347, upload-time = "2022-02-16T12:10:52.196Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/2a/21/f691fb2613100a62b3fa91e9988c991e9ca5b89ea31c0d3152a3210344f9/rank_bm25-0.2.2-py3-none-any.whl", hash = "sha256:7bd4a95571adadfc271746fa146a4bcfd89c0cf731e49c3d1ad863290adbe8ae", size = 8584, upload-time = "2022-02-16T12:10:50.626Z" },
+]
 [[package]]
 name = "referencing"
 version = "0.36.2"
     { url = "https://files.pythonhosted.org/packages/a3/dc/17031897dae0efacfea57dfd3a82fdd2a2aeb58e0ff71b77b87e44edc772/setuptools-80.9.0-py3-none-any.whl", hash = "sha256:062d34222ad13e0cc312a4c02d73f059e86a4acbfbdea8f8f76b28c99f306922", size = 1201486, upload-time = "2025-05-27T00:56:49.664Z" },
 ]
+[[package]]
+name = "sgmllib3k"
+version = "1.0.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/9e/bd/3704a8c3e0942d711c1299ebf7b9091930adae6675d7c8f476a7ce48653c/sgmllib3k-1.0.0.tar.gz", hash = "sha256:7868fb1c8bfa764c1ac563d3cf369c381d1325d36124933a726f29fcdaa812e9", size = 5750, upload-time = "2010-08-24T14:33:52.445Z" }
 [[package]]
 name = "shellingham"
 version = "1.5.4"