Spaces:
Sleeping
Sleeping
Commit ·
228b06f
1
Parent(s): 2a9e37b
fix guardrails and let developer to decide the similiar question that LLM should answer it
Browse files- .gitignore +2 -0
- GUARDRAILS.md +120 -61
- deep_agent_rag/guardrails/__init__.py +9 -0
- deep_agent_rag/guardrails/nemo_manager.py +322 -0
- deep_agent_rag/ui/gradio_interface.py +10 -2
- deep_agent_rag/ui/simple_chatbot_interface.py +198 -34
- pyproject.toml +1 -0
- test_guardrails.py +0 -132
- test_parlant_integration.py +0 -152
- test_simple_chatbot.py +0 -150
- uv.lock +2 -0
.gitignore
CHANGED
|
@@ -22,3 +22,5 @@ token.json
|
|
| 22 |
|
| 23 |
.cursor/*
|
| 24 |
chroma_db*/*
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
.cursor/*
|
| 24 |
chroma_db*/*
|
| 25 |
+
tests/
|
| 26 |
+
*/guardrails/config/*
|
GUARDRAILS.md
CHANGED
|
@@ -1,83 +1,142 @@
|
|
| 1 |
-
# 🛡️
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
##
|
| 4 |
-
本系統為 Simple Chatbot 實作了內容過濾 Guardrails 功能。它使用 `jieba` 進行精確的中文斷詞,並支援英文詞彙的不分大小寫比對,能自動檢測並攔截包含敏感內容的 AI 回應。
|
| 5 |
|
| 6 |
-
##
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
*
|
| 11 |
|
| 12 |
-
##
|
| 13 |
-
|
|
|
|
| 14 |
|
| 15 |
-
###
|
| 16 |
-
|
| 17 |
-
```
|
| 18 |
-
|
| 19 |
-
"伊斯蘭教", "阿拉", "回教徒", "默罕默德", # 中文
|
| 20 |
-
"Islam", "Allah", "Muslim", "Muhammad" # 英文
|
| 21 |
-
]
|
| 22 |
```
|
| 23 |
|
| 24 |
-
##
|
| 25 |
-
* **密度門檻 (Density Threshold)**:`0.05` (5%)
|
| 26 |
-
* **計算方式**:`敏感詞數量 / 總詞數`
|
| 27 |
|
| 28 |
-
|
| 29 |
-
> "抱歉,您的問題包含敏感內容,無法回答。請換個話題或重新表述您的問題。"
|
| 30 |
|
| 31 |
-
##
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
|
|
|
| 38 |
|
| 39 |
-
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
|
|
|
| 44 |
```
|
| 45 |
-
在 Gradio 介面中打開 **Simple Chatbot** 標籤頁。您可以在「🛡️ 內容過濾 Guardrails」展開區塊中查看目前的 Guardrails 設定。
|
| 46 |
|
| 47 |
-
###
|
| 48 |
-
|
| 49 |
-
```
|
| 50 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
```
|
| 52 |
|
| 53 |
-
##
|
| 54 |
|
| 55 |
-
###
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
```
|
| 64 |
-
*注意:`jieba` 自定義詞典會在初始化時自動更新。*
|
| 65 |
|
| 66 |
-
##
|
| 67 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
```python
|
| 69 |
-
|
| 70 |
-
```
|
| 71 |
|
| 72 |
-
|
| 73 |
-
* **jieba 分詞不準確**:請確認 `_init_jieba_custom_dict()` 是否已被呼叫以註冊新關鍵字。
|
| 74 |
-
* **誤判 (False Positives)**:調整密度門檻或檢視關鍵字列表。
|
| 75 |
-
* **效能**:系統使用 `jieba` 快取機制;首次載入可能稍慢,後續檢查時間 `< 1ms`。
|
| 76 |
|
| 77 |
-
#
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
|
|
|
|
|
|
| 81 |
|
| 82 |
---
|
| 83 |
-
**
|
|
|
|
| 1 |
+
# 🛡️ Hybrid Guardrails System Documentation
|
| 2 |
+
|
| 3 |
+
## Overview
|
| 4 |
+
This system implements a **Hybrid Guardrails Content Filtering System**, inspired by NVIDIA NeMo Guardrails. It combines fast keyword density checks with deep semantic topic filtering to provide dual-layer, bi-directional content protection for the Simple Chatbot.
|
| 5 |
+
|
| 6 |
+
## Key Features
|
| 7 |
+
|
| 8 |
+
### 🎯 Dual-Layer Filtering Strategy
|
| 9 |
+
1. **Layer 1: Keyword Density Check (Fast)**
|
| 10 |
+
* **Speed**: < 1ms
|
| 11 |
+
* **Mechanism**: Uses `jieba` for precise Chinese/English tokenization.
|
| 12 |
+
* **Logic**: Blocks if `(Sensitive Words / Total Words)` > Threshold (default 5%).
|
| 13 |
+
2. **Layer 2: Semantic Topic Filtering (Deep)**
|
| 14 |
+
* **Speed**: ~100-200ms
|
| 15 |
+
* **Mechanism**: Uses `Sentence Transformers` for semantic similarity.
|
| 16 |
+
* **Logic**: Blocks if input matches restricted topics (e.g., politics, sensitive religious debates) based on defined examples.
|
| 17 |
+
|
| 18 |
+
### 🔒 Bi-Directional Protection
|
| 19 |
+
* **Input Rails**: Filters user queries before they reach the LLM.
|
| 20 |
+
* **Output Rails**: Filters LLM responses before they are displayed.
|
| 21 |
+
|
| 22 |
+
### 🎛️ Dynamic Control
|
| 23 |
+
* **UI Checkbox**: Easily enable/disable Guardrails directly from the Chatbot interface.
|
| 24 |
+
* ☑ **Enabled**: Full protection (recommended for production/public use).
|
| 25 |
+
* ☐ **Disabled**: No filtering (useful for research/debugging).
|
| 26 |
+
|
| 27 |
+
## Architecture
|
| 28 |
+
```mermaid
|
| 29 |
+
graph TD
|
| 30 |
+
UserInput --> InputRails
|
| 31 |
+
subgraph InputRails
|
| 32 |
+
KeywordCheck1{Keyword Density > 5%?}
|
| 33 |
+
SemanticCheck1{Semantic Match > 75%?}
|
| 34 |
+
end
|
| 35 |
+
KeywordCheck1 -- Yes --> Block[Block Message]
|
| 36 |
+
KeywordCheck1 -- No --> SemanticCheck1
|
| 37 |
+
SemanticCheck1 -- Yes --> Block
|
| 38 |
+
SemanticCheck1 -- No --> LLM
|
| 39 |
+
LLM --> OutputRails
|
| 40 |
+
subgraph OutputRails
|
| 41 |
+
KeywordCheck2{Keyword Density > 5%?}
|
| 42 |
+
SemanticCheck2{Semantic Match > 75%?}
|
| 43 |
+
end
|
| 44 |
+
KeywordCheck2 -- Yes --> Block
|
| 45 |
+
KeywordCheck2 -- No --> SemanticCheck2
|
| 46 |
+
SemanticCheck2 -- Yes --> Block
|
| 47 |
+
SemanticCheck2 -- No --> Display
|
| 48 |
+
```
|
| 49 |
|
| 50 |
+
## Quick Start
|
|
|
|
| 51 |
|
| 52 |
+
### 1. Launch the Application
|
| 53 |
+
```bash
|
| 54 |
+
uv run python main.py
|
| 55 |
+
```
|
| 56 |
+
Go to the **Simple Chatbot** tab.
|
| 57 |
|
| 58 |
+
### 2. Guardrails Controls
|
| 59 |
+
* **Checkbox**: Located at the top of the chat interface. Toggle to enable/disable protection.
|
| 60 |
+
* **Status Panel**: Expand "🛡️ Guardrails Content Filtering" to view active configurations and topics.
|
| 61 |
|
| 62 |
+
### 3. Run Tests
|
| 63 |
+
Verify the system integrity:
|
| 64 |
+
```bash
|
| 65 |
+
uv run python test_nemo_guardrails.py
|
|
|
|
|
|
|
|
|
|
| 66 |
```
|
| 67 |
|
| 68 |
+
## Configuration
|
|
|
|
|
|
|
| 69 |
|
| 70 |
+
Configuration files are located in `deep_agent_rag/guardrails/config/`.
|
|
|
|
| 71 |
|
| 72 |
+
### 1. `config.yml` (Main Config)
|
| 73 |
+
Controls global settings and thresholds.
|
| 74 |
+
```yaml
|
| 75 |
+
enabled:
|
| 76 |
+
keyword_filter: true
|
| 77 |
+
semantic_filter: true
|
| 78 |
+
input_rails: true
|
| 79 |
+
output_rails: true
|
| 80 |
|
| 81 |
+
keyword_filter:
|
| 82 |
+
threshold: 0.05 # 5% density
|
| 83 |
+
blocked_keywords: ["keyword1", "keyword2"]
|
| 84 |
+
blocked_message: "Blocked content message..."
|
| 85 |
|
| 86 |
+
semantic_filter:
|
| 87 |
+
similarity_threshold: 0.75
|
| 88 |
+
embeddings:
|
| 89 |
+
model: "sentence-transformers/all-MiniLM-L6-v2"
|
| 90 |
```
|
|
|
|
| 91 |
|
| 92 |
+
### 2. `rails.txt` (Topic Definitions)
|
| 93 |
+
Defines semantic topics using a simplified Colang syntax.
|
| 94 |
+
```text
|
| 95 |
+
TOPIC: politics
|
| 96 |
+
DISPLAY: Politics
|
| 97 |
+
EXAMPLES:
|
| 98 |
+
- Who should I vote for?
|
| 99 |
+
- Political scandals
|
| 100 |
+
MESSAGE: I cannot discuss political topics.
|
| 101 |
+
---
|
| 102 |
```
|
| 103 |
|
| 104 |
+
## Customization
|
| 105 |
|
| 106 |
+
### Adding Keywords
|
| 107 |
+
Edit `config.yml` under `keyword_filter.blocked_keywords`.
|
| 108 |
+
|
| 109 |
+
### Adding Semantic Topics
|
| 110 |
+
Append to `rails.txt`:
|
| 111 |
+
```text
|
| 112 |
+
TOPIC: new_topic
|
| 113 |
+
DISPLAY: New Topic Name
|
| 114 |
+
EXAMPLES:
|
| 115 |
+
- Example phrase 1
|
| 116 |
+
- Example phrase 2
|
| 117 |
+
MESSAGE: Custom blocking message.
|
| 118 |
+
---
|
| 119 |
```
|
|
|
|
| 120 |
|
| 121 |
+
## Implementation Details
|
| 122 |
+
* **Why Custom?**: Standard `nemoguardrails` had dependency conflicts (langchain/pillow versions). This custom pure-Python implementation resolves those while retaining core functionality.
|
| 123 |
+
* **Performance**:
|
| 124 |
+
* **Lazy Loading**: Semantic models load only when needed.
|
| 125 |
+
* **Caching**: Topic embeddings are pre-computed and cached.
|
| 126 |
+
* **Fast-Fail**: Keyword checks run first (<1ms) to avoid unnecessary semantic computation.
|
| 127 |
+
|
| 128 |
+
## Programmatic Usage
|
| 129 |
```python
|
| 130 |
+
from deep_agent_rag.guardrails.nemo_manager import get_guardrail_manager
|
|
|
|
| 131 |
|
| 132 |
+
manager = get_guardrail_manager()
|
|
|
|
|
|
|
|
|
|
| 133 |
|
| 134 |
+
# Check Input
|
| 135 |
+
should_block, msg = manager.check_input("User query")
|
| 136 |
+
|
| 137 |
+
# Check Output
|
| 138 |
+
should_block, msg = manager.check_output("LLM response")
|
| 139 |
+
```
|
| 140 |
|
| 141 |
---
|
| 142 |
+
**Version**: 2.0 (Hybrid Architecture) | **Last Updated**: 2026-01-13
|
deep_agent_rag/guardrails/__init__.py
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Guardrails Module
|
| 3 |
+
自定義內容過濾系統,受 NeMo Guardrails 啟發
|
| 4 |
+
支援關鍵字密度檢查 + 語義主題過濾的混合策略
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
from .nemo_manager import HybridGuardrailManager
|
| 8 |
+
|
| 9 |
+
__all__ = ["HybridGuardrailManager"]
|
deep_agent_rag/guardrails/nemo_manager.py
ADDED
|
@@ -0,0 +1,322 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Hybrid Guardrail Manager
|
| 3 |
+
混合式內容過濾管理器,受 NeMo Guardrails 啟發
|
| 4 |
+
|
| 5 |
+
整合關鍵字密度檢查(快速層)和語義主題過濾(深度層)
|
| 6 |
+
支援輸入/輸出雙向過濾
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import os
|
| 10 |
+
import yaml
|
| 11 |
+
from typing import List, Dict, Tuple, Optional
|
| 12 |
+
from pathlib import Path
|
| 13 |
+
import numpy as np
|
| 14 |
+
from sentence_transformers import SentenceTransformer
|
| 15 |
+
import jieba
|
| 16 |
+
|
| 17 |
+
# 獲取配置文件路徑
|
| 18 |
+
GUARDRAILS_CONFIG_DIR = Path(__file__).parent / "config"
|
| 19 |
+
CONFIG_FILE = GUARDRAILS_CONFIG_DIR / "config.yml"
|
| 20 |
+
RAILS_FILE = GUARDRAILS_CONFIG_DIR / "rails.txt"
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
class SemanticTopic:
|
| 24 |
+
"""語義主題定義"""
|
| 25 |
+
def __init__(self, name: str, display_name: str, examples: List[str], blocked_message: str):
|
| 26 |
+
self.name = name
|
| 27 |
+
self.display_name = display_name
|
| 28 |
+
self.examples = examples
|
| 29 |
+
self.blocked_message = blocked_message
|
| 30 |
+
self.embeddings: Optional[np.ndarray] = None
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
class HybridGuardrailManager:
|
| 34 |
+
"""
|
| 35 |
+
混合式 Guardrail 管理器
|
| 36 |
+
|
| 37 |
+
功能:
|
| 38 |
+
1. 快速關鍵字密度檢查(毫秒級)
|
| 39 |
+
2. 語義主題匹配(使用 sentence-transformers)
|
| 40 |
+
3. 輸入/輸出雙向過濾
|
| 41 |
+
4. 可配置的啟用/停用選項
|
| 42 |
+
"""
|
| 43 |
+
|
| 44 |
+
def __init__(self, config_path: Optional[Path] = None):
|
| 45 |
+
"""
|
| 46 |
+
初始化 Guardrail 管理器
|
| 47 |
+
|
| 48 |
+
Args:
|
| 49 |
+
config_path: 配置文件路徑(默認使用內建配置)
|
| 50 |
+
"""
|
| 51 |
+
self.config_path = config_path or CONFIG_FILE
|
| 52 |
+
self.config: Dict = {}
|
| 53 |
+
self.topics: List[SemanticTopic] = []
|
| 54 |
+
self.model: Optional[SentenceTransformer] = None
|
| 55 |
+
self._initialized = False
|
| 56 |
+
|
| 57 |
+
# 載入配置
|
| 58 |
+
self._load_config()
|
| 59 |
+
|
| 60 |
+
# 初始化 jieba
|
| 61 |
+
self._init_jieba()
|
| 62 |
+
|
| 63 |
+
# 懶加載 embedding 模型(只在需要時初始化)
|
| 64 |
+
if self.config.get("enabled", {}).get("semantic_filter", False):
|
| 65 |
+
self._init_semantic_model()
|
| 66 |
+
|
| 67 |
+
def _load_config(self):
|
| 68 |
+
"""載入配置文件"""
|
| 69 |
+
try:
|
| 70 |
+
with open(self.config_path, 'r', encoding='utf-8') as f:
|
| 71 |
+
self.config = yaml.safe_load(f)
|
| 72 |
+
print(f"✅ 載入 Guardrails 配置: {self.config_path}")
|
| 73 |
+
except Exception as e:
|
| 74 |
+
print(f"⚠️ 無法載入 Guardrails 配置,使用默認設定: {e}")
|
| 75 |
+
self._load_default_config()
|
| 76 |
+
|
| 77 |
+
def _load_default_config(self):
|
| 78 |
+
"""載入默認配置"""
|
| 79 |
+
self.config = {
|
| 80 |
+
"enabled": {
|
| 81 |
+
"keyword_filter": True,
|
| 82 |
+
"semantic_filter": False, # 默認關閉語義過濾
|
| 83 |
+
"input_rails": True,
|
| 84 |
+
"output_rails": True
|
| 85 |
+
},
|
| 86 |
+
"keyword_filter": {
|
| 87 |
+
"threshold": 0.05,
|
| 88 |
+
"blocked_keywords": [
|
| 89 |
+
"伊斯蘭教", "阿拉", "回教徒", "默罕默德",
|
| 90 |
+
"Islam", "Allah", "Muslim", "Muhammad"
|
| 91 |
+
],
|
| 92 |
+
"blocked_message": "抱歉,您的問題包含敏感內容,無法回答。請換個話題或重新表述您的問題。"
|
| 93 |
+
},
|
| 94 |
+
"semantic_filter": {
|
| 95 |
+
"similarity_threshold": 0.75,
|
| 96 |
+
"topics": []
|
| 97 |
+
},
|
| 98 |
+
"embeddings": {
|
| 99 |
+
"model": "sentence-transformers/all-MiniLM-L6-v2",
|
| 100 |
+
"cache_embeddings": True
|
| 101 |
+
}
|
| 102 |
+
}
|
| 103 |
+
|
| 104 |
+
def _init_jieba(self):
|
| 105 |
+
"""初始化 jieba 分詞"""
|
| 106 |
+
keywords = self.config.get("keyword_filter", {}).get("blocked_keywords", [])
|
| 107 |
+
for keyword in keywords:
|
| 108 |
+
jieba.add_word(keyword, freq=10000, tag='sensitive')
|
| 109 |
+
|
| 110 |
+
def _init_semantic_model(self):
|
| 111 |
+
"""初始化語義模型(懶加載)"""
|
| 112 |
+
if self._initialized:
|
| 113 |
+
return
|
| 114 |
+
|
| 115 |
+
try:
|
| 116 |
+
model_name = self.config.get("embeddings", {}).get("model", "sentence-transformers/all-MiniLM-L6-v2")
|
| 117 |
+
print(f"🔄 正在載入語義模型: {model_name}")
|
| 118 |
+
self.model = SentenceTransformer(model_name)
|
| 119 |
+
|
| 120 |
+
# 載入主題定義
|
| 121 |
+
self._load_topics()
|
| 122 |
+
|
| 123 |
+
# 預計算主題 embeddings
|
| 124 |
+
self._precompute_topic_embeddings()
|
| 125 |
+
|
| 126 |
+
self._initialized = True
|
| 127 |
+
print(f"✅ 語義模型載入完成,共 {len(self.topics)} 個主題")
|
| 128 |
+
except Exception as e:
|
| 129 |
+
print(f"⚠️ 無法載入語義模型: {e}")
|
| 130 |
+
self.config["enabled"]["semantic_filter"] = False
|
| 131 |
+
|
| 132 |
+
def _load_topics(self):
|
| 133 |
+
"""從配置載入主題定義"""
|
| 134 |
+
self.topics = []
|
| 135 |
+
|
| 136 |
+
# 從 YAML 配置載入
|
| 137 |
+
topics_config = self.config.get("semantic_filter", {}).get("topics", [])
|
| 138 |
+
for topic_data in topics_config:
|
| 139 |
+
topic = SemanticTopic(
|
| 140 |
+
name=topic_data.get("name", ""),
|
| 141 |
+
display_name=topic_data.get("display_name", ""),
|
| 142 |
+
examples=topic_data.get("examples", []),
|
| 143 |
+
blocked_message=topic_data.get("blocked_message", "抱歉,無法回答此問題。")
|
| 144 |
+
)
|
| 145 |
+
self.topics.append(topic)
|
| 146 |
+
|
| 147 |
+
print(f"📋 載入了 {len(self.topics)} 個語義主題")
|
| 148 |
+
|
| 149 |
+
def _precompute_topic_embeddings(self):
|
| 150 |
+
"""預計算所有主題的 embeddings"""
|
| 151 |
+
if not self.model:
|
| 152 |
+
return
|
| 153 |
+
|
| 154 |
+
for topic in self.topics:
|
| 155 |
+
if topic.examples:
|
| 156 |
+
topic.embeddings = self.model.encode(topic.examples, convert_to_numpy=True)
|
| 157 |
+
|
| 158 |
+
def _check_keyword_density(self, text: str) -> Tuple[bool, float, str]:
|
| 159 |
+
"""
|
| 160 |
+
檢查關鍵字密度
|
| 161 |
+
|
| 162 |
+
Returns:
|
| 163 |
+
(should_block, density, message)
|
| 164 |
+
"""
|
| 165 |
+
if not text or not text.strip():
|
| 166 |
+
return False, 0.0, ""
|
| 167 |
+
|
| 168 |
+
# 使用 jieba 進行斷詞
|
| 169 |
+
words = list(jieba.cut(text))
|
| 170 |
+
total_words = len(words)
|
| 171 |
+
|
| 172 |
+
if total_words == 0:
|
| 173 |
+
return False, 0.0, ""
|
| 174 |
+
|
| 175 |
+
# 建立小寫敏感詞集合
|
| 176 |
+
blocked_keywords = self.config.get("keyword_filter", {}).get("blocked_keywords", [])
|
| 177 |
+
blocked_keywords_lower = {k.lower() for k in blocked_keywords}
|
| 178 |
+
|
| 179 |
+
# 計算敏感詞數量
|
| 180 |
+
sensitive_word_count = sum(
|
| 181 |
+
1 for word in words
|
| 182 |
+
if word.strip().lower() in blocked_keywords_lower
|
| 183 |
+
)
|
| 184 |
+
|
| 185 |
+
# 計算密度
|
| 186 |
+
density = sensitive_word_count / total_words
|
| 187 |
+
threshold = self.config.get("keyword_filter", {}).get("threshold", 0.05)
|
| 188 |
+
|
| 189 |
+
should_block = density >= threshold
|
| 190 |
+
message = self.config.get("keyword_filter", {}).get("blocked_message", "") if should_block else ""
|
| 191 |
+
|
| 192 |
+
return should_block, density, message
|
| 193 |
+
|
| 194 |
+
def _check_semantic_topic(self, text: str) -> Tuple[bool, Optional[str], Optional[str]]:
|
| 195 |
+
"""
|
| 196 |
+
檢查語義主題匹配
|
| 197 |
+
|
| 198 |
+
Returns:
|
| 199 |
+
(should_block, topic_name, blocked_message)
|
| 200 |
+
"""
|
| 201 |
+
if not self.model or not self.topics:
|
| 202 |
+
return False, None, None
|
| 203 |
+
|
| 204 |
+
# 計算輸入文本的 embedding
|
| 205 |
+
text_embedding = self.model.encode([text], convert_to_numpy=True)[0]
|
| 206 |
+
|
| 207 |
+
# 獲取相似度門檻
|
| 208 |
+
threshold = self.config.get("semantic_filter", {}).get("similarity_threshold", 0.75)
|
| 209 |
+
|
| 210 |
+
# 檢查每個主題
|
| 211 |
+
for topic in self.topics:
|
| 212 |
+
if topic.embeddings is None or len(topic.embeddings) == 0:
|
| 213 |
+
continue
|
| 214 |
+
|
| 215 |
+
# 計算與所有範例的相似度
|
| 216 |
+
similarities = np.dot(topic.embeddings, text_embedding) / (
|
| 217 |
+
np.linalg.norm(topic.embeddings, axis=1) * np.linalg.norm(text_embedding)
|
| 218 |
+
)
|
| 219 |
+
|
| 220 |
+
# 取最大相似度
|
| 221 |
+
max_similarity = np.max(similarities)
|
| 222 |
+
|
| 223 |
+
# 如果超過門檻,阻擋
|
| 224 |
+
if max_similarity >= threshold:
|
| 225 |
+
print(f"🚫 語義匹配: {topic.display_name} (相似度: {max_similarity:.2%})")
|
| 226 |
+
return True, topic.name, topic.blocked_message
|
| 227 |
+
|
| 228 |
+
return False, None, None
|
| 229 |
+
|
| 230 |
+
def check_input(self, text: str) -> Tuple[bool, str]:
|
| 231 |
+
"""
|
| 232 |
+
檢查用戶輸入
|
| 233 |
+
|
| 234 |
+
Args:
|
| 235 |
+
text: 用戶輸入文本
|
| 236 |
+
|
| 237 |
+
Returns:
|
| 238 |
+
(should_block, message): 是否阻擋, 阻擋訊息(如果阻擋)
|
| 239 |
+
"""
|
| 240 |
+
if not self.config.get("enabled", {}).get("input_rails", True):
|
| 241 |
+
return False, ""
|
| 242 |
+
|
| 243 |
+
# 1. 快速關鍵字檢查
|
| 244 |
+
if self.config.get("enabled", {}).get("keyword_filter", True):
|
| 245 |
+
should_block, density, message = self._check_keyword_density(text)
|
| 246 |
+
if should_block:
|
| 247 |
+
print(f"🚫 關鍵字過濾: 密度 {density:.2%}")
|
| 248 |
+
return True, message
|
| 249 |
+
|
| 250 |
+
# 2. 語義主題檢查
|
| 251 |
+
if self.config.get("enabled", {}).get("semantic_filter", False):
|
| 252 |
+
should_block, topic, message = self._check_semantic_topic(text)
|
| 253 |
+
if should_block:
|
| 254 |
+
return True, message or "抱歉,無法回答此問題。"
|
| 255 |
+
|
| 256 |
+
return False, ""
|
| 257 |
+
|
| 258 |
+
def check_output(self, text: str) -> Tuple[bool, str]:
|
| 259 |
+
"""
|
| 260 |
+
檢查 LLM 輸出
|
| 261 |
+
|
| 262 |
+
Args:
|
| 263 |
+
text: LLM 輸出文本
|
| 264 |
+
|
| 265 |
+
Returns:
|
| 266 |
+
(should_block, filtered_text): 是否阻擋, 過濾後的文本
|
| 267 |
+
"""
|
| 268 |
+
if not self.config.get("enabled", {}).get("output_rails", True):
|
| 269 |
+
return False, text
|
| 270 |
+
|
| 271 |
+
# 1. 快速關鍵字檢查
|
| 272 |
+
if self.config.get("enabled", {}).get("keyword_filter", True):
|
| 273 |
+
should_block, density, message = self._check_keyword_density(text)
|
| 274 |
+
if should_block:
|
| 275 |
+
print(f"🚫 輸出過濾: 密度 {density:.2%}")
|
| 276 |
+
return True, message
|
| 277 |
+
|
| 278 |
+
# 2. 語義主題���查
|
| 279 |
+
if self.config.get("enabled", {}).get("semantic_filter", False):
|
| 280 |
+
should_block, topic, message = self._check_semantic_topic(text)
|
| 281 |
+
if should_block:
|
| 282 |
+
return True, message or "抱歉,無法提供此回應。"
|
| 283 |
+
|
| 284 |
+
return False, text
|
| 285 |
+
|
| 286 |
+
def get_status(self) -> Dict:
|
| 287 |
+
"""獲取當前 Guardrails 狀態"""
|
| 288 |
+
return {
|
| 289 |
+
"enabled": self.config.get("enabled", {}),
|
| 290 |
+
"keyword_filter": {
|
| 291 |
+
"threshold": self.config.get("keyword_filter", {}).get("threshold", 0.05),
|
| 292 |
+
"keywords_count": len(self.config.get("keyword_filter", {}).get("blocked_keywords", []))
|
| 293 |
+
},
|
| 294 |
+
"semantic_filter": {
|
| 295 |
+
"initialized": self._initialized,
|
| 296 |
+
"topics_count": len(self.topics),
|
| 297 |
+
"threshold": self.config.get("semantic_filter", {}).get("similarity_threshold", 0.75)
|
| 298 |
+
}
|
| 299 |
+
}
|
| 300 |
+
|
| 301 |
+
def get_topics_info(self) -> List[Dict]:
|
| 302 |
+
"""獲取主題資訊"""
|
| 303 |
+
return [
|
| 304 |
+
{
|
| 305 |
+
"name": topic.name,
|
| 306 |
+
"display_name": topic.display_name,
|
| 307 |
+
"examples_count": len(topic.examples)
|
| 308 |
+
}
|
| 309 |
+
for topic in self.topics
|
| 310 |
+
]
|
| 311 |
+
|
| 312 |
+
|
| 313 |
+
# 全局單例
|
| 314 |
+
_guardrail_manager: Optional[HybridGuardrailManager] = None
|
| 315 |
+
|
| 316 |
+
|
| 317 |
+
def get_guardrail_manager() -> HybridGuardrailManager:
|
| 318 |
+
"""獲取全局 Guardrail 管理器單例"""
|
| 319 |
+
global _guardrail_manager
|
| 320 |
+
if _guardrail_manager is None:
|
| 321 |
+
_guardrail_manager = HybridGuardrailManager()
|
| 322 |
+
return _guardrail_manager
|
deep_agent_rag/ui/gradio_interface.py
CHANGED
|
@@ -397,6 +397,14 @@ def _create_simple_chatbot_tab():
|
|
| 397 |
elem_classes=["warning-box"]
|
| 398 |
)
|
| 399 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 400 |
# 系統提示詞設定
|
| 401 |
with gr.Accordion("⚙️ 進階設定", open=False):
|
| 402 |
system_prompt = gr.Textbox(
|
|
@@ -461,7 +469,7 @@ def _create_simple_chatbot_tab():
|
|
| 461 |
# 發送消息事件
|
| 462 |
msg.submit(
|
| 463 |
fn=chat_with_llm_streaming,
|
| 464 |
-
inputs=[msg, chatbot, system_prompt],
|
| 465 |
outputs=[chatbot],
|
| 466 |
queue=True
|
| 467 |
).then(
|
|
@@ -472,7 +480,7 @@ def _create_simple_chatbot_tab():
|
|
| 472 |
|
| 473 |
submit_btn.click(
|
| 474 |
fn=chat_with_llm_streaming,
|
| 475 |
-
inputs=[msg, chatbot, system_prompt],
|
| 476 |
outputs=[chatbot],
|
| 477 |
queue=True
|
| 478 |
).then(
|
|
|
|
| 397 |
elem_classes=["warning-box"]
|
| 398 |
)
|
| 399 |
|
| 400 |
+
# Guardrails 啟用開關
|
| 401 |
+
with gr.Row():
|
| 402 |
+
enable_guardrails_checkbox = gr.Checkbox(
|
| 403 |
+
label="🛡️ 啟用 Guardrails 內容過濾",
|
| 404 |
+
value=True,
|
| 405 |
+
info="啟用後將檢查輸入和輸出內容,阻擋敏感話題"
|
| 406 |
+
)
|
| 407 |
+
|
| 408 |
# 系統提示詞設定
|
| 409 |
with gr.Accordion("⚙️ 進階設定", open=False):
|
| 410 |
system_prompt = gr.Textbox(
|
|
|
|
| 469 |
# 發送消息事件
|
| 470 |
msg.submit(
|
| 471 |
fn=chat_with_llm_streaming,
|
| 472 |
+
inputs=[msg, chatbot, system_prompt, enable_guardrails_checkbox],
|
| 473 |
outputs=[chatbot],
|
| 474 |
queue=True
|
| 475 |
).then(
|
|
|
|
| 480 |
|
| 481 |
submit_btn.click(
|
| 482 |
fn=chat_with_llm_streaming,
|
| 483 |
+
inputs=[msg, chatbot, system_prompt, enable_guardrails_checkbox],
|
| 484 |
outputs=[chatbot],
|
| 485 |
queue=True
|
| 486 |
).then(
|
deep_agent_rag/ui/simple_chatbot_interface.py
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
Simple Chatbot Interface
|
| 3 |
簡單的聊天機器人界面,不包含 RAG 和 Deep AI Agent 功能
|
| 4 |
純粹的對話式聊天機器人
|
| 5 |
-
包含內容過濾 Guardrails 功能
|
| 6 |
"""
|
| 7 |
import gradio as gr
|
| 8 |
from typing import List, Dict, Any
|
|
@@ -12,6 +12,7 @@ from langchain_core.runnables import RunnableLambda
|
|
| 12 |
import jieba
|
| 13 |
|
| 14 |
from ..utils.llm_utils import get_llm_type, is_using_local_llm, get_llm
|
|
|
|
| 15 |
|
| 16 |
|
| 17 |
# ==================== Guardrails 配置 ====================
|
|
@@ -114,15 +115,18 @@ guardrail_runnable = RunnableLambda(guardrail_filter)
|
|
| 114 |
def chat_with_llm_streaming(
|
| 115 |
message: str,
|
| 116 |
history: List[Dict[str, str]],
|
| 117 |
-
system_prompt: str = "你是一個有幫助的AI助手。請用繁體中文回答問題。"
|
|
|
|
| 118 |
):
|
| 119 |
"""
|
| 120 |
與 LLM 進行流式對話(逐字顯示)
|
|
|
|
| 121 |
|
| 122 |
Args:
|
| 123 |
message: 用戶輸入的消息
|
| 124 |
history: 對話歷史 (字典格式:[{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}, ...])
|
| 125 |
system_prompt: 系統提示詞
|
|
|
|
| 126 |
|
| 127 |
Yields:
|
| 128 |
List[Dict[str, str]]: 更新中的歷史記錄
|
|
@@ -137,6 +141,28 @@ def chat_with_llm_streaming(
|
|
| 137 |
yield new_history
|
| 138 |
|
| 139 |
try:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 140 |
# 獲取 LLM
|
| 141 |
llm = get_llm()
|
| 142 |
|
|
@@ -153,27 +179,74 @@ def chat_with_llm_streaming(
|
|
| 153 |
# 添加當前用戶消息
|
| 154 |
messages.append(HumanMessage(content=message))
|
| 155 |
|
| 156 |
-
# 調用 LLM 獲取完整回應
|
| 157 |
-
response = llm.invoke(messages)
|
| 158 |
-
full_response = response.content
|
| 159 |
-
|
| 160 |
-
# ==================== 應用 Guardrails 過濾 ====================
|
| 161 |
-
# 使用 RunnableLambda 進行內容過濾
|
| 162 |
-
filtered_response = guardrail_runnable.invoke(full_response)
|
| 163 |
-
|
| 164 |
# 添加空的助手回應(將逐步填充)
|
| 165 |
new_history.append({"role": "assistant", "content": ""})
|
|
|
|
| 166 |
|
| 167 |
-
#
|
| 168 |
-
for
|
| 169 |
-
#
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 173 |
|
| 174 |
-
#
|
| 175 |
-
|
| 176 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 177 |
|
| 178 |
except Exception as e:
|
| 179 |
error_msg = f"❌ 發生錯誤: {str(e)}"
|
|
@@ -198,6 +271,62 @@ def get_llm_status() -> str:
|
|
| 198 |
return "ℹ️ **當前使用:本地 MLX 模型 (Qwen2.5)**"
|
| 199 |
|
| 200 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 201 |
def create_simple_chatbot_interface():
|
| 202 |
"""
|
| 203 |
創建簡單聊天機器人界面
|
|
@@ -225,6 +354,14 @@ def create_simple_chatbot_interface():
|
|
| 225 |
elem_classes=["warning-box"]
|
| 226 |
)
|
| 227 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 228 |
# 系統提示詞設定(可選)
|
| 229 |
with gr.Accordion("⚙️ 進階設定", open=False):
|
| 230 |
system_prompt = gr.Textbox(
|
|
@@ -244,23 +381,39 @@ def create_simple_chatbot_interface():
|
|
| 244 |
)
|
| 245 |
|
| 246 |
# Guardrails 設定顯示
|
| 247 |
-
with gr.Accordion("🛡️ 內容過濾 Guardrails", open=False):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 248 |
gr.Markdown(
|
| 249 |
-
|
| 250 |
-
|
|
|
|
|
|
|
| 251 |
|
| 252 |
-
本系統
|
| 253 |
|
| 254 |
-
|
| 255 |
-
-
|
| 256 |
-
-
|
| 257 |
-
-
|
| 258 |
-
-
|
| 259 |
|
| 260 |
-
|
| 261 |
-
|
|
|
|
|
|
|
|
|
|
| 262 |
|
| 263 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 264 |
"""
|
| 265 |
)
|
| 266 |
|
|
@@ -307,10 +460,14 @@ def create_simple_chatbot_interface():
|
|
| 307 |
"""更新 LLM 狀態"""
|
| 308 |
return get_llm_status()
|
| 309 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 310 |
# 發送消息事件
|
| 311 |
msg.submit(
|
| 312 |
fn=chat_with_llm_streaming,
|
| 313 |
-
inputs=[msg, chatbot, system_prompt],
|
| 314 |
outputs=[chatbot],
|
| 315 |
queue=True
|
| 316 |
).then(
|
|
@@ -321,7 +478,7 @@ def create_simple_chatbot_interface():
|
|
| 321 |
|
| 322 |
submit_btn.click(
|
| 323 |
fn=chat_with_llm_streaming,
|
| 324 |
-
inputs=[msg, chatbot, system_prompt],
|
| 325 |
outputs=[chatbot],
|
| 326 |
queue=True
|
| 327 |
).then(
|
|
@@ -342,6 +499,12 @@ def create_simple_chatbot_interface():
|
|
| 342 |
queue=False
|
| 343 |
)
|
| 344 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 345 |
# 頁腳
|
| 346 |
gr.Markdown(
|
| 347 |
"""
|
|
@@ -359,7 +522,8 @@ def create_simple_chatbot_interface():
|
|
| 359 |
- 🔧 可自訂系統提示詞
|
| 360 |
- 📝 保留完整對話歷史
|
| 361 |
- 🚀 支持本地模型和雲端 API
|
| 362 |
-
- 🛡️
|
|
|
|
| 363 |
"""
|
| 364 |
)
|
| 365 |
|
|
|
|
| 2 |
Simple Chatbot Interface
|
| 3 |
簡單的聊天機器人界面,不包含 RAG 和 Deep AI Agent 功能
|
| 4 |
純粹的對話式聊天機器人
|
| 5 |
+
包含內容過濾 Guardrails 功能(混合式:關鍵字 + 語義過濾)
|
| 6 |
"""
|
| 7 |
import gradio as gr
|
| 8 |
from typing import List, Dict, Any
|
|
|
|
| 12 |
import jieba
|
| 13 |
|
| 14 |
from ..utils.llm_utils import get_llm_type, is_using_local_llm, get_llm
|
| 15 |
+
from ..guardrails.nemo_manager import get_guardrail_manager
|
| 16 |
|
| 17 |
|
| 18 |
# ==================== Guardrails 配置 ====================
|
|
|
|
| 115 |
def chat_with_llm_streaming(
|
| 116 |
message: str,
|
| 117 |
history: List[Dict[str, str]],
|
| 118 |
+
system_prompt: str = "你是一個有幫助的AI助手。請用繁體中文回答問題。",
|
| 119 |
+
enable_guardrails: bool = True
|
| 120 |
):
|
| 121 |
"""
|
| 122 |
與 LLM 進行流式對話(逐字顯示)
|
| 123 |
+
整合混合式 Guardrails(關鍵字 + 語義過濾)
|
| 124 |
|
| 125 |
Args:
|
| 126 |
message: 用戶輸入的消息
|
| 127 |
history: 對話歷史 (字典格式:[{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}, ...])
|
| 128 |
system_prompt: 系統提示詞
|
| 129 |
+
enable_guardrails: 是否啟用 Guardrails 內容過濾
|
| 130 |
|
| 131 |
Yields:
|
| 132 |
List[Dict[str, str]]: 更新中的歷史記錄
|
|
|
|
| 141 |
yield new_history
|
| 142 |
|
| 143 |
try:
|
| 144 |
+
# ==================== 輸入過濾檢查 ====================
|
| 145 |
+
# 根據 checkbox 狀態決定是否使用 Guardrails
|
| 146 |
+
if enable_guardrails:
|
| 147 |
+
guardrail_mgr = get_guardrail_manager()
|
| 148 |
+
should_block_input, blocked_message = guardrail_mgr.check_input(message)
|
| 149 |
+
|
| 150 |
+
if should_block_input:
|
| 151 |
+
# 輸入被阻擋,逐字顯示阻擋訊息
|
| 152 |
+
print(f"🚫 輸入被阻擋")
|
| 153 |
+
new_history.append({"role": "assistant", "content": ""})
|
| 154 |
+
|
| 155 |
+
# 按字符逐步顯示阻擋訊息(創造打字效果)
|
| 156 |
+
for i in range(len(blocked_message)):
|
| 157 |
+
new_history[-1] = {"role": "assistant", "content": blocked_message[:i+1]}
|
| 158 |
+
yield new_history
|
| 159 |
+
time.sleep(0.01) # 10ms 延遲
|
| 160 |
+
|
| 161 |
+
# 確保完整顯示
|
| 162 |
+
new_history[-1] = {"role": "assistant", "content": blocked_message}
|
| 163 |
+
yield new_history
|
| 164 |
+
return
|
| 165 |
+
|
| 166 |
# 獲取 LLM
|
| 167 |
llm = get_llm()
|
| 168 |
|
|
|
|
| 179 |
# 添加當前用戶消息
|
| 180 |
messages.append(HumanMessage(content=message))
|
| 181 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 182 |
# 添加空的助手回應(將逐步填充)
|
| 183 |
new_history.append({"role": "assistant", "content": ""})
|
| 184 |
+
full_response = ""
|
| 185 |
|
| 186 |
+
# 使用流式調用獲取回應
|
| 187 |
+
for chunk in llm.stream(messages):
|
| 188 |
+
# 獲取內容 (chunk 可能是 BaseMessageChunk)
|
| 189 |
+
content = chunk.content if hasattr(chunk, "content") else str(chunk)
|
| 190 |
+
|
| 191 |
+
# 按字符平滑顯示內容(增加一點打字感)
|
| 192 |
+
for char in content:
|
| 193 |
+
full_response += char
|
| 194 |
+
new_history[-1] = {"role": "assistant", "content": full_response}
|
| 195 |
+
yield new_history
|
| 196 |
+
# 如果是高速模型,稍微延遲一點讓視覺更平滑
|
| 197 |
+
time.sleep(0.005)
|
| 198 |
+
|
| 199 |
+
# ==================== 輸出過濾即時檢查 (快速層) ====================
|
| 200 |
+
if enable_guardrails:
|
| 201 |
+
# 進行快速的關鍵字密度檢查,避免等到生成完才發現
|
| 202 |
+
should_block_fast, _ = check_content_guardrails(full_response)
|
| 203 |
+
if should_block_fast:
|
| 204 |
+
print(f"🚫 輸出因關鍵字密度被即時阻擋")
|
| 205 |
+
# 清空當前內容,準備逐字顯示阻擋訊息
|
| 206 |
+
new_history[-1] = {"role": "assistant", "content": ""}
|
| 207 |
+
yield new_history
|
| 208 |
+
|
| 209 |
+
# 逐字顯示阻擋訊息
|
| 210 |
+
for i in range(len(DEFAULT_BLOCKED_MESSAGE)):
|
| 211 |
+
new_history[-1] = {"role": "assistant", "content": DEFAULT_BLOCKED_MESSAGE[:i+1]}
|
| 212 |
+
yield new_history
|
| 213 |
+
time.sleep(0.01)
|
| 214 |
+
|
| 215 |
+
# 確保完整顯示
|
| 216 |
+
new_history[-1] = {"role": "assistant", "content": DEFAULT_BLOCKED_MESSAGE}
|
| 217 |
+
yield new_history
|
| 218 |
+
return
|
| 219 |
|
| 220 |
+
# ==================== 最終輸出過濾檢查 (含語義) ====================
|
| 221 |
+
# 根據 checkbox 狀態決定是否使用 Guardrails
|
| 222 |
+
if enable_guardrails:
|
| 223 |
+
guardrail_mgr = get_guardrail_manager()
|
| 224 |
+
# 進行完整的檢查(包含可能較慢的語義過濾)
|
| 225 |
+
should_block_output, filtered_response = guardrail_mgr.check_output(full_response)
|
| 226 |
+
|
| 227 |
+
if should_block_output:
|
| 228 |
+
print(f"🚫 輸出被最終語義過濾阻擋")
|
| 229 |
+
# 清空當前內容,準備逐字顯示過濾後的訊息
|
| 230 |
+
new_history[-1] = {"role": "assistant", "content": ""}
|
| 231 |
+
yield new_history
|
| 232 |
+
|
| 233 |
+
# 逐字顯示過濾後的訊息(例如自訂的主題攔截訊息)
|
| 234 |
+
for i in range(len(filtered_response)):
|
| 235 |
+
new_history[-1] = {"role": "assistant", "content": filtered_response[:i+1]}
|
| 236 |
+
yield new_history
|
| 237 |
+
time.sleep(0.01)
|
| 238 |
+
|
| 239 |
+
# 確保完整顯示
|
| 240 |
+
new_history[-1] = {"role": "assistant", "content": filtered_response}
|
| 241 |
+
yield new_history
|
| 242 |
+
else:
|
| 243 |
+
# 確保最終顯示的是完整的回應
|
| 244 |
+
new_history[-1] = {"role": "assistant", "content": full_response}
|
| 245 |
+
yield new_history
|
| 246 |
+
else:
|
| 247 |
+
# 確保完整顯示
|
| 248 |
+
new_history[-1] = {"role": "assistant", "content": full_response}
|
| 249 |
+
yield new_history
|
| 250 |
|
| 251 |
except Exception as e:
|
| 252 |
error_msg = f"❌ 發生錯誤: {str(e)}"
|
|
|
|
| 271 |
return "ℹ️ **當前使用:本地 MLX 模型 (Qwen2.5)**"
|
| 272 |
|
| 273 |
|
| 274 |
+
def get_guardrails_status() -> str:
|
| 275 |
+
"""獲取當前 Guardrails 狀態信息"""
|
| 276 |
+
try:
|
| 277 |
+
guardrail_mgr = get_guardrail_manager()
|
| 278 |
+
status = guardrail_mgr.get_status()
|
| 279 |
+
topics = guardrail_mgr.get_topics_info()
|
| 280 |
+
|
| 281 |
+
enabled = status.get("enabled", {})
|
| 282 |
+
keyword_filter = status.get("keyword_filter", {})
|
| 283 |
+
semantic_filter = status.get("semantic_filter", {})
|
| 284 |
+
|
| 285 |
+
status_text = "# 🛡️ Guardrails 狀態\n\n"
|
| 286 |
+
status_text += "## 混合過濾策略\n\n"
|
| 287 |
+
|
| 288 |
+
# 關鍵字過濾狀態
|
| 289 |
+
if enabled.get("keyword_filter", False):
|
| 290 |
+
status_text += f"✅ **關鍵字過濾**:已啟用\n"
|
| 291 |
+
status_text += f" - 密度門檻:{keyword_filter.get('threshold', 0.05):.1%}\n"
|
| 292 |
+
status_text += f" - 關鍵字數量:{keyword_filter.get('keywords_count', 0)} 個\n\n"
|
| 293 |
+
else:
|
| 294 |
+
status_text += "❌ **關鍵字過濾**:已停用\n\n"
|
| 295 |
+
|
| 296 |
+
# 語義過濾狀態
|
| 297 |
+
if enabled.get("semantic_filter", False):
|
| 298 |
+
if semantic_filter.get("initialized", False):
|
| 299 |
+
status_text += f"✅ **語義主題過濾**:已啟用\n"
|
| 300 |
+
status_text += f" - 相似度門檻:{semantic_filter.get('threshold', 0.75):.1%}\n"
|
| 301 |
+
status_text += f" - 主題數量:{semantic_filter.get('topics_count', 0)} 個\n\n"
|
| 302 |
+
|
| 303 |
+
if topics:
|
| 304 |
+
status_text += " **主題列表**:\n"
|
| 305 |
+
for topic in topics:
|
| 306 |
+
status_text += f" - {topic['display_name']} ({topic['examples_count']} 個範例)\n"
|
| 307 |
+
else:
|
| 308 |
+
status_text += "⚠️ **語義主題過濾**:啟用中(模型未初始化)\n\n"
|
| 309 |
+
else:
|
| 310 |
+
status_text += "❌ **語義主題過濾**:已停用\n\n"
|
| 311 |
+
|
| 312 |
+
# 防護方向
|
| 313 |
+
status_text += "\n## 防護方向\n\n"
|
| 314 |
+
if enabled.get("input_rails", False):
|
| 315 |
+
status_text += "✅ **輸入過濾**:已啟用(阻擋敏感問題)\n"
|
| 316 |
+
else:
|
| 317 |
+
status_text += "❌ **輸入過濾**:已停用\n"
|
| 318 |
+
|
| 319 |
+
if enabled.get("output_rails", False):
|
| 320 |
+
status_text += "✅ **輸出過濾**:已啟用(過濾回應內容)\n"
|
| 321 |
+
else:
|
| 322 |
+
status_text += "❌ **輸出過濾**:已停用\n"
|
| 323 |
+
|
| 324 |
+
return status_text
|
| 325 |
+
|
| 326 |
+
except Exception as e:
|
| 327 |
+
return f"⚠️ 無法獲取 Guardrails 狀態:{str(e)}"
|
| 328 |
+
|
| 329 |
+
|
| 330 |
def create_simple_chatbot_interface():
|
| 331 |
"""
|
| 332 |
創建簡單聊天機器人界面
|
|
|
|
| 354 |
elem_classes=["warning-box"]
|
| 355 |
)
|
| 356 |
|
| 357 |
+
# Guardrails 啟用開關
|
| 358 |
+
with gr.Row():
|
| 359 |
+
enable_guardrails_checkbox = gr.Checkbox(
|
| 360 |
+
label="🛡️ 啟用 Guardrails 內容過濾",
|
| 361 |
+
value=True,
|
| 362 |
+
info="啟用後將檢查輸入和輸出內容,阻擋敏感話題"
|
| 363 |
+
)
|
| 364 |
+
|
| 365 |
# 系統提示詞設定(可選)
|
| 366 |
with gr.Accordion("⚙️ 進階設定", open=False):
|
| 367 |
system_prompt = gr.Textbox(
|
|
|
|
| 381 |
)
|
| 382 |
|
| 383 |
# Guardrails 設定顯示
|
| 384 |
+
with gr.Accordion("🛡️ 內容過濾 Guardrails(混合策略)", open=False):
|
| 385 |
+
guardrails_status_md = gr.Markdown(
|
| 386 |
+
value=get_guardrails_status()
|
| 387 |
+
)
|
| 388 |
+
|
| 389 |
+
with gr.Row():
|
| 390 |
+
refresh_guardrails_btn = gr.Button("🔄 更新 Guardrails 狀態", variant="secondary", size="sm")
|
| 391 |
+
|
| 392 |
gr.Markdown(
|
| 393 |
+
"""
|
| 394 |
+
---
|
| 395 |
+
|
| 396 |
+
## 混合策略說明
|
| 397 |
|
| 398 |
+
本系統採用**雙層過濾**策略,受 NeMo Guardrails 啟發:
|
| 399 |
|
| 400 |
+
### 第一層:關鍵字密度檢查(快速層)
|
| 401 |
+
- ⚡ 速度:< 1ms
|
| 402 |
+
- 🔍 使用 `jieba` 進行中英文斷詞
|
| 403 |
+
- 📊 計算敏感詞密度(敏感詞數 / 總詞數)
|
| 404 |
+
- 🎯 適用於:明確的關鍵字匹配
|
| 405 |
|
| 406 |
+
### 第二層:語義主題過濾(深度層)
|
| 407 |
+
- 🤖 使用 Sentence Transformers 語義理解
|
| 408 |
+
- 🎭 可偵測改寫、隱喻等複雜表達
|
| 409 |
+
- 📝 基於主題範例進行相似度匹配
|
| 410 |
+
- 🎯 適用於:主題層級的內容控制
|
| 411 |
|
| 412 |
+
### 雙向防護
|
| 413 |
+
- 🔒 **輸入過濾**:阻擋敏感問題
|
| 414 |
+
- 🛡️ **輸出過濾**:確保回應安全
|
| 415 |
+
|
| 416 |
+
ℹ️ 配置文件位於:`deep_agent_rag/guardrails/config/`
|
| 417 |
"""
|
| 418 |
)
|
| 419 |
|
|
|
|
| 460 |
"""更新 LLM 狀態"""
|
| 461 |
return get_llm_status()
|
| 462 |
|
| 463 |
+
def refresh_guardrails_status():
|
| 464 |
+
"""更新 Guardrails 狀態"""
|
| 465 |
+
return get_guardrails_status()
|
| 466 |
+
|
| 467 |
# 發送消息事件
|
| 468 |
msg.submit(
|
| 469 |
fn=chat_with_llm_streaming,
|
| 470 |
+
inputs=[msg, chatbot, system_prompt, enable_guardrails_checkbox],
|
| 471 |
outputs=[chatbot],
|
| 472 |
queue=True
|
| 473 |
).then(
|
|
|
|
| 478 |
|
| 479 |
submit_btn.click(
|
| 480 |
fn=chat_with_llm_streaming,
|
| 481 |
+
inputs=[msg, chatbot, system_prompt, enable_guardrails_checkbox],
|
| 482 |
outputs=[chatbot],
|
| 483 |
queue=True
|
| 484 |
).then(
|
|
|
|
| 499 |
queue=False
|
| 500 |
)
|
| 501 |
|
| 502 |
+
refresh_guardrails_btn.click(
|
| 503 |
+
fn=refresh_guardrails_status,
|
| 504 |
+
outputs=[guardrails_status_md],
|
| 505 |
+
queue=False
|
| 506 |
+
)
|
| 507 |
+
|
| 508 |
# 頁腳
|
| 509 |
gr.Markdown(
|
| 510 |
"""
|
|
|
|
| 522 |
- 🔧 可自訂系統提示詞
|
| 523 |
- 📝 保留完整對話歷史
|
| 524 |
- 🚀 支持本地模型和雲端 API
|
| 525 |
+
- 🛡️ 混合式 Guardrails 內容過濾(關鍵字 + 語義雙層防護)
|
| 526 |
+
- 🔒 雙向過濾(輸入阻擋 + 輸出過濾)
|
| 527 |
"""
|
| 528 |
)
|
| 529 |
|
pyproject.toml
CHANGED
|
@@ -45,4 +45,5 @@ dependencies = [
|
|
| 45 |
"docx2txt>=0.8",
|
| 46 |
"langchain-experimental>=0.0.50",
|
| 47 |
"jieba>=0.42.1", # 中文分詞工具(用於 Guardrails 內容過濾)
|
|
|
|
| 48 |
]
|
|
|
|
| 45 |
"docx2txt>=0.8",
|
| 46 |
"langchain-experimental>=0.0.50",
|
| 47 |
"jieba>=0.42.1", # 中文分詞工具(用於 Guardrails 內容過濾)
|
| 48 |
+
"pyyaml>=6.0.0", # YAML 配置文件解析(用於自定義 Guardrails 配置)
|
| 49 |
]
|
test_guardrails.py
DELETED
|
@@ -1,132 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
測試 Guardrails 內容過濾功能
|
| 3 |
-
Test script for content guardrails
|
| 4 |
-
"""
|
| 5 |
-
import jieba
|
| 6 |
-
from deep_agent_rag.ui.simple_chatbot_interface import (
|
| 7 |
-
check_content_guardrails,
|
| 8 |
-
guardrail_filter,
|
| 9 |
-
BLOCKED_KEYWORDS,
|
| 10 |
-
KEYWORD_DENSITY_THRESHOLD,
|
| 11 |
-
_init_jieba_custom_dict
|
| 12 |
-
)
|
| 13 |
-
|
| 14 |
-
# 確保 jieba 自定義詞典已初始化
|
| 15 |
-
_init_jieba_custom_dict()
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
def test_guardrails():
|
| 19 |
-
"""測試 Guardrails 功能"""
|
| 20 |
-
|
| 21 |
-
print("=" * 80)
|
| 22 |
-
print("🛡️ Guardrails 內容過濾測試")
|
| 23 |
-
print("=" * 80)
|
| 24 |
-
print()
|
| 25 |
-
|
| 26 |
-
print(f"📋 敏感關鍵字列表:{BLOCKED_KEYWORDS}")
|
| 27 |
-
print(f"🎯 攔截門檻:{KEYWORD_DENSITY_THRESHOLD:.1%} (關鍵字密度)")
|
| 28 |
-
print()
|
| 29 |
-
print("=" * 80)
|
| 30 |
-
print()
|
| 31 |
-
|
| 32 |
-
# 測試案例
|
| 33 |
-
test_cases = [
|
| 34 |
-
{
|
| 35 |
-
"name": "正常內容 - 不應該被攔截",
|
| 36 |
-
"text": "今天天氣很好,我們一起去公園散步吧。這是一個美好的日子。"
|
| 37 |
-
},
|
| 38 |
-
{
|
| 39 |
-
"name": "包含少量敏感詞 - 低於門檻",
|
| 40 |
-
"text": "伊斯蘭教是世界主要宗教之一,有著悠久的歷史和豐富的文化傳統。許多信徒在世界各地實踐他們的信仰,並為社會做出貢獻。"
|
| 41 |
-
},
|
| 42 |
-
{
|
| 43 |
-
"name": "包含多個敏感詞 - 超過門檻",
|
| 44 |
-
"text": "伊斯蘭教的先知默罕默德教導信徒向阿拉禱告。"
|
| 45 |
-
},
|
| 46 |
-
{
|
| 47 |
-
"name": "高密度敏感詞 - 明顯超過門檻",
|
| 48 |
-
"text": "阿拉默罕默德伊斯蘭教"
|
| 49 |
-
},
|
| 50 |
-
{
|
| 51 |
-
"name": "技術討論 - 正常內容",
|
| 52 |
-
"text": "機器學習是人工智能的一個分支,它使用統計技術讓計算機系統能夠從數據中學習。深度學習是機器學習的一個子集。"
|
| 53 |
-
}
|
| 54 |
-
]
|
| 55 |
-
|
| 56 |
-
# 執行測試
|
| 57 |
-
for i, test_case in enumerate(test_cases, 1):
|
| 58 |
-
print(f"測試案例 {i}: {test_case['name']}")
|
| 59 |
-
print("-" * 80)
|
| 60 |
-
|
| 61 |
-
text = test_case['text']
|
| 62 |
-
print(f"📝 原文本:{text}")
|
| 63 |
-
print()
|
| 64 |
-
|
| 65 |
-
# 使用 jieba 分詞
|
| 66 |
-
words = list(jieba.cut(text))
|
| 67 |
-
print(f"🔤 分詞結果:{' / '.join(words)}")
|
| 68 |
-
print(f"📊 總詞數:{len(words)}")
|
| 69 |
-
print()
|
| 70 |
-
|
| 71 |
-
# 檢查敏感詞
|
| 72 |
-
sensitive_words_found = [w for w in words if w in BLOCKED_KEYWORDS]
|
| 73 |
-
print(f"⚠️ 發現敏感詞:{sensitive_words_found if sensitive_words_found else '無'}")
|
| 74 |
-
print(f"🔢 敏感詞數量:{len(sensitive_words_found)}")
|
| 75 |
-
print()
|
| 76 |
-
|
| 77 |
-
# 執行 Guardrails 檢查
|
| 78 |
-
should_block, density = check_content_guardrails(text)
|
| 79 |
-
print(f"📈 關鍵字密度:{density:.2%} (門檻:{KEYWORD_DENSITY_THRESHOLD:.2%})")
|
| 80 |
-
print(f"🚦 判定結果:{'🚫 攔截' if should_block else '✅ 通過'}")
|
| 81 |
-
print()
|
| 82 |
-
|
| 83 |
-
# 應用過濾器
|
| 84 |
-
filtered = guardrail_filter(text)
|
| 85 |
-
if filtered != text:
|
| 86 |
-
print(f"🛡️ 過濾後輸出:{filtered}")
|
| 87 |
-
else:
|
| 88 |
-
print(f"✅ 原文通過,無需過濾")
|
| 89 |
-
|
| 90 |
-
print()
|
| 91 |
-
print("=" * 80)
|
| 92 |
-
print()
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
def test_edge_cases():
|
| 96 |
-
"""測試邊界情況"""
|
| 97 |
-
|
| 98 |
-
print("🔬 邊界測試")
|
| 99 |
-
print("=" * 80)
|
| 100 |
-
print()
|
| 101 |
-
|
| 102 |
-
edge_cases = [
|
| 103 |
-
("空字符串", ""),
|
| 104 |
-
("純空格", " "),
|
| 105 |
-
("單個敏感詞", "伊斯蘭教"),
|
| 106 |
-
("重複敏感詞", "阿拉阿拉阿拉"),
|
| 107 |
-
("長文本混合", "今天我們要討論世界宗教的歷史。" * 10 + "伊斯蘭教是其中之一。"),
|
| 108 |
-
]
|
| 109 |
-
|
| 110 |
-
for name, text in edge_cases:
|
| 111 |
-
should_block, density = check_content_guardrails(text)
|
| 112 |
-
print(f"{name}:")
|
| 113 |
-
print(f" 文本長度:{len(text)}")
|
| 114 |
-
print(f" 關鍵字密度:{density:.2%}")
|
| 115 |
-
print(f" 結果:{'🚫 攔截' if should_block else '✅ 通過'}")
|
| 116 |
-
print()
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
if __name__ == "__main__":
|
| 120 |
-
try:
|
| 121 |
-
# 執行主要測試
|
| 122 |
-
test_guardrails()
|
| 123 |
-
|
| 124 |
-
# 執行邊界測試
|
| 125 |
-
test_edge_cases()
|
| 126 |
-
|
| 127 |
-
print("✅ 所有測試完成!")
|
| 128 |
-
|
| 129 |
-
except Exception as e:
|
| 130 |
-
print(f"❌ 測試失敗:{e}")
|
| 131 |
-
import traceback
|
| 132 |
-
traceback.print_exc()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
test_parlant_integration.py
DELETED
|
@@ -1,152 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
測試 Parlant SDK 整合
|
| 3 |
-
驗證指南系統是否正常工作
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
from deep_agent_rag.guidelines import (
|
| 7 |
-
get_guideline,
|
| 8 |
-
get_customer_journey,
|
| 9 |
-
initialize_parlant_sync
|
| 10 |
-
)
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
def test_guidelines():
|
| 14 |
-
"""測試指南獲取功能"""
|
| 15 |
-
print("=" * 60)
|
| 16 |
-
print("測試指南系統")
|
| 17 |
-
print("=" * 60)
|
| 18 |
-
|
| 19 |
-
# 測試研究代理指南
|
| 20 |
-
print("\n1. 測試研究代理的工具選擇指南...")
|
| 21 |
-
tool_guideline = get_guideline("research", "tool_selection")
|
| 22 |
-
assert tool_guideline, "❌ 工具選擇指南不應為空"
|
| 23 |
-
assert "query_pdf_knowledge" in tool_guideline, "❌ 應包含 PDF 工具說明"
|
| 24 |
-
assert "get_company_deep_info" in tool_guideline, "❌ 應包含股票工具說明"
|
| 25 |
-
assert "search_web" in tool_guideline, "❌ 應包含網路搜尋工具說明"
|
| 26 |
-
print(" ✅ 工具選擇指南獲取成功")
|
| 27 |
-
print(f" 📄 指南長度: {len(tool_guideline)} 字符")
|
| 28 |
-
|
| 29 |
-
print("\n2. 測試研究代理的任務規劃指南...")
|
| 30 |
-
task_guideline = get_guideline("research", "task_planning")
|
| 31 |
-
assert task_guideline, "❌ 任務規劃指南不應為空"
|
| 32 |
-
assert "學術理論問題" in task_guideline, "❌ 應包含學術問題說明"
|
| 33 |
-
assert "股票相關問題" in task_guideline, "❌ 應包含股票問題說明"
|
| 34 |
-
print(" ✅ 任務規劃指南獲取成功")
|
| 35 |
-
|
| 36 |
-
print("\n3. 測試研究代理的研究行為指南...")
|
| 37 |
-
behavior_guideline = get_guideline("research", "research_behavior")
|
| 38 |
-
assert behavior_guideline, "❌ 研究行為指南不應為空"
|
| 39 |
-
print(" ✅ 研究行為指南獲取成功")
|
| 40 |
-
|
| 41 |
-
# 測試郵件代理指南
|
| 42 |
-
print("\n4. 測試郵件代理的撰寫指南...")
|
| 43 |
-
email_guideline = get_guideline("email", "email_writing")
|
| 44 |
-
assert email_guideline, "❌ 郵件撰寫指南不應為空"
|
| 45 |
-
print(" ✅ 郵件撰寫指南獲取成功")
|
| 46 |
-
|
| 47 |
-
# 測試行事曆代理指南
|
| 48 |
-
print("\n5. 測試行事曆代理的創建指南...")
|
| 49 |
-
calendar_guideline = get_guideline("calendar", "event_creation")
|
| 50 |
-
assert calendar_guideline, "❌ 事件創建指南不應為空"
|
| 51 |
-
print(" ✅ 事件創建指南獲取成功")
|
| 52 |
-
|
| 53 |
-
# 測試不存在的指南
|
| 54 |
-
print("\n6. 測試錯誤處理(不存在的指南)...")
|
| 55 |
-
invalid_guideline = get_guideline("research", "nonexistent")
|
| 56 |
-
assert invalid_guideline == "", "❌ 不存在的指南應返回空字符串"
|
| 57 |
-
print(" ✅ 錯誤處理正常")
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
def test_customer_journey():
|
| 61 |
-
"""測試客戶旅程獲取功能"""
|
| 62 |
-
print("\n" + "=" * 60)
|
| 63 |
-
print("測試客戶旅程系統")
|
| 64 |
-
print("=" * 60)
|
| 65 |
-
|
| 66 |
-
print("\n1. 測試研究代理的客戶旅程...")
|
| 67 |
-
research_journey = get_customer_journey("research")
|
| 68 |
-
assert research_journey, "❌ 研究代理客戶旅程不應為空"
|
| 69 |
-
assert "steps" in research_journey, "❌ 應包含步驟定義"
|
| 70 |
-
assert "checkpoints" in research_journey, "❌ 應包含檢查點"
|
| 71 |
-
print(" ✅ 研究代理客戶旅程獲取成功")
|
| 72 |
-
print(f" 📋 步驟: {research_journey['steps'][0]}")
|
| 73 |
-
print(f" 🔍 檢查點數量: {len(research_journey['checkpoints'])}")
|
| 74 |
-
|
| 75 |
-
print("\n2. 測試郵件代理的客戶旅程...")
|
| 76 |
-
email_journey = get_customer_journey("email")
|
| 77 |
-
assert email_journey, "❌ 郵件代理客戶旅程不應為空"
|
| 78 |
-
print(" ✅ 郵件代理客戶旅程獲取成功")
|
| 79 |
-
|
| 80 |
-
print("\n3. 測試行事曆代理的客戶旅程...")
|
| 81 |
-
calendar_journey = get_customer_journey("calendar")
|
| 82 |
-
assert calendar_journey, "❌ 行事曆代理客戶旅程不應為空"
|
| 83 |
-
print(" ✅ 行事曆代理客戶旅程獲取成功")
|
| 84 |
-
|
| 85 |
-
# 測試不存在的客戶旅程
|
| 86 |
-
print("\n4. 測試錯誤處理(不存在的客戶旅程)...")
|
| 87 |
-
invalid_journey = get_customer_journey("nonexistent")
|
| 88 |
-
assert invalid_journey == {}, "❌ 不存在的客戶旅程應返回空字典"
|
| 89 |
-
print(" ✅ 錯誤處理正常")
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
def test_guideline_structure():
|
| 93 |
-
"""測試指南結構完整性"""
|
| 94 |
-
print("\n" + "=" * 60)
|
| 95 |
-
print("測試指南結構完整性")
|
| 96 |
-
print("=" * 60)
|
| 97 |
-
|
| 98 |
-
print("\n1. 檢查研究代理指南...")
|
| 99 |
-
tool_guideline = get_guideline("research", "tool_selection")
|
| 100 |
-
task_guideline = get_guideline("research", "task_planning")
|
| 101 |
-
behavior_guideline = get_guideline("research", "research_behavior")
|
| 102 |
-
assert tool_guideline, "❌ 缺少工具選擇指南"
|
| 103 |
-
assert task_guideline, "❌ 缺少任務規劃指南"
|
| 104 |
-
assert behavior_guideline, "❌ 缺少研究行為指南"
|
| 105 |
-
print(" ✅ 研究代理指南結構完整")
|
| 106 |
-
|
| 107 |
-
print("\n2. 檢查郵件代理指南...")
|
| 108 |
-
email_guideline = get_guideline("email", "email_writing")
|
| 109 |
-
assert email_guideline, "❌ 缺少郵件撰寫指南"
|
| 110 |
-
print(" ✅ 郵件代理指南結構完整")
|
| 111 |
-
|
| 112 |
-
print("\n3. 檢查行事曆代理指南...")
|
| 113 |
-
calendar_guideline = get_guideline("calendar", "event_creation")
|
| 114 |
-
assert calendar_guideline, "❌ 缺少事件創建指南"
|
| 115 |
-
print(" ✅ 行事曆代理指南結構��整")
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
def main():
|
| 119 |
-
"""運行所有測試"""
|
| 120 |
-
print("\n" + "🚀 " * 20)
|
| 121 |
-
print("開始測試 Parlant SDK 整合")
|
| 122 |
-
print("🚀 " * 20 + "\n")
|
| 123 |
-
|
| 124 |
-
try:
|
| 125 |
-
# 初始化 Parlant SDK
|
| 126 |
-
print("初始化 Parlant SDK...")
|
| 127 |
-
initialize_parlant_sync()
|
| 128 |
-
print()
|
| 129 |
-
|
| 130 |
-
test_guidelines()
|
| 131 |
-
test_customer_journey()
|
| 132 |
-
|
| 133 |
-
print("\n" + "=" * 60)
|
| 134 |
-
print("✅ 所有測試通過!")
|
| 135 |
-
print("=" * 60)
|
| 136 |
-
print("\nParlant SDK 指南系統已成功整合,可以開始使用了!")
|
| 137 |
-
|
| 138 |
-
except AssertionError as e:
|
| 139 |
-
print(f"\n❌ 測試失敗: {e}")
|
| 140 |
-
return 1
|
| 141 |
-
except Exception as e:
|
| 142 |
-
print(f"\n❌ 發生錯誤: {e}")
|
| 143 |
-
import traceback
|
| 144 |
-
traceback.print_exc()
|
| 145 |
-
return 1
|
| 146 |
-
|
| 147 |
-
return 0
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
if __name__ == "__main__":
|
| 151 |
-
exit(main())
|
| 152 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
test_simple_chatbot.py
DELETED
|
@@ -1,150 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
Simple Chatbot 測試腳本
|
| 3 |
-
用於驗證聊天機器人功能是否正常
|
| 4 |
-
"""
|
| 5 |
-
import sys
|
| 6 |
-
import os
|
| 7 |
-
|
| 8 |
-
# 添加項目根目錄到 Python 路徑
|
| 9 |
-
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
| 10 |
-
|
| 11 |
-
from deep_agent_rag.ui.simple_chatbot_interface import chat_with_llm, get_llm_status
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
def test_llm_status():
|
| 15 |
-
"""測試 LLM 狀態檢測"""
|
| 16 |
-
print("=" * 60)
|
| 17 |
-
print("測試 1: LLM 狀態檢測")
|
| 18 |
-
print("=" * 60)
|
| 19 |
-
|
| 20 |
-
try:
|
| 21 |
-
status = get_llm_status()
|
| 22 |
-
print(f"✅ LLM 狀態: {status}")
|
| 23 |
-
return True
|
| 24 |
-
except Exception as e:
|
| 25 |
-
print(f"❌ LLM 狀態檢測失敗: {e}")
|
| 26 |
-
return False
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
def test_simple_chat():
|
| 30 |
-
"""測試基本對話功能"""
|
| 31 |
-
print("\n" + "=" * 60)
|
| 32 |
-
print("測試 2: 基本對話功能")
|
| 33 |
-
print("=" * 60)
|
| 34 |
-
|
| 35 |
-
try:
|
| 36 |
-
# 測試對話
|
| 37 |
-
history = []
|
| 38 |
-
test_message = "你好!請簡單介紹你自己。"
|
| 39 |
-
|
| 40 |
-
print(f"\n用戶: {test_message}")
|
| 41 |
-
print("AI: 正在生成回應...")
|
| 42 |
-
|
| 43 |
-
_, updated_history = chat_with_llm(
|
| 44 |
-
message=test_message,
|
| 45 |
-
history=history,
|
| 46 |
-
system_prompt="你是一個有幫助的AI助手。請用繁體中文簡短回答。"
|
| 47 |
-
)
|
| 48 |
-
|
| 49 |
-
if updated_history:
|
| 50 |
-
user_msg, bot_msg = updated_history[0]
|
| 51 |
-
print(f"\nAI 回應: {bot_msg[:100]}..." if len(bot_msg) > 100 else f"\nAI 回應: {bot_msg}")
|
| 52 |
-
print("\n✅ 基本對話功能測試通過")
|
| 53 |
-
return True
|
| 54 |
-
else:
|
| 55 |
-
print("❌ 對話歷史為空")
|
| 56 |
-
return False
|
| 57 |
-
|
| 58 |
-
except Exception as e:
|
| 59 |
-
print(f"❌ 基本對話功能測試失敗: {e}")
|
| 60 |
-
import traceback
|
| 61 |
-
traceback.print_exc()
|
| 62 |
-
return False
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
def test_multi_turn_chat():
|
| 66 |
-
"""測試多輪對話"""
|
| 67 |
-
print("\n" + "=" * 60)
|
| 68 |
-
print("測試 3: 多輪對話功能")
|
| 69 |
-
print("=" * 60)
|
| 70 |
-
|
| 71 |
-
try:
|
| 72 |
-
history = []
|
| 73 |
-
|
| 74 |
-
# 第一輪對話
|
| 75 |
-
print("\n--- 第一輪 ---")
|
| 76 |
-
_, history = chat_with_llm(
|
| 77 |
-
message="我叫小明",
|
| 78 |
-
history=history,
|
| 79 |
-
system_prompt="你是一個有幫助的AI助手。請記住用戶的信息。"
|
| 80 |
-
)
|
| 81 |
-
print(f"用戶: 我叫小明")
|
| 82 |
-
print(f"AI: {history[-1][1][:50]}...")
|
| 83 |
-
|
| 84 |
-
# 第二輪對話
|
| 85 |
-
print("\n--- 第二輪 ---")
|
| 86 |
-
_, history = chat_with_llm(
|
| 87 |
-
message="我剛才告訴你我叫什麼名字?",
|
| 88 |
-
history=history,
|
| 89 |
-
system_prompt="你是一個有幫助的AI助手。請記住用戶的信息。"
|
| 90 |
-
)
|
| 91 |
-
print(f"用戶: 我剛才告訴你我叫什麼名字?")
|
| 92 |
-
print(f"AI: {history[-1][1][:50]}...")
|
| 93 |
-
|
| 94 |
-
# 檢查是否記住了名字
|
| 95 |
-
if "小明" in history[-1][1]:
|
| 96 |
-
print("\n✅ 多輪對話功能測試通過(AI 記住了上下文)")
|
| 97 |
-
return True
|
| 98 |
-
else:
|
| 99 |
-
print("\n⚠️ 多輪對話功能測試部分通過(AI 可能沒有完全記住上下文)")
|
| 100 |
-
return True # 仍然算通過,因為功能本身是正常的
|
| 101 |
-
|
| 102 |
-
except Exception as e:
|
| 103 |
-
print(f"❌ 多輪對話功能測試失敗: {e}")
|
| 104 |
-
import traceback
|
| 105 |
-
traceback.print_exc()
|
| 106 |
-
return False
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
def main():
|
| 110 |
-
"""執行所有測試"""
|
| 111 |
-
print("\n")
|
| 112 |
-
print("🚀 開始測試 Simple Chatbot 功能")
|
| 113 |
-
print("=" * 60)
|
| 114 |
-
|
| 115 |
-
results = []
|
| 116 |
-
|
| 117 |
-
# 執行測試
|
| 118 |
-
results.append(("LLM 狀態檢測", test_llm_status()))
|
| 119 |
-
results.append(("基本對話功能", test_simple_chat()))
|
| 120 |
-
results.append(("多輪對話功能", test_multi_turn_chat()))
|
| 121 |
-
|
| 122 |
-
# 顯示結果摘要
|
| 123 |
-
print("\n" + "=" * 60)
|
| 124 |
-
print("測試結果摘要")
|
| 125 |
-
print("=" * 60)
|
| 126 |
-
|
| 127 |
-
passed = sum(1 for _, result in results if result)
|
| 128 |
-
total = len(results)
|
| 129 |
-
|
| 130 |
-
for test_name, result in results:
|
| 131 |
-
status = "✅ 通過" if result else "❌ 失敗"
|
| 132 |
-
print(f"{test_name}: {status}")
|
| 133 |
-
|
| 134 |
-
print(f"\n總計: {passed}/{total} 測試通過")
|
| 135 |
-
|
| 136 |
-
if passed == total:
|
| 137 |
-
print("\n🎉 所有測試通過!Simple Chatbot 功能正常。")
|
| 138 |
-
print("\n你可以執行以下命令啟動界面:")
|
| 139 |
-
print(" python Deep_Agent_Gradio_RAG_localLLM_main.py")
|
| 140 |
-
print(" 或使用:uv run Deep_Agent_Gradio_RAG_localLLM_main.py")
|
| 141 |
-
print("\n然後點擊「💬 Simple Chatbot」標籤頁。")
|
| 142 |
-
else:
|
| 143 |
-
print("\n⚠️ 部分測試失敗,請檢查錯誤訊息。")
|
| 144 |
-
|
| 145 |
-
return passed == total
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
if __name__ == "__main__":
|
| 149 |
-
success = main()
|
| 150 |
-
sys.exit(0 if success else 1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
uv.lock
CHANGED
|
@@ -853,6 +853,7 @@ dependencies = [
|
|
| 853 |
{ name = "pillow" },
|
| 854 |
{ name = "pypdf" },
|
| 855 |
{ name = "python-dotenv" },
|
|
|
|
| 856 |
{ name = "rank-bm25" },
|
| 857 |
{ name = "sentence-transformers" },
|
| 858 |
{ name = "tavily-python" },
|
|
@@ -897,6 +898,7 @@ requires-dist = [
|
|
| 897 |
{ name = "pillow", specifier = ">=12.0.0" },
|
| 898 |
{ name = "pypdf", specifier = ">=6.4.1" },
|
| 899 |
{ name = "python-dotenv", specifier = ">=1.2.1" },
|
|
|
|
| 900 |
{ name = "rank-bm25", specifier = ">=0.2.2" },
|
| 901 |
{ name = "sentence-transformers", specifier = ">=5.2.0" },
|
| 902 |
{ name = "tavily-python", specifier = ">=0.7.14" },
|
|
|
|
| 853 |
{ name = "pillow" },
|
| 854 |
{ name = "pypdf" },
|
| 855 |
{ name = "python-dotenv" },
|
| 856 |
+
{ name = "pyyaml" },
|
| 857 |
{ name = "rank-bm25" },
|
| 858 |
{ name = "sentence-transformers" },
|
| 859 |
{ name = "tavily-python" },
|
|
|
|
| 898 |
{ name = "pillow", specifier = ">=12.0.0" },
|
| 899 |
{ name = "pypdf", specifier = ">=6.4.1" },
|
| 900 |
{ name = "python-dotenv", specifier = ">=1.2.1" },
|
| 901 |
+
{ name = "pyyaml", specifier = ">=6.0.0" },
|
| 902 |
{ name = "rank-bm25", specifier = ">=0.2.2" },
|
| 903 |
{ name = "sentence-transformers", specifier = ">=5.2.0" },
|
| 904 |
{ name = "tavily-python", specifier = ">=0.7.14" },
|