File size: 7,645 Bytes
50d0fa6
1ea26af
 
 
50d0fa6
 
1ea26af
50d0fa6
 
1ea26af
50d0fa6
1ea26af
50d0fa6
1ea26af
50d0fa6
1ea26af
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
---
title: CognitiveKernel-Launchpad
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.44.1
app_file: app.py
pinned: false
license: mit
hf_oauth: true
hf_oauth_expiration_minutes: 480
---
# 🧠 CognitiveKernel-Launchpad β€” Hugging Face Space

This Space hosts a Gradio UI for CognitiveKernel-Launchpad and is tailored for Hugging Face Spaces.

- Original project (full source & docs): https://github.com/charSLee013/CognitiveKernel-Launchpad
- Access: Sign in with Hugging Face is required (OAuth enabled via metadata above).

## πŸ” Access Control
Only authenticated users can use this Space. Optionally restrict to org members by adding to the metadata:

```
hf_oauth_authorized_org: YOUR_ORG_NAME
```

## πŸš€ How to Use (in this Space)
1) Click β€œSign in with Hugging Face”.
2) Ensure API secrets are set in Space β†’ Settings β†’ Secrets.
3) Ask a question in the input box and submit.

## πŸ”§ Required Secrets (Space Settings β†’ Secrets)
- OPENAI_API_KEY: your provider key
- OPENAI_API_BASE: e.g., https://api-inference.modelscope.cn/v1/chat/completions
- OPENAI_API_MODEL: e.g., Qwen/Qwen3-235B-A22B-Instruct-2507

Optional:
- SEARCH_BACKEND: duckduckgo | google (default: duckduckgo)
- WEB_AGENT_MODEL / WEB_MULTIMODAL_MODEL: override web models

## πŸ–₯️ Runtime Notes
- CPU is fine; GPU optional.
- Playwright browsers are prepared automatically at startup.
- To persist files/logs, enable Persistent Storage (uses /data).

β€”


# 🧠 CognitiveKernel-Launchpad β€” Open Framework for Deep Research Agents & Agent Foundation Models

> πŸŽ“ **Academic Research & Educational Use Only** β€” No Commercial Use
> πŸ“„ [Paper (arXiv:2508.00414)](https://arxiv.org/abs/2508.00414) | πŸ‡¨πŸ‡³ [δΈ­ζ–‡ζ–‡ζ‘£](README_zh.md) | πŸ“œ [LICENSE](LICENSE.txt)

[![Python 3.10+](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
[![arXiv](https://img.shields.io/badge/arXiv-2508.00414-b31b1b.svg)](https://arxiv.org/abs/2508.00414)

---

## 🌟 Why CognitiveKernel-Launchpad?

This research-only fork is derived from Tencent's original CognitiveKernel-Pro and is purpose-built for inference-time usage. It removes complex training/SFT and heavy testing pipelines, focusing on a clean reasoning runtime that is easy to deploy for distributed inference. In addition, it includes a lightweight Gradio web UI for convenient usage.

---

## πŸš€ Quick Start

### 1. Install (No GPU Required)

```bash
git clone https://github.com/charSLee013/CognitiveKernel-Launchpad.git
cd CognitiveKernel-Launchpad
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt
```

### 2. Set Environment (Minimal Setup)

```bash
export OPENAI_API_KEY="sk-..."
export OPENAI_API_BASE="https://api.openai.com/v1"
export OPENAI_API_MODEL="gpt-4o-mini"
```

### 3. Run a Single Question

```bash
python -m ck_pro "What is the capital of France?"
```

βœ… That’s it! You’re running a deep research agent.

---

## πŸ› οΈ Core Features

### πŸ–₯️ CLI Interface
```bash
python -m ck_pro \
  --config config.toml \
  --input questions.txt \
  --output answers.txt \
  --interactive \
  --verbose
```

| Flag          | Description                          |
|---------------|--------------------------------------|
| `-c, --config`| TOML config path (optional)          |
| `-i, --input` | Batch input file (one Q per line)    |
| `-o, --output`| Output answers to file               |
| `--interactive`| Start interactive Q&A session       |
| `-v, --verbose`| Show reasoning steps & timing       |

---

### βš™οΈ Configuration (config.toml)

> `TOML > Env Vars > Defaults`

Use the examples in this repo:
- Minimal config: [config.minimal.toml](config.minimal.toml) β€” details in [CONFIG_EXAMPLES.md](CONFIG_EXAMPLES.md)
- Comprehensive config: [config.comprehensive.toml](config.comprehensive.toml) β€” full explanation in [CONFIG_EXAMPLES.md](CONFIG_EXAMPLES.md)

#### πŸš€ Recommended Configuration

Based on the current setup, here's the recommended configuration for optimal performance:

```toml
# Core Agent Configuration
[ck.model]
call_target = "https://api-inference.modelscope.cn/v1/chat/completions"
api_key = "your-modelscope-api-key-here"  # Replace with your actual key
model = "Qwen/Qwen3-235B-A22B-Instruct-2507"

[ck.model.extract_body]
temperature = 0.6
max_tokens = 8192

# Web Agent Configuration (for web browsing tasks)
[web]
max_steps = 20
use_multimodal = "auto"  # Automatically use multimodal when needed

[web.model]
call_target = "https://api-inference.modelscope.cn/v1/chat/completions"
api_key = "your-modelscope-api-key-here"  # Replace with your actual key
model = "moonshotai/Kimi-K2-Instruct"
request_timeout = 600
max_retry_times = 5
max_token_num = 8192

[web.model.extract_body]
temperature = 0.0
top_p = 0.95
max_tokens = 8192

# Multimodal Web Agent (for visual tasks)
[web.model_multimodal]
call_target = "https://api-inference.modelscope.cn/v1/chat/completions"
api_key = "your-modelscope-api-key-here"  # Replace with your actual key
model = "Qwen/Qwen2.5-VL-72B-Instruct"
request_timeout = 600
max_retry_times = 5
max_token_num = 8192

[web.model_multimodal.extract_body]
temperature = 0.0
top_p = 0.95
max_tokens = 8192

# Search Configuration
[search]
backend = "duckduckgo"  # Recommended: reliable and no API key required
```

#### πŸ”‘ API Key Setup

1. **Get ModelScope API Key**: Visit [ModelScope](https://www.modelscope.cn/) to obtain your API key
2. **Replace placeholders**: Update all `your-modelscope-api-key-here` with your actual API key
3. **Alternative**: Use environment variables:
   ```bash
   export OPENAI_API_KEY="your-actual-key"
   ```

#### πŸ“‹ Model Selection Rationale

- **Main Agent**: `Qwen3-235B-A22B-Instruct-2507` - Latest high-performance reasoning model
- **Web Agent**: `Kimi-K2-Instruct` - Optimized for web interaction tasks
- **Multimodal**: `Qwen2.5-VL-72B-Instruct` - Advanced vision-language capabilities

For all other options, see [CONFIG_EXAMPLES.md](CONFIG_EXAMPLES.md).

---

### πŸ“Š GAIA Benchmark Evaluation

Evaluate your agent on the GAIA benchmark:

```bash
python -m gaia.cli.simple_validate \
  --data gaia_val.jsonl \
  --level all \
  --count 10 \
  --output results.jsonl
```

β†’ Outputs detailed performance summary & per-task results.

---

### 🌐 Gradio Web UI

Launch a user-friendly web interface:

```bash
python -m ck_pro.gradio_app --host 0.0.0.0 --port 7860
```

β†’ Open `http://localhost:7860` in your browser.


Note: It is recommended to install Playwright browsers (or install them if you encounter related errors). On Linux you may also need to run playwright install-deps.

Note: It is recommended to install Playwright browsers (or install them if you encounter related errors): `python -m playwright install` (Linux may also require `python -m playwright install-deps`).

---

### πŸ“‚ Logging

- Console: `INFO` level by default
- Session logs: `logs/ck_session_*.log`
- Configurable via `[logging]` section in TOML

---

## 🧩 Architecture Highlights

- **Modular Design**: Web, File, Code, Reasoning modules
- **Fallback Mechanism**: HTTP API β†’ Playwright browser automation
- **Reflection & Voting**: Novel test-time strategies for improved accuracy
- **Extensible**: Easy to plug in new models, tools, or datasets

---

## πŸ“œ License & Attribution

This is a research-only fork of **Tencent’s CognitiveKernel-Pro**.
πŸ”— Original: https://github.com/Tencent/CognitiveKernel-Pro

> ⚠️ **Strictly for academic research and educational purposes. Commercial use is prohibited.**
> See `LICENSE.txt` for full terms.