HybriKo: Korean Hybrid Language Model

RNN과 Attention 메커니즘을 결합한 Griffin 아키텍처 기반 한국어 하이브리드 언어 모델입니다.

모델 상세

파라미터: 117.8M
아키텍처: 2:1 RNN-to-Attention 비율 (Griffin 기반)
컨텍스트 길이: 1024 토큰
어휘 크기: 32,000 (SentencePiece)
학습 데이터: 한국어 위키피디아

학습 결과 (Exp3)

Phase	Steps	Loss	PPL
Phase 1	0-10K	1.80	~6.0
Phase 2	10K-30K	1.60	~4.95

아키텍처

HybriKo (117.8M params)
├── Embedding (32000 → 768)
├── Layers (12x)
│   ├── Layer 1,2: GriffinBlock (RNN)
│   ├── Layer 3: AttentionBlock
│   └── (패턴 반복)
└── LM Head (weight-tied)

주요 특징:

RGLRU: Real-Gated Linear Recurrent Unit
GQA: Grouped Query Attention (1:4 KV reduction)
Flash Attention 2: 최적화된 어텐션 연산
GeGLU: FFN의 Gated activation

빠른 시작 (Google Colab)

import torch
from hybridko.model import HybriKoModel, HybriKoConfig
from hybridko.data import load_tokenizer

# 모델 로드
config = HybriKoConfig.from_yaml("config.yaml")
model = HybriKoModel(config)
model.load_state_dict(torch.load("pytorch_model.pt"))

# 토크나이저 로드
tokenizer = load_tokenizer("HybriKo_tok.model")

# 텍스트 생성
from hybridko.inference import generate_with_cache
output = generate_with_cache(model, tokenizer, "한국의 수도는", max_tokens=50)
print(output)

여러 프롬프트 테스트

prompts = ["한국어", "대한민국", "서울", "인공지능", "오늘 날씨가"]

for prompt in prompts:
    input_ids = torch.tensor([[2] + sp.EncodeAsIds(prompt)]).to(device)
    output = model.generate(input_ids, max_new_tokens=30, temperature=0.8, top_k=50)
    generated = sp.DecodeIds(output[0].tolist())
    print(f"📝 {prompt}")
    print(f"   → {generated}")
    print("-" * 50)

파일 목록

pytorch_model.pt: 모델 가중치 (450MB)
config.yaml: 모델 설정
HybriKo_tok.model: SentencePiece 토크나이저
HybriKo_tok.vocab: 토크나이저 어휘

인용

@misc{hybridko2026,
  title={HybriKo: Korean Hybrid Language Model},
  year={2026},
  url={https://huggingface.co/gyunggyung/HybriKo-117M}
}

라이선스

Apache 2.0

Downloads last month: 15

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support