--- license: apache-2.0 library_name: transformers base_model: Qwen/Qwen3-4B-Instruct-2507 tags: - question-generation - rl - grpo - lora pipeline_tag: text-generation --- # qwen3-4b-question-gen Fine-tuned model for generating technical screening questions, trained using GRPO (Group Relative Policy Optimization) with LoRA adapters. ## Base Model - **Base**: [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) - **Training**: LoRA fine-tuning with RL (GRPO algorithm) ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("ash256/qwen3-4b-question-gen") tokenizer = AutoTokenizer.from_pretrained("ash256/qwen3-4b-question-gen") prompt = "Generate a technical screening question for a senior backend engineer:" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` Or with vLLM for faster inference: ```python from vllm import LLM, SamplingParams llm = LLM(model="ash256/qwen3-4b-question-gen") outputs = llm.generate(["Generate a technical screening question for a senior backend engineer:"], SamplingParams(max_tokens=256)) print(outputs[0].outputs[0].text) ```