🏗️ Building on HF

Ali El Filali PRO

alielfilali01

in-ternal

·

AI & ML interests

AI psychometrician | NLP (mainly for Arabic) | Interests include Reinforcement Learning and Cognitive sciences among others

Recent Activity

liked a model 15 days ago

ibm-research/granite-4.0-h-3b-ar

updated a dataset about 1 month ago

OALL/requests_v2

posted an update about 1 month ago

Plans in HTML > Plans in Markdown

View all activity

Organizations

upvoted an article 2 months ago

Article

🪢 Langfuse and 🤗 Hugging Face: 5 Ways to use them Together

MJannik

•

Mar 14, 2025

• 14

upvoted 2 articles 3 months ago

Article

QIMMA قِمّة ⛰: A Quality-First Arabic LLM Leaderboard

tiiuae

•

Apr 21

• 13

Article

Stop benchmarking inference providers

SaylorTwift

•

Apr 14

• 8

upvoted a collection 3 months ago

Gemma 4

15 items • Updated 30 days ago • 1.02k

upvoted an article 3 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

+5

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 914

upvoted a paper 5 months ago

ResearchGym: Evaluating Language Model Agents on Real-World AI Research

Paper • 2602.15112 • Published Feb 16 • 21

upvoted 5 articles 5 months ago

Article

OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments

+3

christian-washington, ajasuja, santosh-iima, lewtun, burtenshaw

•

Feb 12

• 35

Article

Custom Kernels for All from Codex and Claude

+2

burtenshaw, sayakpaul, ariG23498, evalstate

•

Feb 13

• 80

Article

We Got Claude to Build CUDA Kernels and teach open models!

+2

burtenshaw, evalstate, merve, pcuenq

•

Jan 28

• 158

Article

Community Evals: Because we're done trusting black-box leaderboards over the community

+5

burtenshaw, SaylorTwift, kramp, merve, davanstrien, nielsr, julien-c

•

Feb 4

• 90

Article

Alyah ⭐️: Toward Robust Evaluation of Emirati Dialect Capabilities in Arabic LLMs

tiiuae

•

Jan 27

• 26

upvoted 4 articles 6 months ago

Article

AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality

ibm-research

•

Jan 21

• 33

Article

Open Responses: What you need to know

+2

evalstate, burtenshaw, merve, pcuenq

•

Jan 15

• 112

Article

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

nvidia

•

Jan 5

• 64

Article

New in llama.cpp: Anthropic Messages API

ggml-org

•

Jan 19

• 45

upvoted a collection 6 months ago

AgriLLM

A collection of the artifacts for the AgriLLM initiative. • 5 items • Updated Dec 15, 2025 • 6

upvoted 2 articles 6 months ago

Article

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

nvidia

•

Dec 17, 2025

• 50

Article

Introducing Falcon-H1-Arabic: Pushing the Boundaries of Arabic Language AI with Hybrid Architecture

tiiuae

•

Jan 5

• 43

upvoted a collection 7 months ago

Dalla models

Dalla is a family of Arabic language models optimized for Arabic text processing through advanced tokenization techniques. • 4 items • Updated Dec 16, 2025 • 3

upvoted a paper 7 months ago

Can a Multichoice Dataset be Repurposed for Extractive Question Answering?

Paper • 2404.17342 • Published Apr 26, 2024 • 2