AI & ML interests

Benchmark datasets for evaluating language models in Portuguese and assessing their knowledge about Brazil

Recent Activity

We currently offer one dataset for evaluating the performance of language models on Brazilian Leading Universities Entrance eXams (BLUEX).

If you use bluex in your research, please cite the following papers:

Our initial paper introducing the benchmark

@misc{almeida2023bluex,
      title={BLUEX: A benchmark based on Brazilian Leading Universities Entrance eXams}, 
      author={Thales Sales Almeida and Thiago Laitz and Giovana K. Bonás and Rodrigo Nogueira},
      year={2023},
      eprint={2307.05410},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
} 

Our most recent expansion, adding captions and the 2024 and 2025 exams

@misc{santos2025bluexrevisitedenhancingbenchmark,
      title={BLUEX Revisited: Enhancing Benchmark Coverage with Automatic Captioning}, 
      author={João Guilherme Alves Santos and Giovana Kerche Bonás and Thales Sales Almeida},
      year={2025},
      eprint={2508.21294},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.21294}, 
} 

models 0

None public yet