🚀 [2025-02-01]: Initial release of FAMMA-LivePro dataset, collected from invited experts! 🌟
🚀 [2025-01-01]: Full release of FAMMA-Basic dataset, now including answers and explanations with enhanced quality! 🌟
🔥 [2024-06-01]: Initial public release of FAMMA benchmark (based on the FAMMA-Basic dataset), along with our paper FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering. We welcome all submissions and look forward to your participation! 😆
FAMMA is a multilingual multimodal financial question-answering benchmark dataset with the following key features:
Model | Arithmetic (Pass@1) | Non-Arithmetic (Pass@1) | Overall (Pass@1) | ||||||
---|---|---|---|---|---|---|---|---|---|
Overall | Easy | Medium | Hard | Overall | Easy | Medium | Hard | Overall |
Overall results of different models on the FAMMA-LivePro leaderboard. The
best-performing
model in each category is in-bold, and the second best is underlined.
*: use OCR to
extract the image content and pass to the model.
GPT o1 version: 2024-12-17, o1-mini version: 2024-09-12, 4o version: 2024-08-06
Deepseek-R1 version: 2025-01-20, Qwen-QwQ-32B version: 2025-03-05, Qwen-VL-Max version:
2025-01-25
Claude 3.5 Sonnet version: 2024-10-22
Gemini 2.0 Flash Thinking version: exp-0120, Gemini 1.5 Pro version: 002
Model | Arithmetic (Pass@1) | Non-Arithmetic (Pass@1) | Overall (Pass@1) | ||||||
---|---|---|---|---|---|---|---|---|---|
Overall | Easy | Medium | Hard | Overall | Easy | Medium | Hard | Overall |
Overall results of different models on the FAMMA leaderboard. The best-performing
model in each category is in-bold, and the second best is underlined.
*: use OCR
to extract the image content and pass to the model.
GPT o1 version: 2024-12-17, o1-mini version: 2024-09-12, 4o version: 2024-08-06
Deepseek-R1 version: 2025-01-20, Qwen-QwQ-32B version: 2025-03-05, Qwen-VL-Max version:
2025-01-25
Claude 3.5 Sonnet version: 2024-10-22
Gemini 2.0 Flash Thinking version: exp-0120, Gemini 2.0 Flash version: exp, Gemini 2.0 Pro version: exp-0205, Gemini 1.5 Pro version: 002
@article{xue2024famma,
title={FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering},
author={Siqiao Xue, Tingting Chen, Fan Zhou, Qingyang Dai, Zhixuan Chu, and Hongyuan Mei},
journal={arXiv preprint arXiv:2410.04526},
year={2024},
url={https://arxiv.org/abs/2410.04526}}