#HackathonSomosNLP 2026

We are going to drive the creation of language models aligned with the culture of LATAM and the Iberian Peninsula.


There are 600M Spanish speakers and 265M Portuguese speakers in the world. Spanish and Portuguese are the main languages in 29 countries, each of them rich in culture. Although language models show ever-growing multilingual capabilities, are they truly multicultural? Join the #HackathonSomosNLP now, the largest open-source Natural Language Processing hackathon in Spanish and Portuguese 🚀

(In Spanish, in Portuguese)


📊 We’re launching the fifth edition!

Since 2022, we have brought together…

4
Editions
1500+
Participants
30
Countries
100+
Projects
60
Events

In this fifth edition we will focus on creating resources that let us evaluate and improve the cultural adequacy of large language models for each of the LATAM and Iberian Peninsula countries.

The best part? EVERYONE can contribute! 🎉

GIF Hackathon #Somos600M


🚀 How to participate

📚

Send questions about your culture to LLMs

Ask LLMs questions and choose which ones align best with your culture. Open to everyone!

💻

Build a language model

Develop an LLM aligned with your culture. Teams of 1–5 people: generate a dataset, align a model and build a demo.

By participating you will have the opportunity to:

  • ✨ Learn through live workshops and talks
  • ✨ Access hundreds of USD in GPU and API credits to build your project
  • ✨ Win prizes worth 1500, 1000 or 500 USD (1st, 2nd and 3rd place)
  • ✨ Earn conference tickets and nominations to the Nova talent network
  • ✨ Get mentored by leading people in the NLP field
  • ✨ Co-author papers at international NLP conferences
  • ✨ Get a participation (or winning team) certificate for the hackathon

Let’s go for it!

Have questions? Check the FAQ and contact info at the bottom of the page.


🚀 Other ways to support us

Help us organize this free, non-profit event!

📣

Spread the word

Help us reach more people with this initiative. After 4 publications we add your logo to the website.

Spread the word
🤗

Join the team

Collaborate by creating content, support resources, tutorials, articles, or by researching Cultural NLP.

Join
🧑‍🏫

Offer a mentorship

Share your experience helping teams build quality datasets and train good LLMs. One-off or ongoing mentorships.

Offer mentorship
🙌

Sponsor the event

Support our mission through visibility, vouchers or donations. SomosNLP is a non-profit community.

See options

🏆 Success stories

Hackathon projects generate real-world impact:

20221st Prize

🏅 BiomedIA

Voice-to-voice biomedical Q&A system. It led to a paper at NAACL 2022 that won the Best Poster Presentation Award.

20222nd Prize

⚖️ Mexican Legal Model

Legal knowledge model used by the Supreme Court of Justice of Mexico.

20241st Prize

📰 NoticIA

Corpus of 850 Spanish clickbait news articles with high-quality summaries, tackling digital misinformation. Published at SEPLN 2024.

20242nd Prize

🤝 AsistenciaRefugiados

Legal assistant for refugees, making legal information in Spain more accessible.

20241st Prize

🤝 Sustainable BERT

Identification of texts related to climate change and sustainability using pre-trained language models in Spanish. LatinX in AI (LXAI) Research Workshop @NAACL 2024. Best paper at KHIPU 2025.

20241st Prize

🤝 Healthy Cooking

Learning to cook in healthy ways with Large Language Models, Supervised Fine-Tuning and Retrieval Augmented Generation. LatinX in AI (LXAI) Research Workshop @NAACL 2024.

2024Collective achievement

📚 Instruction dataset

More than 1M instructions generated, creating the largest supervised training dataset in Spanish. #Somos600M paper published at the LatinX in NLP workshop @NAACL 2024. Interview in El País newspaper.

2025Collective achievement

📚 INCLUDE: cultural knowledge benchmark

More than 38,000 exam questions were collected from 23 countries, creating the largest cultural-knowledge evaluation benchmark for LLMs in Spanish and Portuguese.

More examples

💡 Talks and mentorships

You will have the chance to learn from leaders in academia and industry — we will keep announcing new talks and mentorships!


👏 Acknowledgments

Thank you so much for your time and for helping our initiative reach further. Let’s make language models more inclusive!

🚀 Organized by

SomosNLPUNED

🥇 Gold Sponsors

NextGenerationEUSEDIAredesPERTEUNED
Hugging Face

🥈 Silver Sponsors

Universidad Politécnica de MadridCENIA

❓ Frequently asked questions

Why should I participate?

By joining this hackathon you will have the chance to:

  • ✅ Understand how large language models work, both text (LLMs) and multimodal (VLLMs), and discover the challenges of each development stage: corpus creation, training, alignment and evaluation
  • ✅ Take part in building the first high-quality, diverse preference corpus to align LLMs with the cultures of LATAM and the Iberian Peninsula (great experience and great for your CV)
  • ✅ Be part of the team that creates some of the datasets for the first open leaderboard of LLMs in Spanish: La Leaderboard
  • ✅ Get all your NLP questions answered during “Ask Me Anything” mentoring sessions
  • ✅ Get support to present your work in a paper
  • ✅ Win prizes to keep growing as a professional and earn a certificate you can share on LinkedIn
  • ✅ Join the largest community of Spanish speakers studying, working and researching in NLP
What level is required?

The SomosNLP team encourages you to participate regardless of your current knowledge. In previous editions we have had groups from research institutes and undergraduate student groups — every project counts!

  • 📖 We will run a series of hands-on workshops showing you how to build a project so you have a reference example.
  • ❓ We will organize AMAs (Ask Me Anything) with experts and mentors so they can answer your questions.
What determines the complexity of the projects?

We will provide an example of how to create a dataset, train a model and build a demo. It’s up to you and your team to decide how much to research and work to improve on the baseline. The difficulty also depends on the use case, the origin of the data, how much time you spend curating it, the training technique, how many iterations you do and how polished you want your demo to be. You are free to choose everything!

Do we really need 4 weeks?

No — it depends on your availability, you can develop a good project in a week. We know people have studies and jobs, so we leave more time than strictly necessary so everyone can participate. We also want to give you extra time to enjoy attending the live talks and mentorships held during the hackathon.

Until when can I create a team?

UPDATED: We welcome new teams until May 23. The final project submission day is May 31.

How do I join a team?

Read the “To create a team:” section at the start of this page and the README in the #encuentra-equipo channel of our Discord server :)

Can teams be just 1 person?

Yes, we accept teams of 1 to 5 people.

How do you recommend organizing ourselves?
  • Use your project’s Discord channel to communicate and get organized.
  • Since this is an international hackathon, we recommend async communication or splitting the work and holding smaller meetings
  • Schedule meetings or chat spontaneously using the new voice channels in the “MEETING ROOMS” category on Discord
  • Pin the important messages in your project channel, e.g. task allocation, next meeting date, etc. To pin a message click the three dots and select “Pin message”
  • For extra clarity, you can also create a shared document with the team where you write down the project goal, split tasks and so on (and pin the link in the chat)
I don't understand Discord — which are the most important channels?
  • Check the #anuncios channel — we recommend enabling notifications; we post 2–3 times a week
  • Ask your questions in the Discord #pide-ayuda channel so everyone can benefit from the answer
  • Events are announced in #eventos and added to the Google Calendar
How can I find out about the events?
How can I give feedback about the event?
  • You can give us feedback to improve the challenge guides via this form (anonymous)
  • We will also share a general feedback form at the end of the event

If we told you there is info on this page that you can’t find, clear the cookies and reload the page.


🤗 Connect!

To keep up with all the events and updates: