About
I am a third year Ph.D student at the Sardine Lab advised by Dr. Andre F.T. Martins and Dr. Giuseppe Attanassio. I completed my master's at Saarland University studying Language Science and Technology where I worked at the Multilingual Technologies Lab, DFKI under the supervision of Dr. Cristina Espana-Bonet. Prior to this, I received my bachelor's as a Mathematics and Computer Science double major at École Polytechnique.
My current research interests largely revolve around learning better representations of languages in a multilingual and multimodal setup. My work has revolved around adding speech and visual modalities to pre-existing text-only LLMs in order to move towards the most natural way of communiting, which is through all modes. In the future, I want to create models that are robust to mutliple languages and modalities which allows it to work in collaboration rather than interference. Please reach out for any interesting collaborations!
News
- Aug. 20, 2025 SPIRE has been accepted to EMNLP 2025 (Findings)! See you in China! We also release two new works on multimodality: TowerVision and the MF2 Benchmark!
- Feb 1., 2025 Back to sunny Lisbon, and we now have a patent accepted from my work at Amazon!
- Oct. 1, 2024 Started my internship at Amazon Alexa on semantically aware speech representations.
- Apr. 10, 2024 First author paper accepted at SIGIR'24 on synthetic data generation via LLMs for virtual assistants!
- Oct. 1, 2023 Completed my master's, my internship at Apple, and moved to Lisbon, PT!
- Apr. 10, 2023 First author paper at EAMT, Finland with Inria Paris!
- Jan. 30, 2023 Spending the summer at Apple AI/ML Research in Germany!
- Jan. 22, 2023 First author paper in EACL about document representations with the MLT team at DFKI! See you in Croatia!
- Jul. 28, 2022 Awarded the ACM SIGHPC Computational Science Fellowship for 2022! Immensely thankful for all the support!
- May. 31, 2022 Awarded the Palantir Women in Technology Scholarship! So grateful <3
- May. 22-27, 2022 Attended my first conference, ACL! Yay! Presented my work at Repl4NLP :)
- Dec. 29, 2021 Excited to be joining Bloomberg LP in the spring for their technology insights week!
- Oct. 15, 2021 Started my master's degree at Saarland Uni
- Sep. 20, 2021 I am going to be giving a talk at PyCon ZA'21 on Self-Supervised Action Classification!
- Jul. 16, 2021 I have sucessfully completed my bachelor's degree with honours!
- May. 21, 2021 I have been awarded a scholarship to attend the first vGHC EMEA conference bringing together women technologists from all over the world!
- Apr. 21, 2021 I successfully defended my thesis on cross lingual word embeddings for low resource machine translation receiving the highest grade!
- Oct. 20, 2020 I held my first event at Code Week EU by Code.org!
- Sep. 7, 2020 I completed the Google Get Ahead programme, which is an invite only development programme for students.
- Jul. 20, 2020 I was part of the Programme Committee at NLP Beyond Text Workshop at EMNLP'20!
- May. 4, 2020 I won the Google WomenTechmakers Computer Science Scholarship in EMEA!
Publications
From Tower to Spire: Adding the Speech Modality to a Translation-specialist LLM Ambilduke, Kshitij*, Peters, Ben*, Sannnigrahi, Sonal* et. al. 30th Conference of the Empirical Methods for Natural Language Processing, 2025
Synthetic query generation using large language models for virtual assistants Sannnigrahi, Sonal* , Thiago Fraga-Silva, Youssef Oualil, Christophe Van Gysel* 47th International ACM SIGIR Conference
Investigating Lexical Sharing in Multilingual Machine Translation of Indian Languages Sannigrahi, Sonal and Bawden, Rachel 24th Conference of the European Association for Machine Translation, 2023
Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings? Sannigrahi, Sonal and van Genabith, Josef and Espana-Bonet, Cristina 17th Conference of the European Chapter of the Association for Computational Linguistics
Isomorphic Cross-Lingual Word Embeddings for Low-Resource Languages Sannigrahi, Sonal and Read, Jesse 7th Workshop on Representation Learning for NLP at the 60th Annual Meeting of the Association for Computational Linguistics, 2022Media and Outreach Work
In my time at university, my work has been recognised in a few different areas. Have a look!
Interviews
- 2020 Women in Science Interview
I talk about what I think about the position of women in STEM fields, why I think representation matter, and what I'm doing to break down language barriers for people around the world with hopes of making tech an inclusive field!Talks
- 2021 PyCon ZA: Moving In-Sync
- 2019 PyCon FR: APIs and Language Processing with Python for Twitter
Tech Outreach
My love for languages and inclusivity goes far beyond the realms of research and academic work! In my free time, I volunteer at a bunch of places whose mission I believe in! I am a proud contributor of all of these wonderful organisations :)
- 2020 Code.org
Code.org allows tech enthusiats from around the world to register as a volunteer and teach basic programming skills to kids everywhere. One event I conducted was part of EU Code Week 2020 where I spoke about different scholarship opportunities available for women in STEM fields within the EMEA region.- 2019 RaspberryPi Translation Volunteer
I joined RaspberryPi Translation in the summer of 2019 as a translator for the English-Hindi team. Here, I translated and reviewed numerous projects to impact more than 250 code clubs in India. I also succesfully organised a translation hackathon where we tranlslated 15 projects in 7 different languages! This work was highly impactful in rural communities throughout India. I would strongly recommend joining the translation team as they are super welcoming to new volunteers and the work is extremely rewarding!
Website built from Marcos Horro's template
Last updated Oct. 30 2025