Profile picture

Hello World! I'm Sonal

sonal.sannigrahi[at]tecnico.ulisboa.pt



About


I am a third year Ph.D student at the Sardine Lab advised by Dr. Andre F.T. Martins and Dr. Giuseppe Attanassio. I also conducted research at Amazon, Apple, and INRIA. I completed my master's at Saarland University studying Language Science and Technology where I worked at the Multilingual Technologies Lab, DFKI under the supervision of Dr. Cristina Espana-Bonet. Prior to this, I received my bachelor's as a Mathematics and Computer Science double major at École Polytechnique.

My current research interests largely revolve around learning better representations of languages in a multilingual and multimodal setup. My work has revolved around adding speech and visual modalities to pre-existing text-only LLMs in order to move towards the most natural way of communicating, which is through all modes. In the future, I want to create models that are robust to multiple languages and modalities which allows it to work in collaboration rather than interference. Please reach out for any interesting collaborations!

I am from Mumbai, India and I have had the good fortune to spend time in several places- France, Germany, UK, and now Portugal! I am an avid surfer and I enjoy spending my time travelling around the world hiking, eating, and taking photos :) Nice to e-meet you!

News


Aug. 20, 2025

SPIRE has been accepted to EMNLP 2025 (Findings)! See you in China! We also release two new works on multimodality: TowerVision and the MF2 Benchmark!

Feb. 1, 2025

Back to sunny Lisbon, and we now have a patent accepted from my work at Amazon!

Oct. 1, 2024

Started my internship at Amazon Alexa on semantically aware speech representations.

Apr. 10, 2024

First author paper accepted at SIGIR'24 on synthetic data generation via LLMs for virtual assistants!

Oct. 1, 2023

Completed my master's, my internship at Apple, and moved to Lisbon, PT!

Apr. 10, 2023

First author paper at EAMT, Finland with Inria Paris!

Jan. 30, 2023

Spending the summer at Apple AI/ML Research in Germany!

Jan. 22, 2023

First author paper in EACL about document representations with the MLT team at DFKI! See you in Croatia!

Jul. 28, 2022

Awarded the ACM SIGHPC Computational Science Fellowship for 2022!

May. 31, 2022

Awarded the Palantir Women in Technology Scholarship!

May. 22-27, 2022

Attended my first conference, ACL! Presented my work at Repl4NLP :)

Dec. 29, 2021

Excited to be joining Bloomberg LP in the spring for their technology insights week!

Oct. 15, 2021

Started my master's degree at Saarland Uni

Sep. 20, 2021

I am going to be giving a talk at PyCon ZA'21 on Self-Supervised Action Classification!

Jul. 16, 2021

I have successfully completed my bachelor's degree with honours!

May. 21, 2021

Awarded a scholarship to attend the first vGHC EMEA conference!

Apr. 21, 2021

Successfully defended my thesis on cross lingual word embeddings for low resource machine translation receiving the highest grade!

Oct. 20, 2020

Held my first event at Code Week EU by Code.org!

Sep. 7, 2020

Completed the Google Get Ahead programme, an invite only development programme for students.

Jul. 20, 2020

Part of the Programme Committee at NLP Beyond Text Workshop at EMNLP'20!

May. 4, 2020

Won the Google WomenTechmakers Computer Science Scholarship in EMEA!

Publications


From Tower to Spire: Adding the Speech Modality to a Translation-specialist LLM
Ambilduke, Kshitij*, Peters, Ben*, Sannigrahi, Sonal* et. al.
30th Conference of the Empirical Methods for Natural Language Processing, 2025
Synthetic query generation using large language models for virtual assistants
Sannigrahi, Sonal*, Thiago Fraga-Silva, Youssef Oualil, Christophe Van Gysel*
47th International ACM SIGIR Conference, 2024
Investigating Lexical Sharing in Multilingual Machine Translation of Indian Languages
Sannigrahi, Sonal and Bawden, Rachel
24th Conference of the European Association for Machine Translation, 2023
Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?
Sannigrahi, Sonal, van Genabith, Josef and Espana-Bonet, Cristina
17th Conference of the European Chapter of the Association for Computational Linguistics, 2023
Isomorphic Cross-Lingual Word Embeddings for Low-Resource Languages
Sannigrahi, Sonal and Read, Jesse
7th Workshop on Representation Learning for NLP at ACL, 2022

Media and Outreach Work


In my time at university, my work has been recognised in a few different areas. Have a look!

Interviews

Women in Science Interview (2020)

I talk about what I think about the position of women in STEM fields, why I think representation matters, and what I'm doing to break down language barriers for people around the world with hopes of making tech an inclusive field!

Read Interview

Talks

PyCon ZA 2021: Moving In-Sync
PyCon FR 2019: APIs and Language Processing with Python for Twitter

Tech Outreach

My love for languages and inclusivity goes far beyond the realms of research and academic work! In my free time, I volunteer at organizations whose mission I believe in!

Code.org (2020)

Code.org allows tech enthusiasts from around the world to register as a volunteer and teach basic programming skills to kids everywhere. I conducted an event as part of EU Code Week 2020 where I spoke about different scholarship opportunities available for women in STEM fields within the EMEA region.

RaspberryPi Translation Volunteer (2019)

I joined RaspberryPi Translation in the summer of 2019 as a translator for the English-Hindi team. I translated and reviewed numerous projects to impact more than 250 code clubs in India. I also successfully organized a translation hackathon where we translated 15 projects in 7 different languages! This work was highly impactful in rural communities throughout India.