MONROVIA, Calif., March 19, 2025 /PRNewswire/ -- Nexdata, a leading global provider of AI data services today announces the start of The Multilingual Conversational Speech LLM (MLC-SLM) Challenge, an officially approved satellite event of Interspeech 2025.
This challenge, hosted by Meta, Google, Samsung, Naver, China Mobile, Northwestern Polytechnical University and Nexdata, aims to advance multilingual conversational speech AI by providing a real-world dataset and encouraging innovation in speech language models.
The challenge consists of two tasks, both of which require participants to explore the development of speech language models (SLMs):
Task I: Multilingual Conversational Speech Recognition
Objective: Develop a multilingual LLM-based ASR model. Participants will be provided with oracle segmentation and speaker labels for each conversation.
Task II: Multilingual Conversational Speech Diarization and Recognition
Objective: Develop a system for both speaker diarization (identifying who is speaking when), and recognition (transcribing speech to text). No prior or oracle information will be provided during evaluation (e.g., no pre-segmented utterances or speaker labels). Both pipeline-based and end-to-end systems are encouraged, providing flexibility in system design and implementation.
The training set (Train) comprises approximately 11 languages: English (en), French (fr), German (de), Italian (it), Portuguese (pt), Spanish (es), Japanese (jp), Korean (ko), Russian (ru), Thai (th), Vietnamese (vi). It's designed to provide a rich resource for training and evaluating multilingual conversational speech language models (MLC-SLM), addressing the challenges of linguistic diversity, speaker variability, and contextual understanding.
Important Dates (AOT Time)
March 10, 2025: Registration opens
March 15, 2025: Training data release
March 20, 2025: Development set and baseline system release
May 15, 2025: Evaluation set release and Leaderboard open
May 30, 2025: Leaderboard freeze and paper submission portal opens (CMT system)
June 15, 2025: Paper submission deadline
July 1, 2025: Notification of acceptance
August 18, 2025: Workshop date
We have set a prize pool of $20,000 for the winners. Based on performance, the top three teams in each track will be awarded:
1st Prize: $5,000
2nd Prize: $3,000
3rd Prize: $2,000
For more details, please check out the challenge website: https://www.nexdata.ai/competition/mlc-slm
Participate here: https://docs.google.com/forms/d/e/1FAIpQLSftZCRQQWvO5NZd-bPo1VT2Xsaieu_ZYCklw6MhW6LqjWnuYQ/viewform?usp=send_form
For inquiries: [email protected]
Join us in shaping the future of multilingual conversational AI and be part of this groundbreaking challenge!
About Nexdata
Nexdata provides top-notch training data solutions and serves as your reliable partner. With an extensive array of off-the-shelf datasets and flexible data collection and annotation services, our mission revolves around unleashing AI's full potential and expediting the AI industry's growth.
SOURCE Nexdata
These press releases may also interest you
|