Bhashini's Vaani Project: Open-Sourcing Speech Data Across India

Technology
Jul 11 2024 12:56 PM
P C Thomas

The Bhashini initiative, spearheaded by IISc-ARTPARK and Google, is set to open-source 16,000 hours of speech from 80 districts and 150,000 hours of speech from one million people across 773 districts in India. This ambitious project is funded by the Bill and Melinda Gates Foundation.

Project Overview

Under the Ministry of Electronics and Information Technology’s Bhashini initiative, the Indian Institute of Science’s AI and Robotics Technology Park (IISc-ARTPARK) aims to support AI development in Indic languages. As part of Project Vaani, IISc-ARTPARK is collaborating with Google to collect and open-source spontaneous speech data.

Data Collection and Coverage

The project is collecting 200 hours of speech per district, supporting 58 language variants, with data collection assisted by Karya. The first phase, which began in late 2022, is nearing completion, covering 80 districts. The second phase will extend to 160 districts, with each district contributing 200 hours of speech data from approximately 1,000 people.

Significance of Bhashini

Amitabh Nag, Chief Executive of Bhashini, emphasized the importance of this project in creating digital data for low-resource languages, aiding in the development of AI models for these languages. "Bhashini, for the first time, helped create digital data for low-resource languages in its effort to build AI models," said Nag. "The Vaani project with Google ensures coverage across the country, with open-sourced datasets available for startups to build applications and AI models."

Applications and Benefits

This extensive dataset will serve as base data for training speech-to-text AI models, beneficial for various applications, including conversational AI platforms and chatbots. Any organization interested in developing speech models for Indian languages and dialects can download and use the data according to their business needs.

Funding and Support

The project is funded by the Bill and Melinda Gates Foundation and supported by SYSPIN (Synthesizing Speech in Indian Languages), which started in 2021 with funding from the GIZ – German Development Corporation. RESPIN, another initiative, focuses on speech recognition in agriculture and finance for the poor, while SYSPIN aims to create text-to-speech synthesizers in nine Indian languages.

Conclusion

The Vaani project is a significant step towards enhancing AI capabilities for Indic languages, providing valuable resources for the development of diverse AI applications. By open-sourcing this data, Bhashini is fostering innovation and collaboration across the tech community.

AI Units to Assess GPU Needs Across Indian Ministries

Elon Musk Unveils Grok AI 2 Following Grok AI 1.5, Teases Grok AI 3

Tata Play Expands Its Bingeing Solution to Bangladesh