SHARE
Facebook X Pinterest WhatsApp

Baidu Makes Speech APIs Available for Free

How to Find Business Value in Your Data Through Modernization It’s estimated that some 1.2 billion people, roughly equivalent to 16 percent of the world’s population, speak some form of Chinese. Obviously, a lot of developers would like to tap into that market without necessarily having to translate their application into Chinese. To help accelerate […]

Written By
MV
Mike Vizard
Nov 22, 2016
Slide Show

How to Find Business Value in Your Data Through Modernization

It’s estimated that some 1.2 billion people, roughly equivalent to 16 percent of the world’s population, speak some form of Chinese. Obviously, a lot of developers would like to tap into that market without necessarily having to translate their application into Chinese.

To help accelerate that process, Baidu, the most widely used search engine in China, announced today that it is making available to developers four speech application programming interfaces (APIs).

The APIs specifically address Long Utterance Speech Recognition, Far-Field Speech Recognition, Expressive Speech Synthesis and Wake Word. Long Utterance Speech Recognition enables the transcription of long audio clips such as interviews, speeches and lectures. Far-Field Speech Recognition enables the recognition of speech from audio sources that are up to 16 feet away. Expressive Speech Synthesis provides a collection of realistic voices that can be used to read aloud. Wake Word allows developers to create customized short words or phrases that can be spoken to turn devices on.

Sanjeev Satheesh, a research scientist at Baidu who specializes in machine learning, says Baidu is trying to drive globalization of applications by making speech APIs available along with other services such as facial recognition, optical character recognition and natural language processing.

“We’re not just talking about speech to text,” says Satheesh.

BaiduSpeech

To drive that effort, Satheesh says that in contrast to other providers of APIs, Baidu has opted to make the four speech APIs available to developers for free. Longer term, Satheesh says Baidu envisions speech replacing traditional keyboards as the primary interface users interact with to access applications. In fact, Satheesh says Baidu expects natural language APIs to be used in concert with speech APIs and language translation services enabled by a deep learning framework it developed to translate audio. Baidu early this year made that deep learning framework, dubbed PaddlePaddle, available as an open source project.

Satheesh says free access to APIs coupled with open source software will eventually force other providers of these types of technologies to follow suit in terms of making advanced technologies more pervasively available to all developers. In the meantime, developers might want to start considering what access to a billion more potential users might mean for their applications.

Save

MV

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

Recommended for you...

Data Lake Strategy Options: From Self-Service to Full-Service
Chad Kime
Aug 8, 2022
What’s New With Google Vertex AI?
Kashyap Vyas
Jul 26, 2022
Data Lake vs. Data Warehouse: What’s the Difference?
Aminu Abdullahi
Jul 25, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.