Google 语音搜寻更快更准确

Google Voice Search

Google Research Blog 今天公布 Google 语音搜寻 (Voice Search) 采用了全新的核芯技术,可以更快更准确分析语音输入,进行搜寻。

Today, we’re happy to announce we built even better neural network acoustic models using Connectionist Temporal Classification (CTC) and sequence discriminative training techniques. These models are a special extension of recurrent neural networks (RNNs) that are more accurate, especially in noisy environments, and they are blazingly fast!

过往旧技术,会将用家所讲的 waveform 分拆成一个个 10微秒的 frames,再逐一分析。之后再透过不同的模型去判定分析出来的音节字词的在该语音内的可能性,从而选取成该个语音结果。

新技术采用 Recurrent Neural Networks,具有 feedback loops 机制,判断每一音节前后相关的音节,可以提升声音的辨认度。另外,将语音模型不断的训练,语音模型可以更少的运算作出辨认。而在语音模型训练亦特别加入一些杂音,从以提升语音模型在杂声下的辨识能力。

We are happy to announce that our new acoustic models are now used for voice searches and commands in the Google app (on Android and iOS), and for dictation on Android devices. In addition to requiring much lower computational resources, the new models are more accurate, robust to noise, and faster to respond to voice search queries – so give it a try, and happy (voice) searching!

简单而言,用了新技术 Voice Search 可以更快速更准确。新技术已经在 Android 和 iOS 的 Google App 上应用,亦用在语音输入上。各位不妨试试 Voice Search ,是否真的较以前更快呢?

Source: Google Research Blog

发布留言