Skip to main content

Streaming Punctuation for Long-Form Dictation with Transformers



Streaming Punctuation for Long-Form Dictation with Transformers

Authors

Piyush Behre, Sharman Tan, Padma Varadharajan and Shuangyu Chang, Microsoft Corporation, USA

Abstract

While speech recognition Word Error Rate (WER) has reached human parity for English, longform dictation scenarios still suffer from segmentation and punctuation problems resulting from irregular pausing patterns or slow speakers. Transformer sequence tagging models are effective at capturing long bi-directional context, which is crucial for automatic punctuation. Automatic Speech Recognition (ASR) production systems, however, are constrained by real-time requirements, making it hard to incorporate the right context when making punctuation decisions. In this paper, we propose a streaming approach for punctuation or re-punctuation of ASR output using dynamic decoding windows and measure its impact on punctuation and segmentation accuracy across scenarios. The new system tackles over-segmentation issues, improving segmentation F0.5-score by 13.9%. Streaming punctuation achieves an average BLEU-score improvement of 0.66 for the downstream task of Machine Translation (MT).

Keywords

Automatic punctuation, automatic speech recognition, re-punctuation, speech segmentation.



Comments

Popular posts from this blog

Comparison of Support Vector Machines and Deep Learning for Plant Classification in Smart Agriculture Applications

Comparison of Support Vector Machines and Deep Learning for Plant Classification in Smart Agriculture Applications Authors Esmael Hamuda 1, Ashkan Parsi 2, Martin Glavin 2 and Edward Jones 2, 1 Elmergib University, Libya, 2 University of Galway, Ireland Abstract In this paper, we investigate the use of deep learning approaches for plant classification (cauliflower and weeds) in smart agriculture applications. To perform this, five approaches were considered, two based on well-known deep learning architectures (AlexNet and GoogleNet), and three based on Support Vector Machine (SVM) classifiers with different feature sets (Bag of Words in L*a*b colour space, Bag of Words in HSV colour space, Bag of Words of Speeded-up Robust Features (SURF)). Two types of datasets were used in this study: one without Data Augmentation and the second one with Data Augmentation. Each algorithm's performance was tested with one data set similar to the training data, and a second data set acquired under ...

Submit your Research Article - International Journal of Chaos, Control, Modelling and Simulation (IJCCMS)

Submit your Research Article!! International Journal of Chaos, Control, Modelling and Simulation (IJCCMS) ISSN :  2319 - 5398 [Online] ; 2319 - 8990 [Print] Webpage URL:  https://airccse.org/journal/ijccms/index.html Submission URL:  http://coneco2009.com/submissions/imagination/home.html Here's where you can reach us :  ijccmsjournal@yahoo.com or ijccms@aircconline.com

6th International Conference of Control Theory and Computer Modelling (CTCM 2020)

October 24 ~ 25, 2020, Dubai, UAE https://csen2020.org/ctcm/index.html Submission Deadline: July 26, 2020 Contact us: Here's where you can reach us: ctcm@csen2020.org (or) ctcmconference@yahoo.com Submission Link: https://csen2020.org/submission/index.php