Interspeech 2018

Unsupervised Word Segmentation from Speech with Attention

Pierre Godard, Marcely Zanon Boito, Lucas Ondel, Alexandre Berard, François Yvon, Aline Villavicencio and Laurent Besacier

Abstract:

We present a first attempt to perform attentional word segmentation from speech signal, with the final goal of automatically identifying lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a pseudo-phones sequence that is segmented using neural soft alignments (from a neural machine translation model). Evaluation uses an actual Bantu UL, Mboshi; comparisons to monolingual and bilingual baselines illustrate the potential of attentional word segmentation for language documentation.

Cite as: Godard, P., Boito, M.Z., Ondel, L., Berard, A., Yvon, F., Villavicencio, A., Besacier, L. (2018) Unsupervised Word Segmentation from Speech with Attention. Proc. Interspeech 2018, 2678-2682, DOI: 10.21437/Interspeech.2018-1308.

BiBTeX Entry:

@inproceedings{Godard2018,
author={Pierre Godard and Marcely Zanon Boito and Lucas Ondel and Alexandre Berard and François Yvon and Aline Villavicencio and Laurent Besacier},
title={Unsupervised Word Segmentation from Speech with Attention},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={2678--2682},
doi={10.21437/Interspeech.2018-1308},
url={http://dx.doi.org/10.21437/Interspeech.2018-1308} }