Fearless Steps: Apollo-11 Corpus Advancements for Speech Technologies from Earth to the Moon
John H.L. Hansen, Abhijeet Sangwan, Aditya Joglekar, Ahmet E. Bulut, Lakshmish Kaushik and Chengzhu Yu
Abstract:
The Apollo Program is one the most significant benchmarks for technology and innovation in human history. The previously introduced UTD-CRSS Apollo initiative resulted in the digitization of the original analog audio tapes recorded during the Apollo Space Missions. This entire speech data is now being made publicly available with the release of the Fearless Steps Corpus. This corpus consists of a cumulative 19,000 hours of conversational speech spanning over thirty time-synchronized channels. With over six hundred speakers, the corpus has a rich collection of information which can be beneficial for research and advancement in the Speech and Language Community. Recent efforts on this data have led to the generation of pipeline diarization transcripts for the entire Speech Corpus. Research has also been done to address speech and natural language tasks such as speech activity detection, speech recognition and sentiment analysis. This paper provides an overview of the Fearless-Steps Corpus as well as a summary of previous research work achieved and highlights the factors that make the processing of this data a challenging problem. To initiate further development of algorithms on this Corpus, five challenge tasks are also organized. We also describe the challenge tasks with their associated transcriptions.
Cite as: Hansen, J.H., Sangwan, A., Joglekar, A., Bulut, A.E., Kaushik, L., Yu, C. (2018) Fearless Steps: Apollo-11 Corpus Advancements for Speech Technologies from Earth to the Moon. Proc. Interspeech 2018, 2758-2762, DOI: 10.21437/Interspeech.2018-1942.
BiBTeX Entry:
@inproceedings{Hansen2018,
author={John H.L. Hansen and Abhijeet Sangwan and Aditya Joglekar and Ahmet E. Bulut and Lakshmish Kaushik and Chengzhu Yu},
title={Fearless Steps: Apollo-11 Corpus Advancements for Speech Technologies from Earth to the Moon},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={2758--2762},
doi={10.21437/Interspeech.2018-1942},
url={http://dx.doi.org/10.21437/Interspeech.2018-1942} }