ASRU 2007

Project Talks

P.1 THE GALE PROJECT: A DESCRIPTION AND AN UPDATE

Jordan Cohen, SRI International, United States

ARPA program to develop and apply computer software technologies to absorb, translate, analyze, and interpret huge volumes of speech and text in multiple languages This program has been active for two years, and the GALE contractors have been engaged in developing highly robust speech recognition, machine translation, and information delivery systems in Chinese and Arabic. Several GALE­ developed talks will be given in this workshop. This overview talk will review the program goals, the technical highlights, and the technical issues remaining in the GALE project.

P.2 RECOGNITION AND UNDERSTANDING OF MEETINGS: THE AMI AND AMIDA PROJECTS

Steve Renals, University of Edinburgh, United Kingdom; Thomas Hain, University of Sheffield, United Kingdom; Hervé Bourlard, IDIAP Research Institute, Switzerland

The AMI and AMIDA projects are concerned with the recognition and interpretation of multiparty meetings. Within these projects we have: developed an infrastructure for recording meetings using multiple microphones and cameras; released a 100 hour annotated corpus of meetings; developed techniques for the recognition and interpretation of meetings based primarily on speech recognition and computer vision; and developed an evaluation framework at both component and system levels. In this paper we present an overview of these projects, with an emphasis on speech recognition and content extraction.

P.3 INTRODUCTION OF THE METI PROJECT “DEVELOPMENT OF FUNDAMENTAL SPEECH RECOGNITION TECHNOLOGY”

Sadaoki Furui, Tokyo Institute of Technology, Japan; Tetsunori Kobayashi, Waseda University, Japan

Waseda University, Tokyo Institute of Technology, and six companies, Asahi-kasei, Hitachi, Mitsubishi, NEC, Oki and Toshiba, initiated a three year project in 2006 supported by the Ministry of Economy, Industry and Trade (METI), Japan, for jointly developing fundamental automatic speech recognition (ASR) technology. The project focuses on utilizing ASR technology in car and home environments. Seven subtasks are being investigated: speech/non-speech separation using multiple microphones, speech/non-speech separation for a single audio stream, developing a high-performance WFST-based decoder, multi-lingual ASR modeling, higher-order language modeling, developing a system for assisting speech interface development, and overall technology evaluation. This talk will give an overview of the intermediate technological progress achieved by the project.