Title:
Video and Audio Analysis for Remembering Conversations
Abstract:
The Informedia digital video library project pioneered the
automatic analysis of television broadcast news and its retrieval
on demand. Building on that system, we have developed a
wearable, personalized Informedia system, which listens to and
transcribes the wearer's part of a conversation, recognizes the
face of the current dialog partner and remembers his/her voice.
The next time the system sees the same person's face and hears
the same voice, it can retrieve the audio from the last
conversation, replaying in compressed form the names and
major issues that were mentioned. All of this happens
unobtrusively, somewhat like an intelligent assistant who
whispers to you: "That's Bob Jones from Tech Solutions, two
weeks ago in London you discussed solar panels." This paper
outlines the general system components as well as interface
considerations. Initial implementations showed that both face
recognition methods, and speaker identification technology
have serious shortfalls that must be overcome.
|