Identifying people in audio

At Pop Up Archive, we use several approaches to store and represent information about the people in audio. We’ve written before about machine- and crowd-based techniques for describing audio: computers can collect data on a large scale, and people can refine that data. You can see both of these approaches in action at Pop Up Archive and through our latest project, Audiosear.ch.

People at Pop Up Archive: a user-based approach

The technology behind Pop Up Archive and Audiosear.ch generates high-accuracy machine transcripts keywords. At Pop Up Archive, users can upload their own information about the people in their audio through contributor fields like “interviewer,” “interviewee,” “producer,” and “host.”
Screen Shot 2015-07-31 at 9.08.03 AM

People at Audiosear.ch: an exhaustive A-Z index

Screen Shot 2015-07-31 at 10.47.05 AM
While Pop Up Archive relies on users to identify contributors, Audiosear.ch leverages the information we extract from the audio itself. We’ve identified over 10,000 people in the Audiosear.ch podcast database using solely automated methods.

Understanding people in context with roles

Screen Shot 2015-07-31 at 9.22.15 AM
With so many people in our database, naturally we want to know why they’re being talked about. We use transcripts to parse out people’s roles, so we can see at a glance whether someone is a host, producer, guest, or mentioned in a podcast.

As with any purely computational approach, there are bound to be inaccuracies (such as Jeb Bush being identified as a producer on the Glenn Beck podcast – whoops). We’re continually improving the people index because we’re excited to be building the only resource of this kind for podcasts.