Searching 40,000 hours of broadcasting history

Pop Up Archive and WGBH embark on a landmark project to make the American Archive searchablelogos

On August 31, the Institute of Museum and Library Services (IMLS) awarded $14.16 million in grant funding to libraries across the United States. We’re thrilled to announce that the WGBH Educational Foundation, together with the American Archive of Public Broadcasting and Pop Up Archive, received one of 276 National Leadership Grants.

The $898,474 grant includes transcribing, analyzing, and building crowdsourcing tools for almost 40,000 hours of digital audio from the American Archive of Public Broadcasting over the next two and half years. This will be the first major media archive of its kind: the new American Archive site will integrate full-text, searchable transcripts and crowdsourced metadata for thousands of hours of audiovisual materials.

Read more about the IMLS grantees announced last week.

Continue reading

Free premium transcripts for everyone

Take premium transcripts for a spin — for free

We’ve had this change in the works for awhile, and we’re excited to share it with our community: as of this week, anyone who joins the 1 Hour Demo plan can try out the best of Pop Up Archive without paying a thing.

free-plan

Sign up here and Spread the word!

Continue reading

Is public media ready for machine transcription? (A Socratic dialogue)

A conversation between Peter Karman of Pop Up Archive & Andy Kruse of American Public Media on the trade-offs between access and accuracy when it comes to machine transcripts for media.

Andy Kruse:

If I thought we could start working with someone now and the accuracy would be five 9s by the time we turned it on, that would be assuring. But people have thought this technology is close for years now. The future always seems annoyingly just out of reach.
Just kidding. There is no audio in the future. We’ll just swallow pills.

Peter Karman:

The red pill? Or the blue one?

Is public media ready for machine transcription? (A Socratic dialogue)

A conversation between Peter Karman of Pop Up Archive & Andy Kruse of American Public Media on the trade-offs between access and accuracy when it comes to machine transcripts for media.

Andy Kruse:

If I thought we could start working with someone now and the accuracy would be five 9s by the time we turned it on, that would be assuring. But people have thought this technology is close for years now. The future always seems annoyingly just out of reach.
Just kidding. There is no audio in the future. We’ll just swallow pills.

Peter Karman:

The red pill? Or the blue one?

Can’t type, won’t type

Thanks for the shout out, and for checking out our premium software release! Just send us an email, or fill in the contact form on our site (from the Enterprise plan) if you have questions about getting started. 

Also, to speak to your concerns about security issues: we do have a thorough privacy policy, which you can find in our terms of service. All of the audio that’s public on the site has either been publically uploaded by users, or is audio that we’ve found from the public domain. Users can also choose to make their audio and its machine-generated data private; all data is transferred using a secure protocol. Information security is very important to us. 

Hope that helps! 

rootofnothing:

Everyone hates logging. Which is why, half the time, we don’t do it.

So, what if I gave you access to someone who would type of all of your interviews, make them effortlessly searchable, and at any time of day? Thanks to machine transcription that’s about to happen. It’s just a question of when.

The BBC already has work underway in this area using the R&D product Comma, but there’s also an explosion in third parties offering this kind of service.

I’ve had my eye on Pop-up Archive for a while. Their interface looks slick and their prices are keen but they’ve also looked a bit iffy in the accuracy department with a few information security issues thrown in too (how happy are you with your sensitive rushes potentially being accessible to all?) But now they have upped their game with an enterprise offering which looks very tasty. I plan to give it a go, so watch this space for the verdict.

Check out the site here

https://www.popuparchive.com

Can’t type, won’t type

Will the murder case in Serial be cracked by speech-to-text?

(Spoiler alert: probably not.)

This week we analyzed the speech-to-text output of all of the available episodes of Serial and ran them through Pop Up Archive data visualization tools to see which terms and themes appear most frequently across episodes.

If you’ve been listening along, you’ll instantly understand why certain tags, like “Best Buy,” and “Lincoln Park” [sic] appear so prominently.

Though the output stops short of pairing “Adnan” with “innocent” or “Jay” with “liar,” it’s thrilling nonetheless to see Serial’s audio-based saga reflected so accurately in machine-made keyword tags. 

Fun fact: Wondering why Serial producer Julie Snyder and This American Life creator Ira Glass both appear in the tags, while Serial’s own creator, Sarah Koenig, does not? Turns out, Pop Up’s software thinks the name Sarah “Koenig” sounds a lot more like “Sarah Kane. Ick.” We edited this tag out, along with some other winners (like “breakfast cereal”). Bear with us, speech recognition is a work in progress! 

Will the murder case in Serial be cracked by speech-to-text?

(Spoiler alert: probably not.)

This week we analyzed the speech-to-text output of all of the available episodes of Serial and ran them through Pop Up Archive data visualization tools to see which terms and themes appear most frequently across episodes.

If you’ve been listening along, you’ll instantly understand why certain tags, like “Best Buy,” and “Lincoln Park” [sic] appear so prominently.

Though the output stops short of pairing “Adnan” with “innocent” or “Jay” with “liar,” it’s thrilling nonetheless to see Serial’s audio-based saga reflected so accurately in machine-made keyword tags. 

Fun fact: Wondering why Serial producer Julie Snyder and This American Life creator Ira Glass both appear in the tags, while Serial’s own creator, Sarah Koenig, does not? Turns out, Pop Up’s software thinks the name Sarah “Koenig” sounds a lot more like “Sarah Kane. Ick.” We edited this tag out, along with some other winners (like “breakfast cereal”). Bear with us, speech recognition is a work in progress! 

Speech Recognition for Media: Rethinking Accuracy
Adapted from our post Speech Recognition for Media (PBS Idea Lab)

“How accurate are your automatic transcripts?" It’s one of the most frequently asked questions at Pop Up Archive — and one of the hardest to answer. It’s a fair question, yet it often anticipates an unfair answer: 100% accurate. Media producers want the ease and speed of automatic transcripts and captions, but are often loathe to publish anything short of this mystical percentage. 

The barrier to perfect accuracy: If this is what the people want, why don’t we give it to them? The fact is, machine transcription for media voices is a tricky business: you have to factor in background noise, overlapping speech, and poor audio quality. There’s no way to guarantee accuracy for automatic transcription for audio of ranging quality and content. 

We’d like to pose our own question: do you really need 100% accuracy? To value automatic transcripts only at 100% accuracy is to misunderstand the way the Internet reads text. After all, search engines don’t need perfect transcripts. Neither do producers looking for particular moments in hours of interviews. Harnessed the right way, speech-to-text software means effortless drag-and-drop access to crucial keywords and moments hidden deep within hours of content. 

Toward more searchable transcripts: That said, more accurate text still means more accurate search. Pop Up Archive is accomplishing this through speech-to-text that we target at specific genres of media — for example, news broadcasts, first-person interviews, and archival audio from different decades.

Intrigued? Get a free sample transcript for a short audio file from our new and improved speech-to-text software.

***Email us at founders@popuparchive.com to test the new software with your own audio.***

Speech Recognition for Media: Rethinking Accuracy
Adapted from our post Speech Recognition for Media (PBS Idea Lab)

“How accurate are your automatic transcripts?" It’s one of the most frequently asked questions at Pop Up Archive — and one of the hardest to answer. It’s a fair question, yet it often anticipates an unfair answer: 100% accurate. Media producers want the ease and speed of automatic transcripts and captions, but are often loathe to publish anything short of this mystical percentage. 

The barrier to perfect accuracy: If this is what the people want, why don’t we give it to them? The fact is, machine transcription for media voices is a tricky business: you have to factor in background noise, overlapping speech, and poor audio quality. There’s no way to guarantee accuracy for automatic transcription for audio of ranging quality and content. 

We’d like to pose our own question: do you really need 100% accuracy? To value automatic transcripts only at 100% accuracy is to misunderstand the way the Internet reads text. After all, search engines don’t need perfect transcripts. Neither do producers looking for particular moments in hours of interviews. Harnessed the right way, speech-to-text software means effortless drag-and-drop access to crucial keywords and moments hidden deep within hours of content. 

Toward more searchable transcripts: That said, more accurate text still means more accurate search. Pop Up Archive is accomplishing this through speech-to-text that we target at specific genres of media — for example, news broadcasts, first-person interviews, and archival audio from different decades.

Intrigued? Get a free sample transcript for a short audio file from our new and improved speech-to-text software.

***Email us at founders@popuparchive.com to test the new software with your own audio.***

Introducing our speech-to-text partners

We are thrilled to announce the incredible slate of partners working with us to build custom speech-to-text software for news organizations, historical audio collections, and religious institutions: CUNY Television, the Hoover Institution at Stanford University, Illinois Public Media, KCRW Los Angeles, KQED San Francisco, NPR, the Presbyterian Historical Society, the Princeton Theological Seminary, Snap Judgment, and StoryCorps.

Curious where this idea came from? We started with some big problems:

  • Transcription is time-consuming. There are no human hands fast enough to transcribe the amount of recorded sound we process. And even if there were…

  • Transcription is expensive. Enough said.

  • Out of the box automatic transcription services are inaccurate. “India” becomes “ninja,” “quitter” becomes “Twitter,” and a meaningful broadcast or oral history can end up reading like tech gibberish.

Read more in our initial blog post announcing the project.

Over the next two months, we’re creating unique speech-to-text vocabularies tailored specifically to our partners’ content: contemporary news broadcasts, oral histories, archival recordings, religious lecture series and sermons. We’ll be blogging about our partners, their amazing audio, and the speech-to-text customization process as it unfolds, so check back for updates.

The vocabularies are built directly from words and phrases found in our partners’ content. It won’t be 100% accurate, but these special vocabularies enable our speech-to-text software to effectively gauge the likeliness that sounds in certain contexts correspond to particular words or phrases — so that, for example, when someone recording an oral history for StoryCorps says “quitter,” it doesn’t get transcribed “Twitter.” Unless of course the person actually said “Twitter,” which our software can accurately guess by looking at the placement of the word and other nearby words within a sentence.

Our software was initially trained on a subset of audio and transcripts from NPR, StoryCorps, the Washington Post, the Broadcast Board of Governors, and numerous independent producers, reporters, and radio stations. If you want to learn more about advancements in speech-to-text, watch this short Google Tech Talk.

We can’t wait to bring cutting edge speech recognition methods to organizations that would otherwise never benefit from this technology. Want to be a part of the custom speech-to-text magic? Just let us know. We’ll be onboarding more organizations in the coming weeks, and yours could be one of them.

Advised by the British Broadcasting Corp. R&D team and partnered with the Public Radio Exchange, Pop Up Archive is supported by the Knight Foundation, the National Endowment for the Humanities, and 500 Startups.