Introducing our speech-to-text partners

We are thrilled to announce the incredible slate of partners working with us to build custom speech-to-text software for news organizations, historical audio collections, and religious institutions: CUNY Television, the Hoover Institution at Stanford University, Illinois Public Media, KCRW Los Angeles, KQED San Francisco, NPR, the Presbyterian Historical Society, the Princeton Theological Seminary, Snap Judgment, and StoryCorps.

Curious where this idea came from? We started with some big problems:

  • Transcription is time-consuming. There are no human hands fast enough to transcribe the amount of recorded sound we process. And even if there were…

  • Transcription is expensive. Enough said.

  • Out of the box automatic transcription services are inaccurate. “India” becomes “ninja,” “quitter” becomes “Twitter,” and a meaningful broadcast or oral history can end up reading like tech gibberish.

Read more in our initial blog post announcing the project.

Over the next two months, we’re creating unique speech-to-text vocabularies tailored specifically to our partners’ content: contemporary news broadcasts, oral histories, archival recordings, religious lecture series and sermons. We’ll be blogging about our partners, their amazing audio, and the speech-to-text customization process as it unfolds, so check back for updates.

The vocabularies are built directly from words and phrases found in our partners’ content. It won’t be 100% accurate, but these special vocabularies enable our speech-to-text software to effectively gauge the likeliness that sounds in certain contexts correspond to particular words or phrases — so that, for example, when someone recording an oral history for StoryCorps says “quitter,” it doesn’t get transcribed “Twitter.” Unless of course the person actually said “Twitter,” which our software can accurately guess by looking at the placement of the word and other nearby words within a sentence.

Our software was initially trained on a subset of audio and transcripts from NPR, StoryCorps, the Washington Post, the Broadcast Board of Governors, and numerous independent producers, reporters, and radio stations. If you want to learn more about advancements in speech-to-text, watch this short Google Tech Talk.

We can’t wait to bring cutting edge speech recognition methods to organizations that would otherwise never benefit from this technology. Want to be a part of the custom speech-to-text magic? Just let us know. We’ll be onboarding more organizations in the coming weeks, and yours could be one of them.

Advised by the British Broadcasting Corp. R&D team and partnered with the Public Radio Exchange, Pop Up Archive is supported by the Knight Foundation, the National Endowment for the Humanities, and 500 Startups.

 

“I don’t think that the caged bird flings a prayer up to heaven…the caged bird sings about freedom all the time, and that song was so rich and so beautiful, that I knew that was the title." 

-Maya Angelou on titling her 1969 autobiography I Know Why The Caged Bird Sings.

Listen to Angelou’s 1970 interview with Studs Terkel in remembrance of her remarkable life and works.

“I don’t think that the caged bird flings a prayer up to heaven…the caged bird sings about freedom all the time, and that song was so rich and so beautiful, that I knew that was the title." 

-Maya Angelou on titling her 1969 autobiography I Know Why The Caged Bird Sings.

Listen to Angelou’s 1970 interview with Studs Terkel in remembrance of her remarkable life and works.

Giving history a voice: introducing our partnership with History IT

image

This week, we are very happy to announce Pop Up Archive’s partnership with HistoryIT to make archival sound more discoverable on the web. Together, we’ll provide an end-to-end multimedia archiving experience, so that significant sound recordings can be integrated into web-based archives that are routinely indexed by Google.

Read more in the press release.

A big part of our mission at Pop Up Archive is to create better access to recorded sound, so organizations can save, find, reuse, and monetize their content. Archives, libraries, and museums address this challenge constantly. Others find themselves stewards of “accidental archives,” like San Francisco-based radio producers The Kitchen Sisters, with whom we began our work. But there are no simple, web-­based tools for quickly accessing the historic voices contained within digital audio — not even at some of the biggest institutions in the U.S. We set out to change that by automatically transcribing and tagging audio files, using speech-to-text software uniquely trained for media and cultural heritage organizations, without requiring anyone to painstakingly listen through every file in its entirety.

  • Hidden media  — audio and video that used to be physically stashed away on shelves, on hard drives, behind locked doors — should be accessible and reusable in new and different contexts.

  • It’s time to break down institutional silos that keep incredible content from being discovered on the web.  

  • Adding a search engine optimized text layer to media improves productivity and facilitates new revenue streams.

But we’re not archival consultants: Pop Up Archive provides easy-to-use technology, addressing one of many aspects of the larger digital archiving ecosystem. And this is where HistoryIT comes in. HistoryIT helps build digital archives from start to finish, from feasibility studies to comprehensive digitization, metadata creation, and curated portals to digital archives.  It’s a good thing we found each other, because our services complement each other perfectly.

Our first project together is the audio files from the Digital Mayoral Archive at the University of Indianapolis, which will contain more than 1.5 million records subject-tagged at the item level, including hundreds of hours of sound from the archives of former Senator Richard Lugar.

Building digital archives is not about access for the sake of access. It’s about what meaningful access enables:

  • New and improved information exchange and dialogue with local communities.

  • Immediate relevancy and ability to inform current events through a living archive of searchable voices, both contemporary and historical, cross-referenced with documents and images.

  • Opportunities to monetize content through search engine optimization, new audiences, and resulting increased donor support.

Institutions are beginning to understand the wealth of opportunities that will be afforded them if they clean up their archives and treat them not as static artifacts of the past, but rather as active tools for community and relationship building. Technology has enabled archival collections to be instantly accessible around the world — and they can (and should) be accessible to the public in ways that fit with how the public finds information on the Internet today. Here’s a hint: the public doesn’t use finding aids or Library of Congress subject headings. They ask Google a question and see what it tells them. When it comes to meaningful public access, if there’s a struggle between the finding aid and the search box, we know who’s winning.

So, what are you waiting for? Let’s let history speak for itself — there’s no time to waste. Drop us a line and get started today.

thegloballibrarian:

Photos from the visit to the Library of Congress’ Packard campus a few weeks ago.  They pretty much speak for themselves: this place is amazing!

Maybe it’s just us, but there’s nothing like that feeling when decaying, forgotten reels are digitized and uploaded online to become instantly searchable. Digitize all the things! 

thegloballibrarian:

Photos from the visit to the Library of Congress’ Packard campus a few weeks ago.  They pretty much speak for themselves: this place is amazing!

Maybe it’s just us, but there’s nothing like that feeling when decaying, forgotten reels are digitized and uploaded online to become instantly searchable. Digitize all the things! 

Happy Mother’s Day from Pop Up Archive! Here are five of our favorite listens from the Archive about all things maternal:

1. The Seven Daughters of Eve: The Science that Reveals Our Genetic Ancestry (Illinois Public Media) Iceman and his mama.

2. Home As Career-Killer: (The Broad Experience) An interview with Liz O’Donnell, author of Mogul, Mom & Maid on leaning in on the home front.

3. Dear Mama (Snap Judgement) Snap Judgement sits down at the “family table [and puts their noses] right in the middle of the most important relationship of all.”

4. Hidden Kitchen Mama (Kitchen Sisters) “Kitchens and mothers. The food they cooked or didn’t.”

5. Pregnant (Audio Smut) Three unusual tales about the path to motherhood, and how it can be everything from torture to ecstasy.

Happy Mother’s Day from Pop Up Archive! Here are five of our favorite listens from the Archive about all things maternal:

1. The Seven Daughters of Eve: The Science that Reveals Our Genetic Ancestry (Illinois Public Media) Iceman and his mama.

2. Home As Career-Killer: (The Broad Experience) An interview with Liz O’Donnell, author of Mogul, Mom & Maid on leaning in on the home front.

3. Dear Mama (Snap Judgement) Snap Judgement sits down at the “family table [and puts their noses] right in the middle of the most important relationship of all.”

4. Hidden Kitchen Mama (Kitchen Sisters) “Kitchens and mothers. The food they cooked or didn’t.”

5. Pregnant (Audio Smut) Three unusual tales about the path to motherhood, and how it can be everything from torture to ecstasy.

On radio: “Few inventions evoke such nostalgia, such deeply personal and vivid memories, such a sense of loss and regret. And there are few devices with which people from different generations and backgrounds have had such an intimate relationship. Ask anyone born before World War II about the role of radio in his or her life, and in the life of the country, and you will see that person begin to time-travel, with almost euphoric pleasure, to other eras and places, when worlds and music filled their heads and their hearts.”

From the intro of Listening In: Radio and the American Imagination by Susan J. Douglas. Douglas captures the special magic of radio, and why we’re working to preserve old radio interviews and make them searchable at Pop Up Archive. 

On radio: “Few inventions evoke such nostalgia, such deeply personal and vivid memories, such a sense of loss and regret. And there are few devices with which people from different generations and backgrounds have had such an intimate relationship. Ask anyone born before World War II about the role of radio in his or her life, and in the life of the country, and you will see that person begin to time-travel, with almost euphoric pleasure, to other eras and places, when worlds and music filled their heads and their hearts.”

From the intro of Listening In: Radio and the American Imagination by Susan J. Douglas. Douglas captures the special magic of radio, and why we’re working to preserve old radio interviews and make them searchable at Pop Up Archive.