Pop Up Archive: Breakthrough in Speech to Text?

Illinois Public Media is a delight. It’s wonderful to work with people who care as much about audio preservation and access as we do!

willtech:

image

For the past year I’ve been working with a startup called the Pop Up Archive to improve the way WILL is archiving digital audio. It’s one thing to save a number of files in some form of managed digital storage – and in WILL’s case it’s a very large number of audio files. The challenge is…

Pop Up Archive: Breakthrough in Speech to Text?

Can’t type, won’t type

Thanks for the shout out, and for checking out our premium software release! Just send us an email, or fill in the contact form on our site (from the Enterprise plan) if you have questions about getting started. 

Also, to speak to your concerns about security issues: we do have a thorough privacy policy, which you can find in our terms of service. All of the audio that’s public on the site has either been publically uploaded by users, or is audio that we’ve found from the public domain. Users can also choose to make their audio and its machine-generated data private; all data is transferred using a secure protocol. Information security is very important to us. 

Hope that helps! 

rootofnothing:

Everyone hates logging. Which is why, half the time, we don’t do it.

So, what if I gave you access to someone who would type of all of your interviews, make them effortlessly searchable, and at any time of day? Thanks to machine transcription that’s about to happen. It’s just a question of when.

The BBC already has work underway in this area using the R&D product Comma, but there’s also an explosion in third parties offering this kind of service.

I’ve had my eye on Pop-up Archive for a while. Their interface looks slick and their prices are keen but they’ve also looked a bit iffy in the accuracy department with a few information security issues thrown in too (how happy are you with your sensitive rushes potentially being accessible to all?) But now they have upped their game with an enterprise offering which looks very tasty. I plan to give it a go, so watch this space for the verdict.

Check out the site here

https://www.popuparchive.com

Can’t type, won’t type

I Have A Valuable API Resource, What Now?

One of the most enjoyable thing about being the API Evangelist is talking to API providers about their strategy, and helping brainstorm what they should do next. I have multiple APIs I do this with regularly, either because I’m an advisor, big fan, or simply because they pay me. 😉 My favorite discussions are from the providers that are fine with me retelling their story publicly, APIs like the Cashtie API and  Pop Up Archive.

Anne and Bailey over at Pop Up, a audio transcription API, talk with me regularly about their API development, deployment and now evangelism strategy. The Popup Archive API meets the first rule of APIs for me—do one thing and do it well, providing clear value for developers. The Popup Archive does this with audio transcription, opening up a whole world of audio to being searchable via an API. Think about the potential of making old radio programs indexed, and searchable online, giving new life to legacy content.

Anne and Bailey have built a high value web site and API, and recently finished their interactive documentation, and are ready for business. They are at that amazingly exciting, and at the same completely terrifying step of having to evangelize their API and make the rubber meet the road–they have a valuable API resource, but now what?

Anne and Bailey have some great ideas about how the audio transcription API could be used, and they have some partners who are kicking the tires, understanding what is possible with integrating with Popup Archive API. This is where you start! You harness the ideas, and the early integrations and you tell the story of how the Pop Up Archive is providing a solution. Tell these stories on the Pop Up Archive Blog, on the Twitterz, and anywhere else you can find an audience.

Then launch a formal idea showcase where you can submit ideas on how the Popup Archie could be put to use, allow people to browse and search the idea showcase and imagine what is possible. Next establish a section for actual case studies, and as partners and other developers successfully integrate, tell these stories as well, but in a more formal way, demonstrating established approaches for putting the Pop Up Archive API to use—not just the dreamy ideas. 

Next I told them to start monitoring the landscape of where they think their potential users are, find the radio stations, and media outfits who have Twitter accounts. Understand who the audiofiles, archivists, DJs, performers and bloggers who care the most about the audio space. Spend time each week discovering, living and understanding this space, all while you are telling stories about the Pop Up Archive and its valuable API.

Gather your ideas on how an API could be put to use, showcase how the API is being actively used, tell the stories, and actively explore and get to know the landscape of the online world you think will find the API resource most valuable. Eventually you will find more stories, build relationships with new developers, and discover other business interests that can put the API to use. Its a natural, ongoing API evangelism cycle—not to be confused with marketing or sales.

In the end its not just API evangelism, its about building your own awareness of a space, and telling stories of the value your API delivers—then repeat, repeat, repeat. Each week you will learn more, your storytelling voice will get stronger, your community will get bigger, and your API will grow and mature.

I Have A Valuable API Resource, What Now?

Building the Studs Terkel Radio Archive: Part One | Let’s Get Working: Chicago Celebrates Studs Terkel

By Tony Macaluso
Director of Network Syndication / The WFMT Radio Network
& the Studs Terkel Radio Archive

Note: Over the course of 2014 the Studs Terkel Radio Archive will begin to be made available to the public via a free streaming website. The process is being led by Studs’ long-time radio home WFMT in partnership with the Chicago History Museum with assistance from many other organizations. This blog entry (and several that will follow) provides a window onto the process of creating this giant audio archive. The project will be discussed during a panel at the University of Chicago’s: “Let’s Get Working” Studs Terkel festival in May, 2014.

https://www.popuparchive.org/embed_player/Studs-Ali%20excerpt/12739/9946/938

Studs Terkel played many roles in his 96 years. Author of oral histories, television actor, agitator for social justice, embodiment of a Chicago attitude toward art and society that was neither apologetic nor boastful (just honestly enthusiastic), to name a few. The role that stuck with him the longest, perhaps, was that of a radio man. He often referred to himself as a disc jockey (although his way of using radio to communicate defies conventional job titles).

Radio was a trade that he plied for eight decades. It started when he was in his 20s and worked as a radio theater actor, primarily playing gangsters. It became a life’s calling in 1952 when, at the age of 40, he was hired to host a radio show, initially about music, for WFMT an upstart, independent radio station broadcasting out of a down-on-its-heels art deco hotel on the far west wide of Chicago (the Hotel Guyon). His show quickly evolved into something much more: a space for long-form, slow, unabashedly intellectual yet profoundly playful conversations with writers, musicians, scientists, social activists, dancers, actors, philosophers, historians and working class people. It literally spanned the globe. And changed radio. This was decades before NPR. Much of the other work that he became known for (including the oral history books) grew out of his radio work. It remained his home-base until the end of his life.

He hosted a daily radio show until 1998. Forty-six years. Every day at 10am (later it moved to 10pm). It’s estimated that he produced at least 7,500 programs, of which around 5,400 survive (for a stretch Studs was taping over the reel-to-reels of old shows until WFMT staff convinced him that they ought to be saved). Here’s how Alex Kotlowitz recently summed up the significance of Studs’ radio opus:

“The interviews he’s done for his radio program are… a riff on the American way of life, striking notes of hope and despair, of laughter and tears, of stubbornness and transformation. They reveal who we are — and who we want to be. Studs had an uncanny ability to scratch away the veneer of celebrities and the crustiness of the alienated so that listening to these interviews is like peering into the soul of this country.”

Most of Studs Terkel’s radio work (perhaps 90%) has been completely inaccessible to all but the most determined scholars. We’re talking about enchanting, historically significant interviews with giants of 20th century culture. To pick a tiny sample: Studs archive contains Martin Luther King discussing civil rights strategies while sitting in the south-side Chicago kitchen of gospel singer Mahalia Jackson; filmmakers such as Buster Keaton, Fellini, Sidney Poitier and Jacques Tati chatting about the techniques of their craft; Chinese and Russian artists, activists and common people trying to understand the complexities of the Cold War world; leaders of the struggle to give women an equal voice in society such as Simone de Beauvoir, Susan Sontag, Gloria Steinhem, Adrienne Rich, Erica Jong, Kate Millet, Maya Angelou, Nora Ephron, Dorothy Parker, Eudora Welty, Nadine Gordimer, and on and on.

The opportunity to make the Studs Terkel Radio Archive available is the result of careful planning and persistence, above all by many of Stud’ colleagues and friends. His reel-to-reel archive was maintained and moved first first to the Chicago History Museum (in 1998) and then to the Library of Congress (in 2011) with the help of people such as Lois Baum, Sydney Lewis, Tony Judge (just a few of Studs’ WFMT colleagues who have been especially active in tending to his legacy). The archive was further tended to by Gary Johnson and Russell Lewis (from the Chicago History Museum) and Gene DeAnna (the head of audio at the Library of Congress who oversaw the invaluable process of digitizing) and his staff.

Studs life work has remained vividly present at WFMT in the years since his death in 2008. All of the staff is aware of how he helped shape the place. A few examples: Andrew Patner, who knew Studs since he was a child, keeps his legacy of unfettered inquiry alive with his program Critical Thinking, on which listeners can encounter hour-long conversations with blues historians, architects, poets or the likes of Riccardo Muti from one week to the next. Steve Robinson, the station’s general manager, helped keep Studs engaged with the radio station in his final years and whose vision for radio’ future is as bold and unbounded as Studs’. Louise Frank produces the weekly Best of Studs Terkelbroadcasts (heard on Fridays at 10p). David Polk, the station’s new program director and Andi Lamoreaux, music director, both channel Studs’ eclectic curiosity in shaping the sound of the station.

As much as everyone at WFMT feels Studs’ presence at the station even five years after his death, undertaking the project of making his radio archive available to the world was not inevitable. It was helped along when, in the summer of 2013, the station’s long-running, nationally syndicated music appreciation show Exploring Music with Bill McGlaughlinhad it’s entire archive of 900+ hours put online on a searchable streaming website. This project, important in its own right, was a deliberate precedent for putting the even more vast Studs Terkel archive online. Bill tested the water. When the Exploring Music site was deemed a success, we plunged in with Studs.

In subsequent blogs we’ll share some stories about how the Studs Terkel Radio Archive is being built and the wide-ranging plans for how people might use it (and how it will become a catalyst for the creation of new audio art). While the website won’t formally launch until the autumn of 2014, several dozen of the newly digitized programs (thanks to the Library of a Congress and their stunning Culpepper facility inside a mountain outside of Washington D.C.) are already available on a temporary site hosted by our partners Pop Up Archive, an audio archive based in Oakland. Click here to sample Studs talking to James Baldwin, Simone de Beauvoir, Shel Silverstein, Edward Said and others:

https://www.popuparchive.org/collections/938

I can’t imagine chatting about Studs Terkel’s radio programs without recommending an favorite excerpt. In fact some of the special features of the forthcoming archive will be monthly guest curators who will share some of their favorite bits, with their own comments, and the ability for anyone to clip their own highlights and share them (or use them to create new artistic audio mashups). I’ll end this blog with one recently discovered highlight. Every excursion into Studs’ radio archive seems to yield a new ‘favorite moment.’ Today’s “soup of the day” just happens to be a brief exchange with Mohammed Ali. In a matter of a few minutes the listener gets a micro-vision of TerkeI’s interview style and methods, including a poetic rift on pain and a fascinating moment when Terkel reacts sharply to being compared to Howard Cosell. I won’t say much more, except to point out that this peculiar bit of dialogue is something of a synecdoche for Studs’ work as a whole.

There are thousands of other moments in the Studs Terkel radio archive that are equally surprising and revelatory. You can literally dive in anywhere and be almost certain to have your preconceptions upended and horizons expanded.

*     *     *

We don’t know how the Studs Terkel Radio Archive website will look and work. The design process is just getting started and new partners are joining the process. The list of organizations involved already includes PRX, the BBC, The Nation, StoryCorps, In These Times, The Chicago Humanities Festival, Third Coast Audio Festival and others. One goal for the archive website is to give journalists, scholars, artists and the general public free reign to stitch together their own fresh audio collages drawing on Studs’ work, creating their own audio mashups, cutting, clipping and rearranging snippets of dialogue mixed with their own audio files, whether music, interviews or other acoustic artifacts. We would like to hear a hip-hop song mixing samples of Studs talking with James Baldwin, John Cage or Maya Angelou or an independent film based on the voices of Nelson Algren, Diane Arbus, Big Bill Broonzy or Laurie Anderson. We agree that Studs would have enjoyed nothing more than to be surprised and puzzled to hear what the world might make of his audio treasure trove in the 21st century and beyond.

Stay tuned for more updates on how Studs’ work is prepared to be shared.

Please send comments, ideas and questions about the Studs Terkel Radio Archive to: tmacaluso@wfmt.com

Building the Studs Terkel Radio Archive: Part One | Let’s Get Working: Chicago Celebrates Studs Terkel

Five Cool Things I’ve Worked On So Far

My dream is to make our archive open and accessible — but also to add context and value to current reporting efforts.”

socialmediadesk:

I’ve been at NPR for four months now and wanted to highlight a couple of neat projects that I’ve worked on so far. I should say: my gig is not all lollipops and roses. Sometimes it’s quite stressful (which is the nature of this daily, deadline-driven business…), sometimes I’m completely…

Five Cool Things I’ve Worked On So Far

An API Is Research And Development For Your Business Model

This is a guest post by Kin Lane of apievangelist.com.

I spend a lot of time talking to folks on the phone, Skype, in Google Hangouts and in person about their API business models. Not everyone I talk with is willing to share their story public, so I’m also happy when I meet folks who are as open and transparent about figuring all of this out as I am.

This morning I spoke with Anne and Bailey over at the Pop Up Archive about their upcoming API, and potential business model(s) for their API when ready. We talked about their immediate API release and then brainstormed about how to get the word out, how people might use the API, and possible approaches monetization.

Pop Up Archive is an audio transcription service, allowing you to publish audio files and receive back full text transcriptions of the audio. It is a pretty straightforward service, and they will be releasing an API so others can use the audio transcription service in their own apps, as well as access the wealth of audio resources they are amassing in their archive.

Even with the straightforward nature of this upcoming API resource, the question of who will use this service, and what they are willing to pay comes up–something we spent the good part of an hour discussing. We should have recorded, then we’d have the transcription, but since we didn’t here are some of my thoughts from today’s discussion.

Your API Is Tech and Business Research & Development
The tech space moves fast, and success is all about getting your API up and running, allow for developers to use, even if it is just within a trusted group, then iterate and evolve as you gain more knowledge about what people want and how they will be using it. You can speculate from now until the cows come home about how people will use, but until you have it up and running you won’t know for sure. When you approach your API with a R&D mindset, you will be much more open to opportunities and more resilient when things go wrong.

Rolling Out With Minimum Viable Monetization Strategy
When you are pioneering into a new area of API resources, it can be tough to know what the market is willing to pay. Start with identifying your hard costs like compute, storage and bandwidth, tack on a reasonable profit and get your API open for business in a beta release. Make it clear that things will change, but let people start using your API, pay for basic services and begin understanding more about their usage and what your first wave of customers are willing to pay. Over time, you can evolve your monetization strategy, building on what your hard costs are, the value you deliver to your users, and ultimately what they are willing to pay for your service(s).

Casting A Wide Net To Identify Target Audience Beyond Obvious Ones
When identifying their potential target audience, Pop Up Archive starting with the obvious ones like radio and journalists, but then how do they potentially identify other areas, that aren’t so obvious? You do this by first by getting your API up and available in some sort of private beta. To support the API release, publish a blog and twitter account and get to work telling the stories of your API, which Pop Up Archive has done. Then get to work telling the details of every step of your journey, and every use case of the obvious target groups. These stories will become your net, and the wider you cast the net, including as many keywords as you can, and the number of stories you tell, the wider possible audience you will draw in. The Pop Up Archive introductory video says that if you don’t provide text representations of your audio, they won’t be found in searches–when it comes to APIs, if you don’t tell stories of the problems your API solves, users will never find it when searching for solutions to their problems.

Not All API Consumers Are Created Equally
When identifying your API consumers make sure you get to know their needs and goals. In the case of the Pop Up Archive, some users may be using the API for audio transcription, while others may be using it to gain access to the rich library of audio uploaded by other users. These two groups will have radically different needs, and possess very differ thresholds of what they will pay for API access. While it makes sense to charge audio transcription users for the heavy lifting of transcribing, you want to incentivize archive users to access, syndicate and share as much content as you can. Why even charge them? With a proper branding strategy these users can become the marketing vehicle of the API, building directories, sites, widgets and other content syndication that could potentially drive new users to the API.

Anne and Bailey are doing a great job of approaching their API strategy in a very open and agile way. The worst thing any API provider can do is approach their strategy with a very rigid view, thinking they understand exactly how developers should use an API resource. Pop Up Archive has built a great service that does two things, and does them well:

  • Audio Transcription
  • Audio + Transcription Archive

They provide a clean, easy to use website, and now they are preparing a first version of their API, with accompanying documentation, code samples and widgets. They have identified their operational costs and are developing an appropriate fee for the service that is based upon the number of minutes of audio you are processing.

In 2014, Pop Up Archive will open their API for business with a handful of beta partners and begin iterating on the technology and business of their API platform. While there are a lot of unknowns, Anne and Bailey realize they just need to just get up and running, engage their users, and establish a feedback loop that will then help define the future of their API and its business model.

Your API is a an external, R&D lab for your business. While you will be producing real products and services out of this lab, you need to approach operations with an agile state of mind, enabling you to be open to use cases and business models you may never have conceived when you set out on this road. If you do this, your chances of success will significantly increase.

What sort of uses cases and business models do you envision for Pop Up Archive?

An API Is Research And Development For Your Business Model

Digital Stewardship and the Digital Public Library of America’s Approach: An Interview with Emily Gore | The Signal: Digital Preservation

Emily Gore, Director for Content at the Digital Public Library of America

The following is a guest post by Anne Wootton, CEO of Pop Up Archive, National Digital Stewardship Alliance Innovation Working Group member and Knight News Challenge winner.

In this installment of the Insights Interviews series, a project of the Innovation Working Group of the National Digital Stewardship Alliance, I caught up with Emily Gore, Director for Content at the Digital Public Library of America.

Anne:  The DPLA launched publicly in April 2013 — an impressive turnaround from the first planning meeting in 2010. Tell us how it came to be, and how you ended up in your role as content director?

Emily:  I started building digital projects fairly early in my career, in the early 2000s, when I was an entry-level librarian at East Carolina University. In the past, I’ve worked on a lot of collaborative projects at the state level. In North Carolina and South Carolina, I worked on a number of either small scale or large scale statewide collaborations. I led a project in North Carolina for a little over a year called NC-ECHO(Exploring Cultural History Online) and so have always been interested in what we can do together as opposed to what we can do on individually or on an institutional level. Standards are important. When we create data at our local institutions, we need to be thinking about that data on a global level.  We need to think about the power of our data getting reused instead of just building a project for every institution — which is where all of us started, frankly. We all started in that way. We thought about our own box first, and then we started thinking about the other boxes, right? I think now we’re beginning to think broader and more globally. It’s always been where my passion has been, in these collaborations, especially across libraries, archives, and museums.

I was involved in the DPLA work streams early on and saw the power of promise of what DPLA could be, and I jumped at the offer to lead the content development. At the time, I had taken an associate dean of libraries position and been at Florida State for about a year, and it was a real struggle for me to think about leaving, after only being somewhere for a year… but I think, I guess we have to take leaps in our life. So I took the leap, and you know, I think we’re doing some pretty cool things. We’ve come really far from when I started last September, really fast. I haven’t even been working on the project for quite year and we’ve already aggregated millions of objects and we’re adding millions more.

I love all the energy around the project and that a lot of people are excited about it and want to contribute. One of the first projects I coordinated was with a local farm museum, dealing with the actual museum objects, and marrying those with the rich text materials we had in the library’s special collections. And telling a whole story — people being able to actually see those museum objects described in that text. I just saw the power of that kind of collaboration from early on and what it could be more than just kind of a static, each-one-of-us-building-our-own-little-online-presence. The concept of the DPLA has really been a dream for me, to take these collaborations that have been built on the statewide, regional and organizational levels and expand them.

Image of DPLA homepage

Anne: There are ongoing efforts in lots of countries outside the United States to create national libraries, many of which have been underway since before the DPLA. Are there any particular examples you’ve looked to for inspiration?

Emily: Europeana, a multi-country aggregation in Europe, has been around for about five years now. We’ve learned quite a bit from them, and talked to them a lot during the planning phase. They have shared advice with us regarding things they might have done differently if given the opportunity to start again.  One particularly valuable piece of advice has been not to be so focused on adding content to DPLA that we forget to nurture our partnerships and to work with our users.  Of course, my job is largely focused on content and partnerships,  but we really want to make sure that the data we are bringing in to DPLA is getting used, that there are avenues for reuse, that people are developing apps, that we continue to make sure the Github code is updated, and that everything is open and we promote that openness and take advantage of showing off apps that have been built, encouraging other people (through hackathons, for example) to build on what we’ve got.

Europeana has also done a lot of work building their data model, and testing that data model, and making it work with their partners. That’s been a huge help for us starting off, to take their data model and adapt it for our use. They’ve also held rights workshops — Europeana formed 12 standardized rights statements starting with CC-0 and various Creative Commons level licensing, down to rights restricted or rights unknown. We all need to work with our partners to help them understand their rights and their collections better, and to place appropriate rights on them. Most of the collections we see coming in are “contact so-and-so,” “rights reserved,” that kind of thing. This is largely because people are afraid or there is a lack of history regarding rights. We want to work with Europeana and our partners to clarify rights regarding reuse for our end users.  Europeana has started to work with their partners on that, and we want to do that together, so that the rights statements are the same between organizations, and we promote interoperability in that way.

Anne:  So much of the DPLA is based on state hubs and the relationships that existing institutions have with those state hubs. How much collaboration do you see among the states?

[For uninitiated readers: the DPLA Digital Hubs Program is building a national network of state/regional digital libraries and myriad large digital libraries in the US, with a goal of uniting digitized content from across the country into a single access point for end users and developers. The DPLA Service Hubs are state or regional digital libraries that aggregate information about digital objects from libraries, archives, museums, and other cultural heritage institutions within its given state or region. Each Service Hub offers its state or regional partners a full menu of standardized digital services, including digitization, metadata, data aggregation and storage services, as well as locally hosted community outreach programs to bring users in contact with digital content of local relevance.]

Emily: When the DPLA working groups started to examine how we should go about getting content into the DPLA, I remember saying “We should build off of existing infrastructure, because these collaborative projects exist in many states.” They’ve been working with the local institutions for a number of years. So if we can start working with those institutions, then we can build a network and get content. Trust is so important. I think that the small institutions often trust that institution that’s been aggregating their content for a number of years and they might not trust someone from the DPLA coming in and and saying, “I want your content.”

The states work extremely well together. We have project leads and other relevant staff from each state or region, and five states and one region covering multiple states right now that we’re working with. We come together to talk about issues that are relevant to all of the states. The models are very different. Some of them have centralized repositories where the metadata work, the digitization work, everything is done in one central place. They work with partners to help provide initial data, and to get the actual objects, but then all the work is done centrally to enhance that metadata and do the digitization work. In other places it’s totally distributed. I’ll take South Carolina as an example. The three major universities in the state have regional scan centers, and they work with the people in their respective regions to get materials digitized, described and online. They’ll accept contributions from institutions who have already digitized their content and provided metadata for that, and then they’ll take it in to their regional repository, and then the three regional repositories are linked together to form one feed. It’s wonderful to hear the exchanges among the hubs, “this is what works in our state, and here are the reasons why.” And they figure out, “Maybe we’ll try this, maybe this will work better to attract folks.”

Anne: Have the state hubs helped build relationships with small institutions? Or how has the DPLA mission and reputation preceded it in these communities?

In several of the regions, because of the participation of the DPLA, people who refused to partner before are actually saying, “I want my content exposed through the DPLA so can we partner with you?” Partnerships are expanding in the hub states/region as a result of this. I think being at the national level is really helping. I think a lot of [the state hubs] are trying to do outreach and education — they’re doing webinars, they’re talking to people in their state, they’re trying to educate people about what the DPLA is and what the possibilities are. And trying to alleviate fears, where possible. There’s a lot of fear. Even opening metadata, it’s been interesting to see what people’s reactions to that are sometimes. I guess in my mind, I never thought about metadata having any rights. These states have had a challenge explaining what a CC0 license really means for metadata. I think that that has been a hurdle, but most of them are overcoming it, and partners in general are ok with it once they understand the importance of open data. They’re explaining why it’s important, and they’re talking about linked data and the power of possibility in a LOD world, and that that’s only going to happen if data is open.

Anne: How do you effectively provide context for these 4,000,000+ digital records? How do you root a museum artifact in the daily life of that place, and how do you do it within a given state versus across states?

Emily: We’ve done exhibitions of some of the content in the DPLA so far. We have worked with our service hubs to build some initial exhibitions around topics of national interest.  Our goal initially was for different states to work together to help provide data from multiple collections. That happened on a very small scale. Mostly the exhibitions were built with collections from their own institutions, largely because of time constraints we were under to get the exhibitions launched. But also, it’s easier. You know the curator down the hall, you can get permission to get the large-scale image that are needed to actually go in the exhibitions. We did have some exceptions to that; we had a couple of institutions work together and share images with the others. We hope to do more of that — we pulled out 40 or 50 themes of national significance that we could potentially build exhibitions around and there are a number of institutions who want to build more. Right now we’re working on a proposal to actually work with public librarians in several states, to reach some of the small rural public libraries that may have some collections that haven’t been exposed through the hubs, that would in turn help build some of these exhibitions at a national level. And those would be cross-state: local content into national-level topics of interest. We’re also doing a pilot with a couple of library schools on exhibition building. And we’ve given them the same themes, and they’re going to use content that already exists in the DPLA.

Anne: You mentioned hackathons and encouraging people to build things using the DPLA API. What are people building so far?

Emily: To date, I think there are approximately nine apps on the site. There is a cross-search between Europeana and the DPLA — a little widget app where you can search both at the same time and get results, which is awesome. That was built early on. Ed Summers built the DPLA map tool that automatically recognizes where you are so you can look at what DPLA content is available around you. The Open Pics app is iOS-based — you can search and find images around all the topics in the DPLA and use them on your phone. It’s pretty cool. Culture Collage is the same kind of app – it visualizes search results for images from the DPLA. StackLife is a way to visualize book material in a browsing way, like you would actually in the stacks in a library.

We also hope to continue to have hackathons, we’ve talked a little bit to Code for America and hoped to get more plugged in to their community, and we were involved in the National Day for Civic Hacking, and we’re hoping to continue to promote the fact that we do have this open data API that people can interface and build these cool apps with. We really want to encourage more of that.

Anne: Explain your vision for the Scannebago mobile scanning units.

Emily: When I was working in North Carolina years ago, we did a really extensive collections care survey of all the cultural heritage institutions in the state of North Carolina — about 1,000 institutions.

That survey took five years and two or three different cars! We surveyed these cultural heritage institutions looking specifically at their collections care and the conditions that their collection were in, but also with an eye toward what might need to be preserved for the long term, what needs to be digitized and made available, what are their gem collections that we could essentially help them expose? We saw so many amazing collections that, without physically going to these institutions, you would never ever see. Take the Museum of the Cherokee Indian as an example.

There we discovered wonderful textiles and pottery and other collections that, unless you physically go there, you will likely never see. And of course, like most museums, they only display a small portion of their collection at any time. Otherwise the collections are in storage, and on shelves, and until they rotate those collections in you never see them. It’s not only in North Carolina where we find those examples — it’s everywhere. The ability to see those objects online, I think, is so powerful. And even to potentially tell that rich contextual story, build exhibitions around that, talk about the important history there — I think can be very powerful. But we know that it took a trust relationship for us to even go there and survey their collections. There had to be a trust relationship built, instead of, “Hi, we’re from the state government and we’re coming here to survey your collections.” Obviously that is not really what a lot of people want to hear. So [during the North Carolina survey] we worked with cultural heritage professionals who had existing trust relationships with institutions and they helped us forge our own relationships.  In the end, most institutions were confident that we were indeed only there to survey the collections, and that we had good intentions to help get funding, to help preserve these collections for the long term.

We use that network a lot. We’re not going to get local content without the local people, without the connections, without the trust relationships that have already been built. These people aren’t going to let materials out of their building to be digitized. They’re not going to send them to a regional scan center, or a statewide scan center — they’re just not going to do that. They care about those objects so much — they represent their history, and in many cases they’re not going to let them out of their sight. We have to come to them — how do we do that? Some of these places are up these long winding mountain roads — how in the world do we get up here, and how in the world do we get equipment to them to get this done? That’s where I came up with the concept of a mobile digitization vehicle that I called a Scannebago, a Winnebago shell that we can build out with scanning/camera equipment to get to these rural and culturally rich institutions. That’s the concept.

People ask me about taking content directly into DPLA, and I think the importance is the sustaining of that content. Somebody has to be responsible for the long term maintenance of that content — and at this point, that’s not us. We’re aggregating that content, exposing that content for reuse, but we are not the long-term preserver of that content. And these small institutions are not the long-term preservers of that content either — that’s why the hubs model continues to be important. When we go out with the Scannebago, I still want that digital material to go to the hubs to be preserved for the long term. The Scannebago is another way to make content available with its appropriate metadata through the DPLA, but we really want to see the digital objects preserved and maintained for the long term at some level, and right now that’s through the hubs. It doesn’t have to be geography-based — hubs could be organized around media type or organization type.  But right now, a lot of these relationships exist already based on geography, so it seems logical to continue to build out hubs by geography as we build out other potential collaboratives as well.

The Scannebago has always been a dream — I had really hoped when I was working at the state of North Carolina that we’d be able to do it on some level, and it just didn’t become a reality — but John Palfrey (Head of School at Phillips Academy, Andover and chair of the DPLA board of directors) heard about what I wanted to do and picked it up and was really excited about the potential of doing this. We’re drawing out a schematic of what it would look like. We might potentially launch a Kickstarter campaign to try to build one out in the future. We really want to at least pilot the concept. I would also love to do a documentary on it — I think the stories we’ll find when we actually get to these places are just as important to preserve as the content — the curators, the people who are looking over this stuff and how important it is. I get chills just thinking about it, but one step at a time. One step at a time.

Digital Stewardship and the Digital Public Library of America’s Approach: An Interview with Emily Gore | The Signal: Digital Preservation

Pop Up Archive and PRX Launch Audio Archiving Service | Idea Lab | PBS

This post was written with help from Jake Shapiro, CEO of Public Radio Exchange.

We believe that archiving can be so easy it doesn’t even feel like archiving — and that an entire new generation of meaningful content is waiting to be created, unlocked from the archive and unleashed on the world.

 

It’s been an amazing few months since Pop Up Archive won the Knight News Challenge in September, and we’re excited to share our progress with you. At the end of 2012, Pop Up Archive and PRX teamed up to build a web-based archive system that addresses the needs of content creators and archivists as they attempt to harness quality audio material. We’re energized and inspired by the Pop Up-PRX collaboration, and we’ve spent 2013 so far hunkered down in Oakland, Calif., and weathering a blizzardy week together in Cambridge, Mass., developing the system. At the time of this writing, Pop Up Archive is just about ready for use by anyone with audio and the will to save it. Together with PRX and audiovisual preservation expert Dave Rice, we’ll debut Pop Up Archive in March at a workshop we’re giving at SXSW:Interactive.

Pop Up Archive is a simple system for organizing and accessing audio and related material. We’re leveraging emerging technology and standards to build an archive of oral history material with new methods for search and discovery as well as public and private storage options. We’ve secured records and accompanying content from larger organizations like WGBH and Illinois Public Media as well as independent producers and small archives like The Kitchen Sisters.

Pop Up Archive has looked to PRX’s approach and unique role in the public media community ever since we started thinking about audio archiving at the Berkeley School of Information. We’re excited about this partnership because we share ideals and motives, but also because we get things done and we do them well. We’re both passionate about architecting the future of media while staying very close to the ground when it comes to user habits and workflows. We’re thrilled that PRX members will be some of the first to test and take advantage of Pop Up Archive.

WHAT IT DOES

So what does Pop Up Archive do exactly?

  1. Preserves digital audio. Valuable cultural material is lost every time a hard drive dies or a folder gets erased to make more space on your laptop. Pop Up Archive enables anyone to add archival records and safeguard media privately on Pop Up Archive servers or publicly at the Internet Archive.
  2. Makes it easy to add metadata. Pop Up Archive uses speech-to-text software to create useful subject tags about your audio automatically. You’ll also be able to add custom metadata using a simple form or by importing your existing CSV or XML records.
  3. Enables anyone to search, filter, and access a substantive database of archival material from oral history archives, media stations, and individuals. However, we realize that not all audio is ready to be shared, so users will also be given the choice of storing their audio publicly or privately.

We’re eager for any and everyone to test our batch upload and auto-tagging process so we can improve the service and find out what features are most desirable. Come visit Pop Up Archive and PRX at our workshop, where we’ll provide archiving advice of all stripes to anyone interested, open house style. Registration full? Come anyway, for a short presentation at 9:30 a.m. or anytime through 1:30 p.m. on Monday, March 11. Follow us on Twitter or visitpopuparchive.org for updates.

Pop Up Archive and PRX Launch Audio Archiving Service | Idea Lab | PBS

Pop Up Archive: Build an Archive & Make It Count | See you in Austin!

Pop Up Archive is a Knight-funded project that makes it easy to tag and store audio and related material on the web. We’re debuting an alpha version of the site in March at SXSW:Interactive. We’d love to see you there.

If you have amazing audio on the shelf (or hard drive), consider taking the system for a two step while you’re at SXSW. What does Pop Up Archive do exactly? Put plainly, we’re making audio searchable and easy to organize. 

  • Save digital audio.

Valuable material gets lost every time a hard drive dies or a folder gets erased to make more space on your laptop. Pop Up Archive enables anyone to track their material and safeguard media privately on Pop Up Archive servers or publicly at the Internet Archive.

  • Make it searchable.

Pop Up Archive uses speech-to-text software to create transcripts and useful subject tags about your audio automatically. You can also add your own descriptive tags and metadata using a simple form.

  • Build a nexus of archival audio from around the world.

Save time exploring and finding material within your own work as well as a wide range of material producers, oral historians, archives and scholars. Private storage options enable you to protect any audio that’s not ready to be shared.

If you’re planning to be at SXSW, we’d love to see you there. Registration for the workshop is required, but no one is expected to come for all four hours. You can read more on our Facebook event page. Tell your friends!

Pop Up Archive: Build an Archive & Make It Count | See you in Austin!