In the digital collection building workshops we do for SAA, we always have emphasized the idea that you should never digitize a collection that isn’t already processed. We generally leave the definition of “processed” a bit vague. At the most basic level, we mean that until you have some organized list of the items that you want to digitize, you shouldn’t start slapping random content on the scanner bed. In practice this meant that you didn’t digitize until you had item-level control of the collection, even if there was only a title without any other descriptive information. The value added descriptive information is something we would advocate adding as part of the digitizing workflow process.
Now I am beginning to wonder if that idea is not quite as valid for born digital content. Perhaps if we just put the stuff out there with the absolute minimum of control, and let the crowd of interested amateur experts fill in the details beyond what we can derive automatically we might be better off, or at least farther ahead.
For most born digital content I can know a few basic things mostly automatically: where it came from, who created it (sometimes), and what it is (document, photograph, moving image, etc) and its file format (jpg, pdf, mp4, mp3, etc.). I can assign it the few required fields in a management system automatically, with something as basic as the title being simply the file name. Could I then just toss it out there and allow the crowd to fill in the other details?
Even if I assume that there are equal parts “Wisdom of the Masses” and “Madness of the Mob” out there, would I get enough good information to make it worth the work of separating the wheat from the chaff?
One argument on the positive side is that, unless you have a very highly focused collection with a very small temporal span, no one organization or institution can possibly have all the expertise to create high quality, in-depth information about all of its collections. And there are a lot of people out there who may know more about the Ukraine, or about DU in the 1940s than the folks here in Denver in the early part of the 21st century.
Could our role as archivists and repository managers be to view and review, rather than to create and catalog?
I don’t think this really can work, or can it?