Archive

Archive for the ‘Discovery’ Category

Going Mobile

May 10th, 2010 Quantum Archivist 4 comments

A recent post in the AOTUS blog by David Ferriero entitled  “The Future is in the Palm or Our Hands” called for archivists to think about ways to connect archival collections to potential users through mobile devices. Ferriero was speaking specifically about NARA and its collections, but this idea is of course broadly applicable to all archives and collections.

The great opportunity for archives  in connecting to users through mobile devices comes from one special nature of these devices: they can locate themselves in space, that is, they know where they are. And since they know where they are, we can link digital objects in our collections to those locations and have them pop-up on a mobile device and announce their presence, without the user doing practically anything at all except holding up his smartphone.

The idea of geo-coding locations for historical documents (especially photographs) has been around for some time. I was a part of some work in the late 1990s at Tufts University in collaboration with the Perseus Digital Library to overlay historical resources of London and Boston

"Boston Streets" at Tufts University

onto historical maps. These were large-scale, programming intensive projects that used what we would now consider primitive, web-based GIS display tools to visually display and deliver historical information through a web-browser. They certainly were not optimized for mobile devices, because, of course those devices didn’t really exist then. While these tools were good at showing a visual representation of the location of historical information, we didn’t yet have the ability to do what we could imagine, which was to stand in a particular spot on the earth and connect with the historical record of that particular place.

The advent and general adoption of the Google maps API  made it possible to more easily connect content to maps, and the development of smart phones and web-enabled mobile devices makes it possible to deliver historical documentation to people right where the history happened even though the resources that document that history are stored in our repositories.

How great would it be to stand on the steps of the Lincoln Memorial and hear Martin Luther King Jr.’s “I Have a Dream” speech? Or stand on a street in San Francisco and see photos of that street after the 1906 earthquake? Actually, I don’t know that you CAN’T do this right now. The technology exists, I don’t know if anyone has done it yet.

Of course, there are people already doing this sort of thing. For example, if you are in Philadelphia, you can point your iPhone to http://phillyhistory.org/i/ and be shown historic photos of Philadelphia based on your location. North Carolina State has produced WolfWalk (http://www.lib.ncsu.edu/wolfwalk/) which provides information on the history of approximately 60 major sites on the NCSU campus drawn from resources at the University’s Special Collections. In both cases I need to know that Phillyhistory or WolfWalk exists and what the url is.

What would it take for my Google maps app to list, not only restaurants or barber shops, but historical documents, images, and media related to nearby places?  Well, maybe that’s getting a bit too optimistic, but we can still dream can’t we?

Keep Your Friends Close…

April 19th, 2010 Quantum Archivist 1 comment

… and your enemies closer. Whether this comes from the Godfather, or Napoleon, or an Ancient Chinese philosopher, it may explain what a fervent believer in open source like me is doing giving a presentation at an ILS vendor’s user group meeting here Chicago.

Image from Wikipedia

Like most academic libraries, we use a combination of tools, applications, and resources to collect and deliver our content. In the past few years, we have made an explicit choice to move toward open source software solutions, at least for our presentation layer.

Why did we do this? There are a number of reasons most of them philosophical and operational rather than economical. Although open source is free (like a puppy), there are many costs associated with development and maintenance. I don’t think the economic argument has a lot of value in terms of decision making, since anything big costs a lot of money. Big products from vendors and big software development projects seem to me to be in the same ballpark cost-wise.

I’m not going to go deeply into the whole argument here, and it is possible to argue any of these points. But my opinion is that given a certain level of technical expertise (that not everyone has or can get) I think the advantage of open source is the ability to be nimble in the face of new demands and serve your user base in much more focused way than vendor solutions can offer. The downside of course is that you have to maintain it all yourself and there is no easy phone call to customer support that you can make to say “just fix it please!”

Which brings me back to Chicago, physically and intellectually. I am part of a panel with two colleagues from our library to talk about harvesting and aggregating metadata–including primary source metadata–into a presentation layer that is usable and useful for researchers.

We will of course talk about the vendor-supplied option that we currently use to harvest and aggregate book and primary source metadata, but I’m going to go another step beyond that to talk about the value of standards-based data exchange and demonstrate not only the vendor-based model, but a few open source based applications that we have developed here at the library because my point is that data aggregation is a matter of policy and practice, not applications.

What I am saying is that aggregated metadata can be used in a variety of ways to support discovery, and that open source applications based on standards that can be re-used and re-purposed for different audiences can go a long way toward serving the needs of our local audiences in ways that “one-size-fits-all” vendor products don’t seem to be doing.

We’ll see what sort of reception this gets in a room full of people who presumably (at least in my mind) are here to hear about the latest product from their vendor and why they should buy it.

Who Wants to Know?

April 13th, 2010 Quantum Archivist No comments

Penrose Library's new web site

Recently, the Penrose Library launched a brand new “user-centered” web site. I’m not a big fan of the term “user centered” since I think it is often used as an excuse not to be creative. But what we are trying to do is make available to each group of users the things that they are most interested in right up front. Rather than forcing them to learn how the library is organized administratively, we wanted the site to answer the question: “What do I want to do?” based on a second question:  “Who wants to know?”

Some of this approach was informed by a workshop given by Nancy Fried Foster, library anthropologist at the University of Rochester, that some of us attended a year or so ago. She had recently completed an ethnographic study of undergraduate research behavior at the University of Rochester. Her findings were published in 2007 in a book called “Studying Students: The Undergraduate Research Project at the University of Rochester.”

Other parts of the design were informed by our own observations of user behaviors from Faculty, Students (both graduate and undergraduate) and University staff. If you are interested, there is a short “tour” of the new site, narrated by our Instruction Librarian, Carrie Forbes.

My point is really that, in order to be successful, especially in a library that hopes to teach research and scholarship skills as well as provide information, one size does not fit all, and there should be as many different library experiences as there are groups we wish to serve. Our next step is to extend the granularity of experience down to the individual, and provide each person (or at least each person who is affiliated with DU) with a experience that is tailored to his or her own interests and experience. I mean, if Amazon and L.L. Bean can do it, why can’t a library?

Digitize First, Catalog Later?

April 1st, 2010 Quantum Archivist No comments

In the digital collection building workshops we do for SAA, we always have emphasized the idea that you should never digitize a collection that isn’t already processed. We generally leave the definition of “processed” a bit vague. At the most basic level, we mean that until you have some organized list of the items that you want to digitize, you shouldn’t start slapping random content on the scanner bed.  In practice this meant that you didn’t digitize until you had item-level control of the collection, even if there was only a title without any other descriptive information. The value added descriptive information is something we would advocate adding as part of the digitizing workflow process.

Now I am beginning to wonder if that idea is not quite as valid for born digital content. Perhaps if we just put the stuff out there with the absolute minimum of control, and let the crowd of interested amateur experts fill in the details beyond what we can derive automatically we might be better off, or at least farther ahead.

For most born digital content I can know a few basic things mostly automatically: where it came from, who created it (sometimes), and what it is (document, photograph, moving image, etc) and its file format (jpg, pdf, mp4, mp3, etc.). I can assign it the few required fields in a management system automatically, with something as basic as the title being simply the file name. Could I then  just toss it out there and allow the crowd to fill in the other details?

Even if I assume that there are equal parts “Wisdom of the Masses” and “Madness of the Mob” out there, would I get enough good information to make it worth the work of separating the wheat from the chaff?

One argument on the positive side is that, unless you have a very highly focused collection with a very small temporal span, no one organization or institution can possibly have all the expertise to create high quality, in-depth information about all of its collections. And there are a lot of people out there who may know more about the Ukraine, or about DU in the 1940s than the folks here in Denver in the early part of the 21st century.

Could our role as archivists and repository managers be to view and review, rather than to create and catalog?

I don’t think this really can work, or can it?

What if Ramond Loewy Designed Our Access Tools?

March 26th, 2010 Quantum Archivist 2 comments

S-1 Locomotive (Library of Congress via Wikipedia)

Known as the father of industrial design, Raymond Loewy practically invented the look of “modernism” in industrial and consumer products. The iconic S-1 locomotive with its streamlined design became a model for everything from locomotives to automobiles to toasters in mid-century America.

The point is not that we need streamlined access tools (well we DO, but not in this way), but that maybe we should look to industrial designers as inspiration for the design of our access tools as much as we look at information architecture. This thought was inspired by a conversation I had at the recent IMLS WebWise conference here in Denver a couple of weeks ago. Jodi Allison-Bunnel of the Northwest Digital Archives and I were talking about building user interfaces and how the idea of user-centered design could lead to stagnation unless it was possible to translate users often unarticulated desires into something completely new. At which point I pulled out my iPhone and said something like “If somebody had asked me what I wanted in a handheld communications device I wouldn’t have described this!” Yet the design of my iPhone (and other smartphones) suits the needs of my mobile information seeking activities very well even if I couldn’t have explained it to someone ahead of time.

University of Wyoming Libraries web site

Does this mean we should design all of our discovery portals to mimic the experience of my iPhone? Perhaps, perhaps not. I know that there is an entire academic discipline of Human Computer Interaction, and there are Information Architects galore. But maybe we need to broaden our thinking a bit and reach out to people who are not necessarily in the world of information management but are a part of a world that makes useful things elegant as well as utilitarian.  Should I feel a sense of joy or excitement when I use an archival discovery and delivery system rather than just satisfaction that I discovered something? When we designed our access tools we spent a lot of time thinking about the functionality, and by and large we got that right. Maybe we should have taken a bit more time to think about the elegance of the tool as well. Maybe we will pretty soon.

Deliver the Moment

March 12th, 2010 Quantum Archivist 3 comments

As archivists, we are always trying to find the best way to connect to our user community to give them what they want in the best way possible. The idea of quantum archives is to connect people to the content in as granular a way as possible while preserving the opportunity for them to contextualize the content in ways that they want.   I was recently involved in a conversation where someone who wouldn’t ever consider himself an Archivist put this idea in the most succinct way.

Thought Equity Motion is a for profit stock footage fulfillment and video asset management service that manages the video libraries of some of the biggest media organizations in the world. They happen to be based in Denver and I’ve had a couple of opportunities  over the past few months to talk with Frank Cardello, the EVP for Corporate Development at TEM. TEM has just launched a joint venture with the NCAA called the “NCAA Vault.” Timed to coincide with the beginning of the annual Men’s basketball tournament, the Vault features “ten years of full games and highlights” of the Sweet 16. As a basketball fan I appreciate this opportunity, as an archivist I am even more impressed with how TEM and the NCAA thought about presenting historical information.

NCAA Vault graphic

A model for archival access?

While I can watch an entire game, I can also use search terms to limit to particular teams, years, and players. There are also some pre-defined categories like “great shots” or  “great finishes.”  Next, but not finally, you have the opportunity to search (using a text-based search box) through the play-by play track of the video footage for a particular moment or play within a game. You can select this clip and share it in other applications.

Frank said that the idea behind this approach was that people initially don’t want to watch the entire game, they want to “experience the moment” and share that moment with others. It was the purpose of the Vault to allow people to experience the moment.

Although he was talking about entertainment consumers, I thought that this was an interesting way to view all types of historical research. Researchers seldom want everything in a collection or a book, but those “moments” that help them prove their points, support their thesis or just inform themselves. This seems to me to be the essence of quantum archives, to reduce archival material to a collection of “moments” that can be used, shared, and re-used both in ways that we define–the pre-defined “great shots”–and the unexpected ways that result from users making their own moment out of a larger whole.

I’d like to coin a new phrase that I think I’ll add to the next version of the Quickstart Guide. It is “Deliver the Moment.” It simply means that we can manage our content according to traditional principles, but always seek to deliver that content in ways that resonate with our users.

I don’t know how scalable this idea is in terms of delivering real-life archival access. The NCAA Vault, for now, focuses on just one sport (Men’s basketball), in a very short time frame (10 years), and over a very limited scope (the last three rounds of the annual tournament). Given these limited parameters it is relatively easy to craft a satisfying user experience based on the principle of delivering the moment. There are plans to add more sports and a greater time span. I’m rooting for them.

Distributed Cataloging and the Semantic Web

March 9th, 2010 Quantum Archivist 2 comments

In the first couple of Harry Potter books, the editions that were offered for sale in the United States were Americanized versions of the original works. What was a “jumper” in the original became a “sweater” in the US version. Lorries became trucks, boots became trunks, etc. Even the title of the first book was changed to suit the American audience. Once the books became a world-wide phenomenon, everyone was presumably familiar with Britishisms and the practice stopped I believe.

This is an interesting and possibly significant issue as we begin to develop our distributed cataloging project for the work of Semyon Fridlyand. Will we need to develop a semantic thesaurus of some kind that will help us bridge the gap between how we think about and name things and how others do? Adding to the dilemma is the fact that we will also be dealing with multiple languages and even multiple alphabets.

At the Web Wise conference last week, I heard Monika Hagendorn-Saupe of Europeana the EU’s aggregator of digital libraries. They are dealing with a huge alphabetic, semantic, and language issue and are developing a semantic search engine that you can test. I think it has promise and I’m hoping to find out more about the project and will report it here.

The concept of the semantic web has been around for a number of years, and for at least 10 years we’ve been hearing how the semantic web would change the way we use the web. The automatic linking of similar ideas, even if those ideas are not specifically indicated in the resource has been something of a holy grail for information professionals since the digital age began and we realized that it would be impossible to maintain metadata about digital content in the way that we did for analog content.

Finding a way out of our semantic/language/alphabet dilemma is going to be a bigger deal than we had originally thought when we come up with this idea.

From Being to Becoming: Archivists Confront the Twentieth Century

February 22nd, 2010 Quantum Archivist No comments

Ten years into the twenty-first century we are beginning to see a movement among archivists to move forward into the twentieth century. All this really means is that Archivists are beginning to understand the balance between being and becoming. The idea that constant change, balanced and tempered by a consistent theoretical foundation, might just be the roadmap for the profession is slowly permeating the ethos of more “modern” or forward thinking archivists.

Howard Besser says that most new technology is used at first to mimic the old ways in a new form. “The conceptual steps [of technology development] typically include first trying to replicate core activities that functioned in the analog environment.”  So it stands to reason then that before we could invent new forms of access we had to re-invent the paper finding aid in the form of EAD.

However, it seemed that in doing so we raised the finding aid, rather than access to the material itself, to the level of an objective and actually prevented archivists from providing good service in the Internet environment. Before the age of computers and digitization,  there really was no point in providing highly granular content information to users. You still had to come to the repository and interact with the content in ways that did not disrupt the physical order of the boxes and folders. This filing system approach, while efficient and effective in its time and place, was a barrier to use. Everyone understood this, but no one had any real idea of how to do it better or differently. Thus, archivists became the interpreters of collections, a kind of human finding aid.

As with any bureaucratic system (and I mean this in the most positive possible sense), once it was devised, a class of apparatchiks grew up to tend the system and became vested in its perpetuation. The essential conservative nature of archives also contributed to the idea that the finding aid was sacred and that the Archivist as gatekeeper was the best possible way to provide service. I’ll wager that this access method was never satisfactory to the general user, but then, the only “serious” researchers were supposed to use archives anyway.

I don’t want to say that the parents of EAD didn’t do creative work. But they were working within a context of thinking from which they were unable to break free. It would have been surprising if they had, and if they did, perhaps no one would have listened to them anyway.

My own introduction to formal archival education came just before the introduction of EAD, and since I came from a research and teaching background, my idea of what an archives could or should be was based on user-centered ideas (although I didn’t know that term at the time).  I wanted to get to the “stuff” and draw my own conclusions, after all, that’s what I was there for. Most of my work when I was a classroom teacher centered on teaching with primary sources. It was always surprising what a group of students would see in a set of documents that I had never seen.

When I crossed over from researcher/teacher to service provider, I always believed that keeping the researcher as close to the content as possible was the most important thing an archivist could do. Give them the stuff and get out of the way!

Although it is obviously not quite as simple as that, we do have the capability to do this now. The Internet and digitization makes this all possible. But just how will we do it? By constantly trying new ways to present our archival material.

A couple of years ago, I gave a presentation at the SAA meeting in Chicago that I called “Where Have All the Binders Gone?” That introduced an idea that we should try to manage and provide access as close to the content as possible. This later evolved into the theory of quantum archives and is the inspiration for this blog.

The Essence of Self-Government is Information

February 18th, 2010 Quantum Archivist No comments

With that statement from George Mitchell as a governing principle, I set out  in 1994 to process 1,000 linear feet of papers from the former Senate Majority Leader. They came in a truck like the ones they use to move households. I had never processed anything on the scale of this collection or of this complexity. It challenged me to think differently about processing and access.

George Mitchell web site, 1999

George Mitchell web site, 1999

The first thing we did was to think about a productivity approach to processing, although in a very  paper-based way. We used a primitive, but effective, database system to manage the series and folders and, using the “report writer” function planned to create an electronic finding aid on the College’s gopher. (Anyone remember gopher?)

Well the web exploded onto the scene not too long after we started, and to our good fortune but not surprise, we found that with just a little adjustment to our report templates we could export HTML pages from our database. In the true fashion of reinventing the past in a new technology, we created a finding aid for the collection in a few short weeks that looked suspiciously like a paper finding aid in its construction and organization. We didn’t really know what to do with this new thing, but we knew we had to be there. So we

were on the web and we had pictures, and video! Even then we were exploring the potential of the web for organizing and reorganizing information. We  had a photograph “database” that was really just a categorized alphabetical list of digitized photos. We believed in searching and indexing, but didn’t have the tools in place to be able to do it, so we faked it. Similarly, the “menu” system on the left side of the finding aid is not dynamically generated, but is a set of images hard-coded into every page. We could imagine what we wanted to do, but didn’t have the tools or the expertise to do it.

Somewhat to  my astonishment, more than 10 years later, this finding tool is still available on the web as  part of a larger project to document the former Senator’s career. Take a minute to visit the George J. Mitchell Papers at

Bowdoin College for a look at the past envisioned as the future. Good enough for its time, and a beginning of understanding the power of this new thing called the World Wide Web. It is also a story of attempting, but not completely succeeding, to think out of the box. Even though many of the elements of what would become quantum archives were there for us, we just didn’t have enough experience to see it then.

p.s. Another round of thanks to Eliot Wilczek and Calley Gurley who embraced and supported the experiment. Both of them went on to careers in archives in other institutions. You were wonderful people to work with.

The Quick Start Guide to Becoming a Professional Archivist

February 15th, 2010 Quantum Archivist No comments

When we were first developing a productivity-based  processing workflow system for the Digital Collections and Archives at Tufts University, we had a whiteboard on which we wrote motivational phrases that reminded us of the things that were important for us to remember. These guiding principles were later codified into what we called the “Quickstart Guide to Becoming a Professional Archivist.“   It had two sections, one on archival principles and one on attitudes about processing. We used the Quickstart Guide as a introductory and training tool for new staff members.

The Guide introduced concepts like “lumpers vs. splitters” and “ruthless efficiency and dogged persistence.” as ideas related to archival processing as well as asking more philosophical questions about the role of the archivist in creating knowledge.

Back then the Quickstart Guide was mostly focused on processing paper records. As time went on and I began to use the Quickstart Guide as a teaching tool, I realized that in the born digital age, processing had changed significantly and that the old Guide was a bit out of touch. For example, the original Guide emphasized that good archival description proceeded from the General to the Specific and moved down that continuum as time and resources allowed. Quantum Archival theory turns that idea on its head, and says that good archival description focuses on specifics first and moves to generalities as time allows.

So I went back and revised it for the digital world. The result is the Quick Start Guide 2.1.

The Quick Start Guide, 2.1

The key change was to emphasize that “management is not access.” That is, the way we manage our collections is not necessarily (or even desirably) the way we want users to access our collections. The ability to separate management from access is one of the key values of digitized and born digital archival content.

The Quick Start Guide remains a central statement of what I consider to be “good” archival attitudes. It is the first thing I teach in my classes.