Home > Discovery, Metadata, Preservation > Distributed Cataloging and the Semantic Web

Distributed Cataloging and the Semantic Web

In the first couple of Harry Potter books, the editions that were offered for sale in the United States were Americanized versions of the original works. What was a “jumper” in the original became a “sweater” in the US version. Lorries became trucks, boots became trunks, etc. Even the title of the first book was changed to suit the American audience. Once the books became a world-wide phenomenon, everyone was presumably familiar with Britishisms and the practice stopped I believe.

This is an interesting and possibly significant issue as we begin to develop our distributed cataloging project for the work of Semyon Fridlyand. Will we need to develop a semantic thesaurus of some kind that will help us bridge the gap between how we think about and name things and how others do? Adding to the dilemma is the fact that we will also be dealing with multiple languages and even multiple alphabets.

At the Web Wise conference last week, I heard Monika Hagendorn-Saupe of Europeana the EU’s aggregator of digital libraries. They are dealing with a huge alphabetic, semantic, and language issue and are developing a semantic search engine that you can test. I think it has promise and I’m hoping to find out more about the project and will report it here.

The concept of the semantic web has been around for a number of years, and for at least 10 years we’ve been hearing how the semantic web would change the way we use the web. The automatic linking of similar ideas, even if those ideas are not specifically indicated in the resource has been something of a holy grail for information professionals since the digital age began and we realized that it would be impossible to maintain metadata about digital content in the way that we did for analog content.

Finding a way out of our semantic/language/alphabet dilemma is going to be a bigger deal than we had originally thought when we come up with this idea.

Bookmark and Share
  1. Nathan
    March 9th, 2010 at 16:41 | #1

    This is something we handle pretty frequently in the linked data sub genre of semantic web; in short we just @lang tag every name; so for instance if you lookup London on dpedia you’ll find the name for London in virtually every language; adding further en-US vs en-GB really wouldn’t be that big a chore imho :)

    when it comes to translating from extracted chunks of text though.. well that may be a bit trickier to detect; especially as many mixup en-US and en-GB so much (with z/s swapping etc)

    nice article :)

  2. March 9th, 2010 at 19:33 | #2

    @Nathan
    I confess I know very little about this (but will soon learn more). Thanks for the starting point for my education. Cheers.

  1. No trackbacks yet.