December 12, 2012

Representation of recordings through Annotation

Our practice of Oral History content management, which we often refer to as digital indexing, began by questioning the assumption that recordings must be transcribed word for word before they can be used.  In the database-driven environments we work in, summary annotations are much preferred to full transcription. The work of the annotator is at the heart of digital indexing and we continue to reiterate that full transcription is always an option if the need is there and time and resources allow. Here are some random musings about annotation...

  • Annotation is about representing what is on a recording in a text format.  In this sense, it is no different than a transcription.
  • An annotation needs to describe passages of audio adequately enough to lead a user to that passage. Defining the users well may be as important or more important than the specificity with which the annotation represents the passage.
  • An annotation can be enhanced by using strategic vocabulary words within the prose of the annotation. Thus full text searches will get hits on that digital object (passage of audio or video). 
  • All annotations are subjective, and that is totally okay. It is all part of recognizing, defining, and composing toward an audience of users. Our subjectivity saves them time.

December 4, 2012

Multi-Dimensional Indexing: A Dynamic Process

Traditional cataloging and indexing might typically be bounded in scope--like in the case of a collection catalogue or an index is developed for a particular book.  But oral history collections and other digital collections are often associated with active projects that are growing over time. If the indexing process is strongly content-informed, and the content is changing dynamically over time, then the indexing process must not only be an iterative one, but a dynamic one as well.

What does this mean for our controlled vocabulary development process? In order to begin to capture the breadth and diversity of a collection via an index, the more content used to contribute to its development, the better. But how much of the content should inform the index, and when? Is it okay not to reevaluate the earliest indexed material and code it up with the evolving framework?

