Wednesday, December 12, 2012

Representation of recordings through Annotation

Our practice of Oral History content management, which we often refer to as digital indexing, began by questioning the assumption that recordings must be transcribed word for word before they can be used.  In the database-driven environments we work in, summary annotations are much preferred to full transcription. The work of the annotator is at the heart of digital indexing and we continue to reiterate that full transcription is always an option if the need is there and time and resources allow. Here are some random musings about annotation...

  • Annotation is about representing what is on a recording in a text format.  In this sense, it is no different than a transcription.
  • An annotation needs to describe passages of audio adequately enough to lead a user to that passage. Defining the users well may be as important or more important than the specificity with which the annotation represents the passage.
  • An annotation can be enhanced by using strategic vocabulary words within the prose of the annotation. Thus full text searches will get hits on that digital object (passage of audio or video). 
  • All annotations are subjective, and that is totally okay. It is all part of recognizing, defining, and composing toward an audience of users. Our subjectivity saves them time.

Tuesday, December 4, 2012

Multi-Dimensional Indexing: A Dynamic Process

Traditional cataloging and indexing might typically be bounded in scope--like in the case of a collection catalogue or an index is developed for a particular book.  But oral history collections and other digital collections are often associated with active projects that are growing over time. If the indexing process is strongly content-informed, and the content is changing dynamically over time, then the indexing process must not only be an iterative one, but a dynamic one as well.

What does this mean for our controlled vocabulary development process? In order to begin to capture the breadth and diversity of a collection via an index, the more content used to contribute to its development, the better. But how much of the content should inform the index, and when? Is it okay not to reevaluate the earliest indexed material and code it up with the evolving framework?

Wednesday, November 28, 2012

Multi-Dimensional Indexing: An Iterative Process

Developing and refining a catalogue or index is an iterative process that includes cycles of brainstorming, organizing, arranging, testing, reorganizing, editing, and publication. Iterative means we expect to go back and adjust work done earlier in the process, informed by things we learn later in the process. This may seem inefficient if we are comparing to other types of work. But indexing is more like writing or music composition, where the goal is quality content and the product comes as a result of a number of drafts. Although some brilliant artists create and compose well spontaneously, others require multiple revisions of initial drafts. In indexing, we behave more like the latter artists, revising significantly our first attempts based on an evolving conception of what's important or feedback from an audience.

Thursday, October 25, 2012

Digital Humanities Discussion

In November I'll be attending the American Studies conference in San Juan and will be on a panel talking about Digital Humanities in graduate education. Although our primary identity at Randforce is more in Digital Indexing than digital humanities, we've worked with people like Mark Tebeau at Cleveland State and folks at the Center for History and New Media at George Mason University--who run THATcamp and have developed Omeka--and generally have a lot to say about this diverse and emerging field. As part of the ramp-up to the conference I've been invited/encouraged to post and comment on this blog:

Digital Dimensions of Grad Ed in Am Studies

Check it out to join or view the discussion... The session is co-sponsored by the Graduate Education Committee and the Digital Humanities Caucus of the American Studies Association. Thanks to Rob Snyder for inviting me to join that panel!

Thursday, August 30, 2012


A producer may be involved at any stage of an oral historydigital indexing project, and can focus on concise published content that features parts of a collection even as comprehensive annotation and indexing proceeds. Production can begin immediately after an interview is recorded or long after an interview was performed. Naturally, finding the best segments to produce from older footage is more convenient when the content is annotated and/or indexed.

Producers work in a variety of different ways, but fundamentally their role is to identify content they wish to work with, redact that content as part of an editorial process, and reproduce the material in edited videos, podcasts, segments for radio, etc.  The producer uses the content of the audio/video resource, brings additional research and vision, and adds other historical recordings, music, images and other documents to create palatable products to audiences.  Annotation and indexing is crucial for creating access to oral history audio and video, but production and publication is important for spreading awareness of and exposure to the collection. Even as we attempt to preserve and create better access to oral history collections, ultimately we want to see them be used for the education and enjoyment of others, and producers are crucial players in that effort. 

Return to Oral History Digital Indexing Roles.


As with any published document, an editor is needed to assure consistency and quality. An editorial role is also important to unify inconsistencies that may emerge between different annotators of the audio and video by dictating a style and giving feedback until the desired “voice” is achieved. The work may be as much managerial in nature as it is an editing role.  In general, the editor should be the master of all the text created in the annotation/indexingprocess. The editor would not be expected to have heard every minute of the original audio or video, but they would be expected to have read most or every word annotated within the collection.  In a larger project, the editor might delegate editorial tasks to trusted personnel, but ultimately there should be one person at the top of a hierarchy who is ultimately responsible for all published content.

For newly developed controlled vocabulary, there is also an editorial role associated with approving new terms. In the context of a custom/local controlled vocabulary being developed from scratch, the editorial process occurs as proposed terms are agreed upon and finalized.  In the case of updating, amending or expanding on an existing standard or local controlled vocabulary, content management systems like CONTENTdm have features that cue and allow a librarian to approve specific terms that have been added. In that case, the librarian who has the authority to approve or disapprove of a term being added is acting as an editor as well.

Thursday, August 23, 2012


The role of indexer entails two parts of the indexing work. There is developing the index—i.e., brainstorming and organizing the customized controlled vocabulary to be used, and then actually applying the index to the content, which could be referred to as coding. As with all the other roles discussed here, this work may all be done by the same person. 

Developing an index, or more specifically the controlled vocabulary used as the index, may include contributions from anyone familiar with the content of the oral history interviews and the subject area in general. The annotator is typically best equipped to provide the most specific and topical input relative to the material that has been annotated. However, the controlled vocabulary is based not only on the very specific content of the collection but outside factors as well. Existing thesauri such as TGM and LCSH can be drawn upon to develop the controlled vocabulary. Also, the users--the anticipated audience of the collection—should be explicitly defined and considered when choosing terms (e.g., a local term for an object might be more appropriate than what LOC calls it.)  Developing a controlled vocabulary is particularly challenging in an on-going project, as the terms chosen and the architectural structure of terms (e.g., hierarchy) will necessarily change as more material is added. Ideally, the controlled vocabulary is developed after all interviewing and annotation is complete, though sometimes this is not possible. Developing the controlled vocabulary works best as a collaborative, iterative process (including drafting, debating, and test application) aimed at a comprehensive taxonomy that represents the whole collection.

Applying the controlled vocabulary is another task of the indexer, or more specifically the coder.  Coding is sometimes done by more than one person, and often done by someone different than the person who composed the annotation originally. Frequently, the coder applies the controlled vocabulary based on the annotation summary only, not by actually listening to the original recorded passage, which has certain advantages and disadvantages. The key challenge with this job is maintaining a reasonable amount of “intercoder reliability”—i.e., that two indexers/coders independently will assign the same vocabulary term to the same passage. Some inconsistency and subjectivity is expected in this process—just as two people would not index a book in exactly the same way. Ideally, one person, such as a lead indexer or the editor, should oversee the indexing/coding and develop some means of quality control of to assure a reasonable amount of consistency.

Return to Oral History Digital Indexing Roles.

Tuesday, August 21, 2012


Annotation is a key step upon which indexing is built, and the role of the annotator is to link the content of the audio or video with meaningful text. Annotation is itself a form of indexing—creating text that directs a user to content of interest—that simply takes a linear form, similar to a transcript. The basic skill is listening to a recording, and composing text that summarizes the content. Although annotation seeks to lighten the tedious burden of word-for-word transcription, it still takes significant time to complete (at least 1 ¼ hours per hour of interview, sometimes up to 2+ hours or more if a great deal of detail is desired). Annotation proceeds faster with practice and with increasing familiarity with the content. Familiarity with the content from the outset (whether because of personal background/interest or because the annotator conducted the interviews) is typically an additional advantage. Because annotation is the foundation upon which the indexing is built, consistency of style, density, and overall quality of annotation is strongly recommended. Generally, annotations that are shorter but consistent are better than a highly varied collection of sparse and detailed annotation written by different people in different styles. When there is more than one annotator, an editor is an essential player in the indexing process. 

Thursday, August 16, 2012


Digital Indexing of oral histories begins with the work of interviewer.  In an ideal digital indexing project, the role of the interviewer may continue throughout the process of indexing, and in some cases right through to editing and production. In other cases, the interviewer cannot be involved, for example in older, archival collections. In any oral history, a key responsibility of the interviewer is to know their subject and have conducted thorough background and contextual research.  An interviewer is by default a steward of the recordings, and is thus on some level responsible for seeing through certain follow up steps after the recording is made.  In a project that includes indexing, an interviewer is well positioned—because of their familiarity with the content—to be an annotator and indexer of the material they recorded. This is not imperative, and sometimes has the downside that they are not as objective as a less familiar annotator might be. However, the original interviewer—especially at a time shortly after the interview was conducted—has the potential to most efficiently annotate material, taking advantage of their own memory of the recorded event. When someone besides the interviewer does the annotation, they have the disadvantage of being unfamiliar with the content. 

Friday, June 1, 2012

Oral History Digital Indexing Roles

Digital Indexing work is more akin to art than science. Accordingly, breaking down the process to a series of universally applicable steps or elements is challenging.  This is both because of the nature of the work but also due to the fact that every project we work on is different. Our clients’ placement on the digital spectrum varies--ranging from brand new oral history projects where not a single recording has yet been made to well-developed projects with large amounts of digital material that need to be multi-dimensionally indexed.  Breaking projects into phases, tasks, or other elemental organizational schemes can be done, but is not necessarily the most useful organizational indexing schema for this type of content (if you know what I mean).

The indexing process can be described by the roles of the people involved. These roles described are not mutually exclusive, but they do comprehensively cover the phases of work necessary to get a project from beginning to end. The following titles for the various “roles” are conceptual only.  In some smaller projects, one or two people are taking on all of the roles. In other projects, several people may take on a single role (for example, where volunteers are organized to create annotations). These roles can be filled by people within or outside of an organization, be paid or volunteer positions, and engage highly skilled and knowledgeable people, or not.  They are presented here as a basic guideline of “who and what” is needed within an annotation/indexing project, and each role will be described in more detail in their own post.

Tuesday, April 24, 2012

Browsing vs. Searching: An "analogue" analogue

From NPR's Eulogy for a Record Store, a great quote about the merits of record store browsing:

"Browsing is a form of learning that is intense but indirect," Wieseltier says. "It's learning by osmosis, by serendipity. It's being surprised because when you browse, you don't know what you're looking for. You're not indulging in 'search.' You're basically making a bet on the richness of the world and immersing yourself in it and coming away with something that you've discovered. ... Look, we grow by discoveries. We don't grow by what we already know. But for these interventions and revelations and illuminations, we would be only what we already are. Who wants to be that?"

Special thanks to for pointing out this link!

Wednesday, April 4, 2012


It seems I’m getting behind on my posts…. Or actually, it seems I was getting ahead of myself.

I want to talk about a concept I’ve been dabbling with for a couple of years. I call it “inliteration”. It’s a word I made up to capture the essence of what it means to make up a word. Inliteration is similar to “incarnation”, except instead of meaning to become “embodied” in a thing that is physical, it limited to taking form as words.

Here’s Merriam-Webster’s definition(s) of incarnation:

So to parallel those definitions, I propose:


1a (1): the embodiment of a deity or spirit in a word or group of words in human language (2): the union of concept with language analogous to the union of divinity with humanity in Christianity b: a quality or concept definable as a word or group of words

2: the act of inliterating: the state of being inliterated

3: language

This is a cut at a definition… enough, I think, to be able to refer to it for other discussions. I’ve found it a handy word to have in conversation with a small group of friends who talk about library science, indexing, and life philosophy in general.

Phone: 800-554-1047 - E-mail:
Web Site Copyright © 2011 The Randforce Associates, LLC