What is tantalizing about the proverbial task of finding the needle in the haystack is that you are assured the needle is there. The space is restricted, the object is unique, it must be possible to find it. And no doubt, with enough care and patience it could be found with just a pair of eyes and a pair of hands.(1)Archival research bears an enormous parallel to that proverbial haystack because it suggests the large quantity of unorganized materials one must mine to find the useful nugget of information.
In the age of Infoglut, the haystack is not static; the amount of information to search through grows faster than one can sort through it. The last time I checked, more than 11 billion photographs were made in the United States each year; and that was 1981. Mass media is relying more and more on visuals for news and entertainment. Publishers that were once satisfied with a few photographs for a book now are asking for thousands for Multimedia/CD-ROM products. The enormous number of images coupled with the high demand for visual information makes Infoglut seems imminent for graphic archives.
I'd like to offer a somewhat contrary opinion that Infoglut is merely a hip term for an old problem. The proliferation of computers and the growth of electronic information on the Internet has not increased the amount of information so much as it has increased the availability of information that already exists in another form.(2)
Infoglut is just old hay in new stacks. The fundamental problem of managing large quantities of information remains the same. Archivists have been dealing with Infoglut for years. It wasn't an electronic glut, but it was just as overwhelming. Acquiring a few hundred items may represent a very good year for an art museum, but it's peanuts by comparison to archives which may acquire a single collection containing hundreds of thousands of items. Arizona State University has been designated as the repository for the records of American Continental Corporation; more than 7,000 linear feet of records. Paper records. If more than a mile of records isn't glut, I don't know what glut is.
Although many people fear being overwhelmed by electronic information in the future, others already see themselves overwhelmed with hardcopy records. In particular, many repositories find their photographic collections in a quagmire. It is hardly unusual for repositories to ignore the photographs that came into their collections over the years; with the increased use of visuals in the mass media and the rise of CD-ROM "edutation," these collections find they need to get control of their graphic collections quickly.
Many archives have looked at computers as a magical solution to their problems of collections management; they dumped information about their collections into a computer, only to discover that they had far from solved their problems--they had automated them, so that they could make more powerful mistakes even faster!
In spite of some notable failures, automation has had a positive effect on archival practice. Archivists now have a much better understanding of the underlying principles of descriptive practices and the needs of their patrons. The two fundamental principles of archives have been reaffirmed; provenance and original order preserve important contextual information and offer basic access points. We've also learned some limits to those principles, that we need to provide access to the secondary value of collections and beef-up finding guides to make them sufficient to aid patrons in selecting materials.
Traditional archival practices have benefited from these insights to produce even more effective strategies for gaining control over Infoglut so that our collections are accessible to researchers. When we have mastered those strategies, we can automate them to take advantage of the computer.
Museums and libraries are used to working with item-level control; overcoming the bias towards item-level control is an enormous challenge, especially to those accustomed to it. If you have a small enough collection or a large enough staff, item- level control is practical. Otherwise if you're suffering from Infoglut forget item-level control except as a long-term goal that may never be attained.
To gain control of your holdings, make sure you have basic information about your collections. One effective method to capture this information is a retrospective accessioning project. Make a list every collection in the repository, capturing the storage location, provenance, and a very brief description.
Over a period of time, add additional information about the collection. Work in "layers," capturing one or two pieces of information during successive surveys. Many people try to capture complete information for each collection before moving on to the next; however, this means many collections have no control while others have exhaustive control. The layered approach spreads access and control evenly through the repository. The layered approach has a built in editing process, providing a chance to check previously recorded information in successive sweeps. Over time, create a complete accession record for each collection in pieces, recording:
Next, search administrative files, finding guides, and other documents, comparing your inventory against the records. The files may reference collections you can't locate; these often turn out to be collections which you have mis-identified or are unidentified. Some collections may have no documentation; take this opportunity to begin a collections file.
Going through the administrative files corrects and supplements the shelf inventory. Collections unidentified on the shelves may be identified in the files. The files often contain accession information not found with the materials, including the immediate source of acquisition and restrictions on the use of the materials.
Once you have basic control of your collections, then you can begin to refine your control at the series level. After you have control at the series level, you can continue a process of stepwise refinement to the folder level and finally (if you ever get there) to the item level. As you go through each layer, you can revise and correct previous descriptions based on your more complete knowledge of the collection. Realistically, you may decide to survey some important collections in depth before you describe less valuable collections; but I encourage you to have at least a minimum collection-level record for all your holdings.
This layered approach rapidly provides a broad level of access to all your collections. First, you don't have a large number of items without any access because you've spent so much time providing detailed access to a few items. Second, providing patrons with a basic list of collections by provenance with a terse description typically indicates the records' primary value--the reason the records were made in the first place. If you're searching for land ownership, you know to check the records of the County Recorder rather than the Fire Department. Secondary values can often be inferred.
Original order is particularly important in organic collections because the contextual relationship of the records may contain as much information as the records themselves. Many people try to save work by reorganizing the materials by subject headings, but that destroys important information. I can't emphasize enough: if you aren't trained in appraising records for contextual order, don't change their order; you can organize descriptions on paper to your heart's content, but leave the originals alone unless you know what you're doing. (Now, I'll step off my soap box.)
In your finding guides, describe the order of the materials with instructions to patrons as to how to request materials. For example, a collection of portraits might include the note "Arranged by name of sitter; include the name of the sitter on your call slip and you will be brought the appropriate box."
If there is an existing finding guide, direct the patron to request it. Ideally, the finding guide will index entries in a meaningful fashion, but many are registers are in chronological order. Even though the important elements may be randomized by date, scanning the entries for a relevant name is likely going to be quicker than trying to scan the records themselves.
I believe that finding guides are very important documents. They provide a single place to record the archivist's intimate knowledge of the materials gained during processing. I also use the finding guide as the place to record information about the collection that researchers need to know in the reading room; it's usually the reference archivist who knows to direct the patron there, but information on restrictions, credit lines, and similar administrivia is in an easily accessible place rather than buried in a file somewhere else.
The collection is typically described hierarchically, beginning with an administrative history or biographical note to place the collection in the context of its creation and then gives an overview of the collection in a scope and contents note. The next level in the hierarchy is the series, and within each series are a list of folder headings.
The strategy we need to learn is one of effective description. What do patrons need to know about the materials in order to make an intelligent selection? More and more, I believe that folder headings are not sufficient; the record creators were so familiar with the materials that they only needed a mnemonic. Researchers unfamiliar with the materials need more information.
Rob Spindler, a colleague on the reference desk, observed that patrons tend to ask questions in terms of a topic, a format, and a date. Minimally, we should provide that information with the folder heading. Depending on the nature of our mission, we may want to describe other aspects. The Center for Creative Photography is interested in 20th century photographers; they might not choose to note every individual referenced in the records, but they might always note a photographer.
In studying patron understanding of descriptions of archival collections in online library catalogs, Rob Spindler and I discovered that people frequently misinterpreted the descriptions written according to Archives, Personal Papers, and Manuscripts(3) (APPM)--the standard rules for describing archives in archival collections.
Starlie Lomayaktewa, Mishongnovi town crier, ca. 1982 / by Suzanne and Jake Page. 1 photographic print.Archival description has traditionally been a list of folder headings. However, those headings often fail to capture much of the information researchers need to make a meaningful selection of materials. We must go beyond transcription to analysis; we must provide a sense of the contents of the folders.
Geronimo--Apache, before 1907 / by Edward S. Curtis. 1 photogravure print.
Indian trip journals, 1959-1989 / by R. Brownell McGrew. 32 spiral notebooks of mss.
Karl Moon portraits of Native Americans, ca. 1910. 5 photographic prints. 487. Desert dawn, c1907 -- 488. On the way to the trading post -- 489. A pool by the trail -- 490. In the Indian country = Awaiting the signal : To illustrate The End of the Trail -- 491. The meeting place : To illustrate The End of the Trail.
The key is that I am interpreting the materials. Minimally, I am synthesizing information in the records that researchers need in order to decide if the material will be relevant. Traditional bibliographic description relies on transcription of existing information, but archivists often do not have the luxury of having something to transcribe. At this level, interpretation is no different from an annotation in a bibliographic record.
Any interpretation beyond synthesis verges on anathema to bibliographic catalogers; they believe strongly that the cataloger must leave judgment of a work to the patron. Ultimately, I agree; it is the patron who must interpret the materials. My intent is to provide information that will aid in the selection of materials. When possible I rely on scholars' interpretations and annotations, which are more authoritative than my own. Also, I put a rhetorical flag on anything that might be considered a value judgment, such as "May be interpreted as . . . . "
Archives differ from general collections, and I believe this difference justifies interpretation. For example, a cataloger would never label a book as propaganda. But in an archive that specializes in World War II, what good is it to collect examples of propaganda if it will not provide access to it under that heading?
Photographic collections place particular demands on archival description for several reasons. Photographs belong to the realm of vision, textual records to that of hearing. Photographic records' particular value is their ability to communicate information words cannot; we use photographs to "point" to things we cannot describe.
Effective description is not necessarily measured by the amount of detail. To the contrary, if we bury researchers under too much description, we might as well bury them under the records themselves. Our description should support the repository's mission; don't note every datum, but do not those that directly relate to the mission.
In the worst case scenario the repository created a finding guide for each collection and the researcher was forced to look through each guide. That was not so onerous when there were only a handful of guides; but searching a hundred guides was impractical.
One technique to supplement access is a topical bibliography or essay describing relevant collections. This technique is a particularly handy reference tool to help direct patrons to collections related to common queries.
Another common technique to provide access is to create a single index containing entries for all collections. The index may be nothing more than a heading followed by pointer to a collection guide or to the materials (in terms of the collection, box and folder numbers). Or the index may be more like a bibliographic catalog, containing both the callmark and a description of the materials.
Providing description with the index headings can aid the researcher in the selection of materials. Great in theory, but the actual mechanics of producing an archival catalog containing adequate description of materials under the many different headings is an enormous amount of work.
Computers promised to reduce the amount of work, and the Society of American Archivists adopted APPM and the USMARC-AMC format as a means to automate the cataloging process and improve access. With mixed results.
The use of APPM and AMC has changed significantly in the ten years since they were brought online. Originally, archivist tended to try to lump the entire finding guide into one enormous record that took up a dozen screens for description and a dozen dozen screens for index headings. Search and retrieval engines have generally been developed for bibliographic databases that organizes the hits according to a bibliographic notion of main entry rather than the archival principles of provenance and original order. Creating a APPM/MARC record was generally done after a traditional finding guide had been written. Instead of saving work, automation became another layer in the descriptive process and took--rather than saved--time.
At the same time archivists were learning the basics of AACR/MARC-based online description, the Internet was born. Instead of adding another step to create a MARC record, many archives are loading their word-processed finding guides onto Gophers. Instead of indexing the materials, they're using Veronica and WAIS to provide access through full-text searches of the guides.
Gopher descriptions are scattered throughout the Internet and lack consistency; ironically, we've dumped the haystack on the researcher. But, the researchers seem to love it. The positive reception of finding guides on Gophers may give us some clues to how we can facilitate access.
First, Gopher are often organized regionally and many topics lend themselves to regional access. Also, many patrons know which repositories hold relevant materials; check Yale for Texas and the Southwest, check the Bancroft for California and the Southwest.
Wired researchers haven't jumped into an electronic haystack; the haystack is organized in ways that will lead them to many useful collections. They may miss some important materials in out of the way places, but the poly-hierarchical organization of Gophers seems to be an effective way to winnow down the size of the haystack. And Veronica and WAIS searches are still primitive, but they often help researchers find those out-of-the- way collections.
Second, Gopher guides are much more detailed than the collection-level AMC records. My sense is that these descriptions are so broad as to not aid researchers in the selection of materials.
Gophers are great browsing tools. I think they will show some limitations as their novelty wears off and the size of the Internet grows.
Researchers on the leading edge of the Internet are some of the most intelligent and diligent researchers; but Gophers may not be as effective for those who want more direct access. Many of those on the Internet pay no access charges, so their only expense in searching is their time; if people start being charged for time or file transfers, they will want more direct access to relevant materials to save money.
In the middle is virtue. I happen to believe that APPM/MARC descriptions can work effectively in archival description. We need to play more with their implementation to find out what really works and what doesn't work, but I think they can work.
Why use a thirty year old data architecture? My first response is simple: because it's there, so why reinvent the wheel. Second, the format and standards are not thirty years old; they've evolved and will change, and if something need to be tweaked to make it work better, we'll tweak it.
I believe the significant problem that APPM/MARC has made apparent existed before we tried automation.
Indexing--automated or manual--lacks much to be desired. A number of currents are driving us in the right direction.
Researchers will ultimately have to look at the materials to make the final selection of what's relevant. The catalog is the first step, not the final step. We need to provide a way to navigate subject headings. Incorporating cross-references and the syndetic relationships between headings into the index will help researchers refine their search strategies by pointing to broader, narrower, and related concepts.
SAA's CAIE is investigating ways to trace administrative history/biographical notes in name authority database, and then provide links to the descriptive database.