Building the Digital Collection: The Evolution of the Conservation Movement, 1850-1920

The 793 documents comprising The Evolution of the Conservation Movement, 1850-1920 include 62 books and pamphlets, 140 Federal statutes and Congressional resolutions, 34 additional legislative documents, 23 excerpts from the Congressional Globe and the Congressional Record, 360 Presidential proclamations, 170 prints and photographic images, two historic manuscripts, and two motion pictures. The materials for the collection come from the General Collections of the Library of Congress, from the Law Library of Congress, and from the Manuscript Division; Motion Picture, Broadcasting and Recorded Sound Division; Rare Book and Special Collections Division; and Prints and Photographs Division.


The documents consist of printed matter and manuscript items. All documents are reproduced as a set of facsimile page images. Most of the longer documents have a searchable text transcription as well. Systems Integration Group (SIG) of Lanham, Maryland, digitized the documents. The image capture occurred at the Library and, in order to preserve the originals, bound works were scanned in their bindings. The master, or archival version, of the images for most pages that consist of typography and line-art is the 300-dpi bitonal image stored in TIFF format using Group IV compression standards.

Although a few document images were produced in the late 1990s, most of the collection's documents were scanned in 1994-95 using the Xerox (Kurzweil) K5200 scanner. Like certain specialized photocopiers, this scanner has a book-edge design that allows a bound volume to be inverted and positioned along a beveled edge for scanning.

In addition to the master images of typographic or line-art pages, the Library of Congress created additional image types to reproduce printed halftone illustrations. Printed halftones present special problems that result from interference between the spatial frequency of the halftone dot pattern and the spatial frequency applied by scanning and/or output devices. When the two frequencies combine, the interference between them manifests itself as moiré patterns that degrade the image.

Most of the images of printed halftones in books were produced on the Xerox K5200 scanner, which offers a diffuse-dithering algorithm that randomizes the scanner's pattern of dots to produce bitonal images in which the moiré pattern is suppressed or reduced. The algorithm results in speckles to white areas surrounding an illustration, however, adversely affecting the legibility of captions or other typography included in the same scan as the illustration. When the Xerox K5200's diffuse-dithering treatment is applied, the software creates files in the PCX format.

Derivative GIF format images were created for the page-turner utility and, in the case of TIFF and PCX formats, browser compatibility. The derivatives are produced in a batch conversion procedure using Image Alchemy. The resulting image is reduced in scale to fit the typical display monitor and sharpened to enhance legibility. When bitonal images are being processed, gray tones are added, and the resulting image is blurred to mimic grayscale. Tonal resolution for grayscale and color images is generally maintained when converted at 16 colors, as these derivatives were.

After the images were digitized, the preparation of the searchable texts occurred offsite, where a subcontractor converted them to machine-readable form by transcribing the text from the document page images and then encoding it with SGML (Standard Generalized Markup Language). The texts have a transcription accuracy of 99.95 percent. The online presentation offers access to the SGML versions of the texts for users with appropriate viewing software. The SGML markup produced in 1994-95 used the then-current version of the American Memory document type definition (DTD) and was updated to conform to the new DTD, which was modified in 1996. The online presentation of the texts also includes a version in HTML (Hypertext Markup Language), which is easier for most users to access since no special software is required.

Pictorial items

The 170 prints and photographs were first copied to 8 x 10-inch negatives or 4 x 5-inch color transparencies to serve patron requests for copy prints. During a videodisc production activity in the early 1990s, these large negatives were themselves copied to 35 mm film, which was then converted to video for the discs. In 1999, JJT Incorporated of Austin, Texas, rescanned the 35 mm film to produce most of the digital images presented here. This set was scanned to approximately 1500 pixels on the long side. A small set of these images was scanned directly from the 8 x 10-inch negatives in 1993 at a resolution of approximately 1280 pixels on the long side. All the images can be viewed at thumbnail size and as a 640-pixel JPEG-format service image. Some items also have a larger JPEG-format image available at 1024 pixels on the long dimension. The uncompressed TIFF-format file is also available for downloading where it exists.

Motion pictures

The two motion pictures included in The Evolution of the Conservation Movement, 1850-1920 were copied to videotape by Roland House in Arlington, Virginia. The film was shot at varying frame rates; therefore, in the video mastering process, the playback speeds were adjusted to present the appearance of natural motion to the greatest degree possible. Main title frames for each part were added during the editing process.

The video copies of the films were digitized by Crawford Multimedia in Atlanta, Georgia. MPEG, QuickTime, and RealMedia digital versions are available. For all American Memory collections the MPEG and QuickTime versions of the titles with running times greater than four minutes have been divided into segments to reduce the file sizes to 40 MB or less. A typical 28.8 Internet connection achieves a theoretical maximum download rate of approximately 3.5 KB/sec (210K/min) under ideal conditions. Therefore, a file of 40 MB would take approximately 190 minutes (3 hours, 10 minutes) in optimal conditions and possibly much longer than that (up to two to three times longer depending on Internet traffic load). MPEG files have spatial resolution of 320 x 240 pixels and run at a nominal rate of 30 fps. The QuickTime files have a resolution of 160 x 120 pixels and run at 10-15 fps.

