Westward by Sea: A Maritime Perspective on American Expansion, 1820-1890

Integrating the Collection into American Memory

The original materials in this collection are heterogeneous and the corresponding digital reproductions fall into different categories used by the Library of Congress to represent converted materials online:

Pictorial items
Represented by thumbnail and higher resolution image(s)

Example: Oil painting

Large images (such as maps)
Represented by thumbnail and "zoom" view. Mystic Seaport chose to use this approach for various charts, ships' plans, and large manuscript documents. Files compressed using MrSID software were delivered to the Library of Congress for this purpose.

Example: Ship's plan

Page-turning items
Each page may be available at higher resolutions or through a zoom view

Example: Set of plans for a ship

Searchable text items (marked up in SGML)
The Library of Congress uses the simple American Memory DTD for its own transcriptions and converted Mystic Seaport's plain ASCII transcriptions to the same format. Most transciptions are accompanied by page-images; the American Memory presentation supports links from the searchable text to the corresponding page images. Long documents ar usually divided into sections, such as chapters.

Example: Letters written on a journey west aboard ship

Contact-sheet items
Represented by a grid of captioned thumbnails. This presentation is used when a set of graphical items has been described (cataloged) as a group rather than with individual records.

Example: Collection of sailing cards

The G. W. Blunt Library at Mystic Seaport delivered image files in various formats (GIF, JPEG, MrSID), transcribed text as ASCII files, and catalog records in the MARC format to the Library of Congress. Each MARC record included a local field that indicated the item's unique identifier within the collection and its category. For use in American Memory, this information was transferred to $f and subfield $q of the 856 field. This enables a link from bibliographic dislays to the appropriate presentation for each digital reproductions. Local fields and subfields were added for consistency with other American Memory collections and to support cross-collection searching.

Transcribed text was delivered as ASCII files, one file per item as cataloged. Within the transcription, page-breaks had been indicated and the corresponding image file identified. For some works, a second file was delivered, representing a table of contents with each entry identifying the starting page-image. The Library of Congress took the transcriptions and tables of contents and marked them up automatically in SGML, according to the American Memory DTD. Documents for which tables of contents were available were divided into chapters for more effective searching. Special characters (such as the degree sign) were converted to SGML character entities. Positions (defined by latitude and longitude) and dates in diaries and logbooks had been encoded by Mystic Seaport in normalized form in addition to the form used in the original document. Positions were transformed for readability and dates were transformed to ISO 8601. As text strings, they can be searched for within the full text. The embedded references from the transcription files to page images were used to generate the datasets that support the Library of Congress's page-turning interface. The full text and the MARC records were indexed using Aurora (formerly InQuery), the search engine used for the American Memory service.

