MIP in collaboration with the Instructional and Collections Computing Facility (ICCF) provides database support services to museums and archives. These services include support of database management systems, data modeling, data model integration and software for data retrieval and analysis. To date, nine databases and collection management applications for each have been developed and are supported by MIP. These databases contain over 1.5 million cataloging records from campus museums and archives.

All MIP supported databases are implemented using the Sybase database management system. Services for implementation, backup, maintenance and migration to new versions of Sybase are provided. Database reporting has been supported through the use of the commercial application, Business Objects, and locally developed tools using Perl, C, and C++.

Enhancements to Sybase for spatial data and full-text data are being added to expand MIP-supported database management services. The Spatial Query Server extension to Sybase has been implemented and the Spatial Database Engine from Environmental Sciences Research Institute (ESRI) will be evaluated for its interoperability with Sybase. The full-text products from Verity for use with Sybase are under consideration. Unicode support for multilingual character sets is available in the current release of Sybase. An enhanced version of the Verity full-text product with a broader range of retrieval functions is due for release later this year.

MIP has not proceeded with spatial and full-text database support until site licensing was completed or appropriate versions were available. The campus obtained an ESRI site license recently. Because needed software tools are now available, during the coming year prototype database support for projects utilizing spatial, full-text and non-Roman character sets will be developed in collaboration with the Electronic Cultural Atlas Initiative (ECAI) and the campus Geographic Information Science Center.

For development of prototype database support services for spatial and full-text data, MIP and ITP will collaborate with the ECAI project to:

Implement the Time Map database structures and software developed at the University of Sydney.
Store and retrieve SGML/XML documents created by ECAI projects.
Convert and implement a Microsoft Access database of Chinese, Japanese, and Korean characters developed by the Academia Sinica, Taiwan for encoding historical manuscripts.

MIP and ITP are also investigating the application of multivalent document technology developed by the Digital Library for the Environment to documents produced by ECAI.

As part of the development of an image repository (see Repository Services), MIP has begun the development of a database of images for networked presentation. To date, all images available through MIP have been stored in file systems rather than databases. With enhanced support for binary large objects by Sybase in its latest releases, it is now feasible to support such data in a database. Database support for images will be extended to include audio and video in the coming years.

The University Library has extensive experience with SGML documents but have not developed a long-term strategy for their management. Through the Berkeley Finding Aids Project and its successor, the Encoded Archival Description, the University Library has prepared a large number of SGML documents that describe portions of their archival collections. Their experience with SGML collection description documents complements MIP’s use of relational object catalogs. Expanding our database support services will facilitate the development of a coordinated strategy by IST and the University Library for SGML / XML document support.

Database management system support is a core service of MIP. As the number of databases has grown and the set of services needed has expanded, support staff has shrunk due to staff attrition in 1996 and 1997 and the inability of IST to refill these positions due to budget restrictions. Funding for staff to bring MIP back to its 1996 staffing level has been requested as part of the IST campus budget request. Returning to the 1996 staffing level is critical for MIP to provide this and other services discussed below.

For each database supported by MIP, a data model has been developed. These data models are useful for defining the scope and functions of the database, for describing the processes these databases support, and for implementing the database. In addition, comparative analyses of these data models have allowed MIP to design general data structures to be applied across projects.

General data structures for thesauri and for agents (people and institutions) and their roles relative to a database or the objects described by a database have been developed. The thesaurus data structure is consistent with the ANSI / NISO Z39.19-1993 standard, Guidelines for the Construction, Format, and Management of Monolingual Thesauri. A report on this data structure can be found at MIP's Relational Data Structures for Implementing Thesauri page.

This thesaurus data structure has been applied to the following thesauri used by MIP projects:

Art and Architecture Thesaurus - a comprehensive vocabulary of nearly 120,000 terms for describing objects, textural materials, images, architecture, and material culture from antiquity to the present.
Thesaurus of Geographic Names - approximately 900,000 records for places, arranged in hierarchies representing all nations of the modern world, and including vernacular and historical names, coordinates, place types, and other relevant information.
Ethnologue - a catalogue of more than 6,700 languages spoken in 228 countries. The Ethnologue lists over 39,000 language names, dialect names, and alternate names and organizes languages according to language families.
Synonymized Checklist of the Vascular Flora of North America North of Mexico - the most recent and most comprehensive compilation of the taxonomy of the North American vascular flora.
Mammal Species of the World - names of 4,629 currently recognized species of mammals, in a taxonomic hierarchy that includes class, order, family, subfamily, and genus.
World Geographic Scheme for Recording Botanical Distributions - a four-level categorization of the major political subdivisions of the world with particular emphasis on the definition of botanical recording units for documenting plant distributions.
Library of Congress Subject Headings - a collection of approximately 250,000 terms widely used in cataloging library materials.

MIP is developing a database of data models for data dictionary support, schema integration, schema reconciliation, and database maintenance. PowerDesigner by Powersoft is being used for this activity. The development of this database is underway and will be an ongoing activity of MIP.

