Samples Database
The database MBARI_Samples is the heart of the MBARI Sample Data Management System. It resides on the machine Godzilla in Microsoft SQL Server. It is administered using Enterprise Manager and queried through ISQL_W, MS Access, or from the Web. Besides Enterprise Manager, it requires ODBC, MS Internet Information Server (IIS), and MS FrontPage extensions to be fully functional, and it must run on a WindowNT machine. The ODBC name must be mapped on your computer for access to the database other than through the web. It may be queried from MBARI's internal web only at this time.
Database structure
| See design diagram (diagram of table relationships and structure) | |
| See table design (field names, data types, and etc.) |
The structure of the database is relational. The core tables are CollectionGroup, CollectionEvent, and Sample. There can be one or many samples related to a collecting event, and one or many collecting events to a collection group. The CollectionGroup table is the root table. It functions as an expedition table; most of the collection groups are ship expeditions, but such is not a requirement. The CollectionEvent table stores the rest of the information pertaining to collection events, which are when the samples are gathered. The Sample table contains information about individual specimens. All the other tables are about these three core tables. They are either look-up tables (eg., Person, Equipment, PreservativeMethod) or tables that may contain zero, one or many records related to a sample (eg., Document, Citation, Loan, StorageLocation, SeedVIMSConcepts).
Procedures and interfaces developed to maintain and interact with the database are discussed elsewhere:
| CTDO lookup | |
| VIMS explosion | |
| Import spreadsheet data | |
| Backups | |
| Web interfaces | |
| MS Access satellite databases |
Database design criteria
Stand alone
The database needs to be able to go to sea eventually. It was designed to be as
self-contained as possible, therefore, some items that could have been sought from
databases elsewhere, such as the CTD data and descriptive keywords, are included in the
specimen database. Images are not included however, as they are such large files. The
version of the database that will go to sea will probably be a read-only copy to avoid
synchronization issues.
Diversity of specimens
Samples are generally biological or geological in nature, but the system may easily be
extended to other materials, such as frozen tissues and "derived objects", such
as thin section slides or SEM stubs. Because of the diversity of specimens, sample
descriptions were difficult to model; the way descriptions are handled is discussed below.
Specimens are generally stored in "lots", in that material in individual containers is generally collected at one time and place. The database is more or less "lot based". However, this scheme is complicated by the reality that the material may be mixtures (e.g., not sorted to individual species or lithology), or it may be material derived from specimens (e. g., subsamples). Mixtures are handled by assigning more than one description keyword (VIMS concept) to describe the specimen and marking it with the database field Sorted equal to No (when they are sorted to species or lithology, Sorted equals Yes). The way subsamples are handled is described below.
Sample numbers
Each specimen (sample and subsample) has a unique SampleID (also known as MBARI ID# ) so
that it may be tracked individually. This number is assigned by the database as an
autoincrementing integer. The SampleID is independent of the sample number/name that a
specimen may be given by the Collector (which is stored in the CollectionRefName and/or
SampleRefName fields), and allows for the collector's personal preference in sample
numbering codes and the inevitable duplication or omission of sample numbers. The SampleID
is the foreign key in many of the outlying tables in the database.
The CollectionRefName is the name (commonly known as "sample number") given by the collector to a specimen at the time of collection. SampleRefName is the name of an individual specimen; for the original sample it is the same as the CollectionRefName, but for subsamples it might include additional characters to indicate multiples or treatments.
There can be more than one sample collected during a collection event, and they all share a CollectionEventID (the foreign key in the Sample table) and therefore share the CollectionRefName and other collecting event information. Because they are separate samples, however, they each have their own SampleID, SampleRefName, description keywords, and storage and handling information.
Subsamples
There may be many subsamples derived from a sample, and even many subsamples may derived
from a subsample. This is handled by a recursive link within the Sample table. Each
specimen, sample or subsample, has its own SampleID. Its lineage is maintained through the
use of the ParentSampleID field: subsamples have their parent's SampleID entered into
the ParentSampleID field to show from which specimen it was derived. Subsamples share the
CollectionEventID of their ancestral sample, which is the one that was initially
collected, and therefore share the collecting event information. The ancestral sample's
ParentSampleID field remains null. Because subsamples are treated as separate specimens,
they have their own description keywords, and storage and handling information.
Descriptions
Descriptions of samples are handled by the database in a rather unique way, necessitated
by the diversity of the materials and by the reality that many of the biological specimens
are not known to the level of species. The Video Information
Management System knowledge base (mbari.kno) was employed for this purpose. It is an
listing of all the taxa, geological features, organism parts, and equipment observed with
MBARI video, structured in an object-oriented inheritance hierarchy (rather than
relational tables). It enforces consistency of spelling and context, and using it solved
our problems that we don't know the taxonomy of our biological specimens down to a
consistent taxonomic level and that lithological nomenclature doesn't map to as strict a
structure as taxonomic nomenclature.
| Using VICKI: Samples are described using VICKI when they are collected: as part of the video annotation, a "concept" (also called here "keyword" or "term") from the knowledge base is chosen to best describe the sample. A step in the subsequent extraction and database load process is to enter this concept into the field CollectionVIMSConcept in the CollectionEvent table and also as a SeedConcept in the SeedVIMSConcepts table. | |
| Subsamples: An ancestral sample has as one if its SeedConcepts the CollectionVIMSConcept, but it needn't be one of the keywords for a subsample if it is inappropriate. | |
| Multiple concepts: A specimen may be described by more than one term from the knowledge base; these may also be entered as SeedConcepts in the table SeedVIMSConcepts. SeedConcepts are used for the descriptions of the specimens in the "Sample Summary" and "Sample Details" web pages. | |
| Queries: The SeedConcepts are also used as "seeds" for the "VIMS Explosion" (the Java routine "UpdateAllSampleKeywords.bat"), which populates the KeywordString field of the VIMSkeywords table. This routine searches the VIMS knowledge base for each concept in the SeedConcept field, and then extracts all the concepts and aliases (e.g., common names) from above that level concept in the inheritance hierarchy, and populates new records in the VIMSkeywords table for each concept and alias extracted. For example, a specimen having a SeedConcept of "basalt" would have the keywords "basalt", "volcanic-rock", "igneous-rock", "consolidated", "rock", and "geological-feature" added to the VIMSkeywords table. The web queries that use keyword as a criteria go against the KeywordString field of the VIMSkeywords table. This way, by searching for a keyword in the hierarchy above the specimen's SeedConcept, the specimen will be found (e.g., query for keyword "igneous-rock", and specimens of "basalt" and "granite" can be found). | |
| Further descriptions: More involved descriptions go in the field DescriptionComment. This field allows free text, and is appropriate for observations of size and appearance. |
![]()
»Next: Design diagram
Last updated: 07 December 2004 by Jennifer Paduan
Copyright © 1998 MBARI