We at the New York Botanical Garden have been going through
a very arduous process of creating a collections management
and research database for our 6 million specimen herbarium.
It is the largest collection of this type in the Western
Hemisphere, and the most active in the world, in terms of
loans and visitor use. As you can well imagine, a collection
this size is a real bear to manage.
As a first step, we are extensively modifying a program
created for the Harvard University Herbarium, called HUHpc.
Our modification is being done by the developer who created
HUHpc, with massive staff involvement. This is a Revelation
application, and any of you who have worked with databases
probably have real strong feelings about Revelation, either
pro or con (mostly the latter it seems.) This process of
modification has taken about 2 years, and we are just now
getting NYpc (as it is imaginatively called) on to a test
data load.
The second step is to specify and develop a client-server
database program using one of the 4th generation languages
such as Oracle or Ingres. This effort has taken about two
years to date, and we are just about ready with full specs.
We are doing this as a collaborative project with another
major botanical collection. I'm not sure what the current
projection is for getting this up and running, but very
detailed implementation plans are being drawn up, with
associated budgets. When it is ready, it will run on a UNIX
platform, with PC clients. Of course, all records created in
the interim database will be convertible to this new
database.
In 1992, a consortium of freestanding natural science
collections (ie non-university museum collections) did a
massive survey and analysis of computing needs. This
project, called the MITRE project, after the consulting firm
who prepared the report, specifies in a pretty
straightforward fashion the model architecture for
collections management system. W. Wayt Thomas, an NYBG
botanist, was the PI for this NSF-funded project, and we are
following the MITRE recommendations. An NSF reviewer called
this report "no less useful for being blindingly obvious..."
Getting to this stage has required massive institutional
commitment of resources, including a full time manager of
science computing, and extensive participation of herbarium
and research staff. And the big bucks for hardware
and...gulp... data entry are yet to come.
Though this collection may be an extreme case, and I have
been watching the process more or less from the sidelines
(as a fundraiser), I have been amazed at the complexity of
this databasing project. It requires incredible
institutional stamina and persistence to get a long-term
project of this scope underway.
For perspective sake, we keep in mind that the project to
automate the NYBG library catalog, with several hundred
thousand items, took from 1967 to 1994. It just went on
line, by the way, and you can telnet to it at
librisc.nybg.org or 192.77.202.200.
Now that I've thoroughly scared anyone contemplating a
collections databasing effort...
Eric Siegel
[log in to unmask]
|