[tdwg-ncd] FW: NCD toolkit
N.Thomson at nhm.ac.uk
Mon Apr 23 11:09:14 CEST 2007
I'm forwarding this message to the list to maintain the sequence,
otherwise my response will be meaningless to the others. It would be
great if you would join the list at the URL provided by Ricardo, then we
can all share in these exciting developments.
From: Ruud Altenburg [mailto:ruud at eti.uva.nl]
Sent: 20 April 2007 15:19
To: Neil Thomson
Cc: tdwg-ncd at lists.tdwg.org
Subject: Re: NCD toolkit
I don't think I'm on the list your cc'ed, so if this bounces could
you please forward my reply? I have combined both of your answers of
today into one mail.
But before I get to that I have an important announcement: I have
made our NLBIF metadatabase maintenance site available on a public
server, so you and other NCD participants can have a look. Please
note that my first concern has been the functionality, not its looks!
This should easily be improved with style sheets but frankly I never
heard complaints from our ETI people about its 'Spartan look' ; )
The system is available at http://www.nlbif.nl/ncd-preview/
It currently contains no data, but anyone can have a try at adding/
updating/deleting records. The login is test/test. A final note: I
would not play too much with the "Send mailing" option, as this may
seriously annoy people you have entered as test persons! :)
Then the Q&A revisited. I only reply to issues that have remained
open or that I feel need some more discussion.
> 3. The contract mentions import from NoDIT databases as a requirement
> (plus the EAD standard which I'm not familiar with). To cater for
> this we have added several fields to the metadatabase, otherwise we
> would not have been able to migrate all data. However, strictly
> speaking the metadatabase therefore does not match the NCD standard
> one-on-one. How do we deal with this?
> ## NHT: Would you be able to let me have a list of the elements that
> do not match, please? Then we can evaluate whether they should be
> added to NCD or whether it does not matter. Since NCD was derived
> from the schema that underlies NoDIT they should be very similar.
> Markus may be able to advise on this, since he built NoDIT for
These are elements such as PercentageDatabased. We could maintain
them in the database (so they are not lost) but omit them from the
> 4. Which fields in the NCD standard are required and which are not? I
> assume ones which have a closed box in XML Spy, but this may be too
> loose for a database setup. In the NLBIF metadatabase, a collection
> must be connected to either a person or an organisation, otherwise it
> would be orphaned. If I interpret the NCD schema well, information
> about a collection can be entered without providing information about
> the organisation or person to which/whom it belongs.
> ## NHT: Your are right in your interpretation of the schema. We have
> tried to keep the number of required elements to a minimum to
> encourage data entry - there are currently only 8 required elements
> and 5 of these are about who created the entry and when. The
> <FamilyName> of a person is required if a person is entered. It
> be easy enough (and makes sense) to make an institution required and
> should not be any hardship for data entry folk since for a
> session, at
> least, the instutition details will be the same and so entered just
This IMO needs further discussion. A collection can also be related
to a person, not exclusively to an organisation (private
collections!). Also, please keep in mind that information on
collections by themselves may be virtually useless if there is no
information stored about where the collection is stored (for access)
and whom to address for further information. The virtue of a database
is that you can search for information, but if only an extremely
limited set of data is required, this will seriously impede its use.
Please have a look at the setup of the metadatabase maintenance site.
It's easy to add collections to existing organisations or persons,
which only have to be created once.
> 5. We have copied some value lists from the NoDIT database to our
> database, e.g. keywords describing collections. However, the NCD
> standard has a broader scope than the NoDIT database and the NLBIF
> metadatabase. In the interface, I assume that many values too should
> be selected from popups menus, to guarantee a uniform database --
> e.g. AgentRole, CollectionPurpose, UnitOfMeasure, DigitalFileFormats
> to name but a few. To create such lists, we need to have these values
> stored in the database, so somehow these lists should be compiled. If
> multiple languages should be supported, we may need to have these
> values in other languages but English as well.
> ## NHT: These terminology lists and their association with the TDWG
> are the subject of a separate development being undertaken at the
> Smithsonian Institution by Carol Butler in association with Roger
> Hyam and Markus. They will result in sets of terms that can be used
> as pick-lists for those elements that should have just a few
> terms or
> for which consistency is important.
> We should see the first draft of these at the NCD Workshop, if not
> before. It would be good for anyone interested to make
> suggestions for
> such terms and their definition through this list to help Carol to
> compile them.
This is good news! I could start with the basic set we have compiled
for NLBIF and replace these with the NCD entries once these have been
determined. I assume these lists will be in English and will not be
translated to other languages?
> 2. More or less the same applies to Agents. My idea is that this is
> filled in once when setting up the database and that this is returned
> only when the database is accessed externally. But as apparently more
> than one agent with one or more roles can be responsible for a single
> record, so possibly this solution is incomplete. However, many
> changes have to be made to the current database to implement agents
> to a record level. E.g. consider that a collection is appended to an
> organisation that has previously been created by another agent.
> ## NHT: You are right that someone entering new records should only
> need to record their details once for the session. The agents
> section of the NCD Header (which is based on the METS header) is
> intended to record metadata about the record itself, rather than
> the collection that the record describes. The date of record
> creation should default to "today" and be fixed. However, an
> agent can come along later and amend the record, maybe adding to
> it or correcting it. In this case the role of the person is
> Appending a collection to an existing organisation record should not
> affect the organisation record, since the relationship is
> established simply by recording the organisation ID in the
> record. Nothing is added to or changed in the organisation record.
Thanks for the explanation. The system tracks which records
(organisations, collections and persons) have been updated by whom
(by Users). The Agent information e.g. is fixed for each site proving
a web service on their local database. This could be entered once
when setting up the database. Correct me if I'm wrong!
> 3. Shouldn't PostalAddressText and PhysicalAddressText each have an
> independent zipcode and town? In reverse, is it necessary to use
> language attributes to towns and regions? We have stored this data in
> a non-language-dependent field.
> ## NHT: Not sure about this one - are there cases where an
> has a postal address in a different town from the organisation? This
> may be solved, as the annotation text suggests, by having the
> address text in the postal element. The ZIP code and town are
> for your wizzy Google Maps link and for folk to determine what
> they may visit in a particular area. So if we are agreed that
> only one
> of these is required, the documentation will need to make it clear
> that the ZIP code and Town refer the to actual organisation, rather
> than its postbox.
> While we are on the topic, does anyone have a preference for
> over "institution", or does it not matter?
Well, normally both towns are the same, but not always. From a
database point of view, it not desirable to enter data that ideally
should be stored in separate fields into one field. E.g. it would
probably make Google Maps queries much more complicated if not
impossible, as addresses can be entered in various forms.
More information about the tdwg-content