Hi Ruud,
Many thanks for providing the URL to the test site - don't worry about its looks, we are all aware that the functionality needs to be in place first.
For your questions below:
- We decided to drop <PercentageDatabased> as being an unhelpful element. It is one that would need constant updating, which we thought would not happen. The URLs in the Related Materials Group could alert a user to the fact that a database exists and could then go direct to it. As to EAD, Barabara Mathé at the AMNH has prepared a mapping between EAD and NCD which you may find helpful once it reaches its final form.
- Yes, this approach sounds good. I did try to put some dummy data into the test site, but I got thrown out with an SQL error. No matter, having an organisation and a person as a precursor to a collection description sounds perfectly reasonable and is similar to the approach taken by NoDIT.
- Yes, the lists will be in English, but we could may get other European language equivalents from GEMET? One for use to discuss with Roger.
- There may be more than one agent per site that will be creating and editing records. At my own place there may be a dozen or more authors.
Perhaps we should have an "open issues" page on the TDWG wiki, otherwise it is going to become difficult to keep track of queries, responses and decisions on emails?
All best wishes, Neil
-----Original Message----- From: Ruud Altenburg [mailto:ruud@eti.uva.nl] Sent: 20 April 2007 15:19 To: Neil Thomson Cc: tdwg-ncd@lists.tdwg.org Subject: Re: NCD toolkit
Hallo Neil,
I don't think I'm on the list your cc'ed, so if this bounces could you please forward my reply? I have combined both of your answers of today into one mail.
But before I get to that I have an important announcement: I have made our NLBIF metadatabase maintenance site available on a public server, so you and other NCD participants can have a look. Please note that my first concern has been the functionality, not its looks! This should easily be improved with style sheets but frankly I never heard complaints from our ETI people about its 'Spartan look' ; )
The system is available at http://www.nlbif.nl/ncd-preview/
It currently contains no data, but anyone can have a try at adding/ updating/deleting records. The login is test/test. A final note: I would not play too much with the "Send mailing" option, as this may seriously annoy people you have entered as test persons! :)
Then the Q&A revisited. I only reply to issues that have remained open or that I feel need some more discussion.
- The contract mentions import from NoDIT databases as a requirement
(plus the EAD standard which I'm not familiar with). To cater for this we have added several fields to the metadatabase, otherwise we would not have been able to migrate all data. However, strictly speaking the metadatabase therefore does not match the NCD standard one-on-one. How do we deal with this?
## NHT: Would you be able to let me have a list of the elements that do not match, please? Then we can evaluate whether they should be added to NCD or whether it does not matter. Since NCD was derived from the schema that underlies NoDIT they should be very similar. Markus may be able to advise on this, since he built NoDIT for BioCASE.
These are elements such as PercentageDatabased. We could maintain them in the database (so they are not lost) but omit them from the web service.
- Which fields in the NCD standard are required and which are not? I
assume ones which have a closed box in XML Spy, but this may be too loose for a database setup. In the NLBIF metadatabase, a collection must be connected to either a person or an organisation, otherwise it would be orphaned. If I interpret the NCD schema well, information about a collection can be entered without providing information about the organisation or person to which/whom it belongs.
## NHT: Your are right in your interpretation of the schema. We have tried to keep the number of required elements to a minimum to encourage data entry - there are currently only 8 required elements and 5 of these are about who created the entry and when. The <FamilyName> of a person is required if a person is entered. It would be easy enough (and makes sense) to make an institution required and should not be any hardship for data entry folk since for a session, at least, the instutition details will be the same and so entered just once.
This IMO needs further discussion. A collection can also be related to a person, not exclusively to an organisation (private collections!). Also, please keep in mind that information on collections by themselves may be virtually useless if there is no information stored about where the collection is stored (for access) and whom to address for further information. The virtue of a database is that you can search for information, but if only an extremely limited set of data is required, this will seriously impede its use.
Please have a look at the setup of the metadatabase maintenance site. It's easy to add collections to existing organisations or persons, which only have to be created once.
- We have copied some value lists from the NoDIT database to our
database, e.g. keywords describing collections. However, the NCD standard has a broader scope than the NoDIT database and the NLBIF metadatabase. In the interface, I assume that many values too should be selected from popups menus, to guarantee a uniform database -- e.g. AgentRole, CollectionPurpose, UnitOfMeasure, DigitalFileFormats to name but a few. To create such lists, we need to have these values stored in the database, so somehow these lists should be compiled. If multiple languages should be supported, we may need to have these values in other languages but English as well.
## NHT: These terminology lists and their association with the TDWG ontology are the subject of a separate development being undertaken at the Smithsonian Institution by Carol Butler in association with Roger Hyam and Markus. They will result in sets of terms that can be used as pick-lists for those elements that should have just a few terms or for which consistency is important. We should see the first draft of these at the NCD Workshop, if not before. It would be good for anyone interested to make suggestions for such terms and their definition through this list to help Carol to compile them.
This is good news! I could start with the basic set we have compiled for NLBIF and replace these with the NCD entries once these have been determined. I assume these lists will be in English and will not be translated to other languages?
- More or less the same applies to Agents. My idea is that this is
filled in once when setting up the database and that this is returned only when the database is accessed externally. But as apparently more than one agent with one or more roles can be responsible for a single record, so possibly this solution is incomplete. However, many changes have to be made to the current database to implement agents to a record level. E.g. consider that a collection is appended to an organisation that has previously been created by another agent.
## NHT: You are right that someone entering new records should only need to record their details once for the session. The agents section of the NCD Header (which is based on the METS header) is intended to record metadata about the record itself, rather than the collection that the record describes. The date of record creation should default to "today" and be fixed. However, an agent can come along later and amend the record, maybe adding to it or correcting it. In this case the role of the person is "editor". Appending a collection to an existing organisation record should not affect the organisation record, since the relationship is established simply by recording the organisation ID in the collection record. Nothing is added to or changed in the organisation record.
Thanks for the explanation. The system tracks which records (organisations, collections and persons) have been updated by whom (by Users). The Agent information e.g. is fixed for each site proving a web service on their local database. This could be entered once when setting up the database. Correct me if I'm wrong!
- Shouldn't PostalAddressText and PhysicalAddressText each have an
independent zipcode and town? In reverse, is it necessary to use language attributes to towns and regions? We have stored this data in a non-language-dependent field.
## NHT: Not sure about this one - are there cases where an organisation has a postal address in a different town from the organisation? This may be solved, as the annotation text suggests, by having the complete address text in the postal element. The ZIP code and town are intended for your wizzy Google Maps link and for folk to determine what collections they may visit in a particular area. So if we are agreed that only one of these is required, the documentation will need to make it clear that the ZIP code and Town refer the to actual organisation, rather than its postbox. While we are on the topic, does anyone have a preference for "organisation" over "institution", or does it not matter?
Well, normally both towns are the same, but not always. From a database point of view, it not desirable to enter data that ideally should be stored in separate fields into one field. E.g. it would probably make Google Maps queries much more complicated if not impossible, as addresses can be entered in various forms.
Best regards,
Ruud