[tdwg-ncd] RE: NCD status summary

Neil Thomson N.Thomson at nhm.ac.uk
Tue Jun 5 16:12:01 CEST 2007


Hi Markus,

Comments on stripped down version below ...

Neil 
-----

> ##NT: Only the NCDID needs to be generated, probably as an LSID.  
> All other identifiers are taken from elsewhere (i.e.the "source")

so my assumption is correct. One NCD id plus any number of associated  
IDs NCD software doesnt need to understand. Thats what the current  
schema reflects.

NT- Correct: although it may need to understand <parentCollectionID>

> ------------------
>
> - are multiple parent collections & parent institutions allowed?
> currently this is not
>
> ##NT: Yes, they should be allowed
both, multiple institutions and multiple parent collections? I know  
we had this in biocase, but not many people made use of it. And this  
is the kind of change in data structure which affects our software a  
lot. With a single hierarchy you have a nice tree, with multiple  
parents you got a nasty graph which is hard to present. If this is  
needed only in 1% of the cases I actually think we should ignore it,  
people should rather indicate this in notes I believe. But if its an  
important and frequent problem we have to deal with it, sure. What do  
you reckon?

NT- Yes, Ruud also says he cannot cope with this and is unable to think of any examples, so maybe we should not bother adding it and just leave the status quo. 

> ------------------
>
> - what is "richRecordSearchString" in contrast to
> "collectionDatabaseURL"? what about "objectAccessService" and simply
> "furtherInformation" ?
>
> ##NT: The "collectionDatabaseURL" is the URL of the database that  
> holds item-level data for the collection being described in this  
> NCD record. The "richRecordSearchString" is what should be used in  
> that database to get directly to the relevant record. To take an  
> example, there will be an NCD record describing a herbarium. The  
> "collectionDatabaseURL" will be to the Index Herbariorum  
> WebDatabase and the "richRecordSearchString" will contain the coden  
> for the herbarium so that the full description of the herbarium can  
> be found by adding the two together. I have no objection to  
> "objectAccessService" for the latter, but the former is more  
> explicit than "furtherInformation".

Now Im confused. I assumed we deal with 2 different things here.

A) link to a service that gives access to individual collection items.
This doesnt necessarily have to be a database, so this is why I  
suggest to rename collectionDatabaseURL into collectionObjectAccess  
(or collectionItemAccess, objectLevelAccess?)

NT- OK, I can go with that.

B) further eventually more detailed information about this collection  
meta record. Thats why I suggested furtherInformation (or the dublin  
core pendant)

NT- This is probably not needed, since there is a <relatedResources> element just above.

If I understand you correctly the richRecordSearchString will be used  
together with the database url? Isnt that mixing item-level  with  
collection meta-level?

NT- Yes, that was the intention. An objective of NCD records is to point to item-level databases where these exist. Perhaps we should leave it as just the <collectionObjectAccess> element since folk can then do their own search when they get there. This element would require too much in the way of maintenance in the event that the database system changed

> -------------------
>
> - I have replaced all address information with vcard elements. That
> is vc:N for person names and vc:ADR for addresses. Some existing NCD
> elements do not exist in vcard though. How shall we treat them? Right
> now Ive simply left them as they were before in addition the vcard
> elements. It looks a bit weird, but its ok I believe. In particular
> that is byear and deathyear for persons (full birthday exists in
> vcard) and fax, email, logourl for institutional addresses. Maybe we
> can remove the email and fax and add a website instead? The
> collection have real contacts anyway already, so institutions dont
> really need this, do they?
>
> ##NT: Yes, institutions do require contact details too so I would  
> prefer that these elements remain. If they are not present as vc:  
> elements then can they just be ncd: elements?
ok. but isnt fax+email a pretty limited set of potential contact  
methods? what about telephone and websites, isnt that even more  
important?

NT- Your suggestion at the foot, to have a single contact class for use with either a collection or an institution or both sounds like a good way to go.

> -------------------
>
> - many elements cover the same ideas that dublin core covers. Should
> we use the explicit dublin core elements for title, description,
> other resources, keyword, rights, modified, created and citation? I
> would be in favor to do so. It would mean that NCD mainly is about
> the extra metadata bits that dublin core doesn't cover!
>
> ##NT: We certainly could do this and Doug has produced an NCD <-->  
> DC mapping. Since this would just be a re-labelling exercise we can  
> have this as a discussion item at the workshop. NCD would then be  
> an application profile, re-using elements from mets:, dc: and vc:  
> along with its own.

I think thats a good approach. Please lets put this on the agenda for  
the workshop

NT- OK - there is already a session on mappings, so it can be discussed then

> -------------------
>
> *** ONTOLOGY ***
> - collection extent is a pure integer without a unit definition. I
> know this is bad, but having complex data (i.e. with multiple
> elements or attributes) in an ontology means I would have to define a
> new class just for this. And Roger is very keen on keeping the number
> of classes low.
>
> ##NT: In this case the single attribute should be a string not an  
> integer. Describing a collection extent as "2" is spectacularly  
> meaningless. If the string is "2 shelves" or "2 cupboards" or "2  
> rooms" it means a little bit more. Not much, but enough for someone  
> to assess whether it is worth a visit or not.

totally right. Ill do that

NT- OK, thanks.

> --------------------
> - all keywords lacking strength flag cause I have created a single
> keyword class that is used by all keyword types. And the strewngth
> flag didnt exist for all keywords. Should it or should we drop it?
>
> ##NT: I would like to keep the strength flag if possible fot those  
> elements where it makes sense - it would be good to indicate where  
> the collection owner believe's the collection to be particularly  
> strong.

well. I added strength now to the "DefinedTerm" class so it applies  
to all keywords (). But it might be incorrect, cause now the class  
describes terms that can be traced back to where they come from  
(source, idinsource). if we also add strength, then the strength  
really is a property of the collection (or the term in regard to the  
collection), not the term itself. To model this in an easier way I  
will create 2 properties for each strength containing keyword of the  
collection class. E.g TaxonCoverage and TaxonCoverageStrength. Then  
both properties can use the DefinedTerm class (which only describes a  
term, no matter where its used). 

Looks as if this problem is solved!

NT- Excellent! 

> --------------------
>
> - a persons role in regard to a specific collection is missing. I
> think we need it, but then again we need at least a basic controlled
> vocabulary
>
> ##NT: Yes, we do need this. Example terms would be owner,  
> adminstrator, collector, preparator etc.

ok, Ill start with these roles then. How many roles do you expect? If  
its less than 10 we can avoid new classes and create seperate  
properties for the different roles. Just like I did with strength in  
the keywords. I've added those 4 person types to the ontology now.

NT- Carol will be collecting terms for use with popups so any new suggestions should be forwarded to her.

+++++++++++++++++++++++++++++++++++++
I think we need to still take a decision on:

  - multiple parent hierarchy. maybe we leave it single and discuss  
it at the workshop?

NT- Leave as is.

  - wording and meaning of "richRecordSearchString" vs  
"collectionDatabaseURL". Can we agree to use "collectionObjectAccess"  
and "furtherDetails" both of which are URIs?

NT- Agree first, probably drop second as already covered

  - CONTACTS. Wouldnt it be easier if we have a single contact class  
(complextype) i.e. a vcard contact, that can be used for institution  
contacts as well as collection contacts? I would prefer this option  
to the strict selection of relevant contact fields for different  
purposes. The editor interface can still restrict this if we think  
thats needed. But NCD would be able to e3xchange an entire vcard  
record for a contact no matter where it is being used.

NT- Agreed

+++++++++++++++++++++++++++++++++++++

changes to the schema when cleaning up:

about an institution:
- PostalAddressText and PhysicalAddressText removed. An institution  
can have multiple vcard addresses now that allow to specify a "type"  
such as postal,parcel,home,work,pref

NT- Agreed

So, Markus, with these changes are we able to announce a v1.0 of the schema?

All best wishes,
Neil





More information about the tdwg-content mailing list