[tdwg-ncd] RE: NCD status summary
Hi Markus,
Sorry for the delay in responding to this. I have interspersed comments below and Cc:'d in the RAVNS since there are some changes here that are potential workshop agenda items.
Thanks for your help with this ...
-----Original Message----- From: Markus Döring [mailto:m.doering@BGBM.org] Sent: 01 June 2007 14:51 To: Neil Thomson Cc: Ruud Altenburg Subject: NCD status summary
Neil, I have been working on the schema and the ontology lately. They don't match a 100%. In my opinion it would be best to focus on one specification only instead of trying to maintain and keep in sync 2 specifications for different target platforms (XML+RDF). If we need to do that, I strongly suggest to do the real modelling in an abstract form with UML and derive a schema and ontology from that! You can also create a suitable database schema from UML automatically if you want. Im not sure how to go about this decision. Maybe its best to discuss this during the workshop?
## NT: We should complete the schema v1.0 at least before transferring to another technology. Maybe you could cover the reasons for moving and how it will affect future developments in your session at the workshop? -----------------------
In regard to the php toolkit I think this decision doesnt have a very strong impact as long as the toolkit doesn't implement im/export via xml or rdf yet. I suggest to work with the schema for now cause this is 99% complete. On the other hand I believe the RDF version is the preferred tdwg format, so we should use this ontology as the reference. And as I said its not exactly the same as the ncd schema, as it is part of the tdwg ontology and uses external ontologies too (like vcard).
## NT: ok, that's good in that it should not hold Ruud up too much. ---------
Detailed issues that still need to be resolved are:
*** SCHEMA *** - what IDs should exist? 1 NCD guid (lsid or whatever) + many others with source. Is this correct?
##NT: Only the NCDID needs to be generated, probably as an LSID. All other identifiers are taken from elsewhere (i.e.the "source") ------------------
- are multiple parent collections & parent institutions allowed? currently this is not
##NT: Yes, they should be allowed ------------------
- person roles not fixed or at least core suggested? I think we need this
##NT: Yes, we do need this ------------------
- what is "richRecordSearchString" in contrast to "collectionDatabaseURL"? what about "objectAccessService" and simply "furtherInformation" ?
##NT: The "collectionDatabaseURL" is the URL of the database that holds item-level data for the collection being described in this NCD record. The "richRecordSearchString" is what should be used in that database to get directly to the relevant record. To take an example, there will be an NCD record describing a herbarium. The "collectionDatabaseURL" will be to the Index Herbariorum WebDatabase and the "richRecordSearchString" will contain the coden for the herbarium so that the full description of the herbarium can be found by adding the two together. I have no objection to "objectAccessService" for the latter, but the former is more explicit than "furtherInformation". -------------------
- I have replaced all address information with vcard elements. That is vc:N for person names and vc:ADR for addresses. Some existing NCD elements do not exist in vcard though. How shall we treat them? Right now Ive simply left them as they were before in addition the vcard elements. It looks a bit weird, but its ok I believe. In particular that is byear and deathyear for persons (full birthday exists in vcard) and fax, email, logourl for institutional addresses. Maybe we can remove the email and fax and add a website instead? The collection have real contacts anyway already, so institutions dont really need this, do they?
##NT: Yes, institutions do require contact details too so I would prefer that these elements remain. If they are not present as vc: elements then can they just be ncd: elements? -------------------
- many elements cover the same ideas that dublin core covers. Should we use the explicit dublin core elements for title, description, other resources, keyword, rights, modified, created and citation? I would be in favor to do so. It would mean that NCD mainly is about the extra metadata bits that dublin core doesn't cover!
##NT: We certainly could do this and Doug has produced an NCD <--> DC mapping. Since this would just be a re-labelling exercise we can have this as a discussion item at the workshop. NCD would then be an application profile, re-using elements from mets:, dc: and vc: along with its own. -------------------
*** ONTOLOGY *** - collection extent is a pure integer without a unit definition. I know this is bad, but having complex data (i.e. with multiple elements or attributes) in an ontology means I would have to define a new class just for this. And Roger is very keen on keeping the number of classes low.
##NT: In this case the single attribute should be a string not an integer. Describing a collection extent as "2" is spectacularly meaningless. If the string is "2 shelves" or "2 cupboards" or "2 rooms" it means a little bit more. Not much, but enough for someone to assess whether it is worth a visit or not. -------------------- - all keywords lacking strength flag cause I have created a single keyword class that is used by all keyword types. And the strewngth flag didnt exist for all keywords. Should it or should we drop it?
##NT: I would like to keep the strength flag if possible fot those elements where it makes sense - it would be good to indicate where the collection owner believe's the collection to be particularly strong. --------------------
- a persons role in regard to a specific collection is missing. I think we need it, but then again we need at least a basic controlled vocabulary
##NT: Yesm we do need this. Example terms would be owner, adminstrator, collector, preparator etc. -------------------
I think thats all there is. It looks like a lot, but most issues are of a cosmetic nature and do not change the abstract standard. Some like multiple parents do of cause.
I would be glad if you could reply to these questions as soon as possible, even if its only partly, cause I hope to find some time over the weekend to settle some of them.
##NT: Thanks again for looking at these details. It was always known that changes would be needed for the toolkit to work and others may well come to light during testing. We can also talk that through at the workshop, since I think that some of the RAVNS are keen to help with the testing.
Is everyone in agreement with the changes proposed above? We need to quickly move to v1.0 so that Ruud can strut his stuff. Apologies again for the unintended out-time recently.
All best wishes, Neil -----------
best wishes -- Markus
http://rs.tdwg.org/ontology/voc/ http://rs.tdwg.org/ncd/dev/schema/ncd.xsd
Neil, my comments inline... some documentation of the latest schema changes at the very bottom as well of the summary of decisions needed to take. please comment! -- Markus
Neil, I have been working on the schema and the ontology lately. They don't match a 100%. In my opinion it would be best to focus on one specification only instead of trying to maintain and keep in sync 2 specifications for different target platforms (XML+RDF). If we need to do that, I strongly suggest to do the real modelling in an abstract form with UML and derive a schema and ontology from that! You can also create a suitable database schema from UML automatically if you want. Im not sure how to go about this decision. Maybe its best to discuss this during the workshop?
## NT: We should complete the schema v1.0 at least before transferring to another technology. Maybe you could cover the reasons for moving and how it will affect future developments in your session at the workshop?
Roger is giving reasons why TDWG aims at RDF over XML schemas
In regard to the php toolkit I think this decision doesnt have a very strong impact as long as the toolkit doesn't implement im/export via xml or rdf yet. I suggest to work with the schema for now cause this is 99% complete. On the other hand I believe the RDF version is the preferred tdwg format, so we should use this ontology as the reference. And as I said its not exactly the same as the ncd schema, as it is part of the tdwg ontology and uses external ontologies too (like vcard).
## NT: ok, that's good in that it should not hold Ruud up too much.
Detailed issues that still need to be resolved are:
*** SCHEMA ***
- what IDs should exist? 1 NCD guid (lsid or whatever) + many others
with source. Is this correct?
##NT: Only the NCDID needs to be generated, probably as an LSID. All other identifiers are taken from elsewhere (i.e.the "source")
so my assumption is correct. One NCD id plus any number of associated IDs NCD software doesnt need to understand. Thats what the current schema reflects.
- are multiple parent collections & parent institutions allowed?
currently this is not
##NT: Yes, they should be allowed
both, multiple institutions and multiple parent collections? I know we had this in biocase, but not many people made use of it. And this is the kind of change in data structure which affects our software a lot. With a single hierarchy you have a nice tree, with multiple parents you got a nasty graph which is hard to present. If this is needed only in 1% of the cases I actually think we should ignore it, people should rather indicate this in notes I believe. But if its an important and frequent problem we have to deal with it, sure. What do you reckon?
- person roles not fixed or at least core suggested? I think we need
this
##NT: Yes, we do need this
so I will use the ones below
- what is "richRecordSearchString" in contrast to
"collectionDatabaseURL"? what about "objectAccessService" and simply "furtherInformation" ?
##NT: The "collectionDatabaseURL" is the URL of the database that holds item-level data for the collection being described in this NCD record. The "richRecordSearchString" is what should be used in that database to get directly to the relevant record. To take an example, there will be an NCD record describing a herbarium. The "collectionDatabaseURL" will be to the Index Herbariorum WebDatabase and the "richRecordSearchString" will contain the coden for the herbarium so that the full description of the herbarium can be found by adding the two together. I have no objection to "objectAccessService" for the latter, but the former is more explicit than "furtherInformation".
Now Im confused. I assumed we deal with 2 different things here.
A) link to a service that gives access to individual collection items. This doesnt necessarily have to be a database, so this is why I suggest to rename collectionDatabaseURL into collectionObjectAccess (or collectionItemAccess, objectLevelAccess?)
B) further eventually more detailed information about this collection meta record. Thats why I suggested furtherInformation (or the dublin core pendant)
If I understand you correctly the richRecordSearchString will be used together with the database url? Isnt that mixing item-level with collection meta-level?
- I have replaced all address information with vcard elements. That
is vc:N for person names and vc:ADR for addresses. Some existing NCD elements do not exist in vcard though. How shall we treat them? Right now Ive simply left them as they were before in addition the vcard elements. It looks a bit weird, but its ok I believe. In particular that is byear and deathyear for persons (full birthday exists in vcard) and fax, email, logourl for institutional addresses. Maybe we can remove the email and fax and add a website instead? The collection have real contacts anyway already, so institutions dont really need this, do they?
##NT: Yes, institutions do require contact details too so I would prefer that these elements remain. If they are not present as vc: elements then can they just be ncd: elements?
ok. but isnt fax+email a pretty limited set of potential contact methods? what about telephone and websites, isnt that even more important?
- many elements cover the same ideas that dublin core covers. Should
we use the explicit dublin core elements for title, description, other resources, keyword, rights, modified, created and citation? I would be in favor to do so. It would mean that NCD mainly is about the extra metadata bits that dublin core doesn't cover!
##NT: We certainly could do this and Doug has produced an NCD <--> DC mapping. Since this would just be a re-labelling exercise we can have this as a discussion item at the workshop. NCD would then be an application profile, re-using elements from mets:, dc: and vc: along with its own.
I think thats a good approach. Please lets put this on the agenda for the workshop
*** ONTOLOGY ***
- collection extent is a pure integer without a unit definition. I
know this is bad, but having complex data (i.e. with multiple elements or attributes) in an ontology means I would have to define a new class just for this. And Roger is very keen on keeping the number of classes low.
##NT: In this case the single attribute should be a string not an integer. Describing a collection extent as "2" is spectacularly meaningless. If the string is "2 shelves" or "2 cupboards" or "2 rooms" it means a little bit more. Not much, but enough for someone to assess whether it is worth a visit or not.
totally right. Ill do that
- all keywords lacking strength flag cause I have created a single
keyword class that is used by all keyword types. And the strewngth flag didnt exist for all keywords. Should it or should we drop it?
##NT: I would like to keep the strength flag if possible fot those elements where it makes sense - it would be good to indicate where the collection owner believe's the collection to be particularly strong.
well. I added strength now to the "DefinedTerm" class so it applies to all keywords (). But it might be incorrect, cause now the class describes terms that can be traced back to where they come from (source, idinsource). if we also add strength, then the strength really is a property of the collection (or the term in regard to the collection), not the term itself. To model this in an easier way I will create 2 properties for each strength containing keyword of the collection class. E.g TaxonCoverage and TaxonCoverageStrength. Then both properties can use the DefinedTerm class (which only describes a term, no matter where its used). Looks as if this problem is solved!
- a persons role in regard to a specific collection is missing. I
think we need it, but then again we need at least a basic controlled vocabulary
##NT: Yesm we do need this. Example terms would be owner, adminstrator, collector, preparator etc.
ok, Ill start with these roles then. How many roles do you expect? If its less than 10 we can avoid new classes and create seperate properties for the different roles. Just like I did with strength in the keywords. I've added those 4 person types to the ontology now.
I think thats all there is. It looks like a lot, but most issues are of a cosmetic nature and do not change the abstract standard. Some like multiple parents do of cause.
I would be glad if you could reply to these questions as soon as possible, even if its only partly, cause I hope to find some time over the weekend to settle some of them.
##NT: Thanks again for looking at these details. It was always known that changes would be needed for the toolkit to work and others may well come to light during testing. We can also talk that through at the workshop, since I think that some of the RAVNS are keen to help with the testing.
Is everyone in agreement with the changes proposed above? We need to quickly move to v1.0 so that Ruud can strut his stuff. Apologies again for the unintended out-time recently.
+++++++++++++++++++++++++++++++++++++ I think we need to still take a decision on:
- multiple parent hierarchy. maybe we leave it single and discuss it at the workshop?
- wording and meaning of "richRecordSearchString" vs "collectionDatabaseURL". Can we agree to use "collectionObjectAccess" and "furtherDetails" both of which are URIs?
- CONTACTS. Wouldnt it be easier if we have a single contact class (complextype) i.e. a vcard contact, that can be used for institution contacts as well as collection contacts? I would prefer this option to the strict selection of relevant contact fields for different purposes. The editor interface can still restrict this if we think thats needed. But NCD would be able to e3xchange an entire vcard record for a contact no matter where it is being used.
+++++++++++++++++++++++++++++++++++++
changes to the schema when cleaning up:
about an institution: - PostalAddressText and PhysicalAddressText removed. An institution can have multiple vcard addresses now that allow to specify a "type" such as postal,parcel,home,work,pref
I just looked through the schema again and have noticed there is no contact defined for a collection. The institutional contact is the only one we have. This is probably because the institution was not "normalised" before but part of a collection record. So for every collection the institution could have provided a different contact. I suggest to add a collection contact to the schema (same structure as institution, preferrably vcard). -- Markus
On 05.06.2007, at 13:52, Markus Döring wrote:
+++++++++++++++++++++++++++++++++++++ I think we need to still take a decision on:
- multiple parent hierarchy. maybe we leave it single and discuss
it at the workshop?
- wording and meaning of "richRecordSearchString" vs
"collectionDatabaseURL". Can we agree to use "collectionObjectAccess" and "furtherDetails" both of which are URIs?
- CONTACTS. Wouldnt it be easier if we have a single contact class
(complextype) i.e. a vcard contact, that can be used for institution contacts as well as collection contacts? I would prefer this option to the strict selection of relevant contact fields for different purposes. The editor interface can still restrict this if we think thats needed. But NCD would be able to e3xchange an entire vcard record for a contact no matter where it is being used.
+++++++++++++++++++++++++++++++++++++
Hi Markus,
Comments on stripped down version below ...
Neil -----
##NT: Only the NCDID needs to be generated, probably as an LSID. All other identifiers are taken from elsewhere (i.e.the "source")
so my assumption is correct. One NCD id plus any number of associated IDs NCD software doesnt need to understand. Thats what the current schema reflects.
NT- Correct: although it may need to understand <parentCollectionID>
- are multiple parent collections & parent institutions allowed?
currently this is not
##NT: Yes, they should be allowed
both, multiple institutions and multiple parent collections? I know we had this in biocase, but not many people made use of it. And this is the kind of change in data structure which affects our software a lot. With a single hierarchy you have a nice tree, with multiple parents you got a nasty graph which is hard to present. If this is needed only in 1% of the cases I actually think we should ignore it, people should rather indicate this in notes I believe. But if its an important and frequent problem we have to deal with it, sure. What do you reckon?
NT- Yes, Ruud also says he cannot cope with this and is unable to think of any examples, so maybe we should not bother adding it and just leave the status quo.
- what is "richRecordSearchString" in contrast to
"collectionDatabaseURL"? what about "objectAccessService" and simply "furtherInformation" ?
##NT: The "collectionDatabaseURL" is the URL of the database that holds item-level data for the collection being described in this NCD record. The "richRecordSearchString" is what should be used in that database to get directly to the relevant record. To take an example, there will be an NCD record describing a herbarium. The "collectionDatabaseURL" will be to the Index Herbariorum WebDatabase and the "richRecordSearchString" will contain the coden for the herbarium so that the full description of the herbarium can be found by adding the two together. I have no objection to "objectAccessService" for the latter, but the former is more explicit than "furtherInformation".
Now Im confused. I assumed we deal with 2 different things here.
A) link to a service that gives access to individual collection items. This doesnt necessarily have to be a database, so this is why I suggest to rename collectionDatabaseURL into collectionObjectAccess (or collectionItemAccess, objectLevelAccess?)
NT- OK, I can go with that.
B) further eventually more detailed information about this collection meta record. Thats why I suggested furtherInformation (or the dublin core pendant)
NT- This is probably not needed, since there is a <relatedResources> element just above.
If I understand you correctly the richRecordSearchString will be used together with the database url? Isnt that mixing item-level with collection meta-level?
NT- Yes, that was the intention. An objective of NCD records is to point to item-level databases where these exist. Perhaps we should leave it as just the <collectionObjectAccess> element since folk can then do their own search when they get there. This element would require too much in the way of maintenance in the event that the database system changed
- I have replaced all address information with vcard elements. That
is vc:N for person names and vc:ADR for addresses. Some existing NCD elements do not exist in vcard though. How shall we treat them? Right now Ive simply left them as they were before in addition the vcard elements. It looks a bit weird, but its ok I believe. In particular that is byear and deathyear for persons (full birthday exists in vcard) and fax, email, logourl for institutional addresses. Maybe we can remove the email and fax and add a website instead? The collection have real contacts anyway already, so institutions dont really need this, do they?
##NT: Yes, institutions do require contact details too so I would prefer that these elements remain. If they are not present as vc: elements then can they just be ncd: elements?
ok. but isnt fax+email a pretty limited set of potential contact methods? what about telephone and websites, isnt that even more important?
NT- Your suggestion at the foot, to have a single contact class for use with either a collection or an institution or both sounds like a good way to go.
- many elements cover the same ideas that dublin core covers. Should
we use the explicit dublin core elements for title, description, other resources, keyword, rights, modified, created and citation? I would be in favor to do so. It would mean that NCD mainly is about the extra metadata bits that dublin core doesn't cover!
##NT: We certainly could do this and Doug has produced an NCD <--> DC mapping. Since this would just be a re-labelling exercise we can have this as a discussion item at the workshop. NCD would then be an application profile, re-using elements from mets:, dc: and vc: along with its own.
I think thats a good approach. Please lets put this on the agenda for the workshop
NT- OK - there is already a session on mappings, so it can be discussed then
*** ONTOLOGY ***
- collection extent is a pure integer without a unit definition. I
know this is bad, but having complex data (i.e. with multiple elements or attributes) in an ontology means I would have to define a new class just for this. And Roger is very keen on keeping the number of classes low.
##NT: In this case the single attribute should be a string not an integer. Describing a collection extent as "2" is spectacularly meaningless. If the string is "2 shelves" or "2 cupboards" or "2 rooms" it means a little bit more. Not much, but enough for someone to assess whether it is worth a visit or not.
totally right. Ill do that
NT- OK, thanks.
- all keywords lacking strength flag cause I have created a single
keyword class that is used by all keyword types. And the strewngth flag didnt exist for all keywords. Should it or should we drop it?
##NT: I would like to keep the strength flag if possible fot those elements where it makes sense - it would be good to indicate where the collection owner believe's the collection to be particularly strong.
well. I added strength now to the "DefinedTerm" class so it applies to all keywords (). But it might be incorrect, cause now the class describes terms that can be traced back to where they come from (source, idinsource). if we also add strength, then the strength really is a property of the collection (or the term in regard to the collection), not the term itself. To model this in an easier way I will create 2 properties for each strength containing keyword of the collection class. E.g TaxonCoverage and TaxonCoverageStrength. Then both properties can use the DefinedTerm class (which only describes a term, no matter where its used).
Looks as if this problem is solved!
NT- Excellent!
- a persons role in regard to a specific collection is missing. I
think we need it, but then again we need at least a basic controlled vocabulary
##NT: Yes, we do need this. Example terms would be owner, adminstrator, collector, preparator etc.
ok, Ill start with these roles then. How many roles do you expect? If its less than 10 we can avoid new classes and create seperate properties for the different roles. Just like I did with strength in the keywords. I've added those 4 person types to the ontology now.
NT- Carol will be collecting terms for use with popups so any new suggestions should be forwarded to her.
+++++++++++++++++++++++++++++++++++++ I think we need to still take a decision on:
- multiple parent hierarchy. maybe we leave it single and discuss it at the workshop?
NT- Leave as is.
- wording and meaning of "richRecordSearchString" vs "collectionDatabaseURL". Can we agree to use "collectionObjectAccess" and "furtherDetails" both of which are URIs?
NT- Agree first, probably drop second as already covered
- CONTACTS. Wouldnt it be easier if we have a single contact class (complextype) i.e. a vcard contact, that can be used for institution contacts as well as collection contacts? I would prefer this option to the strict selection of relevant contact fields for different purposes. The editor interface can still restrict this if we think thats needed. But NCD would be able to e3xchange an entire vcard record for a contact no matter where it is being used.
NT- Agreed
+++++++++++++++++++++++++++++++++++++
changes to the schema when cleaning up:
about an institution: - PostalAddressText and PhysicalAddressText removed. An institution can have multiple vcard addresses now that allow to specify a "type" such as postal,parcel,home,work,pref
NT- Agreed
So, Markus, with these changes are we able to announce a v1.0 of the schema?
All best wishes, Neil
we are getting there. I think we covert all issues now. -- Markus
On 05.06.2007, at 16:12, Neil Thomson wrote:
Hi Markus,
Comments on stripped down version below ...
Neil
##NT: Only the NCDID needs to be generated, probably as an LSID. All other identifiers are taken from elsewhere (i.e.the "source")
so my assumption is correct. One NCD id plus any number of associated IDs NCD software doesnt need to understand. Thats what the current schema reflects.
NT- Correct: although it may need to understand <parentCollectionID>
yep
- are multiple parent collections & parent institutions allowed?
currently this is not
##NT: Yes, they should be allowed
both, multiple institutions and multiple parent collections? I know we had this in biocase, but not many people made use of it. And this is the kind of change in data structure which affects our software a lot. With a single hierarchy you have a nice tree, with multiple parents you got a nasty graph which is hard to present. If this is needed only in 1% of the cases I actually think we should ignore it, people should rather indicate this in notes I believe. But if its an important and frequent problem we have to deal with it, sure. What do you reckon?
NT- Yes, Ruud also says he cannot cope with this and is unable to think of any examples, so maybe we should not bother adding it and just leave the status quo.
good. this makes things a lot easier
- what is "richRecordSearchString" in contrast to
"collectionDatabaseURL"? what about "objectAccessService" and simply "furtherInformation" ?
##NT: The "collectionDatabaseURL" is the URL of the database that holds item-level data for the collection being described in this NCD record. The "richRecordSearchString" is what should be used in that database to get directly to the relevant record. To take an example, there will be an NCD record describing a herbarium. The "collectionDatabaseURL" will be to the Index Herbariorum WebDatabase and the "richRecordSearchString" will contain the coden for the herbarium so that the full description of the herbarium can be found by adding the two together. I have no objection to "objectAccessService" for the latter, but the former is more explicit than "furtherInformation".
Now Im confused. I assumed we deal with 2 different things here.
A) link to a service that gives access to individual collection items. This doesnt necessarily have to be a database, so this is why I suggest to rename collectionDatabaseURL into collectionObjectAccess (or collectionItemAccess, objectLevelAccess?)
NT- OK, I can go with that.
B) further eventually more detailed information about this collection meta record. Thats why I suggested furtherInformation (or the dublin core pendant)
NT- This is probably not needed, since there is a <relatedResources> element just above.
If I understand you correctly the richRecordSearchString will be used together with the database url? Isnt that mixing item-level with collection meta-level?
NT- Yes, that was the intention. An objective of NCD records is to point to item-level databases where these exist. Perhaps we should leave it as just the <collectionObjectAccess> element since folk can then do their own search when they get there. This element would require too much in the way of maintenance in the event that the database system changed
ok, now I understand! the intention is to list all objects of this collection. As you said this is going to be a maintainance nightmare, so I would be very happy if we omit this element alltogether. Ill remove it from the schema then.
- I have replaced all address information with vcard elements. That
is vc:N for person names and vc:ADR for addresses. Some existing NCD elements do not exist in vcard though. How shall we treat them? Right now Ive simply left them as they were before in addition the vcard elements. It looks a bit weird, but its ok I believe. In particular that is byear and deathyear for persons (full birthday exists in vcard) and fax, email, logourl for institutional addresses. Maybe we can remove the email and fax and add a website instead? The collection have real contacts anyway already, so institutions dont really need this, do they?
##NT: Yes, institutions do require contact details too so I would prefer that these elements remain. If they are not present as vc: elements then can they just be ncd: elements?
ok. but isnt fax+email a pretty limited set of potential contact methods? what about telephone and websites, isnt that even more important?
NT- Your suggestion at the foot, to have a single contact class for use with either a collection or an institution or both sounds like a good way to go.
ok, Ill give it a go then
- many elements cover the same ideas that dublin core covers. Should
we use the explicit dublin core elements for title, description, other resources, keyword, rights, modified, created and citation? I would be in favor to do so. It would mean that NCD mainly is about the extra metadata bits that dublin core doesn't cover!
##NT: We certainly could do this and Doug has produced an NCD <--> DC mapping. Since this would just be a re-labelling exercise we can have this as a discussion item at the workshop. NCD would then be an application profile, re-using elements from mets:, dc: and vc: along with its own.
I think thats a good approach. Please lets put this on the agenda for the workshop
NT- OK - there is already a session on mappings, so it can be discussed then
*** ONTOLOGY ***
- collection extent is a pure integer without a unit definition. I
know this is bad, but having complex data (i.e. with multiple elements or attributes) in an ontology means I would have to define a new class just for this. And Roger is very keen on keeping the number of classes low.
##NT: In this case the single attribute should be a string not an integer. Describing a collection extent as "2" is spectacularly meaningless. If the string is "2 shelves" or "2 cupboards" or "2 rooms" it means a little bit more. Not much, but enough for someone to assess whether it is worth a visit or not.
totally right. Ill do that
NT- OK, thanks.
- all keywords lacking strength flag cause I have created a single
keyword class that is used by all keyword types. And the strewngth flag didnt exist for all keywords. Should it or should we drop it?
##NT: I would like to keep the strength flag if possible fot those elements where it makes sense - it would be good to indicate where the collection owner believe's the collection to be particularly strong.
well. I added strength now to the "DefinedTerm" class so it applies to all keywords (). But it might be incorrect, cause now the class describes terms that can be traced back to where they come from (source, idinsource). if we also add strength, then the strength really is a property of the collection (or the term in regard to the collection), not the term itself. To model this in an easier way I will create 2 properties for each strength containing keyword of the collection class. E.g TaxonCoverage and TaxonCoverageStrength. Then both properties can use the DefinedTerm class (which only describes a term, no matter where its used).
Looks as if this problem is solved!
NT- Excellent!
- a persons role in regard to a specific collection is missing. I
think we need it, but then again we need at least a basic controlled vocabulary
##NT: Yes, we do need this. Example terms would be owner, adminstrator, collector, preparator etc.
ok, Ill start with these roles then. How many roles do you expect? If its less than 10 we can avoid new classes and create seperate properties for the different roles. Just like I did with strength in the keywords. I've added those 4 person types to the ontology now.
NT- Carol will be collecting terms for use with popups so any new suggestions should be forwarded to her.
+++++++++++++++++++++++++++++++++++++ I think we need to still take a decision on:
- multiple parent hierarchy. maybe we leave it single and discuss
it at the workshop?
NT- Leave as is.
- wording and meaning of "richRecordSearchString" vs
"collectionDatabaseURL". Can we agree to use "collectionObjectAccess" and "furtherDetails" both of which are URIs?
NT- Agree first, probably drop second as already covered
done
- CONTACTS. Wouldnt it be easier if we have a single contact class
(complextype) i.e. a vcard contact, that can be used for institution contacts as well as collection contacts? I would prefer this option to the strict selection of relevant contact fields for different purposes. The editor interface can still restrict this if we think thats needed. But NCD would be able to e3xchange an entire vcard record for a contact no matter where it is being used.
NT- Agreed
will do in a minute
+++++++++++++++++++++++++++++++++++++
changes to the schema when cleaning up:
about an institution:
- PostalAddressText and PhysicalAddressText removed. An institution
can have multiple vcard addresses now that allow to specify a "type" such as postal,parcel,home,work,pref
NT- Agreed
So, Markus, with these changes are we able to announce a v1.0 of the schema?
I think so! But I would leave the current versioning until we get accepted by TDWG. Dont you think so? So this next version will be 0.60. If we need any adjustments after the workshop we can raise it to 0.70 and save the 1.0 for the final TDWG accepted version!
All best wishes, Neil
tdwg-ncd mailing list tdwg-ncd@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-ncd
participants (3)
-
Markus Döring
-
Markus Döring
-
Neil Thomson