[tdwg-tapir] tapir metadata issues

newer
[tdwg-tapir] New version of the...

Wouter Addink

28 Jun 2007 28 Jun '07

07:15

Hi, I am currently implementing Tapir metadata responses and have some issues. -indexing preferences frequency in the documentation states @Frequency (starting with uppercase), the schema uses @frequency. I assume I have to implement @frequency. -dc:language is mandatory. What to do with data that is not language specific? Example: we are going to use Tapir for sharing lists of scientific names. Should the language be Latin in that case? We think about using specifying English (eng) as default in that case. The recommendation is to use IANA Language subtags. Probably better to recommend the languages from ethnologue.org (3-letter abbreviations). This because the data can be in much more languages then the IANA Languages, for instance common names in extinct languages. This is different from the xml:lang attribute, which is primarily for application development. -the example for an accesspoint is http://example.net/tapir.cgi I think this may be misleading, it would perhaps be better to use http://example.net/tapir.cgi/yourdatasource as example. Wouter

Attachments:

attachment.html (text/html — 1.9 KB)

Show replies by date

Renato De Giovanni

28 Jun 28 Jun

09:58

Hi Wouter, Many thanks for raising these issues. I'll comment each one below...

...

-indexing preferences frequency in the documentation states @Frequency (starting with uppercase), the schema uses @frequency. I assume I have to implement @frequency.

You're right. I've just fixed the specification and I plan to release a new version during the next weeks.

...

-dc:language is mandatory. What to do with data that is not language specific? Example: we are going to use Tapir for sharing lists of scientific names. Should the language be Latin in that case? We think about using specifying English (eng) as default in that case. The recommendation is to use IANA Language subtags. Probably better to recommend the languages from ethnologue.org (3-letter abbreviations). This because the data can be in much more languages then the IANA Languages, for instance common names in extinct languages. This is different from the xml:lang attribute, which is primarily for application development.

I agree with you, and I think it should not be a problem for the existing applications if we make these changes: - Make dc:language an optional element. - Change the cardinality of dc:language to "unbounded". - Change the recommendation about the content of dc:language by including ethnologue codes as another option (probably the main option). Note that it will still be just a recommendation, not a normative statement. I'll wait to see if there's some feedback about this before making the changes.

...

-the example for an accesspoint is http://example.net/tapir.cgi I think this may be misleading, it would perhaps be better to use http://example.net/tapir.cgi/yourdatasource as example.

This one I think it's OK to keep it as it is. A TAPIR accesspoint is just an URL, so it's really up to the provider to decide which pattern to use. Thanks again, -- Renato

Wouter Addink

29 Jun 29 Jun

01:44

Renato, thanks for your comments.

...

- Make dc:language an optional element. - Change the cardinality of dc:language to "unbounded". - Change the recommendation about the content of dc:language by including ethnologue codes as another option (probably the main option). Note that it will still be just a recommendation, not a normative statement.

Ok. Perhaps we should add an optional attribute also, for specifying the used code standard, if any? That should not affect current implementations I think. Problem is that you cannot do anything with an abbreviation if you do not know what it means. Making assumptions can be dangerous. For instance you could asume that "SW" means Swedish, or that it means Swahili. If you know that it is an IANA subtag, you can use it and you can also raise an error if there is an abbreviation which is not present in the used standard. Another comment about the Tapir metadata: when giving courses in installing Tapirlink, I noticed that none of the about 10 (Dutch) students could figure out themselves what 'relatedEntity' means. They all needed help on that. Perhaps the documentation of that element should be expanded? Cheers, Wouter

Jim Graham

06:52

Greetings, I understand making the dc:language optional but I'd be really concerned about allowing the language code to be from different standards. The example for "SW" Wouter points out would be a real concern. Can we have mulitple language elements each of which is tied to a specific language code standard? This way we cannot make the type of mistake with "SW" being missinterpreted as Swedish or Swahili. Thanks, Jim Jim Graham Natural Resource Ecology Laboratory Colorado State University Fort Collins, CO 80524 jim@nrel.colostate.edu 970-491-0410 -----Original Message----- From: tdwg-tapir-bounces@lists.tdwg.org [mailto:tdwg-tapir-bounces@lists.tdwg.org] On Behalf Of Wouter Addink Sent: Friday, June 29, 2007 2:45 AM To: tdwg-tapir@lists.tdwg.org Subject: Re: [tdwg-tapir] tapir metadata issues Renato, thanks for your comments.

...

- Make dc:language an optional element. - Change the cardinality of dc:language to "unbounded". - Change the recommendation about the content of dc:language by including ethnologue codes as another option (probably the main option). Note that it will still be just a recommendation, not a normative statement.

Döring, Markus

2 Jul 2 Jul