Hi there, This really is weird. I was confident that the xml schema language data type was used. This defines natural language identifiers as defined by RFC 3066. http://www.w3.org/TR/xmlschema-2/#language http://www.ietf.org/rfc/rfc3066.txt
But when I looked up the tapir schema, it makes use of the dublin core schema that then states the following:
<xs:element name="language" substitutionGroup="any"/> <xs:element name="any" type="SimpleLiteral" abstract="true"/> <xs:complexType name="SimpleLiteral"> <xs:complexContent mixed="true"> <xs:restriction base="xs:anyType"> xs:sequence <xs:any processContents="lax" minOccurs="0" maxOccurs="0"/> </xs:sequence> <xs:attribute ref="xml:lang" use="optional"/> </xs:restriction> </xs:complexContent> </xs:complexType>
So you can use anything for the dc:language element AND tag it with an optional xml:lang attribute. Thats weird:
<dc:language xml:lang="en">swuaheli</dc:language>
Markus
Am 29.06.2007 15:52 Uhr schrieb "Jim Graham" unter jim@nrel.colostate.edu:
Greetings,
I understand making the dc:language optional but I'd be really concerned about allowing the language code to be from different standards. The example for "SW" Wouter points out would be a real concern. Can we have mulitple language elements each of which is tied to a specific language code standard? This way we cannot make the type of mistake with "SW" being missinterpreted as Swedish or Swahili.
Thanks, Jim
Jim Graham Natural Resource Ecology Laboratory Colorado State University Fort Collins, CO 80524 jim@nrel.colostate.edu 970-491-0410
-----Original Message----- From: tdwg-tapir-bounces@lists.tdwg.org [mailto:tdwg-tapir-bounces@lists.tdwg.org] On Behalf Of Wouter Addink Sent: Friday, June 29, 2007 2:45 AM To: tdwg-tapir@lists.tdwg.org Subject: Re: [tdwg-tapir] tapir metadata issues
Renato, thanks for your comments.
- Make dc:language an optional element.
- Change the cardinality of dc:language to "unbounded".
- Change the recommendation about the content of dc:language by
including ethnologue codes as another option (probably the main option). Note that it will still be just a recommendation, not a normative
statement.
Ok. Perhaps we should add an optional attribute also, for specifying the used code standard, if any? That should not affect current implementations I think. Problem is that you cannot do anything with an abbreviation if you do not know what it means. Making assumptions can be dangerous. For instance you could asume that "SW" means Swedish, or that it means Swahili. If you know that it is an IANA subtag, you can use it and you can also raise an error if there is an abbreviation which is not present in the used standard.
Another comment about the Tapir metadata: when giving courses in installing Tapirlink, I noticed that none of the about 10 (Dutch) students could figure out themselves what 'relatedEntity' means. They all needed help on that. Perhaps the documentation of that element should be expanded?
Cheers, Wouter
tdwg-tapir mailing list tdwg-tapir@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
tdwg-tapir mailing list tdwg-tapir@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tapir