I think it may not be enough. ISO 639-2 (3 letter codes) lists about 500 languages if I am right. Ethnologue about 7000. The data can be in any language or dialect, especially common names or herbal information. The ethnologue 3-letter code list has the advantage of having a link between languages and countries, although the iso countries list they use is not completely up to date. Usually I prefer ISO standards, but in this case I am not sure.
Wouter
----- Original Message ----- From: "Döring, Markus" m.doering@BGBM.org To: "Wouter Addink" wouter@eti.uva.nl; tdwg-tapir@lists.tdwg.org Sent: Wednesday, July 04, 2007 12:08 PM Subject: Re: [tdwg-tapir] tapir metadata issues
Isn't rfc3066 as used by xml schema enough? Any arguments against it?
RFC3066 specifies the primary language to be ISO 639-2. The Library of Congress, maintainers of ISO 639-2, has made the list of languages registered available on the Internet. It can be found at
http://www.loc.gov/standards/iso639-2/langhome.html http://www.w3.org/TR/xmlschema-2/#language http://www.ietf.org/rfc/rfc3066.txt
Am 04.07.2007 10:28 Uhr schrieb "Wouter Addink" unter wouter@eti.uva.nl:
If we are confident we have a standard that suits all, I have nothing against it.
Wouter
----- Original Message ----- From: "Döring, Markus" m.doering@BGBM.org To: "Renato De Giovanni" renato@cria.org.br; tdwg-tapir@lists.tdwg.org Sent: Wednesday, July 04, 2007 9:48 AM Subject: Re: [tdwg-tapir] tapir metadata issues
I cant see why we shouldnt mandate one specific standard. One variable less. I would vote for option #1
Markus
Am 03.07.2007 3:55 Uhr schrieb "Renato De Giovanni" unter renato@cria.org.br:
Hi all,
I see the following alternatives to the language issue:
- Indicate through the specification one particular standard to be used
by dc:language.
or
- Include dc:language elements inside a new element with an attribute
indicating the standard being used, such as:
<contentLanguages standard="ethnologue"> <dc:language>aaa</dc:language> <dc:language>aab</dc:language> </contentLanguages>
Where "standard" could be an extensible controlled vocabulary.
or
- Extend the dc:language type so that it accepts a similar "standard"
attribute.
Are there other alternatives we should consider?
I think the requirements are that:
- Language can be optional.
- There can be multiple languages.
- We must somehow know what is the standard used for the language.
I don't think it would be necessary to allow multiple language elements where each one could be potentially related to different standards.
I don't have strong feelings about this, although I would be more inclined to choose option 2. Option 1 would bring less impact to existing implementations and installations, but we would need to be sure that the standard we choose would really cover all needs.
What do you think?
Regards,
Renato
tdwg-tapir mailing list tdwg-tapir@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
tdwg-tapir mailing list tdwg-tapir@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
tdwg-tapir mailing list tdwg-tapir@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tapir