[tdwg-tapir] tapir metadata issues

Wouter Addink wouter at eti.uva.nl
Wed Jul 4 12:33:37 CEST 2007


I think it may not be enough.  ISO 639-2 (3 letter codes) lists about 500 
languages if I am right. Ethnologue about 7000. The data can be in any 
language or dialect, especially common names or herbal information. The 
ethnologue 3-letter code list has the advantage of having a link between 
languages and countries, although the iso countries list they use is not 
completely up to date. Usually I prefer ISO standards, but in this case I am 
not sure.

Wouter

----- Original Message ----- 
From: "Döring, Markus" <m.doering at BGBM.org>
To: "Wouter Addink" <wouter at eti.uva.nl>; <tdwg-tapir at lists.tdwg.org>
Sent: Wednesday, July 04, 2007 12:08 PM
Subject: Re: [tdwg-tapir] tapir metadata issues


Isn't rfc3066 as used by xml schema enough?
Any arguments against it?

RFC3066 specifies the primary language to be ISO 639-2.
The Library of Congress, maintainers of ISO 639-2, has made the list of
languages registered available on the Internet. It can be found at

http://www.loc.gov/standards/iso639-2/langhome.html
http://www.w3.org/TR/xmlschema-2/#language
http://www.ietf.org/rfc/rfc3066.txt



Am 04.07.2007 10:28 Uhr schrieb "Wouter Addink" unter <wouter at eti.uva.nl>:

> If we are confident we have a standard that suits all, I have nothing
> against it.
>
> Wouter
>
>
> ----- Original Message -----
> From: "Döring, Markus" <m.doering at BGBM.org>
> To: "Renato De Giovanni" <renato at cria.org.br>; <tdwg-tapir at lists.tdwg.org>
> Sent: Wednesday, July 04, 2007 9:48 AM
> Subject: Re: [tdwg-tapir] tapir metadata issues
>
>
>> I cant see why we shouldnt mandate one specific standard. One variable
>> less.
>> I would vote for option #1
>>
>> Markus
>>
>>
>>
>>
>> Am 03.07.2007 3:55 Uhr schrieb "Renato De Giovanni" unter
>> <renato at cria.org.br>:
>>
>>> Hi all,
>>>
>>> I see the following alternatives to the language issue:
>>>
>>> 1) Indicate through the specification one particular standard to be used
>>> by dc:language.
>>>
>>> or
>>>
>>> 2) Include dc:language elements inside a new element with an attribute
>>> indicating the standard being used, such as:
>>>
>>> <contentLanguages standard="ethnologue">
>>>   <dc:language>aaa</dc:language>
>>>   <dc:language>aab</dc:language>
>>> </contentLanguages>
>>>
>>> Where "standard" could be an extensible controlled vocabulary.
>>>
>>> or
>>>
>>> 3) Extend the dc:language type so that it accepts a similar "standard"
>>> attribute.
>>>
>>> Are there other alternatives we should consider?
>>>
>>> I think the requirements are that:
>>>
>>> * Language can be optional.
>>> * There can be multiple languages.
>>> * We must somehow know what is the standard used for the language.
>>>
>>> I don't think it would be necessary to allow multiple language elements
>>> where each one could be potentially related to different standards.
>>>
>>> I don't have strong feelings about this, although I would be more
>>> inclined
>>> to choose option 2. Option 1 would bring less impact to existing
>>> implementations and installations, but we would need to be sure that the
>>> standard we choose would really cover all needs.
>>>
>>> What do you think?
>>>
>>> Regards,
>>> --
>>> Renato
>>>
>>>
>>> _______________________________________________
>>> tdwg-tapir mailing list
>>> tdwg-tapir at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
>>
>> _______________________________________________
>> tdwg-tapir mailing list
>> tdwg-tapir at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
>>
>>
>
>
> _______________________________________________
> tdwg-tapir mailing list
> tdwg-tapir at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir







More information about the tdwg-tag mailing list