[tdwg-content] Why UUIDs alone are not adequate as GUIDs, was Re: ITIS TSNID to uBio NamebankIDs mapping
morris.bob at gmail.com
Wed Jun 8 19:56:40 CEST 2011
And,honoring that tradition, actionable identifiers should fit in 16
bits. Like "ls", "rm", "sh", "wc", and, most honored of all, "cc"
On Wed, Jun 8, 2011 at 1:43 PM, Dmitry Mozzherin <dmozzherin at eol.org> wrote:
> I would like to add an observation that comes from a computer science tradition
> An object should have one responsibility and several responsibilities
> should be achieved by combination of objects.
> In case of identifiers an illustration can be a scientific name. It
> does work as an identifier and as a tiny classification. For example
> Pinus silverstris identifies a species and also tells us the genus of
> the species. As a result identifier changes when classification
> changes and also you cannot identify species until you find a genus
> placement. Combination of two responsibilities in my opinion decreased
> usefulness of this particular identifier dramatically.
> Now imagine that Linnaeus would also add a resolution responsibility
> to identifier. Would not his inclusion of resolution mechanism into
> identifier be not that appropriate at this day and age?
> On Wed, Jun 8, 2011 at 1:56 AM, Bob Morris <morris.bob at gmail.com> wrote:
>> Several points in your dialog with Rich Pyle confuse me. I can't tell
>> at places where you are relying on TDWG Applicability Statements,
>> where on W3 Recommendations, where on IETF RFCs, and where on Cool
>> URIs. Your "easier to read" cited opinion pieces of Tim Berners-Lee
>> carries a warning that it is obsolete in places. (I find it hard to
>> identify those places, but,
>> http://www.w3.org/TR/2008/NOTE-cooluris-20081203/ has a status a
>> little less personal than the TBL pieces, and I assume that's what you
>> are resting on.) That CoolURI NOTE explicitly declaims discussion of
>> non-http URIs, so it is hard for me to see how it supports arguments
>> about http URIs vs non-http URIs, although the last few sections make
>> arguments to bolster its position. Also, as written, the document
>> seems to have a vision of applicability to static web documents.
>> Hence(?) extrapolating to data services seems to require choosing an
>> RDF-based model of data services, and by no means is LOD the only
>> possible such model, ab-hominem (sic) arguments notwithstanding.
>> I've eliminated so much of the dialog with Rich that I may be ignoring
>> context that will show me wrong below.
>> 2011/6/7 Steve Baskauf <steve.baskauf at vanderbilt.edu>:
>>>>[Rich Pyle said:]
>>>>Here is where I completely disagree. I've said it before, and I'll keep
>>>>saying it: GUIDs are (should be) intended and necessary for
>>>>computer-computer communication; *NOT*for human-computer or human-human
>>>>communication. Their beauty or ugliness should be determined by what's
>>>>beautiful or ugly to a computer, not to a human. A consistent 128 bits is
>>>>"beautiful" to a computer, but a UUID is ugly to a human; whereas " Danaus
>>>>plexippus (Linnaeus 1758)" is beautiful to a human, but ugly to a computer
>>>>(for reasons Dima already outlined).
>>>>[cyrillic character and other mumbles omitted]
>>>>Almost by definition, then, a "beautiful" identifier for computer-computer
>>>>communication should be "ugly" to a pair of human eyeballs.
>>>[Steve replied: ]
>>>I disagree with you completely here. If you haven't read the "Cool URIs" piece, you should before we talk >about this more. It is full of examples that are easy to read and type and are intended to be "understood" >by both humans and computers. The piece at http://www.w3.org/Provider/Style/URI is an even easier read. >GUIDs CAN be easy to "read" and type, although they don't have to be. The degree to which it "matters" >whether a GUID is human readable or not depends primarily on the likelihood that humans will see it in print >or type it in the URL box of a web browser. In the examples of GUIDs for names that you provided, I will >agree that it's not very likely that humans will be seeing them. But if the GUID is of a specimen, an image, >or a tree (which could easily appear in print or be written down by somebody to look at its web page), I would >argue that readability is desirable, e.g. http://bioimages.vanderbilt.edu/uncg/966 . I realize that everyone >does not agree with me on this, particularly the fans of UUIDs. As far as I know, there isn't any rule about >what characters should be in an HTTP URI.
>>> far as I know, there isn't any rule about what characters should be in an
>>> HTTP URI.
>> The ASCII control characters are forbidden in URI's used in RDF, but I
>> guess that has no impact on your arguments.
>>>But there is a general understanding that it is a best practice
>>> that an HTTP URI that is intended as an identifier should do content
>>> negotiation and produce both HTML for humans and RDF for machines.
>> This "general understanding" is about particular models of how to
>> solve the dual use problem, and it's quite bound to the http protocol
>> and web browsers as clients. Historically, such problems have
>> sometimes been solved at the client side also. For example, most
>> (all?) modern browsers can do pretty well with the FTP URI and the
>> MAILTO URI.
>> History should not be ignored, especially the history of using
>> protocols du-jour. The convergence of mobile telephony and
>> information management and access arrived rather faster than most
>> predicted. In a mobile world, http may well prove a junior player for
>> data-centric apps, and http servers may not be whence data is
>> fetched. (This is already the case for Android phones. See
>> which describes Android's CONTENT URI scheme ). Similarly, a number
>> of popular P2P network clients implement the MAGNET URI scheme
>> http://en.wikipedia.org/wiki/Magnet_URI_scheme. Indeed, a cynical view
>> of http://www.w3.org/Mobile/ would hold that W3C's direction is a plan
>> to keep the worldwide web relevant. Will it succeed for data?
>> Possibly only with a redefinition of the web. For databases, it's not
>> any harder to make android content: protocol servers than to make
>> http: protocol servers. See
>>> [lots of stuff cut out here that will have to wait for another email]
>>>> [Rich said:]
>>>> Errr..sort of. I say we identify things using GUIDs, and provide services
>>>> that resolve those GUIDs via actionable HTTP URIs (or, if you prefer,
>>>> embedding those GUIDs within a resolution metadata "wrapper"). Yes, I know
>>>> it's all the rage to collapse the functions of actionability and globally
>>>> unique identification into the same text-string URI (what I've been
>>>> referring to as the TB-L perspective). But to be perfectly blunt, I see
>>>> this as a mistake that will, in the long run, sow down our progress.
>>>[Steve replied: ]
>>> Why does this slow down our progress? I don't get that at all. I see your
>>> viewpoint as the one impeding progress because non-HTTP GUIDs make it
>>> difficult or impossible to describe things in RDF.
>> Non-http GUIDS at worst make it difficult to play with data providers
>> and clients that only understand the http protocol, which is a
>> circular argument. Certainly, for example, Non-http GUIDS do not
>> interfere with SPARQL queries, or with RDF reasoners or with RDF data
>> integration. In fact, even LOD has no need of http URIs except for
>> the convenience of the existing infrastructure. Any dereferencable
>> URI scheme would work. And so would multiple ones, provided only the
>> clients and servers both understood the schemes.
>> Finally a social issue about your arguments on the importance of the
>> ease of transcribing URIs from paper. Of course, wholly within
>> electronic clients for humans, this is irrelevant because the client
>> can render the identifier in any form mutually agreeable to the human
>> and the software. With no insult intended (well---maybe a friendly
>> little poke... :-) ) , the social issue is this: mainly it's people
>> over 30 who find paper publication anything other than a quaint
>> annoyance. Others will be bemused if not astonished that some people
>> think that paper is important for prospective publishing in the
>> In the spirit of ending on agreement: I agree with everything where
>> you and Rich agree...oh, wait, that's because I agree with everything
>> Rich said. :-)
>> Bob Morris
>>> Richard L. Pyle, PhD
>>> Database Coordinator for Natural Sciences
>>> Associate Zoologist in Ichthyology
>>> Dive Safety Officer
>>> Department of Natural Sciences, Bishop Museum
>>> 1525 Bernice St., Honolulu, HI 96817
>>> Ph: (808)848-4115, Fax: (808)847-8252
>>> email: deepreef at bishopmuseum.org
>>> tdwg-content mailing list
>>> tdwg-content at lists.tdwg.org
>>> Steven J. Baskauf, Ph.D., Senior Lecturer
>>> Vanderbilt University Dept. of Biological Sciences
>>> postal mail address:
>>> VU Station B 351634
>>> Nashville, TN 37235-1634, U.S.A.
>>> delivery address:
>>> 2125 Stevenson Center
>>> 1161 21st Ave., S.
>>> Nashville, TN 37235
>>> office: 2128 Stevenson Center
>>> phone: (615) 343-4582, fax: (615) 343-6707
>>> tdwg-content mailing list
>>> tdwg-content at lists.tdwg.org
>> Robert A. Morris
>> Emeritus Professor of Computer Science
>> 100 Morrissey Blvd
>> Boston, MA 02125-3390
>> IT Staff
>> Filtered Push Project
>> Department of Organismal and Evolutionary Biology
>> Harvard University
>> email: morris.bob at gmail.com
>> web: http://efg.cs.umb.edu/
>> web: http://etaxonomy.org/mw/FilteredPush
>> phone (+1) 857 222 7992 (mobile)
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
Robert A. Morris
Emeritus Professor of Computer Science
100 Morrissey Blvd
Boston, MA 02125-3390
Filtered Push Project
Department of Organismal and Evolutionary Biology
email: morris.bob at gmail.com
phone (+1) 857 222 7992 (mobile)
More information about the tdwg-content