[tdwg-content] Why UUIDs alone are not adequate as GUIDs, was Re: ITIS TSNID to uBio NamebankIDs mapping

Bob Morris morris.bob at gmail.com
Wed Jun 8 19:56:40 CEST 2011


And,honoring that tradition,  actionable identifiers should  fit in 16
bits.  Like "ls", "rm", "sh", "wc", and, most honored of all, "cc"
:-)
Bob


On Wed, Jun 8, 2011 at 1:43 PM, Dmitry Mozzherin <dmozzherin at eol.org> wrote:
> I would like to add an observation that comes from a computer science tradition
>
> An object should have one responsibility and several responsibilities
> should be achieved by combination of objects.
>
> In case of identifiers an illustration can be a scientific name. It
> does work as an identifier and as a tiny classification. For example
> Pinus silverstris identifies a species and also tells us the genus of
> the species. As a result identifier changes when classification
> changes and also you cannot identify species until you find a genus
> placement. Combination of two responsibilities in my opinion decreased
> usefulness of this particular identifier dramatically.
>
> Now imagine that Linnaeus would also add a resolution responsibility
> to identifier. Would not his inclusion of resolution mechanism into
> identifier be not that appropriate at this day and age?
>
> Dima
>
> On Wed, Jun 8, 2011 at 1:56 AM, Bob Morris <morris.bob at gmail.com> wrote:
>> Several points in your dialog with Rich Pyle confuse me. I can't tell
>> at places where you are relying on TDWG Applicability Statements,
>> where on W3 Recommendations, where on IETF RFCs, and where on Cool
>> URIs. Your "easier to read" cited opinion pieces of Tim Berners-Lee
>> carries a warning that it is obsolete in places. (I find it hard to
>> identify those places, but,
>> http://www.w3.org/TR/2008/NOTE-cooluris-20081203/ has a status a
>> little less personal than the TBL pieces, and I assume that's what you
>> are resting on.)  That CoolURI NOTE explicitly declaims discussion of
>> non-http URIs, so it is hard for me to see how it supports arguments
>> about http URIs vs non-http URIs, although the last few sections make
>> arguments to bolster its position. Also, as written, the document
>> seems to have a vision of applicability to static web documents.
>> Hence(?) extrapolating to data services seems to require choosing an
>> RDF-based model of data services, and by no means is LOD the only
>> possible such model, ab-hominem (sic) arguments notwithstanding.
>>
>> I've eliminated so much of the dialog with Rich that I may be ignoring
>> context that will show me wrong below.
>>
>> 2011/6/7 Steve Baskauf <steve.baskauf at vanderbilt.edu>:
>>> Rich,
>>>>[Rich Pyle said:]
>>>>Here is where I completely disagree.  I've said it before, and I'll keep
>>>>saying it:  GUIDs are (should be) intended and necessary for
>>>>computer-computer communication; *NOT*for human-computer or human-human
>>>>communication.  Their beauty or ugliness should be determined by what's
>>>>beautiful or ugly to a computer, not to a human.  A consistent 128 bits is
>>>>"beautiful" to a computer, but a UUID is ugly to a human; whereas " Danaus
>>>>plexippus (Linnaeus 1758)" is beautiful to a human, but ugly to a computer
>>>>(for reasons Dima already outlined).
>>
>>>>[cyrillic character and other mumbles omitted]
>>
>>>>Almost by definition, then, a "beautiful" identifier for computer-computer
>>>>communication should be "ugly" to a pair of human eyeballs.
>>
>>>[Steve replied: ]
>>>I disagree with you completely here.  If you haven't read the "Cool URIs" piece, you should before we talk >about this more.  It is full of examples that are easy to read and type and are intended to be "understood" >by both humans and computers.  The piece at http://www.w3.org/Provider/Style/URI is an even easier read.  >GUIDs CAN be easy to "read" and type, although they don't have to be.  The degree to which it "matters" >whether a GUID is human readable or not depends primarily on the likelihood that humans will see it in print >or type it in the URL box of a web browser.  In the examples of GUIDs for names that you provided, I will >agree that it's not very likely that humans will be seeing them.  But if the GUID is of a specimen, an image, >or a tree (which could easily appear in print or be written down by somebody to look at its web page), I would >argue that readability is desirable, e.g. http://bioimages.vanderbilt.edu/uncg/966 .  I realize that everyone >does not agree with me on this, particularly the fans of UUIDs.  As far as I know, there isn't any rule about >what characters should be in an HTTP URI.
>>
>>>  As
>>> far as I know, there isn't any rule about what characters should be in an
>>> HTTP URI.
>>
>> The ASCII control characters are forbidden in URI's used in RDF, but I
>> guess that has no impact on your arguments.
>>
>>>[Steve continued:]
>>>But there is a general understanding that it is a best practice
>>> that an HTTP URI that is intended as an identifier should do content
>>> negotiation and produce both HTML for humans and RDF for machines.
>>>
>>
>> This "general understanding" is about particular models of how to
>> solve the dual use problem, and it's quite bound to the http protocol
>> and web browsers as clients.  Historically, such problems have
>> sometimes been solved at the client side also. For example, most
>> (all?) modern browsers can do pretty well with the FTP URI and the
>> MAILTO URI.
>>
>> History should not be ignored, especially the history of using
>> protocols du-jour.  The convergence of mobile telephony and
>> information management and access arrived rather faster than most
>> predicted. In a mobile world, http may well prove a junior player for
>> data-centric apps, and http servers may not be whence data  is
>> fetched. (This is already the case for Android phones. See
>> http://developer.android.com/guide/topics/providers/content-providers.html#urisum
>> which describes Android's CONTENT URI scheme ).  Similarly, a  number
>> of popular P2P network clients implement the MAGNET URI scheme
>> http://en.wikipedia.org/wiki/Magnet_URI_scheme. Indeed, a cynical view
>> of http://www.w3.org/Mobile/ would hold that W3C's direction is a plan
>> to keep the worldwide web relevant.  Will it succeed for data?
>> Possibly only with a redefinition of the web. For databases, it's not
>> any harder to make android content: protocol servers than to make
>> http: protocol servers. See
>> http://developer.android.com/guide/topics/providers/content-providers.html
>>
>>
>>
>>> [lots of stuff cut out here that will have to wait for another email]
>>>
>>>> [Rich said:]
>>>> Errr..sort of.  I say we identify things using GUIDs, and provide services
>>>> that resolve those GUIDs via actionable HTTP URIs (or, if you prefer,
>>>> embedding those GUIDs within a resolution metadata "wrapper").  Yes, I know
>>>> it's all the rage to collapse the functions of actionability and globally
>>>> unique identification into the same text-string URI (what I've been
>>>> referring to as the TB-L perspective).  But to be perfectly blunt, I see
>>>> this as a mistake that will, in the long run, sow down our progress.
>>>
>>>[Steve replied:  ]
>>> Why does this slow down our progress?  I don't get that at all.  I see your
>>> viewpoint as the one impeding progress because non-HTTP GUIDs make it
>>> difficult or impossible to describe things in RDF.
>>
>> Non-http GUIDS at worst make it difficult to play with data providers
>> and clients that only understand the http protocol, which is a
>> circular argument.  Certainly, for example, Non-http GUIDS do not
>> interfere with SPARQL queries, or with RDF reasoners or with RDF data
>> integration.  In fact, even LOD has no need of http URIs except for
>> the convenience of the existing infrastructure.  Any dereferencable
>> URI scheme would work. And so would multiple ones, provided only the
>> clients and servers both understood the schemes.
>>
>> Finally a social issue about your arguments on the importance of the
>> ease of transcribing URIs from paper.  Of course, wholly within
>> electronic clients for humans, this is irrelevant because the client
>> can render the identifier in any form mutually agreeable to the human
>> and the software. With no insult intended (well---maybe a friendly
>> little poke... :-) ) , the social issue is this:  mainly it's people
>> over 30 who find paper publication anything other than a quaint
>> annoyance. Others will be bemused if not astonished that some people
>> think that paper is  important for prospective publishing in the
>> sciences.
>>
>> In the spirit of ending on agreement: I agree with everything where
>> you and Rich agree...oh, wait, that's because I agree with everything
>> Rich said. :-)
>>
>>
>> Bob Morris
>>
>>
>>>
>>>
>>> Richard L. Pyle, PhD
>>> Database Coordinator for Natural Sciences
>>> Associate Zoologist in Ichthyology
>>> Dive Safety Officer
>>> Department of Natural Sciences, Bishop Museum
>>> 1525 Bernice St., Honolulu, HI 96817
>>> Ph: (808)848-4115, Fax: (808)847-8252
>>> email: deepreef at bishopmuseum.org
>>> http://hbs.bishopmuseum.org/staff/pylerichard.html
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> tdwg-content mailing list
>>> tdwg-content at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>> .
>>>
>>>
>>>
>>> --
>>> Steven J. Baskauf, Ph.D., Senior Lecturer
>>> Vanderbilt University Dept. of Biological Sciences
>>>
>>> postal mail address:
>>> VU Station B 351634
>>> Nashville, TN  37235-1634,  U.S.A.
>>>
>>> delivery address:
>>> 2125 Stevenson Center
>>> 1161 21st Ave., S.
>>> Nashville, TN 37235
>>>
>>> office: 2128 Stevenson Center
>>> phone: (615) 343-4582,  fax: (615) 343-6707
>>> http://bioimages.vanderbilt.edu
>>>
>>> _______________________________________________
>>> tdwg-content mailing list
>>> tdwg-content at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>>
>>>
>>
>>
>>
>> --
>> Robert A. Morris
>>
>> Emeritus Professor  of Computer Science
>> UMASS-Boston
>> 100 Morrissey Blvd
>> Boston, MA 02125-3390
>> IT Staff
>> Filtered Push Project
>> Department of Organismal and Evolutionary Biology
>> Harvard University
>>
>>
>> email: morris.bob at gmail.com
>> web: http://efg.cs.umb.edu/
>> web: http://etaxonomy.org/mw/FilteredPush
>> http://www.cs.umb.edu/~ram
>> phone (+1) 857 222 7992 (mobile)
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>



-- 
Robert A. Morris

Emeritus Professor  of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
IT Staff
Filtered Push Project
Department of Organismal and Evolutionary Biology
Harvard University


email: morris.bob at gmail.com
web: http://efg.cs.umb.edu/
web: http://etaxonomy.org/mw/FilteredPush
http://www.cs.umb.edu/~ram
phone (+1) 857 222 7992 (mobile)


More information about the tdwg-content mailing list