Topic 1: What do we mean by "GUID"?

Donald Hobern dhobern at GBIF.ORG
Tue Oct 11 16:37:18 CEST 2005


[ I will be trying to provide some structure to discussions in this mailing
list by raising specific topics and looking for comments.  Please keep the
Topic number in responses ]



Topic 1: What do we mean by GUID?



The most fundamental thing that we need to establish as we consider a GUID
implementation is a definition for "GUID" in this context.  We have been
using a number of terms to describe the identifiers we need (unique,
resolvable, persistent, etc.).



I've been spending some time following up on Rod Page's recommendation that
we consider the use of Archival Resource Keys (ARK) from the California
Digital Library (see http://wiki.gbif.org/guidwiki/wikka.php?wakka=ARK).
The CDL web site includes an excellent overview of this GUID model, which
also serves as an excellent introduction to the issues involved.  I would
urge you all to read this document - it's only nine pages long!):



http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf



This document arrives at the following problem definition for persistent,
actionable identifiers:



1.      The goal: long-term actionable identifiers.

a.      Requirement: that identifiers deliver you to objects (where
feasible).
b.      Requirement: that identifiers deliver you to object metadata.
c.      Desirable: each object should wear its own identifier.
d.      Requirement: that identifiers deliver you to statements of
commitment.

2.      The problem: URLs break for some objects (that is, associations
between URLs and objects are not maintained), and we have no way to tell
which ones will or won't break.
3.      Why URLs break: because objects are moved, removed, and replaced -
completely normal activities - and the provider in each case demonstrates
insufficient commitment to update indirection tables, or to plan identifier
assignment carefully. Persistence is in the mission of few organizations.
4.      Conventional hypothesis: use indirect names (PURLs, URNs, Handles)
instead of URLs; what worked for DNS should work for digital object
references.  Wrong. Indirection is spectacularly successful and elegant in
DNS, but it's a side issue in the provision of digital object persistence.



This document clearly identifies issues around provider service commitments
as the key problem that needs solving.  The construction of ARKs seeks to
address this in a couple of ways.  It separates the role of Name Assigning
Authority (i.e. who initially assigns the identifier) from that of the Name
Mapping Authority (i.e. who is able to map the identifier to the data object
at any particular time).  It also defines a simple standard relationship
between three things: the data object, the metadata for the object, and a
commitment statement from the provider as to what aspects of persistence are
guaranteed.



ARK is a technology that we have not really considered up to this point.  My
question for discussion is what, if anything, is missing or wrong about the
problem definition provided in this document?  If we agree that it provides
a crisp definition of what we need, that in itself will be a major step
forward.



Please provide your thoughts.



Donald

---------------------------------------------------------------
Donald Hobern (dhobern at gbif.org)
Programme Officer for Data Access and Database Interoperability
Global Biodiversity Information Facility Secretariat
Universitetsparken 15, DK-2100 Copenhagen, Denmark
Tel: +45-35321483   Mobile: +45-28751483   Fax: +45-35321480
---------------------------------------------------------------




------=_NextPart_000_021B_01C5CE82.0B8C6DB0
Content-Type: text/html;
        charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:st1=3D"urn:schemas-microsoft-com:office:smarttags" =
xmlns=3D"http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii">
<meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
<o:SmartTagType =
namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
 name=3D"State"/>
<o:SmartTagType =
namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
 name=3D"country-region"/>
<o:SmartTagType =
namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
 name=3D"City"/>
<o:SmartTagType =
namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
 name=3D"place"/>
<!--[if !mso]>
<style>
st1\:*{behavior:url(#default#ieooui) }
</style>
<![endif]-->
<style>
<!--
 /* Font Definitions */
 @font-face
        {font-family:CMR10;
        panose-1:0 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:CMTI10;
        panose-1:0 0 0 0 0 0 0 0 0 0;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:Arial;
        color:windowtext;}
@page Section1
        {size:612.0pt 792.0pt;
        margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
        {page:Section1;}
 /* List Definitions */
 @list l0
        {mso-list-id:256908390;
        mso-list-type:hybrid;
        mso-list-template-ids:1073485238 67698703 67698713 67698715 67698703 =
67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
        {mso-level-tab-stop:36.0pt;
        mso-level-number-position:left;
        text-indent:-18.0pt;}
@list l0:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:72.0pt;
        mso-level-number-position:left;
        text-indent:-18.0pt;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
-->
</style>
<!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext=3D"edit">
  <o:idmap v:ext=3D"edit" data=3D"1" />
 </o:shapelayout></xml><![endif]-->
</head>

<body lang=3DEN-US link=3Dblue vlink=3Dpurple>

<div class=3DSection1>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'>[ I will be trying to provide some structure =
to
discussions in this mailing list by raising specific topics and looking =
for
comments. &nbsp;Please keep the Topic number in responses =
]<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'>Topic 1: What do we mean by =
GUID?<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'>The most fundamental thing that we need to =
establish
as we consider a GUID implementation is a definition for =
&#8220;GUID&#8221; in
this context. &nbsp;We have been using a number of terms to describe the
identifiers we need (unique, resolvable, persistent, etc.). =
&nbsp;<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'>I&#8217;ve been spending some time following =
up on
Rod Page&#8217;s recommendation that we consider the use of Archival =
Resource
Keys (ARK) from the California Digital Library (see =
http://wiki.gbif.org/guidwiki/wikka.php?wakka=3DARK).
&nbsp;The CDL web site includes an excellent overview of this GUID =
model, which
also serves as an excellent introduction to the issues involved. &nbsp;I =
would
urge you all to read this document &#8211; it&#8217;s only nine pages =
long!):<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'><a
href=3D"http://www.cdlib.org/inside/diglib/ark/arkcdl.pdf">http://www.cdl=
ib.org/inside/diglib/ark/arkcdl.pdf</a><o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'>This document arrives at the following problem
definition for persistent, actionable =
identifiers:<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<ol style=3D'margin-top:0cm' start=3D1 type=3D1>
 <li class=3DMsoNormal style=3D'mso-list:l0 level1 =
lfo1;text-autospace:none'><font
     size=3D2 face=3DCMR10><span =
style=3D'font-size:10.0pt;font-family:CMR10'>The
     goal: long-term </span></font><i><font size=3D2 face=3DCMTI10><span
     =
style=3D'font-size:10.0pt;font-family:CMTI10;font-style:italic'>actionabl=
e</span></font></i><font
     size=3D2 face=3DCMTI10><span =
style=3D'font-size:10.0pt;font-family:CMTI10'> </span></font><font
     size=3D2 face=3DCMR10><span =
style=3D'font-size:10.0pt;font-family:CMR10'>identifiers.<o:p></o:p></spa=
n></font></li>
 <ol style=3D'margin-top:0cm' start=3D1 type=3Da>
  <li class=3DMsoNormal style=3D'mso-list:l0 level2 =
lfo1;text-autospace:none'><i><font
      size=3D2 face=3DCMTI10><span =
style=3D'font-size:10.0pt;font-family:CMTI10;
      font-style:italic'>Requirement: that identifiers deliver you to =
objects
      (where feasible).<o:p></o:p></span></font></i></li>
  <li class=3DMsoNormal style=3D'mso-list:l0 level2 =
lfo1;text-autospace:none'><i><font
      size=3D2 face=3DCMTI10><span =
style=3D'font-size:10.0pt;font-family:CMTI10;
      font-style:italic'>Requirement: that identifiers deliver you to =
object
      metadata.<o:p></o:p></span></font></i></li>
  <li class=3DMsoNormal style=3D'mso-list:l0 level2 =
lfo1;text-autospace:none'><i><font
      size=3D2 face=3DCMTI10><span =
style=3D'font-size:10.0pt;font-family:CMTI10;
      font-style:italic'>Desirable: each object should wear its own =
identifier.<o:p></o:p></span></font></i></li>
  <li class=3DMsoNormal style=3D'mso-list:l0 level2 =
lfo1;text-autospace:none'><i><font
      size=3D2 face=3DCMTI10><span =
style=3D'font-size:10.0pt;font-family:CMTI10;
      font-style:italic'>Requirement: that identifiers deliver you to
      statements of commitment</span></font></i><i><font size=3D2 =
face=3DCMR10><span
      =
style=3D'font-size:10.0pt;font-family:CMR10;font-style:italic'>.<o:p></o:=
p></span></font></i></li>
 </ol>
 <li class=3DMsoNormal style=3D'mso-list:l0 level1 =
lfo1;text-autospace:none'><font
     size=3D2 face=3DCMR10><span =
style=3D'font-size:10.0pt;font-family:CMR10'>The
     problem: URLs break </span></font><i><font size=3D2 =
face=3DCMTI10><span
     style=3D'font-size:10.0pt;font-family:CMTI10;font-style:italic'>for =
some
     objects (that is, associations between URLs and objects are not
     maintained), and we have no way to tell which ones will or =
won&#8217;t
     break</span></font></i><i><font size=3D2 face=3DCMR10><span =
style=3D'font-size:
     =
10.0pt;font-family:CMR10;font-style:italic'>.<o:p></o:p></span></font></i=
></li>
 <li class=3DMsoNormal style=3D'mso-list:l0 level1 =
lfo1;text-autospace:none'><font
     size=3D2 face=3DCMR10><span =
style=3D'font-size:10.0pt;font-family:CMR10'>Why
     URLs break: because objects are moved, removed, and replaced =
</span></font><font
     size=3D2 face=3DCMTI10><span =
style=3D'font-size:10.0pt;font-family:CMTI10'>&#8211;
     <i><span style=3D'font-style:italic'>completely normal activities =
&#8211;
     and the provider in each case demonstrates insufficient commitment =
to
     update indirection tables, or to plan identifier assignment =
carefully.
     Persistence is in the mission of few =
organizations.<o:p></o:p></span></i></span></font></li>
 <li class=3DMsoNormal style=3D'mso-list:l0 level1 =
lfo1;text-autospace:none'><font
     size=3D2 face=3DCMR10><span =
style=3D'font-size:10.0pt;font-family:CMR10'>Conventional
     hypothesis: use indirect names (PURLs, URNs, Handles) instead of =
URLs;
     what worked for DNS should work for digital object =
references.&nbsp; </span></font><i><font
     size=3D2 face=3DCMTI10><span =
style=3D'font-size:10.0pt;font-family:CMTI10;
     font-style:italic'>Wrong. Indirection is spectacularly successful =
and
     elegant in DNS, but it&#8217;s a side issue in the provision of =
digital
     object persistence.</span></font></i><i><font size=3D2 =
face=3DCMR10><span
     =
style=3D'font-size:10.0pt;font-family:CMR10;font-style:italic'><o:p></o:p=
></span></font></i></li>
</ol>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>This document clearly identifies issues around =
provider
service commitments as the key problem that needs solving. &nbsp;The
construction of ARKs seeks to address this in a couple of ways. &nbsp;It
separates the role of Name Assigning Authority (i.e. who initially =
assigns the
identifier) from that of the Name Mapping Authority (i.e. who is able to =
map
the identifier to the data object at any particular time). &nbsp;It also
defines a simple standard relationship between three things: the data =
object,
the metadata for the object, and a commitment statement from the =
provider as to
what aspects of persistence are guaranteed.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><st1:State w:st=3D"on"><st1:place w:st=3D"on"><font =
size=3D2
  face=3DArial><span =
style=3D'font-size:10.0pt;font-family:Arial'>ARK</span></font></st1:place=
></st1:State><font
size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;font-family:Arial'> is a
technology that we have not really considered up to this point. &nbsp;My
question for discussion is what, if anything, is missing or wrong about =
the
problem definition provided in this document? &nbsp;If we agree that it
provides a crisp definition of what we need, that in itself will be a =
major
step forward.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span =
style=3D'font-size:10.0pt;
font-family:Arial'>Please provide your =
thoughts.<o:p></o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>

<p class=3DMsoNormal><font size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:
10.0pt;font-family:Arial'>Donald<br>
&nbsp;<br>
---------------------------------------------------------------<br>
Donald Hobern (<a =
href=3D"mailto:dhobern at gbif.org">dhobern at gbif.org</a>)<br>
Programme Officer for Data Access and Database Interoperability <br>
Global Biodiversity Information Facility Secretariat <br>
Universitetsparken 15, DK-2100 <st1:place w:st=3D"on"><st1:City =
w:st=3D"on">Copenhagen</st1:City>,
 <st1:country-region =
w:st=3D"on">Denmark</st1:country-region></st1:place><br>
Tel: +45-35321483&nbsp;&nbsp; <st1:City w:st=3D"on"><st1:place =
w:st=3D"on">Mobile</st1:place></st1:City>:
+45-28751483&nbsp;&nbsp; Fax: +45-35321480<br>
---------------------------------------------------------------</span></f=
ont><font
size=3D2 face=3DArial><span lang=3DEN-GB =
style=3D'font-size:10.0pt;font-family:Arial'><o:p></o:p></span></font></p=
>

<p class=3DMsoNormal><font size=3D3 face=3D"Times New Roman"><span =
lang=3DEN-GB
style=3D'font-size:12.0pt'><o:p>&nbsp;</o:p></span></font></p>

</div>

</body>

</html>


More information about the tdwg-tag mailing list