[tdwg-tag] Recent DiGIR service disruption

Roger Hyam rogerhyam at mac.com
Wed Oct 8 11:42:18 CEST 2008


Thanks to Rod and DaveV for their work.

I'd like to make one observation and concur with David Shorthouse on  
one point.

1) Looking down the list of providers many are exposed as IP addresses  
(which is bad), some are on domain names but not port 80 (which is  
slightly bad) and some are clearly machine names rather than names  
that one would associate with services. It appears to me that a  
proportion people are not using DNS robustly to allow services to be  
moved from machine to machine in a sensible way. So aside from the  
current issues DaveV has outline there is inherent brittleness in the  
way we are implementing these services. This may well be because  
people in large or small institutions do not have easy access to DNS  
configuration for their institutional domain even if they understand  
the importance of it. This has big implications for role out of LSIDs  
or any other technology dependent on configuration of DNS.

2) To concur with DavidS on the number of providers. Whilst  
developing  the Biodiversity Collections Index we worked through some  
lists of DiGIR providers at various portals to see if we were missing  
any 'real' collections. What struck us was the level of overlap and  
the fact that we weren't discovering much (if anything) new. Rod's  
page lists < 200 DiGIR providers (if I try and weed out the duplicates  
by eye).  BCI has around 5,000 physical collections listed. This seems  
like a small percentage of cover even if you take into account BioCASe  
and TAPIR providers. There is overlap between provider and they do not  
very often cover complete collections. On the other hand the more  
important collections may be more likely to be covered.

Sorry for the lengthy cross post.

All the best,

Roger



On 8 Oct 2008, at 07:06, Dave Vieglais wrote:

> Apologies for cross posting.
>
> Hopefully, most of you did not notice this, but if you did, here's
> what happened.  A problem with resolving the digir.net domain was
> recently identified by John Wieczorek and possibly others.  This
> problem would have affected most, if not all DiGIR providers currently
> operating.  Once identified, the issue was promptly resolved, and
> ongoing problems are unlikely.
>
> This message provides further information about the problem, how it
> was resolved, and how recurrence will be avoided to ensure continued
> operation of the DiGIR infrastructure.
>
>
> = Problem =
>
> The DNS entry "digir.net" is an A record that must point to an IP
> address.  For several years digir.net resolved to the ip address of
> "66.35.250.210" which is a Source Forge machine that is configured as
> a VHOST for digir.net.  At some point in the last 48hrs, 66.35.250.210
> ceased to respond as a VHOST for digir.net.  This meant that any web
> service that was expecting to retrieve content from digir.net (e.g. http://digir.net/schema/protocol/2003/1.0/digir.xsd
>  ) failed, with significant operational repercussions.  For technical
> reasons, most other digir.net DNS entries were unaffected (e.g. www.digir.net
> ).
>
>
> = Resolution =
>
> The new Source Forge VHOST servicing www.digir.net was identified as
> "216.34.181.97", and the DNS entry for digir.net was updated
> appropriately.  Since the TTL on that DNS record was set relatively
> short (600 seconds), the DNS infrastructure picked up the new
> information fairly quickly, and testing indicates that digir.net URLs
> are now being correctly resolved from several locations around the
> world.
>
>
> = What You Need To Do =
>
> Nothing.  You should not notice any ongoing disruption.
>
>
> = Ensuring Continued Operation =
>
> There are three major issues that lead to this malfunction (#2 and #3
> are informational only for developers of new services and tools):
>
>  1. We have no control over Source Forge, and nor should digir.net
> infrastructure be dependent on them
>
>  2. Much of the existing installed DiGIR infrastructure references
> "digir.net" directly and so DNS level indirection is not practical.
>
>  3. HTTP redirects need to be supported in service implementations
> and their clients.  I suspect that a lot of DiGIR software services
> and clients do not follow redirects properly.
>
>
> == Issue #1 ==
>
> A DNS monitoring service is now in place that will update the DNS
> entries if the source forge machine changes again.
>
> Also, a replica of the DiGIR web material currently hosted by Source
> Forge is being created on a machine that we have complete control over
> and is located on a fast internet backbone.  The replica machine will
> be monitored with a system that will provide automatic notification to
> appropriate contacts in the event of failure.  This machine will be
> located at the University of Kansas at least initially.  Once
> operational, the DNS entry for digir.net will be modified to point to
> this machine.
>
> DNS services are handled by dyndns.org.  Since their inception they
> have provided a 100% uptime for their clients which includes domains
> such as CNN, Mozilla, and Twitter.  Since digir.net was an early
> adopter of their services, the DNS service provided by DynDNS does not
> expire.  DNS registration does expire however, and this relatively
> minor expense is covered by the University of Kansas with assistance
> from the National Science Foundation.
>
>
> == Issue #2 ==
>
> In retrospect, something like "http://schema.digir.net" rather than "http://digir.net
> " should have been used.  This would have provided a low level (DNS)
> mechanism for indirection that would have been unaffected by this
> unanticipated hardware change.  Altering these entries in DiGIR
> providers and tools is really not practical at this stage, so it is
> unlikely that this issue can be directly addressed.
>
>
> == Issue #3 ==
>
> HTTP offers a mechanism for temporary and permanent redirection.  In
> order to take advantage of this functionality, it is necessary for
> HTTP clients to properly interpret the HTTP 301 and 307 status codes
> and the respective response headers to determine the new location of
> the resource being resolved.  This is the mechanism employed by PURLs
> and TinyURL for example.  Unfortunately, many developers take short
> cuts when building tools to retrieve content from URLs and avoid the
> additional small amount of work necessary to handle redirect
> messages.  Where ever possible, developers of services and clients
> should use libraries that at least properly handle HTTP redirects
> which will allow the use of PURLs and other mechanisms for providing
> an additional level of indirection in URL resolution where necessary.
>
>
> regards,
>   Dave Vieglais
>
> ---
> Biodiversity Research Center
> University of Kansas
>
>
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag

-------------------------------------------------------------
Roger Hyam
Roger at BiodiversityCollectionsIndex.org
http://www.BiodiversityCollectionsIndex.org
-------------------------------------------------------------
Royal Botanic Garden Edinburgh
20A Inverleith Row, Edinburgh, EH3 5LR, UK
Tel: +44 131 552 7171 ext 3015
Fax: +44 131 248 2901
http://www.rbge.org.uk/
-------------------------------------------------------------







More information about the tdwg-tag mailing list