Recent DiGIR service disruption
Apologies for cross posting.
Hopefully, most of you did not notice this, but if you did, here's what happened. A problem with resolving the digir.net domain was recently identified by John Wieczorek and possibly others. This problem would have affected most, if not all DiGIR providers currently operating. Once identified, the issue was promptly resolved, and ongoing problems are unlikely.
This message provides further information about the problem, how it was resolved, and how recurrence will be avoided to ensure continued operation of the DiGIR infrastructure.
= Problem =
The DNS entry "digir.net" is an A record that must point to an IP address. For several years digir.net resolved to the ip address of "66.35.250.210" which is a Source Forge machine that is configured as a VHOST for digir.net. At some point in the last 48hrs, 66.35.250.210 ceased to respond as a VHOST for digir.net. This meant that any web service that was expecting to retrieve content from digir.net (e.g. http://digir.net/schema/protocol/2003/1.0/digir.xsd ) failed, with significant operational repercussions. For technical reasons, most other digir.net DNS entries were unaffected (e.g. www.digir.net ).
= Resolution =
The new Source Forge VHOST servicing www.digir.net was identified as "216.34.181.97", and the DNS entry for digir.net was updated appropriately. Since the TTL on that DNS record was set relatively short (600 seconds), the DNS infrastructure picked up the new information fairly quickly, and testing indicates that digir.net URLs are now being correctly resolved from several locations around the world.
= What You Need To Do =
Nothing. You should not notice any ongoing disruption.
= Ensuring Continued Operation =
There are three major issues that lead to this malfunction (#2 and #3 are informational only for developers of new services and tools):
1. We have no control over Source Forge, and nor should digir.net infrastructure be dependent on them
2. Much of the existing installed DiGIR infrastructure references "digir.net" directly and so DNS level indirection is not practical.
3. HTTP redirects need to be supported in service implementations and their clients. I suspect that a lot of DiGIR software services and clients do not follow redirects properly.
== Issue #1 ==
A DNS monitoring service is now in place that will update the DNS entries if the source forge machine changes again.
Also, a replica of the DiGIR web material currently hosted by Source Forge is being created on a machine that we have complete control over and is located on a fast internet backbone. The replica machine will be monitored with a system that will provide automatic notification to appropriate contacts in the event of failure. This machine will be located at the University of Kansas at least initially. Once operational, the DNS entry for digir.net will be modified to point to this machine.
DNS services are handled by dyndns.org. Since their inception they have provided a 100% uptime for their clients which includes domains such as CNN, Mozilla, and Twitter. Since digir.net was an early adopter of their services, the DNS service provided by DynDNS does not expire. DNS registration does expire however, and this relatively minor expense is covered by the University of Kansas with assistance from the National Science Foundation.
== Issue #2 ==
In retrospect, something like "http://schema.digir.net" rather than "http://digir.net " should have been used. This would have provided a low level (DNS) mechanism for indirection that would have been unaffected by this unanticipated hardware change. Altering these entries in DiGIR providers and tools is really not practical at this stage, so it is unlikely that this issue can be directly addressed.
== Issue #3 ==
HTTP offers a mechanism for temporary and permanent redirection. In order to take advantage of this functionality, it is necessary for HTTP clients to properly interpret the HTTP 301 and 307 status codes and the respective response headers to determine the new location of the resource being resolved. This is the mechanism employed by PURLs and TinyURL for example. Unfortunately, many developers take short cuts when building tools to retrieve content from URLs and avoid the additional small amount of work necessary to handle redirect messages. Where ever possible, developers of services and clients should use libraries that at least properly handle HTTP redirects which will allow the use of PURLs and other mechanisms for providing an additional level of indirection in URL resolution where necessary.
regards, Dave Vieglais
--- Biodiversity Research Center University of Kansas
Thanks to Rod and DaveV for their work.
I'd like to make one observation and concur with David Shorthouse on one point.
1) Looking down the list of providers many are exposed as IP addresses (which is bad), some are on domain names but not port 80 (which is slightly bad) and some are clearly machine names rather than names that one would associate with services. It appears to me that a proportion people are not using DNS robustly to allow services to be moved from machine to machine in a sensible way. So aside from the current issues DaveV has outline there is inherent brittleness in the way we are implementing these services. This may well be because people in large or small institutions do not have easy access to DNS configuration for their institutional domain even if they understand the importance of it. This has big implications for role out of LSIDs or any other technology dependent on configuration of DNS.
2) To concur with DavidS on the number of providers. Whilst developing the Biodiversity Collections Index we worked through some lists of DiGIR providers at various portals to see if we were missing any 'real' collections. What struck us was the level of overlap and the fact that we weren't discovering much (if anything) new. Rod's page lists < 200 DiGIR providers (if I try and weed out the duplicates by eye). BCI has around 5,000 physical collections listed. This seems like a small percentage of cover even if you take into account BioCASe and TAPIR providers. There is overlap between provider and they do not very often cover complete collections. On the other hand the more important collections may be more likely to be covered.
Sorry for the lengthy cross post.
All the best,
Roger
On 8 Oct 2008, at 07:06, Dave Vieglais wrote:
Apologies for cross posting.
Hopefully, most of you did not notice this, but if you did, here's what happened. A problem with resolving the digir.net domain was recently identified by John Wieczorek and possibly others. This problem would have affected most, if not all DiGIR providers currently operating. Once identified, the issue was promptly resolved, and ongoing problems are unlikely.
This message provides further information about the problem, how it was resolved, and how recurrence will be avoided to ensure continued operation of the DiGIR infrastructure.
= Problem =
The DNS entry "digir.net" is an A record that must point to an IP address. For several years digir.net resolved to the ip address of "66.35.250.210" which is a Source Forge machine that is configured as a VHOST for digir.net. At some point in the last 48hrs, 66.35.250.210 ceased to respond as a VHOST for digir.net. This meant that any web service that was expecting to retrieve content from digir.net (e.g. http://digir.net/schema/protocol/2003/1.0/digir.xsd ) failed, with significant operational repercussions. For technical reasons, most other digir.net DNS entries were unaffected (e.g. www.digir.net ).
= Resolution =
The new Source Forge VHOST servicing www.digir.net was identified as "216.34.181.97", and the DNS entry for digir.net was updated appropriately. Since the TTL on that DNS record was set relatively short (600 seconds), the DNS infrastructure picked up the new information fairly quickly, and testing indicates that digir.net URLs are now being correctly resolved from several locations around the world.
= What You Need To Do =
Nothing. You should not notice any ongoing disruption.
= Ensuring Continued Operation =
There are three major issues that lead to this malfunction (#2 and #3 are informational only for developers of new services and tools):
- We have no control over Source Forge, and nor should digir.net
infrastructure be dependent on them
- Much of the existing installed DiGIR infrastructure references
"digir.net" directly and so DNS level indirection is not practical.
- HTTP redirects need to be supported in service implementations
and their clients. I suspect that a lot of DiGIR software services and clients do not follow redirects properly.
== Issue #1 ==
A DNS monitoring service is now in place that will update the DNS entries if the source forge machine changes again.
Also, a replica of the DiGIR web material currently hosted by Source Forge is being created on a machine that we have complete control over and is located on a fast internet backbone. The replica machine will be monitored with a system that will provide automatic notification to appropriate contacts in the event of failure. This machine will be located at the University of Kansas at least initially. Once operational, the DNS entry for digir.net will be modified to point to this machine.
DNS services are handled by dyndns.org. Since their inception they have provided a 100% uptime for their clients which includes domains such as CNN, Mozilla, and Twitter. Since digir.net was an early adopter of their services, the DNS service provided by DynDNS does not expire. DNS registration does expire however, and this relatively minor expense is covered by the University of Kansas with assistance from the National Science Foundation.
== Issue #2 ==
In retrospect, something like "http://schema.digir.net" rather than "http://digir.net " should have been used. This would have provided a low level (DNS) mechanism for indirection that would have been unaffected by this unanticipated hardware change. Altering these entries in DiGIR providers and tools is really not practical at this stage, so it is unlikely that this issue can be directly addressed.
== Issue #3 ==
HTTP offers a mechanism for temporary and permanent redirection. In order to take advantage of this functionality, it is necessary for HTTP clients to properly interpret the HTTP 301 and 307 status codes and the respective response headers to determine the new location of the resource being resolved. This is the mechanism employed by PURLs and TinyURL for example. Unfortunately, many developers take short cuts when building tools to retrieve content from URLs and avoid the additional small amount of work necessary to handle redirect messages. Where ever possible, developers of services and clients should use libraries that at least properly handle HTTP redirects which will allow the use of PURLs and other mechanisms for providing an additional level of indirection in URL resolution where necessary.
regards, Dave Vieglais
Biodiversity Research Center University of Kansas
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
------------------------------------------------------------- Roger Hyam Roger@BiodiversityCollectionsIndex.org http://www.BiodiversityCollectionsIndex.org ------------------------------------------------------------- Royal Botanic Garden Edinburgh 20A Inverleith Row, Edinburgh, EH3 5LR, UK Tel: +44 131 552 7171 ext 3015 Fax: +44 131 248 2901 http://www.rbge.org.uk/ -------------------------------------------------------------
Dear Roger,
On 8 Oct 2008, at 10:42, Roger Hyam wrote:
Thanks to Rod and DaveV for their work.
I'd like to make one observation and concur with David Shorthouse on one point.
- Looking down the list of providers many are exposed as IP addresses
(which is bad), some are on domain names but not port 80 (which is slightly bad) and some are clearly machine names rather than names that one would associate with services. It appears to me that a proportion people are not using DNS robustly to allow services to be moved from machine to machine in a sensible way. So aside from the current issues DaveV has outline there is inherent brittleness in the way we are implementing these services. This may well be because people in large or small institutions do not have easy access to DNS configuration for their institutional domain even if they understand the importance of it. This has big implications for role out of LSIDs or any other technology dependent on configuration of DNS.
I agree that this is a big issue, and mucking with the DNS strikes me as the Achilles heel of easy adoption of LSIDs. That said, I think we desperately need GUIDs for specimens, if only to enable other approaches, such as mirroring data.
- To concur with DavidS on the number of providers. Whilst
developing the Biodiversity Collections Index we worked through some lists of DiGIR providers at various portals to see if we were missing any 'real' collections. What struck us was the level of overlap and the fact that we weren't discovering much (if anything) new. Rod's page lists < 200 DiGIR providers (if I try and weed out the duplicates by eye). BCI has around 5,000 physical collections listed. This seems like a small percentage of cover even if you take into account BioCASe and TAPIR providers. There is overlap between provider and they do not very often cover complete collections. On the other hand the more important collections may be more likely to be covered.
I've had a few emails about how to update the list of DiGIR (and other providers), so the list will change at some point (and grow). But you are right, the degree of coverage is pretty low, and I'd suggest heavily biased towards terrestrial vertebrates (perhaps because of the relatively small collection sizes, and the culture of assigning identifiers to individual specimens).
Regards
Rod
Sorry for the lengthy cross post.
All the best,
Roger
On 8 Oct 2008, at 07:06, Dave Vieglais wrote:
Apologies for cross posting.
Hopefully, most of you did not notice this, but if you did, here's what happened. A problem with resolving the digir.net domain was recently identified by John Wieczorek and possibly others. This problem would have affected most, if not all DiGIR providers currently operating. Once identified, the issue was promptly resolved, and ongoing problems are unlikely.
This message provides further information about the problem, how it was resolved, and how recurrence will be avoided to ensure continued operation of the DiGIR infrastructure.
= Problem =
The DNS entry "digir.net" is an A record that must point to an IP address. For several years digir.net resolved to the ip address of "66.35.250.210" which is a Source Forge machine that is configured as a VHOST for digir.net. At some point in the last 48hrs, 66.35.250.210 ceased to respond as a VHOST for digir.net. This meant that any web service that was expecting to retrieve content from digir.net (e.g. http://digir.net/schema/protocol/2003/1.0/digir.xsd ) failed, with significant operational repercussions. For technical reasons, most other digir.net DNS entries were unaffected (e.g. www.digir.net ).
= Resolution =
The new Source Forge VHOST servicing www.digir.net was identified as "216.34.181.97", and the DNS entry for digir.net was updated appropriately. Since the TTL on that DNS record was set relatively short (600 seconds), the DNS infrastructure picked up the new information fairly quickly, and testing indicates that digir.net URLs are now being correctly resolved from several locations around the world.
= What You Need To Do =
Nothing. You should not notice any ongoing disruption.
= Ensuring Continued Operation =
There are three major issues that lead to this malfunction (#2 and #3 are informational only for developers of new services and tools):
- We have no control over Source Forge, and nor should digir.net
infrastructure be dependent on them
- Much of the existing installed DiGIR infrastructure references
"digir.net" directly and so DNS level indirection is not practical.
- HTTP redirects need to be supported in service implementations
and their clients. I suspect that a lot of DiGIR software services and clients do not follow redirects properly.
== Issue #1 ==
A DNS monitoring service is now in place that will update the DNS entries if the source forge machine changes again.
Also, a replica of the DiGIR web material currently hosted by Source Forge is being created on a machine that we have complete control over and is located on a fast internet backbone. The replica machine will be monitored with a system that will provide automatic notification to appropriate contacts in the event of failure. This machine will be located at the University of Kansas at least initially. Once operational, the DNS entry for digir.net will be modified to point to this machine.
DNS services are handled by dyndns.org. Since their inception they have provided a 100% uptime for their clients which includes domains such as CNN, Mozilla, and Twitter. Since digir.net was an early adopter of their services, the DNS service provided by DynDNS does not expire. DNS registration does expire however, and this relatively minor expense is covered by the University of Kansas with assistance from the National Science Foundation.
== Issue #2 ==
In retrospect, something like "http://schema.digir.net" rather than "http://digir.net " should have been used. This would have provided a low level (DNS) mechanism for indirection that would have been unaffected by this unanticipated hardware change. Altering these entries in DiGIR providers and tools is really not practical at this stage, so it is unlikely that this issue can be directly addressed.
== Issue #3 ==
HTTP offers a mechanism for temporary and permanent redirection. In order to take advantage of this functionality, it is necessary for HTTP clients to properly interpret the HTTP 301 and 307 status codes and the respective response headers to determine the new location of the resource being resolved. This is the mechanism employed by PURLs and TinyURL for example. Unfortunately, many developers take short cuts when building tools to retrieve content from URLs and avoid the additional small amount of work necessary to handle redirect messages. Where ever possible, developers of services and clients should use libraries that at least properly handle HTTP redirects which will allow the use of PURLs and other mechanisms for providing an additional level of indirection in URL resolution where necessary.
regards, Dave Vieglais
Biodiversity Research Center University of Kansas
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Roger Hyam Roger@BiodiversityCollectionsIndex.org http://www.BiodiversityCollectionsIndex.org
Royal Botanic Garden Edinburgh 20A Inverleith Row, Edinburgh, EH3 5LR, UK Tel: +44 131 552 7171 ext 3015 Fax: +44 131 248 2901 http://www.rbge.org.uk/
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
--------------------------------------------------------- Roderic Page Professor of Taxonomy DEEB, FBLS Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
participants (3)
-
Dave Vieglais
-
Roderic Page
-
Roger Hyam