XML gateways

Rob Stevenson robert.stevenson at UMB.EDU
Sun Mar 25 21:30:52 CEST 2001


Dave,

I tried looking at http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp

but I was unsuccessful. Is the address wrong? Must I be registered?

Rob Stevenson


but I was unsuccessful
Dave Vieglais wrote:

> >-----Original Message-----
> >From: TDWG - Structure of Descriptive Data [mailto:TDWG-SDD at USOBI.ORG]On
> >Behalf Of Robert A. (Bob) Morris
> >Sent: Tuesday, March 20, 2001 10:07 AM
> >To: TDWG-SDD at USOBI.ORG
> >Subject: Re: XML gateways
> >
> >
> >Jim Croft writes:
> > > Date:         Wed, 21 Mar 2001 00:25:09 +1100
> > > From: Jim Croft <jrc at anbg.gov.au>
> > > To: TDWG-SDD at usobi.org
> > > Subject:      Re: XML gateways
> > >
> ...
>
> >
> >Sad, but true. We should kick around other ideas to solve this
> >problem, though a conventional port may be the easiest for data
> >sources to implement. How about:
> >
> >- A convention whereby if
> >    http://<host>/<filePath>/<cgiquery>
> >yields HTML then
> >    http://<host>/<filePath>/xml/<cgiquery>
> >yields XML
> >
> >-A convention whereby people set up virtual hosts---pretty easy in
> >most web servers---so that if
> >    http://<host>/<filePath>/<cgiquery>
> >yields HTML then something like
> >     http://xml-<host>/<filePath>/<cgiquery>
> >yields XML
> >
> >Hopefully all of this is temporary, since a well crafted GBIF should
> >provide for discovery of URL, query syntax, and return schema. My real
> >point is that current versions of Oracle, SQL-Server, FileMaker, and
> >Access(?) can already emit XML without much(?) effort on the part of
> >the data source operators, and doing so would let people proceed to
> >build interesting distributed applications.
>
> There are initiatives to resolve each of the issues brought up here that are
> being worked on in several industry sectors:
> 1. Service discovery is addressed by UDDI (Universal Description,
> Description and Integration).
> 2. Service description by WSDL (Web Service Description Language) which is
> an XML document format for describing methods and data exposed by a web
> services.
> 3. IGIR (Interface for Generic Information Retrieval) provides a generic
> format for information retrieval via HTTP GET or POST requests that is easy
> to implement.  The queries are sent as a structure easily translated into a
> number of common query syntaxes (SQL for example).  Results are contained
> within a consistently formatted document.  IGIR is optionally stateful (it's
> up to the server to decide)- so a query that generates a very large result
> set need not be sent back to the client all in one hit.  IGIR also supports
> a SOAP interface (that's what it is originally designed for), so
> programmatic access to information resources becomes pretty simple.
>
> A search and retrieval scenario would work something like this:
> A client hits on the UDDI directory, looking for services that support IGIR
> and are "biologically relevant".  A list of services supporting IGIR are
> returned, and the client broadcasts a query to them and waits for results to
> start coming back.  Since the records are all formatted in a similar manner,
> it is a simple process for the client to merge the results and present to
> the user a nicely formatted set of results.
>
> Anyway, that's how things should work.  To support a broadcast query,
> obviously the targets must understand the contex of the query terms- so
> groups within a particular domain (such as biological collections for
> example) will need a set of "well known" access points (searchable fields).
> Similarly, to simplify the merging of results that come back from such a
> query, it would be convenient if there was an agreed record structure that
> was shared between data providers.  The xml output by most database vendors
> is a hack designed to allow easy transport of database records using http.
> Things are getting better with xsd and so forth, but the structure of the
> resulting xml documents will still reflect the structure of the databases
> rather than any agreed upon common format.
>
> Which brings us back to the "biological collections profile" concept that
> was raised at the last TDWG meeting.  Would anybody like to volunteer a set
> of access points (searchable fields) that are relevant for biological
> collections?  Similarly for the record structure?  The DarwinCore elements
> start to address these issues, but are somewhat deficient in several
> respects.  They might make a reasonable starting place though- The
> description with a comment section (add your gripes etc) is available at:
> http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp
>
> Most of you know that the DarwinCore was originally developed as a Z39.50
> profile, hence some of the language may seem a little odd.  But the main
> thing to take a look at is the list of access points- which are useful from
> a biological collections point of view, and what's missing?
>
> There are obviously other domains of biological databases beside just
> collection databases.  Any suggestions for commonality between them all?
>
> Cheers,
>   Dave V.
>
> =========================
> David A. Vieglais
> University of Kansas
> Natural History Museum &
> Biodiversity Research Center
> http://www.nhm.ukans.edu

--------------AA3EEC6D20D988B2EC71FF53
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
Dave,
<p>I tried looking at <a href="http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp">http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp</a>
<p>but I was unsuccessful. Is the address wrong? Must I be registered?
<p>Rob Stevenson
<br>&nbsp;
<p><a href="http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp">but
I was unsuccessful</a>
<br>Dave Vieglais wrote:
<blockquote TYPE=CITE>>-----Original Message-----
<br>>From: TDWG - Structure of Descriptive Data [<a href="mailto:TDWG-SDD at USOBI.ORG">mailto:TDWG-SDD at USOBI.ORG</a>]On
<br>>Behalf Of Robert A. (Bob) Morris
<br>>Sent: Tuesday, March 20, 2001 10:07 AM
<br>>To: TDWG-SDD at USOBI.ORG
<br>>Subject: Re: XML gateways
<br>>
<br>>
<br>>Jim Croft writes:
<br>> > Date:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Wed, 21 Mar
2001 00:25:09 +1100
<br>> > From: Jim Croft &lt;jrc at anbg.gov.au>
<br>> > To: TDWG-SDD at usobi.org
<br>> > Subject:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Re: XML gateways
<br>> >
<br>...
<p>>
<br>>Sad, but true. We should kick around other ideas to solve this
<br>>problem, though a conventional port may be the easiest for data
<br>>sources to implement. How about:
<br>>
<br>>- A convention whereby if
<br>>&nbsp;&nbsp;&nbsp; <a href="http://">http://</a>&lt;host>/&lt;filePath>/&lt;cgiquery>
<br>>yields HTML then
<br>>&nbsp;&nbsp;&nbsp; <a href="http://">http://</a>&lt;host>/&lt;filePath>/xml/&lt;cgiquery>
<br>>yields XML
<br>>
<br>>-A convention whereby people set up virtual hosts---pretty easy in
<br>>most web servers---so that if
<br>>&nbsp;&nbsp;&nbsp; <a href="http://">http://</a>&lt;host>/&lt;filePath>/&lt;cgiquery>
<br>>yields HTML then something like
<br>>&nbsp;&nbsp;&nbsp;&nbsp; <a href="http://xml">http://xml</a>-&lt;host>/&lt;filePath>/&lt;cgiquery>
<br>>yields XML
<br>>
<br>>Hopefully all of this is temporary, since a well crafted GBIF should
<br>>provide for discovery of URL, query syntax, and return schema. My
real
<br>>point is that current versions of Oracle, SQL-Server, FileMaker, and
<br>>Access(?) can already emit XML without much(?) effort on the part
of
<br>>the data source operators, and doing so would let people proceed to
<br>>build interesting distributed applications.
<p>There are initiatives to resolve each of the issues brought up here
that are
<br>being worked on in several industry sectors:
<br>1. Service discovery is addressed by UDDI (Universal Description,
<br>Description and Integration).
<br>2. Service description by WSDL (Web Service Description Language) which
is
<br>an XML document format for describing methods and data exposed by a
web
<br>services.
<br>3. IGIR (Interface for Generic Information Retrieval) provides a generic
<br>format for information retrieval via HTTP GET or POST requests that
is easy
<br>to implement.&nbsp; The queries are sent as a structure easily translated
into a
<br>number of common query syntaxes (SQL for example).&nbsp; Results are
contained
<br>within a consistently formatted document.&nbsp; IGIR is optionally
stateful (it's
<br>up to the server to decide)- so a query that generates a very large
result
<br>set need not be sent back to the client all in one hit.&nbsp; IGIR
also supports
<br>a SOAP interface (that's what it is originally designed for), so
<br>programmatic access to information resources becomes pretty simple.
<p>A search and retrieval scenario would work something like this:
<br>A client hits on the UDDI directory, looking for services that support
IGIR
<br>and are "biologically relevant".&nbsp; A list of services supporting
IGIR are
<br>returned, and the client broadcasts a query to them and waits for results
to
<br>start coming back.&nbsp; Since the records are all formatted in a similar
manner,
<br>it is a simple process for the client to merge the results and present
to
<br>the user a nicely formatted set of results.
<p>Anyway, that's how things should work.&nbsp; To support a broadcast
query,
<br>obviously the targets must understand the contex of the query terms-
so
<br>groups within a particular domain (such as biological collections for
<br>example) will need a set of "well known" access points (searchable
fields).
<br>Similarly, to simplify the merging of results that come back from such
a
<br>query, it would be convenient if there was an agreed record structure
that
<br>was shared between data providers.&nbsp; The xml output by most database
vendors
<br>is a hack designed to allow easy transport of database records using
http.
<br>Things are getting better with xsd and so forth, but the structure
of the
<br>resulting xml documents will still reflect the structure of the databases
<br>rather than any agreed upon common format.
<p>Which brings us back to the "biological collections profile" concept
that
<br>was raised at the last TDWG meeting.&nbsp; Would anybody like to volunteer
a set
<br>of access points (searchable fields) that are relevant for biological
<br>collections?&nbsp; Similarly for the record structure?&nbsp; The DarwinCore
elements
<br>start to address these issues, but are somewhat deficient in several
<br>respects.&nbsp; They might make a reasonable starting place though-
The
<br>description with a comment section (add your gripes etc) is available
at:
<br><a href="http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp">http://tsadev.speciesanalyst.net/DarwinCore/Darwin_core.asp</a>
<p>Most of you know that the DarwinCore was originally developed as a Z39.50
<br>profile, hence some of the language may seem a little odd.&nbsp; But
the main
<br>thing to take a look at is the list of access points- which are useful
from
<br>a biological collections point of view, and what's missing?
<p>There are obviously other domains of biological databases beside just
<br>collection databases.&nbsp; Any suggestions for commonality between
them all?
<p>Cheers,
<br>&nbsp; Dave V.
<p>=========================
<br>David A. Vieglais
<br>University of Kansas
<br>Natural History Museum &amp;
<br>Biodiversity Research Center
<br><a href="http://www.nhm.ukans.edu">http://www.nhm.ukans.edu</a></blockquote>
</html>


More information about the tdwg-content mailing list