Re: [PyWrapper-devel] [tdwg-tapir] RE: WG: tapir: capabilities

25 Jul 2006

      Why make this so hard? We're talking about a portal sending out one message
to n providers and not having to wait for a response. Right now we would
have to send out a message and wait for all of the responses (up to a
timeout). This is a net improvement.

Why do we need metadata from providers to know if they want logging
requests? We don't ask them if they want metadata requests, or how often.
Just send the requests. If they want to log them, they will configure their
provider to do so. Otherwise the provider will ignore them.

In the meantime, always log on the portal side. Not only will that give
providers with flakey connections a place to see usage statistics, but it
will also generate information will be interesting on its own - summary
information about how the portal is used that you wouldn't get from the
providers.

To me, this usage business is important enough that I would even go so far
as to certify portals as being in compliance with meeting this social
contract. That way a provider could release access to certified portals and
disallow access for those who don't abide by the contract. Remember, the
semblance of control is really important to a lot of our providers. If you
don't think so, have someone do a survey of existing providers to see if
they would want it or not. It would be a sample biased against needing
logging (since they are already doing without). If that survey turned up
interest in logging anyway, then it's worth doing. My feeling is that it is
so easy to implement (if you don't try to get unnecessarily fancy) that it
should just be done - it would be easier than conducting a survey about it.

On 7/25/06, "Döring, Markus" <m.doering@bgbm.org> wrote:
...
I agree that logging via GUID doesnt help in many cases where the provider
wants to know what was searched for.
But searching on a portal-cache to find data from 20 different providers
in 1 search and then sending of 20 log requests could also be annoying. Plus
the burden of the portal of checking the registry if a providers really
wants logging.
The most efficient is probably a portal specific logging as Donald
suggests. But then providers would have a hard time agglomerating the
logging data from several totaly different portals.
To get a comparable logging across different portals though it seems to me
that Renatos suggestions are worth a try. It would definitely need guidlines
for portal developers to know when to use a log request and how to use it.
How to treat paging and map data are good examples where there is no obvious
correct behaviour.
-- Markus
...
-----Ursprüngliche Nachricht-----
Von: tdwg-tapir-bounces@lists.tdwg.org
[mailto:tdwg-tapir-bounces@lists.tdwg.org] Im Auftrag von
John R. WIECZOREK
Gesendet: Montag, 24. Juli 2006 18:28
An: roger@tdwg.org
Cc: PyWrapper Developers mailing list; tdwg-tapir@lists.tdwg.org
Betreff: Re: [PyWrapper-devel] [tdwg-tapir] RE: WG: tapir:
capabilities
Logging that a GUID was used isn't sufficient; it doesn't
tell how the data were used. What I'm after is to log the
actual query that would have had to go to the provider to
produce the results used. The example in your second example
shouldn't happen, the query should specify a record limit per
provider.
On 7/24/06, Roger Hyam <roger@tdwg.org> wrote:
I thought this sounded like a good idea but since
reading Renato's message I am now confused.
If a user does a search on a portal and gets 100
results and looks at the first 10 does the provider of the
11th record get notified? Their data has been used because it
has been given in the count.  Another example would be if a
portal gave a distribution map to 10km squares based in data
from multiple providers. Each data point is made from several
suppliers data and removal of any one supplier's data may not
change the map. Do we notify them all?
I could envisage a GUID based system just about. The
call to the log function would basically say "Some one has
accessed the data that I got from you that you tag with this
GUID" but I can't see how this would work on a search based
system. The log call would mean "Some one searched for
something that made used of data I got from a search I did on
you once".
So really the only service we need is a GUID based one.
Perhaps extending the LSID resolution spec would be more appropriate?
Roger
Renato De Giovanni wrote:
Hi John,
Implementing the log request was never a
problem. We discussed about
              that again during the Madrid meeting, and only
after that a feature
              freeze was suggested. It's true that PyWrapper
is being adjusted now
to conform to the new specs, and considering
that DiGIR2 (or wasabi)
              postponed implementation of TAPIR, I suppose it
should not be a big
              problem to make additional changes if necessary.
The main problem I had with the log request was
that it would
probably not solve the issue behind it, which
is to track usage by
              data aggregators. I still have the same
feeling, and I can easily
              imagine situations when it would not be easy or
even possible to
              translate searches on top of cached databases
to TAPIR requests.
But maybe I'm wrong, and if you all think it's
a good feature then we
              can try to include it. However, I do think that
providers should be
              able to advertise as the part of capabilities
if they want to receive
log requests or not.
To me it also sounds like a new operation,
especially if it's only
              related to search. It could make sense for
view, inventory and
              metadata operations. Maybe capabilities too.
But it doesn't make
sense for ping. Well, maybe it could make sense
for ping if the data
              aggregator monitors provider status and accepts
similar requests on
              top of its results...
So, yes, it could be a new attribute "logOnly"
as part of the
operationRequestGroup with an answer
</received> (just after the
              response header). And we could add an attribute
"acceptLogRequests"
              in the <operations> element in capabilities
responses. The other
option would be to include a new operation, but
maybe it's better to
              just have it as an optional attribute for all
operations.
Best Regards,
              --
              Renato
On 19 Jul 2006 at 10:32, John R. WIECZOREK wrote:
I appreciate that you will consider
this request. I always thought
                      it
                      would be trivial to implement. Your
simulation mode sounds very much
                      like what I had in mind. I hadn't
thought it necessary to get a
response from a log request, but if
there was a simple response, it
                      could be used as a ping, or it could be
used to retry logging until
                      the provider did respond. So, something
like <log request received>.
I think the addition oflogOnly
attribute is a good one, and could
                      apply to every request type.
Javi, I don't disagree that portals
SHOULD log the data usage,
                      especially to cover the situations
where a provider doesn't respond.
I also think that having the
information logged at the provider is a
                      responsible course of action, since
they will have immediate access
                      to the usage statistics that way. It
will be much easier for a
                      portal
builder to send log requests than it
will be to build the
                      infrastructure and interfaces to logs,
therefore it is more likely
                      to
                      actually get done.
On 7/18/06, Javier de la Torre
                      <jatorre@mncn.csic.es>
<mailto:jatorre@mncn.csic.es>  wrote:
                          I am not sure about this,
I still think that portals should
be gathering this data and
                      making
                          it available for data providers...
But in any case if you like it then
I agree with MArkus that the
best
                          is to include another parameter in
the operationRequestGroup.
I havent checked but what happens
if you do an extension there
                      with
                          an attribute that is implementation
specific? A qualified
attribute.
                          Will this still validate against
our schema? You were
                      discussing
                          about qualification of attributes before no?
Javi.
On 18/07/2006, at 10:41, Döring,
Markus wrote:
> John,
                          > all changes going on with TAPIR
right now are really only
                      changes
                          > in terminology or removing
inconsistencies we did not detect
                      before
                          > we started the documentation and
final implementation.
>
                          >
                          > But nevertheless I would support
your request. Especially from
                      the
                          > implementation side of view this
is a trivial change to the
                      code.
                          > So why dont add it? Just some
additional thoughts:
>
                          >
                          > - Ive added a "simulation" mode
already to my code where no
                      SQL
                          > gets executed but just logged. So
you can test
                      configurations
                          > without risking sending off
killer statements. Thats similar
to
                          > logOnly I guess, returning
nothing but diagnostics. What would
                      you
                          > suggest to be returned for a
logOnly request? just the empty
                      TAPIR
                          > envelope? Nothing? <OK>?
                          >
> - would this log-only request not
be needed for all requests?
                      at
                          > least for inventories? So it
would be easiest to have a new
                      logOnly
                          > parameter in the header or
"request element" just after the
header?
                          > something like <search logOnly="true">
                          >
                          >
                          >
                          > -- Markus
                          >
                          >
                          >> -----Ursprüngliche Nachricht-----
                          >> Von:
                      pywrapper-devel-bounces@lists.sourceforge.net
>>
[mailto:pywrapper-devel-bounces@lists.sourceforge.net
                      ] Im
                          >> Auftrag von John R. WIECZOREK
                          >> Gesendet: Montag, 17. Juli 2006 23:30
                          >> An: Renato De Giovanni
                          >> Cc:
                      pywrapper-devel@lists.sourceforge.net
<mailto:pywrapper-devel@lists.sourceforge.net> ; tdwg-
                          tapir@lists.tdwg.org
>> Betreff: Re: [PyWrapper-devel]
[tdwg-tapir] RE: WG: tapir:
                          >> capabilities
                          >>
                          >> A little off topic, but it
occurs to me that a great deal
                      of
                          >> work is still ongoing with
TAPIR, which suggests to me that
>> it may be warranted to re-state
my request for a simple
                          >> message type - a log request.
This request would be the
                      same
                          >> as a search request, except that
the caller doesn't need a
>> response. Providers would use
this type of request to log
                          >> data usage if the data were
retrieved from a cache
                      elsewhere.
                          >> I remember talking about this in
Berlin, at which time
there
                          >> was supposed to be a feature
freeze. Clearly we've gone
                          >> beyond that, so I'm requesting it again.
                          >>
                          >>
                          >> On 7/17/06, Renato De Giovanni
                      <renato@cria.org.br>
<mailto:renato@cria.org.br>  wrote:
                          >>
                          >>Hi,
                          >>
                          >>If I remember well, the "view"
operation was re-included in
                          the
                          >>protocol just to handle query
templates, especifically
>> for TapirLite
                          >>providers. So if someone wants to
query a provider using
                      some
                          >>external output model that should
be dynamically
                          >> parsed, then the
                          >>"search" operation must be used
instead (using either
>> XML or simple
                          >>GET request). View operations are
really bound to query
                          >> templates,
                          >>and they are not allowed to
specify "filter" or
                          >> "partial" parameters.
>>--
                          >>Renato
                          >>
                          >>On 17 Jul 2006 at 21:26, "Döring,
Markus" wrote:
                          >>
                          >>> I was just about to edit the
schema and realizing
>> that output models
                          >>> are only specified for
searches. but what about
                          >> views? they use
                          >>> query templates, yes. but only
the ones listed in
                          >> capabilities? we
>>> should have dynamic ones here
as well I think. And
                          >> they link back to
                          >>> static/dynamic models.
                          >>>
                          >>> So should models maybe become a
seperate section not tight
to
                          >>> search/view operations? I am
going to modify the
                          >> schema nevertheless
                          >>> already to accomodate the
changes below - ignoring
                          >> views for now.
>>>
                          >>> Markus
_______________________________________________
              tdwg-tapir mailing list
tdwg-tapir@lists.tdwg.org
<mailto:tdwg-tapir@lists.tdwg.org>
              http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
--
-------------------------------------
       Roger Hyam
       Technical Architect
       Taxonomic Databases Working Group
      -------------------------------------
http://www.tdwg.org <http://www.tdwg.org>
       roger@tdwg.org
       +44 1578 722782
      -------------------------------------
_______________________________________________
      tdwg-tapir mailing list
      tdwg-tapir@lists.tdwg.org
      http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
<http://lists.tdwg.org/mailman/listinfo/tdwg-tapir>
_______________________________________________
tdwg-tapir mailing list
tdwg-tapir@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-tapir