AW: [PyWrapper-devel] [tdwg-tapir] RE: WG: tapir: capabilities

Tue Jul 25 11:54:29 CEST 2006

I agree that logging via GUID doesnt help in many cases where the provider wants to know what was searched for.

But searching on a portal-cache to find data from 20 different providers in 1 search and then sending of 20 log requests could also be annoying. Plus the burden of the portal of checking the registry if a providers really wants logging.

The most efficient is probably a portal specific logging as Donald suggests. But then providers would have a hard time agglomerating the logging data from several totaly different portals.

To get a comparable logging across different portals though it seems to me that Renatos suggestions are worth a try. It would definitely need guidlines for portal developers to know when to use a log request and how to use it. How to treat paging and map data are good examples where there is no obvious correct behaviour.

-- Markus

> -----Ursprüngliche Nachricht-----
> Von: tdwg-tapir-bounces at lists.tdwg.org 
> [mailto:tdwg-tapir-bounces at lists.tdwg.org] Im Auftrag von 
> John R. WIECZOREK
> Gesendet: Montag, 24. Juli 2006 18:28
> An: roger at tdwg.org
> Cc: PyWrapper Developers mailing list; tdwg-tapir at lists.tdwg.org
> Betreff: Re: [PyWrapper-devel] [tdwg-tapir] RE: WG: tapir: 
> capabilities
> 
> Logging that a GUID was used isn't sufficient; it doesn't 
> tell how the data were used. What I'm after is to log the 
> actual query that would have had to go to the provider to 
> produce the results used. The example in your second example 
> shouldn't happen, the query should specify a record limit per 
> provider.
> 
> 
> On 7/24/06, Roger Hyam <roger at tdwg.org> wrote:
> 
> 
> 	I thought this sounded like a good idea but since 
> reading Renato's message I am now confused.
> 	
> 	If a user does a search on a portal and gets 100 
> results and looks at the first 10 does the provider of the 
> 11th record get notified? Their data has been used because it 
> has been given in the count.  Another example would be if a 
> portal gave a distribution map to 10km squares based in data 
> from multiple providers. Each data point is made from several 
> suppliers data and removal of any one supplier's data may not 
> change the map. Do we notify them all?
> 	
> 	I could envisage a GUID based system just about. The 
> call to the log function would basically say "Some one has 
> accessed the data that I got from you that you tag with this 
> GUID" but I can't see how this would work on a search based 
> system. The log call would mean "Some one searched for 
> something that made used of data I got from a search I did on 
> you once".
> 	
> 	So really the only service we need is a GUID based one. 
> Perhaps extending the LSID resolution spec would be more appropriate?
> 	
> 	Roger
> 	
> 	
> 	
> 	Renato De Giovanni wrote: 
> 
> 		Hi John,
> 		
> 		Implementing the log request was never a 
> problem. We discussed about 
> 		that again during the Madrid meeting, and only 
> after that a feature 
> 		freeze was suggested. It's true that PyWrapper 
> is being adjusted now 
> 		
> 		to conform to the new specs, and considering 
> that DiGIR2 (or wasabi) 
> 		postponed implementation of TAPIR, I suppose it 
> should not be a big 
> 		problem to make additional changes if necessary.
> 		
> 		The main problem I had with the log request was 
> that it would 
> 		
> 		probably not solve the issue behind it, which 
> is to track usage by 
> 		data aggregators. I still have the same 
> feeling, and I can easily 
> 		imagine situations when it would not be easy or 
> even possible to 
> 		translate searches on top of cached databases 
> to TAPIR requests.
> 		
> 		
> 		But maybe I'm wrong, and if you all think it's 
> a good feature then we 
> 		can try to include it. However, I do think that 
> providers should be 
> 		able to advertise as the part of capabilities 
> if they want to receive 
> 		
> 		log requests or not. 
> 		
> 		To me it also sounds like a new operation, 
> especially if it's only 
> 		related to search. It could make sense for 
> view, inventory and 
> 		metadata operations. Maybe capabilities too. 
> But it doesn't make 
> 		
> 		sense for ping. Well, maybe it could make sense 
> for ping if the data 
> 		aggregator monitors provider status and accepts 
> similar requests on 
> 		top of its results...
> 		
> 		So, yes, it could be a new attribute "logOnly" 
> as part of the 
> 		
> 		operationRequestGroup with an answer 
> </received> (just after the 
> 		response header). And we could add an attribute 
> "acceptLogRequests" 
> 		in the <operations> element in capabilities 
> responses. The other 
> 		
> 		option would be to include a new operation, but 
> maybe it's better to 
> 		just have it as an optional attribute for all 
> operations.
> 		
> 		Best Regards,
> 		--
> 		Renato
> 		
> 		On 19 Jul 2006 at 10:32, John R. WIECZOREK wrote:
> 		
> 		  
> 
> 			I appreciate that you will consider 
> this request. I always thought
> 			it 
> 			would be trivial to implement. Your 
> simulation mode sounds very much
> 			like what I had in mind. I hadn't 
> thought it necessary to get a 
> 			
> 			response from a log request, but if 
> there was a simple response, it
> 			could be used as a ping, or it could be 
> used to retry logging until
> 			the provider did respond. So, something 
> like <log request received>.
> 			
> 			I think the addition oflogOnly 
> attribute is a good one, and could 
> 			apply to every request type. 
> 			
> 			Javi, I don't disagree that portals 
> SHOULD log the data usage, 
> 			especially to cover the situations 
> where a provider doesn't respond.
> 			
> 			I also think that having the 
> information logged at the provider is a
> 			responsible course of action, since 
> they will have immediate access
> 			to the usage statistics that way. It 
> will be much easier for a
> 			portal 
> 			
> 			builder to send log requests than it 
> will be to build the 
> 			infrastructure and interfaces to logs, 
> therefore it is more likely
> 			to 
> 			actually get done. 
> 			
> 			
> 			On 7/18/06, Javier de la Torre 
> 			<jatorre at mncn.csic.es> 
> <mailto:jatorre at mncn.csic.es>  wrote: 
> 			    I am not sure about this,
> 			    
> 			    I still think that portals should 
> be gathering this data and
> 			making
> 			    it available for data providers... 
> 			    
> 			    But in any case if you like it then 
> I agree with MArkus that the
> 			
> 			best
> 			    is to include another parameter in 
> the operationRequestGroup.
> 			    
> 			    I havent checked but what happens 
> if you do an extension there
> 			with
> 			    an attribute that is implementation 
> specific? A qualified
> 			
> 			attribute.
> 			    Will this still validate against 
> our schema? You were
> 			discussing
> 			    about qualification of attributes before no?
> 			    
> 			    Javi.
> 			    
> 			    On 18/07/2006, at 10:41, Döring, 
> Markus wrote: 
> 			
> 			    
> 			    > John,
> 			    > all changes going on with TAPIR 
> right now are really only
> 			changes
> 			    > in terminology or removing 
> inconsistencies we did not detect
> 			before
> 			    > we started the documentation and 
> final implementation. 
> 			
> 			    >
> 			    >
> 			    > But nevertheless I would support 
> your request. Especially from
> 			the
> 			    > implementation side of view this 
> is a trivial change to the
> 			code.
> 			    > So why dont add it? Just some 
> additional thoughts: 
> 			
> 			    >
> 			    >
> 			    > - Ive added a "simulation" mode 
> already to my code where no
> 			SQL
> 			    > gets executed but just logged. So 
> you can test
> 			configurations
> 			    > without risking sending off 
> killer statements. Thats similar
> 			
> 			to 
> 			    > logOnly I guess, returning 
> nothing but diagnostics. What would
> 			you
> 			    > suggest to be returned for a 
> logOnly request? just the empty
> 			TAPIR
> 			    > envelope? Nothing? <OK>?
> 			    >
> 			
> 			    > - would this log-only request not 
> be needed for all requests?
> 			at 
> 			    > least for inventories? So it 
> would be easiest to have a new
> 			logOnly
> 			    > parameter in the header or 
> "request element" just after the
> 			
> 			header?
> 			    > something like <search logOnly="true">
> 			    >
> 			    >
> 			    >
> 			    > -- Markus
> 			    >
> 			    >
> 			    >> -----Ursprüngliche Nachricht-----
> 			    >> Von: 
> 			pywrapper-devel-bounces at lists.sourceforge.net
> 			 
> 			    >> 
> [mailto:pywrapper-devel-bounces at lists.sourceforge.net
> 			] Im
> 			    >> Auftrag von John R. WIECZOREK
> 			    >> Gesendet: Montag, 17. Juli 2006 23:30 
> 			    >> An: Renato De Giovanni
> 			    >> Cc: 
> 			pywrapper-devel at lists.sourceforge.net 
> <mailto:pywrapper-devel at lists.sourceforge.net> ; tdwg-
> 			    tapir at lists.tdwg.org
> 			 
> 			    >> Betreff: Re: [PyWrapper-devel] 
> [tdwg-tapir] RE: WG: tapir:
> 			    >> capabilities
> 			    >>
> 			    >> A little off topic, but it 
> occurs to me that a great deal
> 			of
> 			    >> work is still ongoing with 
> TAPIR, which suggests to me that
> 			
> 			    >> it may be warranted to re-state 
> my request for a simple
> 			    >> message type - a log request. 
> This request would be the
> 			same
> 			    >> as a search request, except that 
> the caller doesn't need a
> 			
> 			    >> response. Providers would use 
> this type of request to log 
> 			    >> data usage if the data were 
> retrieved from a cache
> 			elsewhere.
> 			    >> I remember talking about this in 
> Berlin, at which time
> 			
> 			there
> 			    >> was supposed to be a feature 
> freeze. Clearly we've gone
> 			    >> beyond that, so I'm requesting it again. 
> 			    >>
> 			    >>
> 			    >> On 7/17/06, Renato De Giovanni 
> 			<renato at cria.org.br> 
> <mailto:renato at cria.org.br>  wrote:
> 			    >>
> 			    >>Hi,
> 			    >>
> 			    >>If I remember well, the "view" 
> operation was re-included in 
> 			    the 
> 			    >>protocol just to handle query 
> templates, especifically
> 			
> 			    >> for TapirLite
> 			    >>providers. So if someone wants to 
> query a provider using
> 			some
> 			    >>external output model that should 
> be dynamically 
> 			    >> parsed, then the
> 			    >>"search" operation must be used 
> instead (using either
> 			
> 			    >> XML or simple
> 			    >>GET request). View operations are 
> really bound to query
> 			    >> templates, 
> 			    >>and they are not allowed to 
> specify "filter" or
> 			    >> "partial" parameters.
> 			
> 			    >>--
> 			    >>Renato
> 			    >>
> 			    >>On 17 Jul 2006 at 21:26, "Döring, 
> Markus" wrote: 
> 			    >>
> 			    >>> I was just about to edit the 
> schema and realizing
> 			
> 			    >> that output models
> 			    >>> are only specified for 
> searches. but what about
> 			    >> views? they use
> 			    >>> query templates, yes. but only 
> the ones listed in 
> 			    >> capabilities? we
> 			
> 			    >>> should have dynamic ones here 
> as well I think. And
> 			    >> they link back to
> 			    >>> static/dynamic models.
> 			    >>>
> 			    >>> So should models maybe become a 
> seperate section not tight
> 			
> 			    to 
> 			    >>> search/view operations? I am 
> going to modify the
> 			    >> schema nevertheless
> 			    >>> already to accomodate the 
> changes below - ignoring
> 			    >> views for now.
> 			
> 			    >>>
> 			    >>> Markus
> 			    
> 
> 		_______________________________________________
> 		tdwg-tapir mailing list
> 		
> 		tdwg-tapir at lists.tdwg.org 
> <mailto:tdwg-tapir at lists.tdwg.org> 
> 		http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
> 		
> 		
> 		  
> 
> 
> 
> 	-- 
> 	
> 	-------------------------------------
> 	 Roger Hyam
> 	 Technical Architect
> 	 Taxonomic Databases Working Group
> 	-------------------------------------
> 	 
> 	http://www.tdwg.org <http://www.tdwg.org> 
> 	 roger at tdwg.org
> 	 +44 1578 722782
> 	-------------------------------------
> 
> 	_______________________________________________
> 	tdwg-tapir mailing list
> 	tdwg-tapir at lists.tdwg.org
> 	http://lists.tdwg.org/mailman/listinfo/tdwg-tapir 
> <http://lists.tdwg.org/mailman/listinfo/tdwg-tapir> 
> 	
> 	
> 	
> 
> 
>