[tdwg-tapir] TAPIRLink and memory

Roger Hyam roger at tdwg.org
Thu Aug 2 12:14:51 CEST 2007


Thanks Peter,

I tried this but it didn't seem to work. The PHP documentation  
suggests that it is changeable anywhere (even in a .htaccess file)  
but it had no effect for me. Renato is correct that the deployments  
are likely to have access to php.ini or know a man who does.

I have certainly changed script duration like this in the past and it  
has worked.

Anyhow I just moved over to a 'real' server instead for now.

All the best,

Roger


On 2 Aug 2007, at 01:18, Peter Neish wrote:

> Hi Renato and Roger,
>
> One thing that might be worth trying is including the command  
> ini_set to override the php.ini setting for the memory limit e.g.  
> ini_set('memory_limit',   '16M'); This has worked for me on a  
> shared host to run a content management system that choked at the  
> lower default value.
>
> Regards,
>
> Peter
>
>>>> "Renato De Giovanni" <renato at cria.org.br> 2/08/07 2:21 >>>
> Hi Roger,
>
> I do think it's worth adding your changes related to the "fopen"
> workaround. Please don't hesitate.
>
> I should say that I'm actually surprised that TapirLink requires only
> 10M with such a complex RDF output model. I wonder how many records
> were being returned in your request?
>
> I also don't have much experience with profiling, but I'm sure
> there's room for improvements since I didn't pay much attention to
> optimization. By the way, the main new feature for the next version
> will be caching. I'm expecting significant improvements in
> performance since query templates, output models, and response
> structures will be all cached by default as serialized PHP. When
> TapirLink can use cached content, I suppose this will also reduce
> memory use. However, the first time it receives a particular output
> model in a request, then it will require the same memory if we don't
> make additional optimizations and if we want it to run below the 8M
> limit.
>
> Anyway, I also wonder how many people and organizations will need to
> run a TAPIR provider software under the conditions you described
> (external ISP with such a low memory use limit). I still never heard
> of any case in our community (maybe someone from GBIF could give us a
> better picture?).
>
> Best Regards,
> --
> Renato
>
> On 1 Aug 2007 at 14:11, Roger Hyam wrote:
>
>> Hi All
>>
>> I spent a couple of hours this morning adapting TAPIRLink so that it
>> uses a work around of fopen() because many ISPs will not support
>> opening remote files (there is a php config option to stop it).
>>
>> Anyhow I got past this and found that my output model still wouldn't
>> run on my ISP account (Easyspace.com) but would run on my local
>> machine. After a while I found that it was running out of memory.
>>
>> My ISP limits memory to 8meg per running script. That seems pretty
>> tight until you imagine having a hundred scripts running
>> simultaneously. It was the default setting prior to php 5.2 when it
>> jumped to 128meg! This may explain why ISPs are very slow to migrate
>> to PHP5.*
>>
>> The TAPIRLink request was using almost 10meg under PHP 5.2 according
>> to memory_get_peak_usage()  to parse the rather complex
>> TaxonOccurrence output model. There is no peak usage method on
>> earlier PHP versions. Ten meg seems quite reasonable considering the
>> cost of RAM these days - but there you have it.
>>
>> Anyhow I am nervous because this means that deployers might need to
>> mess with php.ini to get scripts running which means shared servers
>> may be problematic for deployments that use complex output models.
>> Basically it won't run everywhere php is available but only where php
>> in a certain config is available. It also means that if you are being
>> crawled by a 10 threaded robot you will be using close to 100meg to
>> service the requests plus memory allocation and deallocation etc.
>>
>> It is all seems pretty trivial if you have a newer machine and even
>> more so if you install PHP 5.2+ on it but it does mean that that old
>> departmental webserver that has an old install of PHP on may not run
>> the RDF based output models out of the box.
>>
>> Never done any profiling with PHP and wouldn't like to get into it. I
>> guess Python will have a similar memory footprint as it is doing a
>> similar job but the install scenario is different for PyWrapper - you
>> really need shell access.
>>
>> It may not be worth adding the fopen() work round if the deployment
>> environment requires access to php.ini.
>>
>> Would be grateful for your thoughts on this.
>>
>> Roger
>
> _______________________________________________
> tdwg-tapir mailing list
> tdwg-tapir at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
>
>
> _______________________________________________
> tdwg-tapir mailing list
> tdwg-tapir at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir




More information about the tdwg-tag mailing list