[tdwg-tapir] TAPIRLink and memory
roger at tdwg.org
Wed Aug 1 15:11:21 CEST 2007
I spent a couple of hours this morning adapting TAPIRLink so that it
uses a work around of fopen() because many ISPs will not support
opening remote files (there is a php config option to stop it).
Anyhow I got past this and found that my output model still wouldn't
run on my ISP account (Easyspace.com) but would run on my local
machine. After a while I found that it was running out of memory.
My ISP limits memory to 8meg per running script. That seems pretty
tight until you imagine having a hundred scripts running
simultaneously. It was the default setting prior to php 5.2 when it
jumped to 128meg! This may explain why ISPs are very slow to migrate
The TAPIRLink request was using almost 10meg under PHP 5.2 according
to memory_get_peak_usage() to parse the rather complex
TaxonOccurrence output model. There is no peak usage method on
earlier PHP versions. Ten meg seems quite reasonable considering the
cost of RAM these days - but there you have it.
Anyhow I am nervous because this means that deployers might need to
mess with php.ini to get scripts running which means shared servers
may be problematic for deployments that use complex output models.
Basically it won't run everywhere php is available but only where php
in a certain config is available. It also means that if you are being
crawled by a 10 threaded robot you will be using close to 100meg to
service the requests plus memory allocation and deallocation etc.
It is all seems pretty trivial if you have a newer machine and even
more so if you install PHP 5.2+ on it but it does mean that that old
departmental webserver that has an old install of PHP on may not run
the RDF based output models out of the box.
Never done any profiling with PHP and wouldn't like to get into it. I
guess Python will have a similar memory footprint as it is doing a
similar job but the install scenario is different for PyWrapper - you
really need shell access.
It may not be worth adding the fopen() work round if the deployment
environment requires access to php.ini.
Would be grateful for your thoughts on this.
More information about the tdwg-tag