Here the link of our new paper with some relevance for the TDWG community
http://jbi.nhm.ku.edu/index.php/jbi/article/view/36/20
Biodiversity Informatics, 4, 2007, pp. 1-13
A QUANTITATIVE COMPARISON OF XML SCHEMAS FOR TAXONOMIC PUBLICATIONS
GUIDO SAUTTER1,3, KLEMENS BÖHM1, AND DONAT AGOSTI2 1Department of Computer Science, Universität Karlsruhe (TH), 76128 Karlsruhe, Germany; 2 Division of Invertebrate Zoology, American Museum of Natural History, New York NY 10024-5192, and Naturmuseum der Burgergemeinde Bern, 3005 Bern Switzerland; 3 sautter@ipd.uka.de Abstract. Large numbers of legacy taxonomic publications are currently being digitized to make them online available and ready for full text search. The documents are being marked up with XML for two purposes: To preserve the document structure, and to facilitate access via standard query languages like XQuery. With regard to the second aspect, the choice of an appropriate XML schema is crucial. It affects both query performance and the correctness of query results. Over the last few years, several different XML schemas have been proposed as markup standards for taxonomic publications. In this paper, we report on a thorough evaluation and comparison of these schemas. We have examined if they facilitate formulation and correct processing of queries that are common when it comes to taxonomic literature. We also compare the performance of these queries on documents that are marked up with the different schemas. Finally, we propose extensions to the schemas that enhance correctness of query results. Key words. Heritage literature, quantiative analysis, systematics, taxonomy, TaxonX, xml schema
Donat Agosti