1) Seeing data marked up in a standard can make it a lot easier to figure out how to implement that standard yourself. It is difficult to see if you are interpreting the standard correctly, and you sometimes find mistakes inthe examples like the one listed above.
For instance, I saw an example where the Datum was recorded as "epsg:4326". I understood that it should be interpreted as "WGS84". But, I did not expect that Datum would appear in that form, and I would not have written a parser to recognize that.
In regards to RDF/OWL, having a sample data set allows you to see if you can ask the questions that you want to.
These were the reason's why I was looking for some sample data a while back.
For data that is going to be used in triple stores and used with SPARQL, it seems best to replace literals with URI's as much as possible. The system seems faster, and more predictable. It appears the system finds a long but standard URI, preferable to a string of characters. This make sense if you think about how this is stored, but it is the opposite of how most people think.
I am a little concerned about adopting a standard which people have not actually tried to work with.
2) It might be advantageous to split out some parts to speed this up.
If we could come up with a proposed representation for location, it would not be too difficult to generate some sample data sets that people could then test.
In the testing phase, we should see if there are any problems with the standard, and what kinds of errors people make when trying to produce records that comply with the standards.
Also, we might solicit feedback from some of the other people in the Semantic Web community. This might help improve the location standard and improve the chances that it might get more widely adopted.
Finally, we produce example code in several different programming languages and environments that demonstrates how to produce the location markup. For instance, code that creates location records in Ruby on Rails, Python, Java, maybe even FileMaker. That code could then be added as libraries for each of those languages.
If we have a standard table format for location, then the data can be output in a number of different formats including variations of OWL/RDF based on different ontologies.
Maybe something like:
Simple RDF - with ontology (data mixes easily with other data sets) OWL/RDF - with DL ontology (may only work with DarwinCore Data) OWL/RDF - experimental ontology (things that people would like to try without complete ratification)
- Pete
On Sun, May 17, 2009 at 4:46 AM, Richard Pyle deepreef@bishopmuseum.orgwrote:
I had said:
So, the point is, there are MANY MANY aspects of ZooBank that are known to be non-functional, or mis-functional, and (as in this case) there are many opportunities for users to enter information incorrectly or into the wrong fields (the doi obviously should have been in the identifier field; not the URL field).
I want to retract that last bit! I just now discovered that the problem is with the code behind the user interface, and NOT the user putting information in the wrong field! My apologies for the misleading statement (unintentional, though it was). Of course, there is still ample opportunity on the ZooNBank website for someone to enter the wrong information in the wrong field, but the specific case pointed out by Rod was not an example of such.
Aloha, Rich
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag