[tdwg-tag] Unresloved technical issues - proposed decisions.

26 Aug 2010

      Hi Everyone -

Javier and I chatted yesterday about unresolved technical issues. Our 
primary goal was simplicity. Here are the outcomes. Feel free to suggest 
something different, if either i) you're planning on developing around 
your suggestion; or ii) you believe that to not follow your suggestion 
would be a grave mistake.

NOTE: by "app" we mean anything that is writing to the table - could be a 
smartphone app, could be a web form, could be a screenscraper or 
tweet-parser.

1. All apps will write to the same table. Each app will need a Google 
account, which will be given write-access to the table.

2. The occurrence_id will be the row_id assigned by Fusion Tables. This is 
not seen in the web interface, but is available from the API.

3. To support crowdsourcing of image identification, alternative 
identifications, arguments, etc., there will be a 2nd table with columns 
{occurrence_id, scientificName, vernacularName, Kingdom, identifiedBy, 
identificationResources, identificationRemarks}. One of each of these 
columns will also exist in the Occurrences table. When the observation is 
first reported, any identification will be entered in the Occurrences 
table, with subsequent identifications in the Identifications 
table. (The main motivation for keeping the initial identification in the Occurrences 
table is to ease the process for app developers.)

4. Multiple multimedia URLs will be listed in a single column, comma 
separated.

5. The lat and long columns will be WGS84. If app developers wish, they 
are welcome to build support for additional datums into their apps, since 
the apps will have authority to create additional columns. If this doesn't 
happen, anyone planning to use another datum  will need to announce 
themselves ahead of time, and we will create  appropriate columns for 
them.

6. The table will contain Kingdom, Phylum, Class, Order, and Family 
columns. Javier will write software to resolve identifications to the 
Catalog of Life, and to populate the taxonomy columns. If someones want to 
resolve observations to a different classification, they are free to do 
so, and are encouraged to publish the results in Fusion Tables.

7. In addition to the identification table, there is scope for creating an 
Annotations table and front-end, should anyone wish to tackle this.

Here's a timetable that would be nice to adhere to:

Sept. 1: Final versions of the Occurrences and Identifications tables 
published to Fusion Tables. Write access will only be granted to 
developers. These will be the test tables. The day before the bioblitz, 
they will become the development tables, and test records will be 
expunged.

Sept. 3: Table documentation and sample code for writing to the tables.

Sept. 8: Taxanomic resolution, validation, assignment of taxon 
GUIDS/LSIDs.

Sept. 15: At least one Image identification framework in place.

Sept. 15 - Sept. 28: Testing and work on visualization and other 
data-oriented services (e.g. publication as DwC or Linked Data, etc.)

Best to all -
Joel.

[tdwg-tag] Unresloved technical issues - proposed decisions.

joel sachs