[tdwg-content] TDWG DwC Text schema references git head

godfoder godfoder at acis.ufl.edu
Fri Jun 10 15:50:37 CEST 2016


Recently, iDigBio has seen some errors popping up in our logs related to 
fetching XML definitions from https. In tracing this problem we found 
that the current published text XSD:

http://rs.tdwg.org/dwc/text/tdwg_dwc_text.xsd

References the darwin core terms with the URL:

https://raw.githubusercontent.com/tdwg/dwc/master/xsd/tdwg_dwcterms.xsd

For those of you not familiar with how git and github work, the URL 
being used is an unversioned URL pulling right from HEAD off of github 
(rather than a version tag or branch). This means that any committed 
change to the file on github will immediately change the behavior of 
schema validation in client programs, potentially invalidating a 
previously valid XML file (or simply breaking things if the commit 
contains errors).

This seems to be related to a (potential) issue with the new DwC release 
process. I get that we're now moving towards more of a "living standard" 
model with small incremental update. I think we still need some form of 
release process.

Rather than having defintions pulled right from github master, we could 
create a "latest" tag in github that everything references (in addition 
to tags for individual releases probably):

https://raw.githubusercontent.com/tdwg/dwc/latest/xsd/tdwg_dwcterms.xsd

Then, commits could be made to the DwC repo representing the ongoing 
work of improving the standard, and then when everyone is happy with the 
changes, the tag could be moved to indicate that the changes are now 
"published".

If we want to leave the URL as is, we should document the commit procees 
in the contributor documentation. There is a well known pattern that 
exclusively uses branches that is often called git flow:

http://nvie.com/posts/a-successful-git-branching-model/

We could also move to a "no commits to this repo" model, where all 
changes must come in as pull requests from forks of the main repo.

- Alex


More information about the tdwg-content mailing list