TDWG DwC Text schema references git head
Recently, iDigBio has seen some errors popping up in our logs related to fetching XML definitions from https. In tracing this problem we found that the current published text XSD:
http://rs.tdwg.org/dwc/text/tdwg_dwc_text.xsd
References the darwin core terms with the URL:
https://raw.githubusercontent.com/tdwg/dwc/master/xsd/tdwg_dwcterms.xsd
For those of you not familiar with how git and github work, the URL being used is an unversioned URL pulling right from HEAD off of github (rather than a version tag or branch). This means that any committed change to the file on github will immediately change the behavior of schema validation in client programs, potentially invalidating a previously valid XML file (or simply breaking things if the commit contains errors).
This seems to be related to a (potential) issue with the new DwC release process. I get that we're now moving towards more of a "living standard" model with small incremental update. I think we still need some form of release process.
Rather than having defintions pulled right from github master, we could create a "latest" tag in github that everything references (in addition to tags for individual releases probably):
https://raw.githubusercontent.com/tdwg/dwc/latest/xsd/tdwg_dwcterms.xsd
Then, commits could be made to the DwC repo representing the ongoing work of improving the standard, and then when everyone is happy with the changes, the tag could be moved to indicate that the changes are now "published".
If we want to leave the URL as is, we should document the commit procees in the contributor documentation. There is a well known pattern that exclusively uses branches that is often called git flow:
http://nvie.com/posts/a-successful-git-branching-model/
We could also move to a "no commits to this repo" model, where all changes must come in as pull requests from forks of the main repo.
- Alex
Hi Alex, thanks for pointing that out. The problem is slightly more complex as rs.tdwg.org is a rewrite to tdwg.github.io/dwc which is driven by the gh-pages branch. we planned to use this branch for the latest release of dwc and the schema should be using rs.tdwg.org for includes - or maybe ghpages directly. surely not the master branch!
markus
Von meinem iPhone gesendet
Am 10.06.2016 um 16:50 schrieb godfoder godfoder@acis.ufl.edu:
Recently, iDigBio has seen some errors popping up in our logs related to fetching XML definitions from https. In tracing this problem we found that the current published text XSD:
http://rs.tdwg.org/dwc/text/tdwg_dwc_text.xsd
References the darwin core terms with the URL:
https://raw.githubusercontent.com/tdwg/dwc/master/xsd/tdwg_dwcterms.xsd
For those of you not familiar with how git and github work, the URL being used is an unversioned URL pulling right from HEAD off of github (rather than a version tag or branch). This means that any committed change to the file on github will immediately change the behavior of schema validation in client programs, potentially invalidating a previously valid XML file (or simply breaking things if the commit contains errors).
This seems to be related to a (potential) issue with the new DwC release process. I get that we're now moving towards more of a "living standard" model with small incremental update. I think we still need some form of release process.
Rather than having defintions pulled right from github master, we could create a "latest" tag in github that everything references (in addition to tags for individual releases probably):
https://raw.githubusercontent.com/tdwg/dwc/latest/xsd/tdwg_dwcterms.xsd
Then, commits could be made to the DwC repo representing the ongoing work of improving the standard, and then when everyone is happy with the changes, the tag could be moved to indicate that the changes are now "published".
If we want to leave the URL as is, we should document the commit procees in the contributor documentation. There is a well known pattern that exclusively uses branches that is often called git flow:
http://nvie.com/posts/a-successful-git-branching-model/
We could also move to a "no commits to this repo" model, where all changes must come in as pull requests from forks of the main repo.
- Alex
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
I have logged this discussion as an issue in the GitHub repo: https://github.com/tdwg/dwc/issues/124
On 11 June 2016 at 11:49, Markus Döring m.doering@mac.com wrote:
Hi Alex, thanks for pointing that out. The problem is slightly more complex as rs.tdwg.org is a rewrite to tdwg.github.io/dwc which is driven by the gh-pages branch. we planned to use this branch for the latest release of dwc and the schema should be using rs.tdwg.org for includes - or maybe ghpages directly. surely not the master branch!
markus
Von meinem iPhone gesendet
Am 10.06.2016 um 16:50 schrieb godfoder godfoder@acis.ufl.edu:
Recently, iDigBio has seen some errors popping up in our logs related to fetching XML definitions from https. In tracing this problem we found that the current published text XSD:
http://rs.tdwg.org/dwc/text/tdwg_dwc_text.xsd
References the darwin core terms with the URL:
https://raw.githubusercontent.com/tdwg/dwc/master/xsd/tdwg_dwcterms.xsd
For those of you not familiar with how git and github work, the URL being used is an unversioned URL pulling right from HEAD off of github (rather than a version tag or branch). This means that any committed change to the file on github will immediately change the behavior of schema validation in client programs, potentially invalidating a previously valid XML file (or simply breaking things if the commit contains errors).
This seems to be related to a (potential) issue with the new DwC release process. I get that we're now moving towards more of a "living standard" model with small incremental update. I think we still need some form of release process.
Rather than having defintions pulled right from github master, we could create a "latest" tag in github that everything references (in addition to tags for individual releases probably):
https://raw.githubusercontent.com/tdwg/dwc/latest/xsd/tdwg_dwcterms.xsd
Then, commits could be made to the DwC repo representing the ongoing work of improving the standard, and then when everyone is happy with the changes, the tag could be moved to indicate that the changes are now "published".
If we want to leave the URL as is, we should document the commit procees in the contributor documentation. There is a well known pattern that exclusively uses branches that is often called git flow:
http://nvie.com/posts/a-successful-git-branching-model/
We could also move to a "no commits to this repo" model, where all changes must come in as pull requests from forks of the main repo.
- Alex
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
participants (3)
-
godfoder
-
Markus Döring
-
Peter Desmet