Rich,
No need to apologise... Actually it affects the aggregators in two respects, one is the larger vs. more compact data representation, the other is the present inconsistency about what is actually expected/supplied in practice by real world data providers in the present "scientificName" element. If it was clearer that this was for sciname + author, and the sciname without author had its own dedicated element, the incoming data would (might) be potentially a lot more consistent.
Basically it is the present "scientificNameAuthor" element which is clouding the issue - people see this and then think they do not need to add the author in to "scientificName" as well, although as previously stated by Markus this is technically incorrect according to the DwC spec (and I can see the argument for keeping it that way, so as to capture as much info as possible in that field).
Cheers - Tony
-----Original Message----- From: Richard Pyle [mailto:deepreef@bishopmuseum.org] Sent: Wednesday, 24 November 2010 11:27 AM To: Rees, Tony (CMAR, Hobart); Chuck.Miller@mobot.org; dremsen@gbif.org Cc: tdwg-content@lists.tdwg.org; dmozzherin@eol.org Subject: RE: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?
OK, understood.
But I guess my next question would be: is this really "bloat"? Isn't the cost of the bloat much less than the value of providing fully parsed content?
I now understand what I think is a large part of the basis for our (perhaps non-existent?) disagreement: I'm thinking of dwc terms in the abstract sense, whereas you are thinking in terms of more practical issues such as the MB size of your DwCA files. This also clarifies for me why you keep saying that it's really a question for the big aggregators (which I now understand and agree with).
Sorry if I was misunderstanding where you are coming from on this!
Aloha, Rich