Hi Ricardo,
I certainly agree with the direction you want to take the discussion in, but I do want to make a couple of comments:
I wasn't the one who came up with the LSID spec, but I suppose that those methods were specifically designed to handle sequence data (DNA and protein data). The getDataByRange method in particular was designed to allow clients to refer to very specific subsets of those sequences.
No doubt that this is all very useful for the bioinformatics folks, but as we've seen in previous discussions, it is not as useful for us in the biodiversity (and ecological) informatics communities. The main reason is that some of our data is represented in XML, which cannot be serialized as the very same stream of bytes every time. But it may still be helpful to use the getData call to retrieve such data.
I am dubious that we will eventually find much use for the getData() call for any non-digital objects (which, in my understanding, includes many of the things we want to exchange data about). However, I think that the getData() call *does* have value for a non-trivial portion of data objects that *are* of interest to us in the biodiversity informatics community. First of all, the data generated by the bioinformatics folks are of interest to our community, and will increasingly become so as time moves on. But I think the getData() call could also be of value to other objects of interest to us as well. Examples include cropped regions of image files, individual pages of multi-page scanned paper document files, specific segments of video files, specific segments of audio files, among others. Obviously, this would depend on the nature of binary file itself, such that a contiguous block of bytes extracted from within the complete binary file would represent a meaningful, render-able unit of information (possibly not . But the point is, I do believe that getData() does have potential use to us in our data domain for certain tasks.
More fundamentally, however, I want to echo something you said in an earlier post: "We should not try to return something in the LSID getData() call just for the sake of it." In other words, if our LSIDs identify something other than a *static* binary data file (which, in my mind, a database record or a dynamically generated XML file usually do *not* represent -- for the reasons that Dave and others have already pointed out), then we should find ways to make use of the information as returned via getMetadata().
I am still a bit confused as to why we need to define all of these "symatically immutable" rules just to allow us to make use of the getData() call for objects that are not static binary files. I can certainly understand a set of rules that our community defines in terms of managing versioning of metadata and dealing with the sorts of use-cases that Matt describes, but I don't see why we can't layer that on top of the existing LSID specs and methods, rather than "bend" the existing LSID spec and/or develop new methods. Like Bryan said:
But I definitely think it would be mistake to re-define what "data" means in the context of LSIDs (i.e., to allow mutability in certain cases, and in so doing fail to fulfill the contract for serving LSIDs).
Aloha, Rich