[tdwg-content] Fwd: Annotation Ontology best practice relation for linking to image/machine-opaque docs? biomedical use case

Sat Jan 15 00:13:46 CET 2011

There is now an annotations interest group in TDWG, which had a
day-long session in Woods Hole, where I presented a candidate data
annotation ontology.
Subsequent to that, the FilteredPush (FP2) project met several times
with Paulo Ciccerese and discussed whether AO would work for data
annotation as well as document annotation and still meet the
requirements met by the ontology I presented... Paolo thinks yes, with
only the subclassing of what AO calls a Selector. From a lot of
experience with XML, I have a natural skepticism that modeling
documents and modeling data are the same thing, but part of my
ontology work in the FP2 project is to try to use AO, and I have been
working on Paolo's challenge to model our use cases with it.
Curiously(?), one simple one needs no Selector at all and I can't tell
yet whether the extensions I had to make to AO got simpler or more
complex because of that. When I get a little farther, I'll report on
the TDWG Annotations Interest Group Wiki. FWIW, anybody tracking that
Wiki would have noticed that Paul Morris listed the AO there on Dec
22, the first time the wiki was substantively edited.

We are waiting for the Annotation Interest Group (AIG...if we call it
that maybe we'll get a huge bailout!) mailing list to be available. I
expect and hope that discussion of annotations continues in the AIG
resources.

On Fri, Jan 14, 2011 at 5:30 PM, Peter DeVries <pete.devries at gmail.com> wrote:
> This discussion of what specific meanings to assign to various widely used
> predicates reminds me how often this list seems to prefer operating only
> within itself.
> A number of these kinds of issues have been discussed elsewhere.
>
> The argument has always been made that we should use standards but when
> those standards are not those proposed on this list they seem to be ignored.
> Whether it is the appropriate use of rdfs:seeAlso or other existing widely
> used vocabularies. Note below a separate Annotations vocabulary.
> Respectfully,
> - Pete
>
> ---------- Forwarded message ----------
> From: Tim Clark <tim_clark at harvard.edu>
> Date: Mon, Jan 10, 2011 at 1:34 PM
> Subject: Re: best practice relation for linking to image/machine-opaque
> docs? biomedical use case
> To: "M. Scott Marshall" <mscottmarshall at gmail.com>
> Cc: HCLS IG <public-semweb-lifesci at w3.org>, public-lod at w3.org, Daniel Rubin
> <dlrubin at stanford.edu>, "John F. Madden" <john.madden at duke.edu>, Vasiliy
> Faronov <vfaronov at gmail.com>, Toby Inkster <tai at g5n.co.uk>, Peter DeVries
> <pete.devries at gmail.com>, Tim Berners-Lee <timbl at w3.org>, Paolo Ciccarese
> <paolo.ciccarese at gmail.com>, Anita de Waard <A.dewaard at elsevier.com>,
> Maryann Martone <maryann at ncmir.ucsd.edu>
>
>
> Hi Scott,
>
> For referring to a portion of an image, let me point you to work in my group
> done in collaboration with HCLS Scientific Discourse Task, UCSD, Elsevier,
> and one of the major pharmas.  Paolo Ciccarese is the main author, and this
> work is based on the earlier W3C project Annotea.
>
> AO, Annotation ontology, here:
> http://code.google.com/p/annotation-ontology/, presented at Bio Ontologies
> 2010, and full-length paper in press at BMC Bioinformatics.
>
> Bio Ontologies 2010 slides here:
> http://www.slideshare.net/paolociccarese/ao-annotation-ontology-for-science-on-the-web
>
> AO uses a special subclass of Selector to specify the part of the document
> (image) being referred to.
>
> see here for Selectors:
> http://code.google.com/p/annotation-ontology/wiki/Selectors
>
> and here for an example of image annotation:
> http://code.google.com/p/annotation-ontology/wiki/AnnotationTypes
>
> Best
>
> Tim
>
> On Jan 10, 2011, at 11:30 AM, M. Scott Marshall wrote:
>
>> [Scott dusts off old use case and pulls from the shelf. Adjusts
>> subject of thread. Was: best practice for referring to PDF]
>>
>> In Health Care and Life Science domains, image data is a common form
>> of data under discussion so a best practice for referring to an image
>> or to an (extractable) feature *within* an image would cover a
>> fundamental need in biomedicine to point to 'raw' data as evidence (as
>> well as giving meaning to the raw data!).
>>
>> A clinical example from breast cancer:
>> There is a scan that produces an image that contains features referred
>> to by the radiologist as 'microcalcifications', which can be
>> indicative of the presence of a tumor.
>>
>> I can think of a few scenarios that would refer to the image data
>> (mammogram). There are probably more:
>> 1) The radiology report (in RDF) asserts the presence of
>> microcalcifications and refers to the entire image as evidence.
>> 2) The radiology report (in RDF) asserts the presence of
>> microcalcifications and refers to the entire image as evidence, along
>> with a image processing/feature extraction program that will highlight
>> the phenomenon in the image.
>> 3) The radiology report (in RDF) asserts the presence of
>> microcalcifications and refers to a specific region in the image as
>> evidence using some function of a 2D coordinate system such as
>> polyline.
>>
>> The question: How can we refer to the microcalcifications as an
>> indication of a certain type of tumor in each case 1, 2, and 3 in RDF?
>>
>> I am especially interested in the 'structural' aspects: How do we
>> refer to the image document as containingEvidence ? How can we refer
>> to a *region* of the image in the document? How can we refer to the
>> software that will extract the relevant features with statistical
>> confidence, etc.?
>>
>> Any ideas or pointers to existing practices would be appreciated. I'm
>> aware of some related work in multimedia to refer to temporal regions
>> but I am specifically interested in spatial regions.
>>
>> Note that an analogous question of practice exists for textual
>> documents such as literature in PubMed that can be text-mined for
>> (evidence of) assertions.
>>
>> * Note: 2D is a simplification that should come in handy in
>> implementations and often deemed necessary, such as thumbnails.
>>
>> -Scott
>>
>> --
>> M. Scott Marshall, W3C HCLS IG co-chair, http://www.w3.org/blog/hcls
>> Leiden University Medical Center / University of Amsterdam
>> http://staff.science.uva.nl/~marshall
>>
>> On Mon, Jan 10, 2011 at 4:01 PM, Tim Berners-Lee <timbl at w3.org> wrote:
>>> It is well to look at and make best practices for the things
>>> we have if we don't
>>>
>>> It was the FOAF folks who, initially, instead of using linked data,
>>> used an Inverse Functional Property to uniquely identify
>>> someone and then rdfs:seeAlso to find the data about them.
>>> So any FOAF browser has to look up the seeAlso  or they
>>> don't follow any links.
>>>
>>> So tabulator always when looking up x and finding x see:also y will
>>> load y.  So must any similar client or any crawler.
>>>
>>> So there is a lot of existing use we would throw away if we
>>> allowed rdfs:seeAlso for pointing to things which do not
>>> provide data. (It isn't the question of conneg or mime type,
>>> that is a red herring. it is whether there is machine-redable
>>> standards-based stuff about x).
>>>
>>> Further, we should not make any weaker properties like seeDocumentation
>>> subproperties of see:Also, or they would imply
>>> We maybe need a very weak top property like
>>>
>>> mayHaveSomeKindOfInfoAboutThis
>>>
>>> to be the superProperty of all the others.
>>>
>>> One things which could be stronger than seeAlso is definedBy if it
>>> is normally used for data, to point to the definitive ontology.
>>> That would then imply seeAlso.
>>>
>>> Tim
>>
>>
>
>
>
>
> --
> ---------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>

-- 
Robert A. Morris
Emeritus Professor  of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
Associate, Harvard University Herbaria
email: morris.bob at gmail.com
web: http://bdei.cs.umb.edu/
web: http://etaxonomy.org/mw/FilteredPush
http://www.cs.umb.edu/~ram
phone (+1) 857 222 7992 (mobile)