Fwd: Annotation Ontology best practice relation for linking to image/machine-opaque docs? biomedical use case
This discussion of what specific meanings to assign to various widely used predicates reminds me how often this list seems to prefer operating only within itself. A number of these kinds of issues have been discussed elsewhere. The argument has always been made that we should use standards but when those standards are not those proposed on this list they seem to be ignored. Whether it is the appropriate use of rdfs:seeAlso or other existing widely used vocabularies. Note below a separate Annotations vocabulary. Respectfully, - Pete ---------- Forwarded message ---------- From: Tim Clark <tim_clark@harvard.edu> Date: Mon, Jan 10, 2011 at 1:34 PM Subject: Re: best practice relation for linking to image/machine-opaque docs? biomedical use case To: "M. Scott Marshall" <mscottmarshall@gmail.com> Cc: HCLS IG <public-semweb-lifesci@w3.org>, public-lod@w3.org, Daniel Rubin <dlrubin@stanford.edu>, "John F. Madden" <john.madden@duke.edu>, Vasiliy Faronov <vfaronov@gmail.com>, Toby Inkster <tai@g5n.co.uk>, Peter DeVries < pete.devries@gmail.com>, Tim Berners-Lee <timbl@w3.org>, Paolo Ciccarese < paolo.ciccarese@gmail.com>, Anita de Waard <A.dewaard@elsevier.com>, Maryann Martone <maryann@ncmir.ucsd.edu> Hi Scott, For referring to a portion of an image, let me point you to work in my group done in collaboration with HCLS Scientific Discourse Task, UCSD, Elsevier, and one of the major pharmas. Paolo Ciccarese is the main author, and this work is based on the earlier W3C project Annotea. AO, Annotation ontology, here: http://code.google.com/p/annotation-ontology/, presented at Bio Ontologies 2010, and full-length paper in press at BMC Bioinformatics. Bio Ontologies 2010 slides here: http://www.slideshare.net/paolociccarese/ao-annotation-ontology-for-science-... AO uses a special subclass of Selector to specify the part of the document (image) being referred to. see here for Selectors: http://code.google.com/p/annotation-ontology/wiki/Selectors and here for an example of image annotation: http://code.google.com/p/annotation-ontology/wiki/AnnotationTypes Best Tim On Jan 10, 2011, at 11:30 AM, M. Scott Marshall wrote:
[Scott dusts off old use case and pulls from the shelf. Adjusts subject of thread. Was: best practice for referring to PDF]
In Health Care and Life Science domains, image data is a common form of data under discussion so a best practice for referring to an image or to an (extractable) feature *within* an image would cover a fundamental need in biomedicine to point to 'raw' data as evidence (as well as giving meaning to the raw data!).
A clinical example from breast cancer: There is a scan that produces an image that contains features referred to by the radiologist as 'microcalcifications', which can be indicative of the presence of a tumor.
I can think of a few scenarios that would refer to the image data (mammogram). There are probably more: 1) The radiology report (in RDF) asserts the presence of microcalcifications and refers to the entire image as evidence. 2) The radiology report (in RDF) asserts the presence of microcalcifications and refers to the entire image as evidence, along with a image processing/feature extraction program that will highlight the phenomenon in the image. 3) The radiology report (in RDF) asserts the presence of microcalcifications and refers to a specific region in the image as evidence using some function of a 2D coordinate system such as polyline.
The question: How can we refer to the microcalcifications as an indication of a certain type of tumor in each case 1, 2, and 3 in RDF?
I am especially interested in the 'structural' aspects: How do we refer to the image document as containingEvidence ? How can we refer to a *region* of the image in the document? How can we refer to the software that will extract the relevant features with statistical confidence, etc.?
Any ideas or pointers to existing practices would be appreciated. I'm aware of some related work in multimedia to refer to temporal regions but I am specifically interested in spatial regions.
Note that an analogous question of practice exists for textual documents such as literature in PubMed that can be text-mined for (evidence of) assertions.
* Note: 2D is a simplification that should come in handy in implementations and often deemed necessary, such as thumbnails.
-Scott
-- M. Scott Marshall, W3C HCLS IG co-chair, http://www.w3.org/blog/hcls Leiden University Medical Center / University of Amsterdam http://staff.science.uva.nl/~marshall
On Mon, Jan 10, 2011 at 4:01 PM, Tim Berners-Lee <timbl@w3.org> wrote:
It is well to look at and make best practices for the things we have if we don't
It was the FOAF folks who, initially, instead of using linked data, used an Inverse Functional Property to uniquely identify someone and then rdfs:seeAlso to find the data about them. So any FOAF browser has to look up the seeAlso or they don't follow any links.
So tabulator always when looking up x and finding x see:also y will load y. So must any similar client or any crawler.
So there is a lot of existing use we would throw away if we allowed rdfs:seeAlso for pointing to things which do not provide data. (It isn't the question of conneg or mime type, that is a red herring. it is whether there is machine-redable standards-based stuff about x).
Further, we should not make any weaker properties like seeDocumentation subproperties of see:Also, or they would imply We maybe need a very weak top property like
mayHaveSomeKindOfInfoAboutThis
to be the superProperty of all the others.
One things which could be stronger than seeAlso is definedBy if it is normally used for data, to point to the definitive ontology. That would then imply seeAlso.
Tim
-- --------------------------------------------------------------- Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies Knowledge Base <http://lod.geospecies.org/> About the GeoSpecies Knowledge Base <http://about.geospecies.org/> ------------------------------------------------------------
There is now an annotations interest group in TDWG, which had a day-long session in Woods Hole, where I presented a candidate data annotation ontology. Subsequent to that, the FilteredPush (FP2) project met several times with Paulo Ciccerese and discussed whether AO would work for data annotation as well as document annotation and still meet the requirements met by the ontology I presented... Paolo thinks yes, with only the subclassing of what AO calls a Selector. From a lot of experience with XML, I have a natural skepticism that modeling documents and modeling data are the same thing, but part of my ontology work in the FP2 project is to try to use AO, and I have been working on Paolo's challenge to model our use cases with it. Curiously(?), one simple one needs no Selector at all and I can't tell yet whether the extensions I had to make to AO got simpler or more complex because of that. When I get a little farther, I'll report on the TDWG Annotations Interest Group Wiki. FWIW, anybody tracking that Wiki would have noticed that Paul Morris listed the AO there on Dec 22, the first time the wiki was substantively edited. We are waiting for the Annotation Interest Group (AIG...if we call it that maybe we'll get a huge bailout!) mailing list to be available. I expect and hope that discussion of annotations continues in the AIG resources. On Fri, Jan 14, 2011 at 5:30 PM, Peter DeVries <pete.devries@gmail.com> wrote:
This discussion of what specific meanings to assign to various widely used predicates reminds me how often this list seems to prefer operating only within itself. A number of these kinds of issues have been discussed elsewhere.
The argument has always been made that we should use standards but when those standards are not those proposed on this list they seem to be ignored. Whether it is the appropriate use of rdfs:seeAlso or other existing widely used vocabularies. Note below a separate Annotations vocabulary. Respectfully, - Pete
---------- Forwarded message ---------- From: Tim Clark <tim_clark@harvard.edu> Date: Mon, Jan 10, 2011 at 1:34 PM Subject: Re: best practice relation for linking to image/machine-opaque docs? biomedical use case To: "M. Scott Marshall" <mscottmarshall@gmail.com> Cc: HCLS IG <public-semweb-lifesci@w3.org>, public-lod@w3.org, Daniel Rubin <dlrubin@stanford.edu>, "John F. Madden" <john.madden@duke.edu>, Vasiliy Faronov <vfaronov@gmail.com>, Toby Inkster <tai@g5n.co.uk>, Peter DeVries <pete.devries@gmail.com>, Tim Berners-Lee <timbl@w3.org>, Paolo Ciccarese <paolo.ciccarese@gmail.com>, Anita de Waard <A.dewaard@elsevier.com>, Maryann Martone <maryann@ncmir.ucsd.edu>
Hi Scott,
For referring to a portion of an image, let me point you to work in my group done in collaboration with HCLS Scientific Discourse Task, UCSD, Elsevier, and one of the major pharmas. Paolo Ciccarese is the main author, and this work is based on the earlier W3C project Annotea.
AO, Annotation ontology, here: http://code.google.com/p/annotation-ontology/, presented at Bio Ontologies 2010, and full-length paper in press at BMC Bioinformatics.
Bio Ontologies 2010 slides here: http://www.slideshare.net/paolociccarese/ao-annotation-ontology-for-science-...
AO uses a special subclass of Selector to specify the part of the document (image) being referred to.
see here for Selectors: http://code.google.com/p/annotation-ontology/wiki/Selectors
and here for an example of image annotation: http://code.google.com/p/annotation-ontology/wiki/AnnotationTypes
Best
Tim
On Jan 10, 2011, at 11:30 AM, M. Scott Marshall wrote:
[Scott dusts off old use case and pulls from the shelf. Adjusts subject of thread. Was: best practice for referring to PDF]
In Health Care and Life Science domains, image data is a common form of data under discussion so a best practice for referring to an image or to an (extractable) feature *within* an image would cover a fundamental need in biomedicine to point to 'raw' data as evidence (as well as giving meaning to the raw data!).
A clinical example from breast cancer: There is a scan that produces an image that contains features referred to by the radiologist as 'microcalcifications', which can be indicative of the presence of a tumor.
I can think of a few scenarios that would refer to the image data (mammogram). There are probably more: 1) The radiology report (in RDF) asserts the presence of microcalcifications and refers to the entire image as evidence. 2) The radiology report (in RDF) asserts the presence of microcalcifications and refers to the entire image as evidence, along with a image processing/feature extraction program that will highlight the phenomenon in the image. 3) The radiology report (in RDF) asserts the presence of microcalcifications and refers to a specific region in the image as evidence using some function of a 2D coordinate system such as polyline.
The question: How can we refer to the microcalcifications as an indication of a certain type of tumor in each case 1, 2, and 3 in RDF?
I am especially interested in the 'structural' aspects: How do we refer to the image document as containingEvidence ? How can we refer to a *region* of the image in the document? How can we refer to the software that will extract the relevant features with statistical confidence, etc.?
Any ideas or pointers to existing practices would be appreciated. I'm aware of some related work in multimedia to refer to temporal regions but I am specifically interested in spatial regions.
Note that an analogous question of practice exists for textual documents such as literature in PubMed that can be text-mined for (evidence of) assertions.
* Note: 2D is a simplification that should come in handy in implementations and often deemed necessary, such as thumbnails.
-Scott
-- M. Scott Marshall, W3C HCLS IG co-chair, http://www.w3.org/blog/hcls Leiden University Medical Center / University of Amsterdam http://staff.science.uva.nl/~marshall
On Mon, Jan 10, 2011 at 4:01 PM, Tim Berners-Lee <timbl@w3.org> wrote:
It is well to look at and make best practices for the things we have if we don't
It was the FOAF folks who, initially, instead of using linked data, used an Inverse Functional Property to uniquely identify someone and then rdfs:seeAlso to find the data about them. So any FOAF browser has to look up the seeAlso or they don't follow any links.
So tabulator always when looking up x and finding x see:also y will load y. So must any similar client or any crawler.
So there is a lot of existing use we would throw away if we allowed rdfs:seeAlso for pointing to things which do not provide data. (It isn't the question of conneg or mime type, that is a red herring. it is whether there is machine-redable standards-based stuff about x).
Further, we should not make any weaker properties like seeDocumentation subproperties of see:Also, or they would imply We maybe need a very weak top property like
mayHaveSomeKindOfInfoAboutThis
to be the superProperty of all the others.
One things which could be stronger than seeAlso is definedBy if it is normally used for data, to point to the definitive ontology. That would then imply seeAlso.
Tim
-- --------------------------------------------------------------- Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base / GeoSpecies Knowledge Base About the GeoSpecies Knowledge Base ------------------------------------------------------------
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram phone (+1) 857 222 7992 (mobile)
Pete, I think that what you need to remember is that a lot of people on this list just don't know as much about the Linked Data world as you do. For example, I think I was the guilty party for suggesting an incorrect use of rdfs:seeAlso . That was based on a faulty memory of something I read six months ago. So what seems to be resistance in many cases may just be ignorance or misinformation. The other part of it is that to some extent we are in the process of cramming a square peg in a round hole in trying to make existing biodiversity informatics terms, databases, systems, etc. work in the LOD world. Those things weren't necessarily designed to work with LOD and it's going to take some time and education to figure out how to round off the corners enough to make them fit. A lot of people have invested a lot of time, effort, and money in square pegs and they are not going to be willing or even able to just throw them all away and suddenly buy all new round pegs. I think that there are a significant number of people in this community who are willing to give Linked Data a shot, but it's going to take some time and education. You might know that certain predicates are widely used, but since I'm not spending much of my time immersed in the LOD world, I won't necessarily know those predicates - to a large extent I'm getting my education from this list. I'm not really following what is being discussed elsewhere - not enough time. Steve Peter DeVries wrote:
This discussion of what specific meanings to assign to various widely used predicates reminds me how often this list seems to prefer operating only within itself.
A number of these kinds of issues have been discussed elsewhere.
The argument has always been made that we should use standards but when those standards are not those proposed on this list they seem to be ignored.
Whether it is the appropriate use of rdfs:seeAlso or other existing widely used vocabularies. Note below a separate Annotations vocabulary.
Respectfully,
- Pete
---------- Forwarded message ---------- From: *Tim Clark* <tim_clark@harvard.edu <mailto:tim_clark@harvard.edu>> Date: Mon, Jan 10, 2011 at 1:34 PM Subject: Re: best practice relation for linking to image/machine-opaque docs? biomedical use case To: "M. Scott Marshall" <mscottmarshall@gmail.com <mailto:mscottmarshall@gmail.com>> Cc: HCLS IG <public-semweb-lifesci@w3.org <mailto:public-semweb-lifesci@w3.org>>, public-lod@w3.org <mailto:public-lod@w3.org>, Daniel Rubin <dlrubin@stanford.edu <mailto:dlrubin@stanford.edu>>, "John F. Madden" <john.madden@duke.edu <mailto:john.madden@duke.edu>>, Vasiliy Faronov <vfaronov@gmail.com <mailto:vfaronov@gmail.com>>, Toby Inkster <tai@g5n.co.uk <mailto:tai@g5n.co.uk>>, Peter DeVries <pete.devries@gmail.com <mailto:pete.devries@gmail.com>>, Tim Berners-Lee <timbl@w3.org <mailto:timbl@w3.org>>, Paolo Ciccarese <paolo.ciccarese@gmail.com <mailto:paolo.ciccarese@gmail.com>>, Anita de Waard <A.dewaard@elsevier.com <mailto:A.dewaard@elsevier.com>>, Maryann Martone <maryann@ncmir.ucsd.edu <mailto:maryann@ncmir.ucsd.edu>>
Hi Scott,
For referring to a portion of an image, let me point you to work in my group done in collaboration with HCLS Scientific Discourse Task, UCSD, Elsevier, and one of the major pharmas. Paolo Ciccarese is the main author, and this work is based on the earlier W3C project Annotea.
AO, Annotation ontology, here: http://code.google.com/p/annotation-ontology/, presented at Bio Ontologies 2010, and full-length paper in press at BMC Bioinformatics.
Bio Ontologies 2010 slides here: http://www.slideshare.net/paolociccarese/ao-annotation-ontology-for-science-...
AO uses a special subclass of Selector to specify the part of the document (image) being referred to.
see here for Selectors: http://code.google.com/p/annotation-ontology/wiki/Selectors
and here for an example of image annotation: http://code.google.com/p/annotation-ontology/wiki/AnnotationTypes
Best
Tim
On Jan 10, 2011, at 11:30 AM, M. Scott Marshall wrote:
[Scott dusts off old use case and pulls from the shelf. Adjusts subject of thread. Was: best practice for referring to PDF]
In Health Care and Life Science domains, image data is a common form of data under discussion so a best practice for referring to an image or to an (extractable) feature *within* an image would cover a fundamental need in biomedicine to point to 'raw' data as evidence (as well as giving meaning to the raw data!).
A clinical example from breast cancer: There is a scan that produces an image that contains features referred to by the radiologist as 'microcalcifications', which can be indicative of the presence of a tumor.
I can think of a few scenarios that would refer to the image data (mammogram). There are probably more: 1) The radiology report (in RDF) asserts the presence of microcalcifications and refers to the entire image as evidence. 2) The radiology report (in RDF) asserts the presence of microcalcifications and refers to the entire image as evidence, along with a image processing/feature extraction program that will highlight the phenomenon in the image. 3) The radiology report (in RDF) asserts the presence of microcalcifications and refers to a specific region in the image as evidence using some function of a 2D coordinate system such as polyline.
The question: How can we refer to the microcalcifications as an indication of a certain type of tumor in each case 1, 2, and 3 in RDF?
I am especially interested in the 'structural' aspects: How do we refer to the image document as containingEvidence ? How can we refer to a *region* of the image in the document? How can we refer to the software that will extract the relevant features with statistical confidence, etc.?
Any ideas or pointers to existing practices would be appreciated. I'm aware of some related work in multimedia to refer to temporal regions but I am specifically interested in spatial regions.
Note that an analogous question of practice exists for textual documents such as literature in PubMed that can be text-mined for (evidence of) assertions.
* Note: 2D is a simplification that should come in handy in implementations and often deemed necessary, such as thumbnails.
-Scott
-- M. Scott Marshall, W3C HCLS IG co-chair, http://www.w3.org/blog/hcls Leiden University Medical Center / University of Amsterdam http://staff.science.uva.nl/~marshall <http://staff.science.uva.nl/%7Emarshall>
On Mon, Jan 10, 2011 at 4:01 PM, Tim Berners-Lee <timbl@w3.org <mailto:timbl@w3.org>> wrote:
It is well to look at and make best practices for the things we have if we don't
It was the FOAF folks who, initially, instead of using linked data, used an Inverse Functional Property to uniquely identify someone and then rdfs:seeAlso to find the data about them. So any FOAF browser has to look up the seeAlso or they don't follow any links.
So tabulator always when looking up x and finding x see:also y will load y. So must any similar client or any crawler.
So there is a lot of existing use we would throw away if we allowed rdfs:seeAlso for pointing to things which do not provide data. (It isn't the question of conneg or mime type, that is a red herring. it is whether there is machine-redable standards-based stuff about x).
Further, we should not make any weaker properties like seeDocumentation subproperties of see:Also, or they would imply We maybe need a very weak top property like
mayHaveSomeKindOfInfoAboutThis
to be the superProperty of all the others.
One things which could be stronger than seeAlso is definedBy if it is normally used for data, to point to the definitive ontology. That would then imply seeAlso.
Tim
-- --------------------------------------------------------------- Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies Knowledge Base <http://lod.geospecies.org/> About the GeoSpecies Knowledge Base <http://about.geospecies.org/> ------------------------------------------------------------
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A. delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235 office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu
participants (3)
-
Bob Morris
-
Peter DeVries
-
Steve Baskauf