FW: Wikipedia and Geonames. was: AW: ANN: RDF Book Mashup - IntegratingWeb 2.0 data sources like Amazon and Google into the Semantic Web

Here a new thread in simile, a semantic web list serve, which might interest those involved in extracting geographic names. Donat -----Original Message----- From: general-bounces@simile.mit.edu [mailto:general-bounces@simile.mit.edu] On Behalf Of Chris Bizer Sent: Saturday, December 02, 2006 10:36 AM To: Richard Cyganiak; Richard Newman Cc: semantic-web@w3.org; 'Damian Steer'; general@simile.mit.edu; 'Karl Dubost' Subject: Wikipedia and Geonames. was: AW: ANN: RDF Book Mashup - IntegratingWeb 2.0 data sources like Amazon and Google into the Semantic Web
Some further ideas along these lines. What about scraping information about geograpic places like countries and cities from Wikipedia and linking the data to geonames (http://www.geonames.org/ontology/)? Something like http://XXX/wikipedia/Embrun owl:sameAs http://sws.geonames.org/3020251/ The Wikipedia articles about countries and cities all follow relatively similar structures (for instance http://en.wikipedia.org/wiki/Berlin) so it should be easy to scrape them. They already contain links to other places, like the Boroughs and localities in Berlin, which could easily be transformed to RDF links. Many places have geo-coordinates which together with the place name allow scrapers to automatically create links to localities from geonames. Wikipedia is GNU, thus there aren't any problems with licensing as with the Google and Amazon data. As most articles follow the same structure, an approach to implement such a information service could be to: - Use a crawling/screenscraping framework that fills a relational database with the information from Wikipedia. - Use D2R Server (http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/) to publish the database on the Web and to provide a SPARQL end-point for querying. I once read about some pretty sophisticated screen-scraping frameworks that fill relational databases with data from websites but forgot the exact links. Does anybody know? Cheers, Chris ----- Original Message ----- From: "Richard Cyganiak" <richard@cyganiak.de> To: "Richard Newman" <r.newman@reading.ac.uk> Cc: "Chris Bizer" <chris@bizer.de>; "'Karl Dubost'" <karl@w3.org>; "'Damian Steer'" <damian.steer@hp.com>; <semantic-web@w3.org> Sent: Friday, December 01, 2006 7:19 PM Subject: Re: AW: ANN: RDF Book Mashup - Integrating Web 2.0 data sources like Amazon and Google into the Semantic Web
_______________________________________________ General mailing list General@simile.mit.edu http://simile.mit.edu/mailman/listinfo/general
participants (1)
-
Donat Agosti