* Home * News and Updates * Lab Tours * Food Tours * Workshops * Program * Travel and Hotel Information * Carpooling and Room-sharing * Volunteer * BlogMedia Coverage * Promo materials * NC blogs *
Shakespeare wasn’t a semantic web guy
Saturday, January 16 11:30am – 12:35pm
B. Shakespeare wasn’t a semantic web guy – Jonathan Rees
Description: That which we call a rose, by any other name, wouldn’t be identified by a computer as a rose. This talk will go through the Shared Name initiative which promotes community-wide use of shared names for records from public databases. The goal is to have a significant effect on the practice of bioinformatics by making it easier to share and link data sets and tools across projects. Selecting and maintaining names is a serious capacity building problem for moving the RDF world from the hacker and hobbyist community to the regular user. And a growing body of experience emphasizes that for any solution to be generally adopted, it must not only be technically sound, but also serve and empower the community of users.
Discuss:
——-
.
.
.
.
(Science Commons intro – about reuse)
.
.
There are lots of open databases. E.g. see
There are lots of tools and they refer to these databases. E.g. see
Databases refer to one another (GO example)
It’s easy to refer to database entries using URIs, e.g.
http://www.ncbi.nlm.nih.gov/pubmed/16504059 = an article abstract
Happy.
.
.
.
.
Unfortunately… Data Reference Risks
In particular: Identity Muddle
Try Google Sidewiki (or semantic web) on http://www.ncbi.nlm.nih.gov/pubmed/16504059 and then on http://view.ncbi.nlm.nih.gov/pubmed/16504059
Name instability keeps link maintenance and db reuse costs high; name inconsistency means missed coreference opportunities
Sad.
.
.
.
.
Applications and databases protect themselves with a level of indirection – GO example again
Each has its own indirection mechanism, independently maintained
Shared Names is a collective effort of application and database builders to agree on all doing the level of indirection in the same way, using URIs
e.g. http://sharedname.org/pubmed/16504059, http://sharedname.org/pdb/3H7B (details TBD)
and pooling their curation efforts
Value added at indirection point: stability, annotations, errata, mirrors, related resources, protocols
Novelty: a consumer cooperative – not publisher originated (steering committee)
Cautious optimism.
.
.
.
.
Stability scorecard
- Preserving content is not the same as preserving name stability
- Human-oriented content is preserved pretty well (repositories such as NCBI, LOCKSS, IA) (but blogs, wikis, etc.)
- Computer-oriented content is preserved sometimes (but publishers are not clear on the concept)
- Mobility helps, and open access and public domain promote replication
- Name stability is especially a problem for computing on the web (BUT: sidewiki, bookmarks, 404s)
Bottom line: Scholarship and repeatability are the foundations of scientific practice. Computation has become another essential component. To really bring science on line, we have to make the web references work. We can draw lessons from the pre-web world (libraries).
Ack: Ben Hyde, Alan Ruttenberg, John Wilbanks, Shared Names Steering Committee
