SPARQL is the query language for RDF and linked data, and such data has been a focus of our work for quite a few years. In this talk, I shall start by summarizing some of our older work in the scalable semantics and reasoning space. The most basic is work scaling reasoning using refinement techniques. I will discuss how refinement works, bring out its relation to program analysis concepts, and show some results. Built on that is work applying our reasoning to the medical domain, matching patients to clinical trials. I shall describe the problem and present our approach and results. Next, I shall discuss our work in scaling SPARQL queries in an RDF store. I shall present our schema, and how it was inspired by ideas from graph theory. I then move on to extending SPARQL to conveniently query across both RDF and non-RDF data, for which at present it has two chief shortcomings to using SPARQL in such contexts. First, SPARQL has no notion of modularity, and modularity is a key feature in assembling complex queries of the kind that are needed when one integrates very different datasets. Second, its support for query federation over different endpoints is limited: the endpoints all need to be SPARQL and the language does not allow for posting data to an endpoint. To rectify these shortcomings, we propose two simple extensions to the language to rectify these limitations: functions and generalized service. In designing these extensions, we were careful to keep the extensions minimal, to preserve SPARQL’s declarative semantics. Finally, I shall conclude with ongoing work to implement an RDF store on Google BigQuery.
Wed 25 Oct
|10:30 - 11:00|
Ravi ChughUniversity of Chicago
|11:00 - 11:30|
Julian DolbyIBM Thomas J. Watson Research Center
|11:30 - 12:00|