Toward Scalable Semantic Big Data (SPLASH 2017 - SPLASH-I)

Write a Blog >>

Sun 22 - Fri 27 October 2017 Vancouver, Canada

Track

SPLASH 2017 SPLASH-I

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 25 Oct 2017 11:00 - 11:30 at Regency D - Programming Models Chair(s): Gail Murphy

Abstract

SPARQL is the query language for RDF and linked data, and such data has been a focus of our work for quite a few years. In this talk, I shall start by summarizing some of our older work in the scalable semantics and reasoning space. The most basic is work scaling reasoning using refinement techniques. I will discuss how refinement works, bring out its relation to program analysis concepts, and show some results. Built on that is work applying our reasoning to the medical domain, matching patients to clinical trials. I shall describe the problem and present our approach and results. Next, I shall discuss our work in scaling SPARQL queries in an RDF store. I shall present our schema, and how it was inspired by ideas from graph theory. I then move on to extending SPARQL to conveniently query across both RDF and non-RDF data, for which at present it has two chief shortcomings to using SPARQL in such contexts. First, SPARQL has no notion of modularity, and modularity is a key feature in assembling complex queries of the kind that are needed when one integrates very different datasets. Second, its support for query federation over different endpoints is limited: the endpoints all need to be SPARQL and the language does not allow for posting data to an endpoint. To rectify these shortcomings, we propose two simple extensions to the language to rectify these limitations: functions and generalized service. In designing these extensions, we were careful to keep the extensions minimal, to preserve SPARQL’s declarative semantics. Finally, I shall conclude with ongoing work to implement an RDF store on Google BigQuery.

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 25 Oct
Displayed time zone: Tijuana, Baja California change

10:30 - 12:00	Programming ModelsSPLASH-I at Regency D Chair(s): Gail Murphy University of British Columbia

10:30 30m Talk		Direct Manipulation Programming Systems SPLASH-I Ravi Chugh University of Chicago
11:00 30m Talk		Toward Scalable Semantic Big Data SPLASH-I Julian Dolby IBM Thomas J. Watson Research Center
11:30 30m Talk		Towards PPL: Extending Coroutines in Hack to Enable Probabilistic Programming SPLASH-I Michael Tingley

Toward Scalable Semantic Big Data

Wed 25 Oct
Displayed time zone: Tijuana, Baja California change

Julian Dolby

IBM Thomas J. Watson Research Center

Tracks

Co-hosted Conferences

Workshops

Co-hosted Symposia

Toward Scalable Semantic Big Data

Program Display Configuration

Program Display Configuration

Wed 25 OctDisplayed time zone: Tijuana, Baja California change

Julian Dolby

IBM Thomas J. Watson Research Center

Wed 25 Oct
Displayed time zone: Tijuana, Baja California change