Exploiting Implicit Beliefs to Resolve Sparse Usage Problem in Usage-Based Specification Mining (SPLASH 2017 - OOPSLA)

Write a Blog >>

Sun 22 - Fri 27 October 2017 Vancouver, Canada

Who

Samantha Syeda Khairunnesa, Hoan Anh Nguyen, Tien N. Nguyen, Hridesh Rajan

Track

SPLASH 2017 OOPSLA

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 26 Oct 2017 13:30 - 13:52 at Regency A - Mining Software Repositories and Parsing Chair(s): Wolfgang De Meuter

Abstract

Frameworks and libraries provide application programming interfaces (APIs) that serve as building blocks in modern software development. As APIs present the opportunity of increased productivity, it also calls for correct use to avoid buggy code. The usage-based specification mining technique has shown great promise in solving this problem through a data-driven approach. These techniques leverage the use of the API in large corpora to understand the recurring usages of the APIs and infer behavioral specifications (preconditions and postconditions) from such usages. A challenge for such technique is thus inference in the presence of insufficient usages, in terms of both frequency and richness. We refer to this as a "sparse usage problem." This paper presents the first technique to solve the sparse usage problem in usage-based precondition mining. Our key insight is to leverage implicit beliefs to overcome sparse usage. An implicit belief (IB) is the knowledge implicitly derived from the fact about the code. An IB about a program is known implicitly to a programmer via the language's constructs and semantics, and thus not explicitly written or specified in the code. The technical underpinnings of our new precondition mining approach include a technique to analyze the data and control flow in the program leading to API calls to infer preconditions that are implicitly present in the code corpus, a catalog of 35 code elements in total that can be used to derive implicit beliefs from a program, and empirical evaluation of all of these ideas. We have analyzed over 350 millions lines of code and 7 libraries that suffer from the sparse usage problem. Our approach realizes 6 implicit beliefs and we have observed that addition of single-level context sensitivity can further improve the result of usage based precondition mining. The result shows that we achieve overall 60% in precision and 69% in recall and the accuracy is relatively improved by 32% in precision and 78% in recall compared to base usage-based mining approach for these libraries.

DOI

https://doi.org/10.1145/3133907

Samantha Syeda Khairunnesa

Iowa State University

United States

Hoan Anh Nguyen

Iowa State University, USA

Tien N. Nguyen

University of Texas at Dallas

United States

Hridesh Rajan

Iowa State University

United States

Time Zone

The program is currently displayed in (GMT-07:00) Tijuana, Baja California.

Use conference time zone: (GMT-07:00) Tijuana, Baja CaliforniaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 26 Oct
Displayed time zone: Tijuana, Baja California change

13:30 - 15:00	Mining Software Repositories and ParsingOOPSLA at Regency A Chair(s): Wolfgang De Meuter Vrije Universiteit Brussel

13:30 22m Talk		Exploiting Implicit Beliefs to Resolve Sparse Usage Problem in Usage-Based Specification Mining OOPSLA Samantha Syeda Khairunnesa Iowa State University, Hoan Anh Nguyen Iowa State University, USA, Tien N. Nguyen University of Texas at Dallas, Hridesh Rajan Iowa State University DOI
13:52 22m Talk		DéjàVu: A Map of Code Duplicates on GitHub OOPSLA Crista Lopes University of California, Irvine, Petr Maj ReactorLabs, Pedro Martins University of California at Irvine, USA, Vaibhav Saini University of California at Irvine, USA, Di Yang University of California at Irvine, USA, Jakub Zitny Czech Technical University, Czechia, Hitesh Sajnani Microsoft , Jan Vitek Northeastern University, USA DOI
14:15 22m Talk		Understanding the Use of Lambda Expressions in Java OOPSLA Davood Mazinanian Concordia University, Canada, Ameya Ketkar Oregon State University, USA, Nikolaos Tsantalis Concordia University, Canada, Danny Dig School of EECS at Oregon State University DOI
14:37 22m Talk		Restricting Grammars with Tree Automata OOPSLA Michael D. Adams University of Utah, USA, Matthew Might University of Utah, USA DOI