Presentation Open Access

A Large-scale Study on API Misuses in the Wild

Li, Xia; Jiang, Jiajun; Benton, Samuel; Xiong, Yingfei; Zhang, Lingming

DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="" xmlns="" xsi:schemaLocation="">
  <identifier identifierType="DOI">10.5281/zenodo.4661089</identifier>
      <creatorName>Li, Xia</creatorName>
      <affiliation>Kennesaw State University</affiliation>
      <creatorName>Jiang, Jiajun</creatorName>
      <affiliation>Tianjin University</affiliation>
      <creatorName>Benton, Samuel</creatorName>
      <affiliation>The University of Texas at Dallas</affiliation>
      <creatorName>Xiong, Yingfei</creatorName>
      <affiliation>Peking University</affiliation>
      <creatorName>Zhang, Lingming</creatorName>
      <affiliation>University of Illinois at Urbana-Champaign</affiliation>
    <title>A Large-scale Study on API Misuses in the Wild</title>
    <date dateType="Issued">2021-04-12</date>
  <resourceType resourceTypeGeneral="Text">Presentation</resourceType>
    <alternateIdentifier alternateIdentifierType="url"></alternateIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.4661088</relatedIdentifier>
    <rights rightsURI="">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
    <description descriptionType="Abstract">&lt;p&gt;API misuses are prevalent and extremely harmful.&lt;br&gt;
Despite various techniques have been proposed for API-misuse&lt;br&gt;
detection, it is not even clear how different types of API misuses&lt;br&gt;
distribute and whether existing techniques have covered all major&lt;br&gt;
types of API misuses. Therefore, in this paper, we conduct the&lt;br&gt;
first large-scale empirical study on API misuses based on 528,546&lt;br&gt;
historical bug-fixing commits from GitHub (from 2011 to 2018).&lt;br&gt;
By leveraging a state-of-the-art fine-grained AST differencing&lt;br&gt;
tool, GumTree, we extract more than one million bug-fixing&lt;br&gt;
edit operations, 51.7% of which are API misuses. We further&lt;br&gt;
systematically classify API misuses into nine different categories&lt;br&gt;
according to the edit operations and context. We also extract&lt;br&gt;
various frequent API-misuse patterns based on the categories&lt;br&gt;
and corresponding operations, which can be complementary to&lt;br&gt;
existing API-misuse detection tools. Our study reveals various&lt;br&gt;
practical guidelines regarding the importance of different types&lt;br&gt;
of API misuses. Furthermore, based on our dataset, we perform&lt;br&gt;
a user study to manually analyze the usage constraints of 10&lt;br&gt;
patterns to explore whether the mined patterns can guide the&lt;br&gt;
design of future API-misuse detection tools. Specifically, we find&lt;br&gt;
that 7,541 potential misuses still exist in latest Apache projects&lt;br&gt;
and 149 of them have been reported to developers. To date, 57&lt;br&gt;
have already been confirmed and fixed (with 15 rejected misuses&lt;br&gt;
correspondingly). The results indicate the importance of studying&lt;br&gt;
historical API misuses and the promising future of employing our&lt;br&gt;
mined patterns for detecting unknown API misuses.&lt;br&gt;
All versions This version
Views 2020
Downloads 66
Data volume 1.5 GB1.5 GB
Unique views 1717
Unique downloads 55


Cite as