On the Impact of Using Trivial Packages: An Empirical Case Study on npm and PyPI
Description
PAPER ABSTRACT
Code reuse has traditionally been encouraged since it enables one to avoid re-inventing the wheel. Due to the npm left-pad package incident where a trivial package led to the breakdown of some of the most popular web applications such as Facebook and Netflix, some questioned such reuse. Reuse of trivial packages is particularly prevalent in platforms such as npm. To date, there is no study that examines the reason why developers reuse trivial packages other than in npm. Therefore, in this paper, we study two large package management platforms npm and PyPI. We mine more than 500,000 npm packages and 38,000 JavaScript applications and more than 63,000 PyPI packages and 14,000 Python applications to study the prevalence of trivial packages. We found that trivial packages are common, making up between 16% to 10.5% of the studied platforms. We performed surveys with 125 developers who use trivial packages to understand the reasons and drawbacks of their use. Our survey revealed that trivial packages are used because they are perceived to be well implemented and tested pieces of code. However, developers are concerned about maintaining and the risks of breakages due to the extra dependencies trivial packages introduce. To objectively verify the survey results, we empirically validate the most cited reason and drawback and found that, contrary to developers' beliefs, around 28.4% of npm and 51% PyPI trivial packages even have tests. However, trivial packages appear to be 'deployment tested' and to have similar test, usage, and community interest as non-trivial packages. On the other hand, we found that 11.5% and 2.9% of the studied trivial packages have more than 20 dependencies in npm and PyPI, respectively. We conclude that developers should be careful about which trivial packages they decide to use.
DATA COLLECTION
The datasets use in this study are collected from mainly five resources.
- Node Package Manager (npm): We collected the last version of all the npm packages publish on npm as of September 30, 2017.
- Python Package Index (PyPI): We also collected and analyzed the last version of all the packages published on PyPI as of September 30, 2017.
- npms: We also collected and analyzed data from npms (https://api-docs.npms.io/) which is the official search engine used by npm. We used the npms API to collected the data. We collected the data on April 15, 2019.
- GitHub Repositories: We cloned collected JavaScript and Python applications from GitHub.
- Libraries.io: We use the Libraries.io to calculated the number of Python applications that use trivial PyPI packages. We use the dataset of libraries.io released on December 22, 2018
We also use Google Forms to create the developers' surveys.
SOFTWARE USED FOR ANALYZING THE DATA
We use the following software when we collected and analyzed our datasets:
- Linux based operating system.
- Python interpreter (2.7 and Python 3 compatibility was used)
- R language (3.4.4). Also, all the R packages that are compatible with this version.
- Understand Tool (version 4.0)
HARDWARE:
We run our data collection on a cluster of four Linux-based systems with a 2.10GHz CPU (Intel Xeon) and 16 GB of RAM.
Files
NPM_Drawback_Of_Using_Trivial_Packages_Developers_Responses.csv
Files
(1.5 MB)
Name | Size | Download all |
---|---|---|
md5:581d40853568b3556238adbad5b0cc2d
|
23.2 kB | Preview Download |
md5:bfb074d5109885037cd257a2617cfc6b
|
551.0 kB | Preview Download |
md5:62524878fd5505d5daf0556567250d4c
|
7.9 kB | Preview Download |
md5:a91ea14490fa2736ad7a4b81520f65d1
|
26.3 kB | Preview Download |
md5:44be4077166c16597cb95849ed2d85a6
|
7.2 kB | Preview Download |
md5:1960f62ea388c05b1e69ec1112e5742a
|
102.1 kB | Preview Download |
md5:571a00dfa6916b6f65ad9eff0f7ba0ba
|
94.6 kB | Preview Download |
md5:2305293f6293438bf6781ee4aa83125b
|
6.3 kB | Preview Download |
md5:2a0e6344832789267fdb6defb4513a76
|
324.1 kB | Preview Download |
md5:5c2db8637374ebdcac1b820c2cfd2eb0
|
3.3 kB | Preview Download |
md5:fc4dc276d8e2452a6ed37eb0757bb57b
|
5.8 kB | Preview Download |
md5:68f81e3f2ac3d896a9e595edbe3ce84d
|
9.1 kB | Preview Download |
md5:b446c29f2881a4ed49d43803b4169946
|
146.5 kB | Preview Download |
md5:c5ab1f7d5a56a8b80d0f8ac880cee36d
|
160.8 kB | Preview Download |