An Empirical Study of npm Ecosystem Related Tweets by Core Software Developers: A Case Study
Creators
- 1. Noakhali Science and Technology University
- 2. Universitas Muhammadiyah Surakarta, Surakarta
- 3. Universitas Jenderal Soedirman, Purwokerto
- 4. Nara Institute of Science and Technology
Description
Summary:
npm ecosystem is crucial for the JavaScript community and its development is significantly influenced by the opinions and feedback of core software developers. Many software developers have utilized social media, such as Twitter, to share community-related information and their views. However, the communication between npm developers via Twitter in terms of topics, nature, and sentiment have not been analyzed. This study conducts an empirical analysis of tweets by core software developers related to the npm ecosystem to understand their perceptions and opinions better. A dataset of tweets was collected and analyzed using qualitative analysis techniques to identify the topic of tweets, their nature and their sentiments. Our study demonstrated that most tweets belong to the package management category followed by notifications and community-related information. The most frequently discussed topics among core software developers in the package management category are usage scenarios. It appears that developers use a variety of approaches, with information tweets dominating, followed by questions and answers. Additionally, the sentiment analysis revealed that developers express more positive sentiments towards notification and community-related discussion while expressing more neutral opinions towards the package ecosystem discussion category. This case study provides valuable insights into the perceptions and opinions of core software developers regarding the npm ecosystem and can inform future development and decision-making.
Data Collection Method:
The data collection process consists of two steps: (i) Building npm core developers Twitter Id dataset and (ii) Extracting Core developers' Tweets, and (iii) Preparing a sample Tweet dataset. Details of the data collection procedure are explained below:
Step-1: Building npm core developers Tweeter Id dataset- For an in-depth examination of npm ecosystem-related issues, we primarily focused on core developers Tweets. To identify core developers, we only select those who performed at least one pull request on software packages published on npm ecosystem. To do so, first, we extracted all npm pull requests through GitHub API and obtained 123,647 core developers' GitHub id. We subsequently extracted user profile information for each GitHub id. The output of this step is 14,330 Twitter IDs of npm core developers interlinked with GitHub ids.
Step-2: Extracting core developers Tweets- To extract npm core developers Tweets using output of step-1, we utilized official Twitter search API. Afterwards, to filter npm ecosystem-related Tweets we limit our data collection process to tweets that only correspond to specific keywords such as npm, pnpm, npm-install, npm-scripts, npmignore, npm-shrinkwrap obtained from our previous work (i.e., Islam, Syful, et al. "An Empirical Study of Package Management Issues via Stack Overflow." IEICE TRANSACTIONS on Information and Systems 106.2 (2023): 138-147). The output of this step is 39,425 tweets dataset (D1) related to npm ecosystem by core developers.
Step-3: Preparing sample Tweet dataset- After obtaining the npm ecosystem-related Tweets (i.e., dataset D1) posted by core developers, we prepare a sample dataset keeping 99% confidence with interval 3. The output of this step is 1,176 tweets dataset (D2).
Files
Replication Package.zip
Files
(4.2 MB)
Name | Size | Download all |
---|---|---|
md5:696e3fd4a887975bb3dca177eaa16ee3
|
4.2 MB | Preview Download |