Artifacts of the TOSEM Submission Mario
Description
This is the online repository of Mario, a journal-first paper under review by ACM TOSEM.
Dataset: The dataset used in our study are all open-sourced. We provide the links to them below.
- Empirical dataset -- collected by Liu et al. (here)
- Evaluation dataset -- collected by Alon et al. (here)
Source code: We release three files here.
- Transformer.py: This is the Transformer model used in Mario which is implemented in Pytorch.
- evaluation.py: This file shows the workflow of Mario and calculates the overall performance of it.
- prior_knowledge.json: This file stores the prior knowledge extracted from our empirical dataset for predicting field-relevant method names to unique fields.
We will build a homepage for Mario on Github and release the whole project upon acceptance.
Explanation of Figure3b:
In our experiment, we find that \(\overline{\mathbb{S}_{M}}\) is slightly higher than \(\overline{\mathbb{S}_{T}}\). Through our manual observation, we find that it happens because tokens composing the investigated method names are repetitive. In the following, we give a concrete example.
For the ErrorsTag.java class of Apache struts1 project, it contains 12 field-relevant method names obtained by combining the Verbs = {get, set} and the Fields = {bundle, footer, locale, name, property, header} in pairs. When \(\alpha\) = 0.5, its proximate classes totally have 14 field-relevant method names composed by the above 12 ones plus with prepareName and createLocale. Under such a condition, the Jaccard similarity of the method name level is 0.857 (12/14), higher than that of the token level which is 0.8 (8/10).
User Study:
We release the queries from the developers and the predictions of Mario in our user study. Note that due to the confidential policy, we only show the last two words of the full qualified class name for each query.
***.impl.MarketService.java: {Init, set config file name, parse config, get config, get config file name}
***.user.UserController.java: {Login, register, logout, delete role, find roles, create admin user, is login, get user by username, find permissions expired}
***.db.Bot.java: {Respond, create kernel handler, init}
***.controller.LoginController.java: {On click, show login form, login, logout}
***.controller.UserController.java: {Delete user, save user, get user by id, get customer users, serve user, send activation email, get activation link, set passward}
***.login.LoginService.java: {Login, logout, login with scm, get authentication}
***.dto.ClientLoginDto.java: {To string, get topology}
***.impl.IdaasServiceImpl.java: {Get mapper id, execute, get job id}
***.router.DBRouterJoinPoint.java: {Materialize string, to string infix, get id}
***.aspect.LogAspect.java: {Log to db, do before, do around, do after returning, do after in service layer, web service}
***.service.DocService.java: {Get project docs, add enum doc strings, search docs, service added, add exception doc strings, doc values string, find supported services, copy of docs}
***.dataclean.convertMessageStructureService.java: {Start up, delete, add module, list types, get type, shut down, delete all, list modules}
***.dataclean.messageRouteAndSendService.java: {On bind, on create, on destroy, send message, on start command, handle message, configure, get error message}
***.addresssimilarity.addressSimilarityService.java: {Delete, get hosted connection, start limited on connection, notify, opened change, on disconnected list changed, create, on initialize, on map changed}
Files
prior_knowledge.json
Files
(2.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:bdd0245b22913b615176dda3b049293a
|
8.2 kB | Download |
|
md5:d468b4428099563178c1f2096b8995cc
|
2.1 MB | Preview Download |
|
md5:d083431117b34ffa2a4abf31f609e1fe
|
8.2 kB | Download |