Published June 17, 2025 | Version jss26
Software Open

MENTOR: Fixing Introductory Programming Assignments With Formula-Based Fault Localization and LLM-Driven Program Repair

  • 1. ROR icon University of Oxford
  • 2. Department of Computer Science, University of Oxford
  • 3. ROR icon Czech Technical University in Prague
  • 4. ROR icon Instituto Superior Técnico

Description

MENTOR: Fixing Introductory Programming Assignments With Formula-Based Fault Localization and LLM-Driven Program Repair

Overview

This is the official Zenodo artifact for **“MENTOR: Fixing Introductory Programming Assignments With Formula-Based Fault Localization and LLM-Driven Program Repair”**, published in the *Journal of Systems and Software (JSS)*, 2026.

MENTOR [1] is a semantic automated program repair (APR) framework designed to provide automated feedback on introductory programming assignments (IPAs). It leverages previous student submissions and integrates program clustering [2], variable alignment [3], fault localization [4], and Large Language Models (LLMs) [5] to guide program repair.

MENTOR can either provide feedback by highlighting the faulty program statements in the students' programs or by fixing the incorrect program and presenting students with a fixed program.

Over 70% of students found MENTOR's feedback helpful in understanding and correcting their programming mistakes [6].

![MENTOR Overview](MENTOR-overview.jpg)

Repository Structure

This repository contains several directories, scripts, and submodules:

Git submodules:

- **program-clustering/**: Clustering-based program repair module.
- **variable-mapping/**: submodule for variable alignment in programs.
- **fault-localization/**: Fault localization submodule.
- **C-Pack-IPAs/**: submodule with a benchmark of introductory programming assignments (IPAs) [6].

Codebase:

- **mentor/**: Core implementation of MENTOR.
- **LLMs/**: Contains components for LLM-based repair processes.
- **code_metrics/**: Provides code evaluation metrics.
- **database/**: Stores and manages relevant data for program repair.
- **utils/**: Utility scripts and helper functions.
- **LLM-CEGIS-Repair.md**: [README](LLM-CEGIS-Repair.md) for the LLM-based repair approach.
- **how_to_run_LLMs.sh**: Script explaining how to run LLM-based repair.
- **how_to_run_RepairAgents.sh**: Script explaining how to run repair agents.
- **repair.py**: The main script to run MENTOR.
- **repair_CPackIPAs.sh**: Script to run MENTOR on the entire C-Pack-IPAs benchmark using different prompt configurations.
- **requirements.txt**: Lists required dependencies for MENTOR.

Installation

MENTOR relies on multiple submodules, each containing its own requirements and implementation instructions. To set up the project, follow these steps:

```bash
git clone --recurse-submodules git@github.com:pmorvalho/MENTOR.git
cd MENTOR
pip install -r requirements.txt
```

For additional dependencies, check the requirements files inside each submodule.

Usage

To hearn how to run MENTOR on an individual program repair task, run:

```bash
python repair.py -h
```

To learn how to run MENTOR on the entire C-Pack-IPAs benchmark with different prompt configurations, run:

```bash
./repair_CPackIPAs.sh -h
```

For LLM-based repair, refer to the `how_to_run_LLMs.sh` script. For running repair agents, use `how_to_run_RepairAgents.sh`.

Citation

If you use MENTOR in your research, please cite the following paper:

```

@article{OrvalhoJM26,
  author      = {Pedro Orvalho and
                       Mikol{\'{a}}s Janota and
                       Vasco Manquinho},
  title          = {{MENTOR: Fixing Introductory Programming Assignments with Formula-Based Fault Localization and LLM-Driven Program Repair}},
  journal     = {Journal of Systems and Software},
  year         = {2026},
  publisher = {Elsevier},
  issn = {0164-1212},
  doi = {https://doi.org/10.1016/j.jss.2025.112690},
  url = {https://www.sciencedirect.com/science/article/pii/S0164121225003590}
}

```

Contributing

Contributions are welcome! Please follow the standard GitHub workflow:

1. Fork the repository.
2. Create a feature branch.
3. Commit your changes.
4. Open a pull request.

Maintenance, Support && Collaborations

MENTOR is **actively maintained** and used in ongoing research. We continue to develop the tool and build upon it, and we are **open to collaborations**.

If you run into a problem, please **open an issue** on our GitHub repository and/or (optionally) **email us** so we do not miss it.

License

This project is licensed under the terms of the MIT LICENSE.

REFERENCES

[1] P. Orvalho, M. Janota, and V. Manquinho. MENTOR: Fixing Introductory Programming Assignments with Formula-Based Fault Localization and LLM-Driven Program Repair. The Journal of Systems & Software, JSS 2026. [PDF](https://www.sciencedirect.com/science/article/pii/S0164121225003590). [GitHub](https://github.com/pmorvalho/MENTOR).

[2] P. Orvalho, M. Janota, and V. Manquinho. InvAASTCluster: On Applying Invariant-Based Program Clustering to Introductory Programming Assignments. arXiv 2022. [PDF](https://arxiv.org/pdf/2206.14175). [GitHub](https://github.com/pmorvalho/InvAASTCluster).

[3] P. Orvalho, J. Piepenbrock, M. Janota, and . Manquinho. Graph Neural Networks For Mapping Variables Between Programs. ECAI 2023. [PDF](https://arxiv.org/pdf/2307.13014.pdf). [GitHub](https://github.com/pmorvalho/ecai23-GNNs-for-mapping-variables-between-programs).

[4] P. Orvalho, M. Janota, and V. Manquinho. CFaults: Model-Based Diagnosis for Fault Localization in C with Multiple Test Cases. The 26th International Symposium on Formal Methods, FM 2024. [PDF](https://arxiv.org/pdf/2407.09337). [GitHub](https://github.com/pmorvalho/CFaults).

[5] P. Orvalho, M. Janota, and V. Manquinho. Counterexample Guided Program Repair Using Zero-Shot Learning and MaxSAT-based Fault Localization. In the 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025. [PDF](https://arxiv.org/pdf/2502.07786). [GitHub](https://github.com/pmorvalho/LLM-CEGIS-Repair).

[6] P. Orvalho, M. Janota, and V. Manquinho. GitSEED: A Git-backed Automated Assessment Tool for Software Engineering and Programming Education. The 1st ACM Virtual Global Computing Education Conference, SIGCSE Virtual 2024. [PDF](https://arxiv.org/pdf/2409.07362). [GitLab](https://gitlab.inesc-id.pt/u020557/GitSEED).

[7] P. Orvalho, M. Janota, and V. Manquinho. C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming Assignments. In the 5th International Workshop on Automated Program Repair, APR 2024, co-located with ICSE 2024. [PDF](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10653136). [GitHub](https://github.com/pmorvalho/C-Pack-IPAs).

Files

MENTOR.zip

Files (6.6 MB)

Name Size Download all
md5:407b75bbd36858042c12573dc3835848
6.6 MB Preview Download

Additional details

Dates

Available
2025-11-12

Software

Repository URL
https://github.com/pmorvalho/MENTOR
Programming language
C
Development Status
Active