AI on the Edge: An Automated Pipeline for PyTorch-to-Android Deployment and Benchmarking
Authors/Creators
- 1. Computer Vision Lab, CAIDAS & IFI, University of Wurzburg, Germany
Description
The deployment of deep learning models on mobile devices is a cornerstone of modern AI applications. However, performance benchmarking in this domain remains a predominantly manual, time consuming, and non scalable process. This paper introduces a fully automated, end-to-end pipeline NN Lite \url{https://github.com/ABrain-One/nn-lite} that bridges the critical gap between model development in PyTorch and rigorous performance evaluation on the Android platform. Our system comprises a Python based orchestration framework that manages model conversion, emulator control, and data collection, working in tandem with a lightweight Android application for on device benchmarking. The orchestrator systematically converts PyTorch models to TensorFlow Lite format, deploys the benchmark application, executes inference tests, and retrieves detailed performance reports. The output is a collection of structured JSON reports containing precise inference latency metrics enriched with device specific hardware analytics. This framework eliminates manual intervention, ensures reproducibility, and provides a scalable solution for evaluating the on device performance of diverse neural network architectures. In a large scale evaluation, the system successfully processed over 7,500 models, demonstrating exceptional with 48+ hours of continuous unattended operation, thereby establishing a new standard for automated mobile ML testing infrastructure.
Files
AI_on_the_edge_automatic_pipeline-1.pdf
Additional details
Related works
- Cites
- Dataset: https://github.com/ABrain-One/nn-dataset/ (URL)
- Software: https://github.com/ABrain-One/nn-gpt (URL)
- Is supplement to
- Software: https://github.com/ABrain-One/nn-lite (URL)
Dates
- Created
-
2015-10-15Date when paper was completed
Software
- Repository URL
- https://github.com/ABrain-One/nn-lite
- Programming language
- Python , Kotlin
- Development Status
- Active
References
- Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. Once-for-all: Train one network and specialize it for efficient deployment. In International Conference on Learning Representations (ICLR), 2020.
- Arash Torabi Goodarzi, Roman Kochnev, Waleed Khalid, Furui Qin, Tolgay Atinc Uzun, Yashkumar Sanjaybhai Dhameliya, Yash Kanubhai Kathiriya, Zofia Antonina Ben- tyn, Dmitry Ignatov, and Radu Timofte. Lemur neural net- work dataset: Towards seamless automl. arXiv preprint arXiv:2504.10552, 2025
- Google LLC. Firebase Test Lab: Cloud-Based App Test- ing Infrastructure, 2016. Cloud-based infrastructure for au- tomated mobile app testing
- Google LLC. TensorFlow Lite: On-device machine learning framework, 2017. Open-source deep learning framework for mobile and edge devices.
- Song Han, Huizi Mao, and William J Dally. Deep com- pression: Compressing deep neural networks with pruning, trained quantization and huffman coding. International Con- ference on Learning Representations (ICLR), 2020.
- Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, and Luc Van Gool. AI benchmark: Running deep neural networks on android smartphones. In European Conference on Computer Vision (ECCV), pages 288–314, 2018.
- Roman Kochnev, Arash Torabi Goodarzi, Zofia Antonina Bentyn, Dmitry Ignatov, and Radu Timofte. Optuna vs code llama: Are llms a new paradigm for hyperparameter tuning? In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2025.
- Roman Kochnev, Waleed Khalid, Tolgay Atinc Uzun, Xi Zhang, Yashkumar Sanjaybhai Dhameliya, Furui Qin, Dmitry Ignatov, and Radu Timofte. Nngpt: Rethinking au- toml with large language models. arXiv preprint, 2025
- Dianshu Liao, Shidong Pan, Siyuan Yang, Yanjie Zhao, Zhenchang Xing, and Xiaoyu Sun. A comparative study of android performance issues in real-world applications and literature. arXiv preprint arXiv:2401.07849, 2024
- Meta AI. PyTorch Mobile: End-to-end deployment solution for mobile and embedded devices. Meta AI Research, 2019. Official documentation and framework release
- Meta AI. AI Edge Torch: PyTorch Library for Edge De- vice Deployment, 2024. Official PyTorch extension for edge computing and mobile deployment.
- Vijay Janapa Reddi, David Kanter, Peter Mattson, Jared Duke, Thai Nguyen, Ramesh Chukka, Ken Shiring, Koan- Sin Tan, Mark Charlebois, William Chou, Mostafa El- Khamy, Jungwook Hong, Tom St. John, Cindy Trinh, Michael Buch, Mark Mazumder, Relia Markovic, Thomas Atta, Fatih Cakir, Masoud Charkhabi, Xiaodong Chen, Cheng-Ming Chiang, Dave Dexter, Terry Heo, Gunther Schmuelling, Maryam Shabani, and Dylan Zika. MLPerf mobile inference benchmark. In MLPerf. MLCommons, 2020.
- Md Ziaul Haque Zim. TinyML: Analysis of Xtensa LX6 mi- croprocessor for neural network inference. Future Internet, 15(11):350, 2023