AI on the Edge: An Automated Pipeline for PyTorch-to-Android Deployment and Benchmarking

Saif U Din; Hussain, Muhammad Ahsan; Ikram, Mohsin; Ignatov, Dmitry; Timofte, Radu

doi:10.5281/zenodo.17684575

Published November 23, 2025 | Version v1.0

Publication Open

AI on the Edge: An Automated Pipeline for PyTorch-to-Android Deployment and Benchmarking

1. Computer Vision Lab, CAIDAS & IFI, University of Wurzburg, Germany

The deployment of deep learning models on mobile devices is a cornerstone of modern AI applications. However, performance benchmarking in this domain remains a predominantly manual, time consuming, and non scalable process. This paper introduces a fully automated, end-to-end pipeline NN Lite \url{https://github.com/ABrain-One/nn-lite} that bridges the critical gap between model development in PyTorch and rigorous performance evaluation on the Android platform. Our system comprises a Python based orchestration framework that manages model conversion, emulator control, and data collection, working in tandem with a lightweight Android application for on device benchmarking. The orchestrator systematically converts PyTorch models to TensorFlow Lite format, deploys the benchmark application, executes inference tests, and retrieves detailed performance reports. The output is a collection of structured JSON reports containing precise inference latency metrics enriched with device specific hardware analytics. This framework eliminates manual intervention, ensures reproducibility, and provides a scalable solution for evaluating the on device performance of diverse neural network architectures. In a large scale evaluation, the system successfully processed over 7,500 models, demonstrating exceptional with 48+ hours of continuous unattended operation, thereby establishing a new standard for automated mobile ML testing infrastructure.

Files

AI_on_the_edge_automatic_pipeline-1.pdf

Files (318.5 kB)

Name	Size	Download all
AI_on_the_edge_automatic_pipeline-1.pdf md5:eb5a85dbcc04e329833522e6f7544361	260.6 kB	Preview Download
android_Medium_Phone_API_36.1.json md5:204004e8d6149350b65012577f2add96	2.9 kB	Preview Download
processing_state.json md5:8c1f6b22a87f1d17a0587dacdc6948f7	55.0 kB	Preview Download

Additional details

Cites: Dataset: https://github.com/ABrain-One/nn-dataset/ (URL); Software: https://github.com/ABrain-One/nn-gpt (URL)
Is supplement to: Software: https://github.com/ABrain-One/nn-lite (URL)

Created: 2015-10-15

Date when paper was completed

Repository URL: https://github.com/ABrain-One/nn-lite
Programming language: Python , Kotlin
Development Status: Active

Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. Once-for-all: Train one network and specialize it for efficient deployment. In International Conference on Learning Representations (ICLR), 2020.
Arash Torabi Goodarzi, Roman Kochnev, Waleed Khalid, Furui Qin, Tolgay Atinc Uzun, Yashkumar Sanjaybhai Dhameliya, Yash Kanubhai Kathiriya, Zofia Antonina Ben- tyn, Dmitry Ignatov, and Radu Timofte. Lemur neural net- work dataset: Towards seamless automl. arXiv preprint arXiv:2504.10552, 2025
Google LLC. Firebase Test Lab: Cloud-Based App Test- ing Infrastructure, 2016. Cloud-based infrastructure for au- tomated mobile app testing
Google LLC. TensorFlow Lite: On-device machine learning framework, 2017. Open-source deep learning framework for mobile and edge devices.
Song Han, Huizi Mao, and William J Dally. Deep com- pression: Compressing deep neural networks with pruning, trained quantization and huffman coding. International Con- ference on Learning Representations (ICLR), 2020.
Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, and Luc Van Gool. AI benchmark: Running deep neural networks on android smartphones. In European Conference on Computer Vision (ECCV), pages 288–314, 2018.
Roman Kochnev, Arash Torabi Goodarzi, Zofia Antonina Bentyn, Dmitry Ignatov, and Radu Timofte. Optuna vs code llama: Are llms a new paradigm for hyperparameter tuning? In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2025.
Roman Kochnev, Waleed Khalid, Tolgay Atinc Uzun, Xi Zhang, Yashkumar Sanjaybhai Dhameliya, Furui Qin, Dmitry Ignatov, and Radu Timofte. Nngpt: Rethinking au- toml with large language models. arXiv preprint, 2025
Dianshu Liao, Shidong Pan, Siyuan Yang, Yanjie Zhao, Zhenchang Xing, and Xiaoyu Sun. A comparative study of android performance issues in real-world applications and literature. arXiv preprint arXiv:2401.07849, 2024
Meta AI. PyTorch Mobile: End-to-end deployment solution for mobile and embedded devices. Meta AI Research, 2019. Official documentation and framework release
Meta AI. AI Edge Torch: PyTorch Library for Edge De- vice Deployment, 2024. Official PyTorch extension for edge computing and mobile deployment.
Vijay Janapa Reddi, David Kanter, Peter Mattson, Jared Duke, Thai Nguyen, Ramesh Chukka, Ken Shiring, Koan- Sin Tan, Mark Charlebois, William Chou, Mostafa El- Khamy, Jungwook Hong, Tom St. John, Cindy Trinh, Michael Buch, Mark Mazumder, Relia Markovic, Thomas Atta, Fatih Cakir, Masoud Charkhabi, Xiaodong Chen, Cheng-Ming Chiang, Dave Dexter, Terry Heo, Gunther Schmuelling, Maryam Shabani, and Dylan Zika. MLPerf mobile inference benchmark. In MLPerf. MLCommons, 2020.
Md Ziaul Haque Zim. TinyML: Analysis of Xtensa LX6 mi- croprocessor for neural network inference. Future Internet, 15(11):350, 2023

	All versions	This version
Views	104	104
Downloads	74	74
Data volume	21.7 MB	21.7 MB

AI_on_the_edge_automatic_pipeline-1.pdf

Files (318.5 kB)

Related works

Dates

Software

References

AI on the Edge: An Automated Pipeline for PyTorch-to-Android Deployment and Benchmarking

Authors/Creators

Description

Files

AI_on_the_edge_automatic_pipeline-1.pdf

Files (318.5 kB)

Additional details

Related works

Dates

Software

References