Phi-3-Vision on Apple Silicon: MLX Porting Guide

Albers, Josef

doi:10.5281/zenodo.13254550

Published August 1, 2024 | Version v4

Book Open

Phi-3-Vision on Apple Silicon: MLX Porting Guide

Albers, Josef

This tutorial series presents a comprehensive guide to porting and optimizing Microsoft's Phi-3-Vision, a compact yet powerful vision-language model, to Apple's MLX framework for efficient execution on Apple Silicon. The series covers a range of advanced techniques for model adaptation and performance enhancement, including: 1) Basic implementation of Phi-3-Vision in MLX, 2) Integration of Su-scaled Rotary Position Embeddings (SuRoPE) for handling long contexts, 3) Implementation of efficient batching techniques, 4) Development of caching mechanisms for accelerated text generation, 5) Exploration of advanced decoding strategies for guided outputs, 6) Implementation of Low-Rank Adaptation (LoRA) for efficient fine-tuning, and 7) Creation of an Agent class with a flexible toolchain system for complex AI workflows. Additionally, the series demonstrates the broader applicability of these techniques by extending them to port Google's PaliGemma model. This work contributes to the growing field of optimizing large language models for consumer-grade hardware, potentially broadening access to sophisticated AI capabilities.

Files

mlx_porting_guide.pdf

Files (501.9 kB)

Name	Size	Download all
mlx_porting_guide.pdf md5:0e512b611c66e65647c98dbc63cd1c53	501.9 kB	Preview Download

Additional details

Created: 2024-08-01

Repository URL: https://github.com/JosefAlbers/Phi-3-Vision-MLX
Programming language: Python
Development Status: Active

	All versions	This version
Views	169	111
Downloads	121	45
Data volume	63.0 MB	25.1 MB

Phi-3-Vision on Apple Silicon: MLX Porting Guide

Files

mlx_porting_guide.pdf

Files (501.9 kB)

Additional details

Dates

Software

Phi-3-Vision on Apple Silicon: MLX Porting Guide

Creators

Description

Files

mlx_porting_guide.pdf

Files (501.9 kB)

Additional details

Dates

Software