Published February 10, 2026 | Version v1
Preprint Open

Machine Learning as a Tool (MLAT): A Framework for Integrating Statistical ML Models as Callable Tools within LLM Agent Workfows

  • 1. Legacy AI LLC Company Limited

Description

We introduce Machine Learning as a Tool (MLAT), a design pattern in which pretrained statistical ML models are exposed as callable tools within LLM agent workows, enabling the orchestrating agent to invoke real-time predictions and reason about their outputs contextually. Unlike conventional pipelines that treat ML inference as a static preprocessing step, MLAT positions the ML model as a rst-class tool alongside web search, database queries, and API calls, allowing the LLM to decide when and how to invoke the model based on conversational context. Despite the naturalness of this pattern, it appears to be underexplored in both the academic literature on agentic AI and in production system architectures.

To validate MLAT, we present PitchCraft, a pilot production system we built for the Google Gemini Hackathon that transforms discovery call recordings into professional proposals with ML-predicted pricing. PitchCraft implements MLAT through a single LLM workow containing two Gemini-powered agents: a Research Agent that performs prospect intelligence gathering via parallel tool calls, and a Draft Agent that invokes an XGBoost pricing model as a tool call, reasons about the prediction, and generates a complete proposal via structured output parsing. The XGBoost model, trained on 70 examples (40 real agency deals augmented with 30 human-veried synthetic records), achieves R^2 = 0.807 on held-out test data with MAE of $3,688. The complete system reduces proposal generation from 3+ hours to under 10 minutes. 

We detail the MLAT framework formally, the structured output parsing architecture using Gemini's JSON schema capabilities, the ML methodology under extreme data scarcity (10:1 sample-to-feature ratio), group-aware cross-validation to prevent data leakage, and a sensitivity analysis demonstrating that the model has learned economically meaningful feature relationships. We argue that MLAT's applicability extends to any domain requiring quantitative estimation combined with contextual reasoning

Files

PitchCraft_MLAT_Paper_v3.pdf

Files (454.9 kB)

Name Size Download all
md5:cfe1b9b6e8b728b2287e09fb1d597db4
454.9 kB Preview Download

Additional details

Software

Repository URL
https://github.com/LegacyAIDev/pitch-craft
Programming language
Python , TypeScript
Development Status
Active