Measuring Cloud API Usage in a Local-First Hermes Agent Deployment: A Log-Based Comparison of Remote Inference vs. On-Premises ds4

Tanaka, Hideki

doi:10.5281/zenodo.20914810

Published June 26, 2026 | Version v1

Preprint Open

Measuring Cloud API Usage in a Local-First Hermes Agent Deployment: A Log-Based Comparison of Remote Inference vs. On-Premises ds4

Tanaka, Hideki¹

1. Elvez, Inc.

Organizations that run Hermes Agent with a local inference backend still need to know whether auxiliary tasks silently route conversation data to cloud LLM providers. We quantify that exposure on a production deployment where main chat inference uses a local ds4-server endpoint (DeepSeek V4 Flash), while several Hermes auxiliary slots were configured for a commercial remote API. Parsing retained Hermes agent logs (2026-04-12–2026-06-26), we find 227 logged remote API calls totaling roughly ~18 million input and ~0.17 million output tokens. All successful remote chat completions in this window were tagged platform=curator (background skill maintenance). Context compression was configured to prefer the remote provider (32 route attempts, 32 compression events). After binding the main model locally on 2026-06-17, successful remote API calls dropped to 0, while local ds4 calls reached 694. We conclude that cloud APIs were not on the critical path of day-to-day operations and that on-premises ds4-server with DeepSeek V4 Flash is viable for production use; we outline configuration hygiene and a future local-only path for analyzing customer-confidential documents.

Files

cloud-api-traffic_20260626.pdf

Files (80.1 kB)

Name	Size	Download all
cloud-api-traffic_20260626.pdf md5:3ceaba43eba0a1e39ba0d77fecad9d54	80.1 kB	Preview Download

Additional details

Is supplement to: Preprint: 10.5281/zenodo.20519019 (DOI)

	All versions	This version
Views	9	9
Downloads	1	1
Data volume	160.2 kB	160.2 kB

Measuring Cloud API Usage in a Local-First Hermes Agent Deployment: A Log-Based Comparison of Remote Inference vs. On-Premises ds4

Authors/Creators

Description

Files

cloud-api-traffic_20260626.pdf

Files (80.1 kB)

Additional details

Related works