Chocolate Cloud Object Storage Transfer Speeds Dataset
Description
Overview
This dataset measures upload and download performance between Fly.io gateway regions (origins) and commercial object storage backends (targets). Each row is one measurement for a specific data size, initiated from a Fly.io region and recorded against a particular backend, and is intended for studying network performance, latency-sensitive placement, and cross-region transfer behavior.
Some records use 1-byte uploads/downloads to approximate latency by activating the target service's data path with minimal payload. For each timestamp, measurements include standard sizes (1 byte, 1 MB, 10 MB, 50 MB) plus a few random sizes up to 50 MB. The dataset includes ~900.000 measurements spanning 86 days between 2024-10-31 and 2025-01-24, with a pause from 2024-11-18 to 2024-12-18. Each measurement is uniquely identified by (timestamp, origin_fly_region, target_backend_id, size_bytes).
CSV Columns
timestamp: UTC datetime string for the measurement (timezone-aware, ISO 8601).origin_fly_region: Fly.io gateway region code (3-letter).origin_countrycode: ISO 3166-1 alpha-2 country code (lowercase) for the Fly.io gateway.origin_city: City of the Fly.io gateway.origin_lat: Latitude of the Fly.io gateway.origin_lng: Longitude of the Fly.io gateway.target_backend_id: Internal storage backend ID.target_provider: Cloud provider name.target_region: Cloud provider region.target_countrycode: ISO 3166-1 alpha-2 country code (lowercase) for the backend location.target_city: City of the storage backend.target_timezone: Time zone name for the backend.target_lat: Latitude of the storage backend.target_lng: Longitude of the storage backend.target_local_time: Local time at the target backend for the same instant astimestamp.distance_km: Great-circle distance between origin and target, in kilometers (rounded int).size_bytes: Data size in bytes for the measurement.upload_time_ms: Upload time in milliseconds.download_time_ms: Download time in milliseconds.upload_speed_mbps: Upload speed in megabits per second (2 decimal places).download_speed_mbps: Download speed in megabits per second (2 decimal places).
Intended Use Examples
- Compare upload/download performance across cloud providers and regions for a fixed data size.
- Identify nearest or best-performing storage backends for a given Fly.io region.
- Analyze how geographic distance correlates with throughput.
- Build placement or replication strategies based on observed network performance.
- Use as input for predictive models of transfer time or throughput.
Notes
- Rows are sorted by
timestampascending. - City names may contain commas and are properly quoted in the CSV.
- There are no missing values
Related ML Models
Models trained on this dataset are published at:
https://zenodo.org/records/18288840
These models predict transfer time for a specific Fly.io region to storage-backend route at a given time and data size. There is a separate model for six backends and the Fly.io London (lhr) region.
The target_backend_id column is the internal unique ID of a region for a commercial cloud storage provider and is consistent with the backend identifiers used in the published models.
Files
_SAMPLE_chocolate_cloud_object_storage_transfer_speeds copy.csv
Additional details
Related works
- Is referenced by
- Journal article: 10.1109/TCC.2023.3287653 (DOI)