There is a newer version of the record available.

Published February 6, 2026 | Version 1.0.0
Dataset Open

Instrumetriq — Crypto Market Activity & Sentiment Context Dataset (Weekly Samples)

Authors/Creators

  • 1. Instrumetriq

Description

This dataset provides time-aligned observational snapshots of crypto market activity and social sentiment across 270+ crypto assets, designed to contextualize market structure, liquidity, and attention dynamics rather than produce forecasts or signals.

The archive contains weekly Sunday samples drawn from Instrumetriq's continuous monitoring pipeline.

Data Collection

Spot market data sourced from Binance

  • Mid prices, bid–ask spreads, liquidity percentiles
  • Aggregated per observation window

Social sentiment data sourced from X (Twitter)

  • Posts are continuously collected and classified using a hybrid transformer-based sentiment system
  • Sentiment is exposed only in aggregated form (counts and averages)
  • Each asset is monitored in ~2-hour observation cycles, producing one row per asset per session
  • Approximately ~2,500 observations per day, regardless of tier

Archive Contents

This archive contains:

  • 7 weekly Sunday samples (2025-12-21 through 2026-02-01)
  • Three dataset tiers per week, sharing the same observations but differing in schema depth
  • Apache Parquet format with Snappy compression
  • Schema documentation for all tiers
  • Methodology overview

Dataset Tiers

All tiers contain the same number of observations. They differ only in column structure and depth.

Tier 1 — Explorer

  • 19 flat columns
  • Aggregated sentiment counts and averages
  • Spot prices, spreads, liquidity, and quality scores
  • Designed for lightweight inspection, dashboards, and general analysis

Tier 2 — Analyst

  • Extends Tier 1 with nested columns
  • Detailed sentiment aggregates, author statistics, and engagement metrics
  • Designed for deeper behavioral and cross-sectional analysis

Tier 3 — Researcher

  • Extends Tier 2 with nested futures and microstructure data
  • Includes 700+ spot price samples per observation window (10-second resolution)
  • Multi-window sentiment, diagnostics, and futures positioning data
  • Designed for research, validation, and archival analysis

Note: High-frequency (10-second) spot price samples are available only in Tier 3.

Intended Use

This dataset is intended for:

  • Market structure research
  • Behavioral and sentiment analysis
  • Liquidity and execution context studies
  • Exploratory and descriptive analytics

Limitations & Ethics

  • Observational data only
  • No trading advice, predictions, or signal generation
  • No individual social media posts or personal data are included
  • All sentiment data is aggregated and anonymized

Access

Full access via subscription at instrumetriq.com/access. Interactive demo: Open in Colab.

Observational data only. No trading advice, predictions, or signal generation.

Notes

Sample data is provided for evaluation and research purposes. Full daily dataset access requires a subscription. See LICENSE.md in the archive for terms.

Methods

Market data is sourced from the Binance spot market via the public REST API.  
Spot prices, bid–ask spreads, and liquidity-related metrics are sampled internally at high frequency and aggregated into fixed observation windows.

Social sentiment data is sourced from publicly available X (Twitter) posts.  
Posts are collected continuously and classified using a hybrid transformer-based sentiment system. Sentiment outputs are aggregated into per-window counts and summary statistics.

Each tracked asset is monitored in rolling observation cycles of approximately **~2 hours**, producing one observation per asset per cycle.  
All tiers share the same observation timing and coverage.

High-frequency spot price samples (10-second resolution) are retained **only in the highest dataset tier**.  
Lower tiers expose aggregated spot and sentiment statistics only.

No raw social media content, user identifiers, or personally identifiable information are included.  
The dataset is strictly observational and descriptive in nature.

Files

Files (76.2 MB)

Name Size Download all
md5:fd75e86b18ea23de4caab253427d86fc
76.2 MB Download

Additional details

Related works

Is documented by
Other: https://instrumetriq.com/dataset (URL)
Is supplemented by
Dataset: https://github.com/SiCkGFX/instrumetriq-public (URL)

Dates

Collected
2025-12-21/2026-02-01
Data collection period covered by the samples in this archive