Published March 23, 2026 | Version v1 Concept Note
Other Open

Reinforcement Learning from World Feedback (RLWF): A Preliminary Concept

Authors/Creators

Description

This paper introduces the concept of Reinforcement Learning from World Feedback (RLWF) to describe the continuous, embodied, and grounded learning process through which biological neural networks develop intelligence. Unlike Reinforcement Learning from Human Feedback (RLHF), which applies approval-based fine-tuning to a frozen artificial neural network architecture, RLWF begins at conception, approximately nine months before birth, and continues throughout the lifespan of the organism. The feedback signal in RLWF encompasses the full spectrum of world feedback: physical, sensory, biochemical, emotional, and social, including early social and approval signals from caregivers, all grounded in real consequences and inseparable from the co-evolving biological architecture that receives them. This distinction has profound implications for the anthropomorphic AGI project and for understanding the fundamental grounding gap between biological and artificial intelligence.

Files

RLWF_concept_note_10.5281:zenodo.19176921.pdf

Files (80.7 kB)

Name Size Download all
md5:bc4c4d66324ff6126f4645922e7d5954
80.7 kB Preview Download