Published October 8, 2025
| Version v54
Preprint
Open
Isotropic Deep Learning: You Should Consider Your (Foundational) Biases
Description
This work was originally written per the requirements of the 2025 NeurIPS Position Paper track, which included bold meta-level arguments for what the field is doing right and wrong. Therefore, the paper's rhetoric aimed to present a strong position and a title which reflects this.
Abstract:
This position paper explores an alternative mathematical formulation, `Isotropic Deep Learning', by analysing the implications of current functional forms in deep learning. Modern networks almost universally rely on foundational forms respecting discrete permutation symmetry. However, this is an underappreciated choice in form, argued to introduce unrecognised biases without suitable alternatives. Initially, this discrete symmetry observation is promoted to a continuous rotation defined framework, then broadened to primitive sets defined by various other symmetries. This constitutes a new symmetry-led design-axis: rather than enforcing it through model design, which transfers symmetry through the structure, it studies how foundational form symmetries inherently act on and interact within general architectures --- one objective is a systematic approach to the consequences of network symmetry breaking in addition to symmetry making and emerging from the primitive-level. In addition, determining whether non-trivial expressibility is contingent on which function symmetries are preserved moreover broken. The goal is to expose and leverage unintended biases by deducing principles applicable in broader contexts for beneficial computation. Proposed is a systematic reformulation of all foundational primitives into classes that respect particular groups, and to determine the resultant implications. This constitutes an inverted ontology framework where general symmetries are situated definitionally prior to neurons, rather than a permutation symmetry being deduced from them. This design axis motivates reselection of compositions upwards, as they underpin current constructions and may enable new models contingent on alternative foundations. Hence, the paper advocates for a distinctly bottom-up reformulation aiming to deduce general principles for broad leverage.
This is motivated by prior work demonstrating that current functional forms influence activation distributions: discrete symmetries in functions induce similar discrete structure in embedded representations through training. Thus, geometric artefacts can arise in learned representations solely due to human-imposed design choices rather than task-driven necessity. Therefore, the prevailing choice is shown to carry unappreciated and unintended task-agnostic biases. Moreover, there appears to be no compelling a priori justification for why such representations or functional forms are universally desirable; this paper hypothesises three testable pathologies of the current formulation with significant connections to mechanistic interpretability. Hence, this motivates the construction and analysis of alternative foundational primitives, aiming to alter geometric constraints on representations and improve performance. The underlying inductive biases of the isotropic approach may constitute a preferable default which could be adopted if a wide array of suitable and well-performing functions are developed.A variety of preliminary functions are proposed, including new activation functions, normalisers, and operations, and an audit is provided across various primitives in use. The symmetry-principled construction is then generalised, enabling a broad class of group-defined reformulations across primitives, positing a new foundational design axis with distinct inductive biases. Thus, Isotropic Deep Learning becomes just one case study among such parallel implementations for all models.
This initial group-theoretic generalisation of primitives is systematically extended upwards to encompass their hierarchical compositions, motivating its applicability across all scales in architectures. This yields an initial three generations of symmetry strength for categorisation in the framework. This extension recovers Geometric Deep Learning as the strongest generation when composing functions, ensuring model-scale compliance with the symmetry constraint derived from data for specialist applications. This substantially contrasts with this paper's enquiry, diverging through a bottom-up philosophy starting from primitives rather than working recursively top-down from model constraints. This establishes a further role for symmetry emerging within deep learning. This taxonomic formalism also encompasses the Parameter Symmetry approach as a distinct compositional case, studying the consequences of computational equivalences under reparameterisations deduced from current permutation-like primitives. In contrast, this work redefines primitives through a symmetry-led design-axis and investigating the ramifications more broadly, not just restricted to discrete parameter degeneracies. Hence, this ``Taxonomic Deep Learning'' approach reveals all three to be distinct special cases, characteristic of various compositional scales and strengths --- a unification of contemporary approaches to symmetry in an intuitive, hierarchical, and complementary formalism. This may facilitate a better, comprehensive comparison and exploration of their interplay, while clarifying further regimes that may remain to be considered. Encouraged is a systematic audit into the influence of symmetry generally, but particularly the reformulation and comparison of various group-defined primitive sets. From this, the study of downstream phenomena can proceed after a primitive algebra is fixed. This can span from determining representation biases, reassessing theorems contingent on prior primitives, optimisation, performance, and diverse new model architectures.
Zenodo will be used to maintain a continuously updated copy of this work as it evolves. Please make sure that any share link used connects to this general page, rather than a specific version of the paper.
Changelog: Changed abstract and aspects of introduction.
This is now considered the finalised version of the paper's content. Formatting/grammatical errors may be corrected but the content will be stable moving forwards.
Files
IsotropicDeepLearning_PositionPaper.pdf
Files
(2.6 MB)
Name | Size | Download all |
---|---|---|
md5:796e9e25d1f5e001e585eb2fa07920e4
|
2.6 MB | Preview Download |
Additional details
Dates
- Submitted
-
2025-05-20Submitted to Zenoda