3 Pilgrim LLC
Version 1.0 · February 5,2026
Click here for full PDF of paper
Preface
Why This Paper Exists, in Human Terms
Modern machine learning talks constantly about “high-dimensional spaces.”
Most of the time, that phrase does not mean what it sounds like.
In mathematics and physics, a dimension is an independent direction of freedom. Adding one changes the structure of a space. It creates new volume, new paths, and new invariants. A higher-dimensional space is not just a more crowded version of a lower-dimensional one—it can do things the lower-dimensional space fundamentally cannot.
In much of machine learning, the same word is used differently. “High-dimensional” often means many features, wide embeddings, or large parameter counts. These additions increase size, but they usually do not introduce new independent directions. The underlying geometry stays the same. The space becomes denser, not broader.
For a long time, this distinction did not matter. Early models were small, and borrowing the term “dimension” as shorthand was convenient and mostly harmless. As models scaled into the billions and trillions of parameters—and as machine learning began to intersect seriously with geometry, topology, and physics—the shortcut hardened into a category error.
Today, the field routinely explains observed behaviors using “high-dimensional effects,” even when measurements show that effective dimensionality remains low. Flat minima, low-rank curvature, thin-shell concentration, distance collapse, and scaling plateaus are treated as mysterious consequences of vast dimensionality—despite arising precisely because true dimensional expansion has not occurred.
This paper exists to fix that mismatch.
It does not propose new algorithms.
It does not dispute existing empirical findings.
It does not claim that current models are “wrong.”
Instead, it performs a corrective act of language.
By restoring “dimension” to its invariant meaning—independent axes of freedom—and naming the other mechanism actually at work—vector aggregation within a fixed topology—the paper makes a long-standing paradox analytically tractable. Phenomena that appear contradictory under the current vocabulary become coherent once the mechanisms are separated.
The result is not a new theory of machine learning, but a clearer map. A way to see which kinds of scaling increase density, which kinds saturate, and which kinds might plausibly create new degrees of freedom in the future.
If the argument is correct, then many current limits are not optimization failures or empirical surprises. They are structural consequences of containment. Managing that containment can improve efficiency—but it cannot substitute for genuine expansion.
This paper is offered as a foundation: a small set of distinctions meant to align language, geometry, and mechanism as machine learning increasingly converges with the disciplines from which its mathematics was borrowed.
Semiotic Frustration in Machine Learning (v1.0)
1) Why This Paper Exists
This paper exists to fix a language failure that has quietly become a thinking failure.
In mathematics and physics, dimension has an invariant meaning: an independent axis of freedom that expands configuration space and enables new structural properties. In machine learning, the same word is routinely used to mean something else—feature count, embedding width, or parameter volume—even when those additions do not introduce independence or alter topology.
For years, this semantic drift was mostly harmless. As models scaled and machine learning began to intersect more directly with geometry, topology, and physics, the mismatch hardened into a category error. The field now routinely explains observed behaviors—flat minima, low intrinsic dimension, thin-shell concentration, scaling plateaus—as “high-dimensional effects,” even when those behaviors arise precisely because true dimensional expansion has not occurred.
This paper exists to resolve that paradox by repairing the taxonomy. It separates two mechanisms that are currently conflated under the single word “dimensionality,” restoring analytical clarity without disputing existing empirical results.
2) What the Paper Says (Plain Language)
Dimension means independence.
A true dimension is an independent direction of variation. Adding
one changes the structure of a space, multiplies its volume, and
enables new paths, separations, and invariants. More coordinates
alone do not accomplish this.
Most ML “dimensionality” is aggregation, not
expansion.
When models grow wider or larger, they typically add
correlated coordinates inside a fixed topology. This
increases density and redundancy, not independent degrees of
freedom.
Name the real mechanism.
The paper introduces vector aggregation (also called
vectoral containment) as the correct primitive for most
contemporary ML growth. To make this precise, it defines the VNode
(Vector-Node): a correlated, non-orthogonal vector element added
without changing topology.
The paradox dissolves once the mechanisms are
separated.
Low intrinsic dimension, low-rank Fisher information, flat minima,
curvature collapse, and thin-shell sparsity are not contradictory
“high-D effects.” They are coherent consequences of aggregation
within a fixed space.
The contribution is semantic, not
algorithmic.
The paper does not propose new models or benchmarks. It supplies a
minimal, reductionist taxonomy that aligns language with geometry
so existing observations can be interpreted correctly.
3) What Distinguishes This Framework
It restores an invariant definition.
Dimensionality is treated as an independence primitive, consistent
with mathematics and physics, rather than a proxy for size or
count.
It separates two distinct growth regimes.
True Dimensional Expansion (TDE): adding independent axes → exponential volume + new invariants
Vector Aggregation / Containment: adding correlated vectors → redundancy, degeneracy, low effective dimension
It unifies fragmented observations without new
assumptions.
The framework explains why phenomena like flat minima, low-rank
curvature, and scaling plateaus reliably co-occur—without invoking
ad hoc explanations or new forces.
It is deliberately reductionist.
Two primitives—independence and aggregation—are sufficient. The
paper aims to clarify, not to proliferate concepts.
4) Why This Distinction Matters
Scaling limits become structural, not mysterious.
If additional parameters mostly increase vector aggregation, then
diminishing returns and early plateaus are expected outcomes, not
anomalies or optimization failures.
“High-dimensional effects” are often misattributed.
Many behaviors blamed on vast dimensionality arise from
containment in crowded, correlated spaces. The difficulty is not
navigating exponential volume, but operating inside a dense,
degenerate one.
Interdisciplinary synthesis stops stalling.
When ML intersects with geometry or physics, shared words
currently denote different primitives. Fixing the taxonomy removes
that friction and enables genuine synthesis.
Innovation paths become clearer.
Managing aggregation (pruning, sparsity, routing) improves
efficiency but cannot substitute for true expansion. Genuine
breakthroughs likely require new independent axes—temporal,
causal, or modal—not further densification.
5) The Core Primitive: Vector Aggregation (VNode)
To make the distinction operational, the paper introduces VNode (Vector-Node) as a naming anchor—not a new object, but a precise label for what ML has been calling “dimensions.”
Definition (informal):
A VNode is a correlated, non-orthogonal vector element added
within a fixed topology.
Key properties:
VNode count can grow arbitrarily while effective dimensionality remains low
Measurable signatures include low intrinsic dimension, low-rank Fisher information, flat minima, and thin-shell concentration
Adding VNodes increases density, not independence
Contrast with TDE:
True dimensional expansion introduces new independent axes and
changes what the space can support. VNode accumulation does
not.
This naming repair allows existing literature to be reinterpreted cleanly: many references to “high dimensionality” are more accurately references to high VNode count.
6) Implications (Interpretive, Not Prescriptive)
Model analysis:
Evaluate effective independence (intrinsic dimension, Fisher rank)
rather than nominal size when diagnosing capacity and
redundancy.
Scaling expectations:
Anticipate earlier saturation when growth is dominated by
aggregation. Plateaus signal containment, not failure.
Research direction:
Distinguish efforts that manage aggregation from those that
plausibly introduce new degrees of freedom. The latter, not the
former, correspond to true dimensional gains.
Interpretability and safety:
Understanding where independence actually lives in a model
improves reasoning about failure modes, generalization, and
out-of-distribution behavior.
7) Scope and Intent (Re-Emphasized)
This paper:
introduces no new algorithms
disputes no empirical findings
makes no performance claims
Its purpose is to repair a category error that has accumulated as machine learning scaled faster than its language. By separating containment from expansion, it renders a long-standing paradox tractable and reorients discussion toward structure rather than size.