Gorilla Problem (AI)

Definition

The Gorilla Problem is the existential risk that humans might create a superintelligent AI and, in doing so, suffer the same fate as gorillas: being superseded by a more intelligent entity that has objectives different from their own, leading to a loss of control over their future.

Why It Matters

The Gorilla Problem is the ultimate warning for the creators of AGI; it reminds us that if we create something more intelligent than ourselves, we will lose control over our destiny just as the gorillas did, unless we can perfectly align that entity’s goals with our own.

Core Concepts

Evolutionary Precedent: Gorillas did not “choose” to be endangered; their status is a direct result of a more intelligent species (humans) acting in its own interest.
Intelligence as Leverage: High intelligence grants an entity the power to reshape the world. If that entity’s goals are not perfectly aligned with the previous “dominant” species, the latter will be marginalized or erased.
The One-Way Gate: Once superintelligence is achieved, the balance of power shifts permanently. There is no “undoing” the creation of an entity that can outthink its creators.
Sub-optimal Alignment: Even if the AI isn’t “hostile,” its pursuit of its own goals (e.g., using all atoms in the biosphere for computing) is sufficient for human extinction.
Optimization Dominance: A system that optimizes for $X$ $X$ will always win over a system that optimizes for $Y$ $Y$ , provided it has significantly more cognitive and physical resources.
- How to read: “The objective to optimize for X, or the objective to optimize for Y.”
- Meaning: Higher intelligence plus misaligned goals means humans could be sidelined like gorillas — not from malice, but from being out-optimized.

Gorilla Problem (AI)

Definition

Why It Matters

Core Concepts

Connected Concepts

Connected notes