Andromeda
Note

Motivational Scaffolding (AI)

Definition

Motivational Scaffolding is a value-loading strategy that involves giving a seed AI an interim “scaffold” goal system consisting of relatively simple, explicitly coded final goals. Once the AI has developed more sophisticated representational powers, the programmers replace the scaffold with a mature, human-aligned Successor Goal System.

Why It Matters

AI shouldn’t just do tasks for us; it should help us stay motivated and focused. Without proper motivational scaffolding, we risk ‘cognitive atrophy’ where we become dependent on tools and lose our internal drive. Scaffolding ensures that technology augments human agency rather than replacing it.

Core Concepts

  • Interim Phase: The scaffold goals govern the AI during its early developmental stages, when it is too “unintelligent” to understand complex human values.
  • Successor Phase: The mature alignment is installed once the AI has the cognitive capacity to “charitably” interpret the programmers’ intentions.
  • Resistance Hazard: Because scaffold goals are final goals for the interim AI, the AI will have an Instrumental Drive for Goal-Content Integrity and will resist being “reprogrammed.”
  • Collaborative Scaffold: To mitigate resistance, the scaffold goals should include:
    • Welcoming guidance from programmers.
    • Transparency about internal values and strategies.
    • An architecture that is intentionally easy to alter or inspect.
  • Differential Stunting: Stunting the AI’s “strategic” or “Machiavellian” abilities while allowing its “knowledge-base” and “representational” abilities to grow, making it easier to manage during the scaffold phase.

Connected Concepts