Browser Automation Deep Dive

The Use Case That Started It All

This is the browser automation extension that led to the discovery of the Fetch framework.

See the Origin Story for the full context.

The Problem

Building an AI browser automation agent that needs to:

Parse DOM structure
Detect user commands
Remember context across page navigations
Decide when to execute actions
Avoid infinite loops

How Fetch Solved It

The Agent Loop

1. SENSE (3D)
   - Chirp: Detect command urgency
   - Perch: Parse DOM structure
   - Wake: Remember previous actions

2. MEASURE (DRIFT)
   - Current state: Where are we?
   - Goal state: Where should we be?
   - Gap: DRIFT = Goal - Current

3. ACT (Fetch)
   - Calculate: Fetch = Chirp × |DRIFT| × Confidence
   - Decide: Execute/Confirm/Queue/Wait
   - Execute if threshold met

4. LOOP
   - Re-measure DRIFT
   - Continue until DRIFT ≈ 0

Example: Reddit Search

User command: "Search for cormorant foraging on Reddit"

Step 1: Initial State

javascript

// Dimension scores
chirp = 90;  // Clear "search" command detected
perch = 70;  // On Reddit homepage, search bar found
wake = 80;   // Remember we're on Reddit

// DRIFT calculation
methodology = 90;  // Goal: view search results
performance = 20;  // Current: on homepage
drift = 70;

// Fetch calculation
confidence = min(70, 80) / 100 = 0.70;
fetch = 90 × 70 × 0.70 = 4,410;

Decision: Execute (fetch > 1000) → Type query in search bar

Step 2: After Typing

javascript

// New state
perch = 75;  // Search box now has text
performance = 40;  // Partially toward goal

// Recalculate
drift = 50;
fetch = 90 × 50 × 0.70 = 3,150;

Decision: Execute → Click search button or press Enter

Step 3: After Search

javascript

// New state
perch = 85;  // Results page loaded
performance = 90;  // Near goal
drift = 0;

// Recalculate
fetch = 90 × 0 × 0.75 = 0;

Decision: Wait (DRIFT ≈ 0, goal achieved)

Preventing Infinite Loops

The original problem: Agent kept navigating to Reddit 11 times.

Solution using Fetch:

javascript

const actionHistory = [];

function checkForLoop(action, url) {
  const hash = `${action}-${url}`;
  const recentActions = actionHistory.slice(-10);
  const repeatCount = recentActions.filter(h => h === hash).length;

  if (repeatCount >= 3) {
    // Same action repeated too many times
    return true;  // Loop detected
  }

  actionHistory.push(hash);
  return false;
}

// Before executing
if (checkForLoop(decision.action, currentUrl)) {
  console.log('Loop detected. DRIFT not decreasing. Stopping.');
  fetch = 0;  // Force stop
}

The Semantic Anchoring Pattern

From the browser extension's governance document:

Rule 1: Observable DOM Over Inferred State

javascript

// ❌ WRONG: Inferring state
const isLoginPage = url.includes('/auth');

// ✅ RIGHT: Observable semantics
const isLoginPage = document.querySelector('input[type="password"]') !== null;

This maps to PerchIQX (Space/Structure) - always anchor to observable structure.

Rule 2: Intent Preservation

Never override user's original goal during multi-step execution.

This maps to WakeIQX (Time/Memory) - remember the original intent.

Rule 3: Semantic Element Selection

Use meaningful attributes:

ID attributes (most reliable)
Name attributes
ARIA labels
Text content

This ensures high PerchIQX scores → higher confidence → safer execution.

Real Agent Code

javascript

async function executeStep() {
  // 1. SENSE: Get DOM
  const dom = await extractSemanticDOM();

  // 2. SENSE: Calculate dimension scores
  const chirp = detectCommandUrgency(userCommand);
  const perch = calculateStructureScore(dom);
  const wake = getContextScore(actionHistory);

  // 3. MEASURE: Calculate DRIFT
  const drift = assessGapToGoal(currentState, goalState);

  // 4. ACT: Calculate Fetch
  const confidence = Math.min(perch, wake) / 100;
  const fetch = chirp * Math.abs(drift) * confidence;

  // 5. DECIDE
  if (fetch > 1000 && confidence > 0.6) {
    await executeAction(aiDecision);
  } else if (fetch > 1000 && confidence <= 0.6) {
    await executeWithReview(aiDecision);
  } else if (fetch > 500) {
    const approved = await confirmWithUser(aiDecision);
    if (approved) await executeAction(aiDecision);
  } else {
    console.log('Waiting - insufficient Fetch score');
  }

  // 6. LOOP: Re-measure
  setTimeout(executeStep, 2000);
}

The GitHub Repository

The actual browser extension:

Semantic Fetch Intelligence Extension

Features:

Gemini 3.0 Pro integration
Semantic DOM extraction
Agentic loop (analyze → think → act)
Loop detection
Action validation

This is where Fetch was discovered.

Key Insights

1. The DRIFT Must Decrease

After each action, re-measure DRIFT. If it's not decreasing, you're stuck.

2. Confidence Gates Action

Even with high urgency (Chirp) and large gap (DRIFT), low confidence (Perch or Wake) blocks execution.

3. Multiplicative is Critical

javascript

fetch = chirp × drift × confidence

If any component is zero, Fetch is zero. This prevents:

Acting without urgency (Chirp = 0)
Acting when goal is met (DRIFT = 0)
Acting when not ready (Confidence = 0)

Lessons Learned

Problem	Fetch Solution
Infinite loops	DRIFT should decrease; track history
Wrong element clicked	High Perch score requires semantic selectors
Premature action	Confidence multiplier gates execution
Stuck on loading page	Wake (context) should track "page loaded"
Ignoring user commands	Chirp detects urgency; don't act without it

Try It Yourself

Clone the repository
Install dependencies
Load in Chrome
Try: "Search for artificial intelligence on Reddit"
Watch the Fetch framework in action

View the complete extension →

← Back to Use Cases

Browser Automation Deep Dive ​

The Use Case That Started It All ​

The Problem ​

How Fetch Solved It ​

The Agent Loop ​

Example: Reddit Search ​

Step 1: Initial State ​

Step 2: After Typing ​

Step 3: After Search ​

Preventing Infinite Loops ​

The Semantic Anchoring Pattern ​

Rule 1: Observable DOM Over Inferred State ​

Rule 2: Intent Preservation ​

Rule 3: Semantic Element Selection ​

Real Agent Code ​

The GitHub Repository ​

Key Insights ​

1. The DRIFT Must Decrease ​

2. Confidence Gates Action ​

3. Multiplicative is Critical ​

Lessons Learned ​

Try It Yourself ​

Browser Automation Deep Dive

The Use Case That Started It All

The Problem

How Fetch Solved It

The Agent Loop

Example: Reddit Search

Step 1: Initial State

Step 2: After Typing

Step 3: After Search

Preventing Infinite Loops

The Semantic Anchoring Pattern

Rule 1: Observable DOM Over Inferred State

Rule 2: Intent Preservation

Rule 3: Semantic Element Selection

Real Agent Code

The GitHub Repository

Key Insights

1. The DRIFT Must Decrease

2. Confidence Gates Action

3. Multiplicative is Critical

Lessons Learned

Try It Yourself