Browser Automation Deep Dive
The Use Case That Started It All
This is the browser automation extension that led to the discovery of the Fetch framework.
See the Origin Story for the full context.
The Problem
Building an AI browser automation agent that needs to:
- Parse DOM structure
- Detect user commands
- Remember context across page navigations
- Decide when to execute actions
- Avoid infinite loops
How Fetch Solved It
The Agent Loop
1. SENSE (3D)
- Chirp: Detect command urgency
- Perch: Parse DOM structure
- Wake: Remember previous actions
2. MEASURE (DRIFT)
- Current state: Where are we?
- Goal state: Where should we be?
- Gap: DRIFT = Goal - Current
3. ACT (Fetch)
- Calculate: Fetch = Chirp × |DRIFT| × Confidence
- Decide: Execute/Confirm/Queue/Wait
- Execute if threshold met
4. LOOP
- Re-measure DRIFT
- Continue until DRIFT ≈ 0Example: Reddit Search
User command: "Search for cormorant foraging on Reddit"
Step 1: Initial State
// Dimension scores
chirp = 90; // Clear "search" command detected
perch = 70; // On Reddit homepage, search bar found
wake = 80; // Remember we're on Reddit
// DRIFT calculation
methodology = 90; // Goal: view search results
performance = 20; // Current: on homepage
drift = 70;
// Fetch calculation
confidence = min(70, 80) / 100 = 0.70;
fetch = 90 × 70 × 0.70 = 4,410;Decision: Execute (fetch > 1000) → Type query in search bar
Step 2: After Typing
// New state
perch = 75; // Search box now has text
performance = 40; // Partially toward goal
// Recalculate
drift = 50;
fetch = 90 × 50 × 0.70 = 3,150;Decision: Execute → Click search button or press Enter
Step 3: After Search
// New state
perch = 85; // Results page loaded
performance = 90; // Near goal
drift = 0;
// Recalculate
fetch = 90 × 0 × 0.75 = 0;Decision: Wait (DRIFT ≈ 0, goal achieved)
Preventing Infinite Loops
The original problem: Agent kept navigating to Reddit 11 times.
Solution using Fetch:
const actionHistory = [];
function checkForLoop(action, url) {
const hash = `${action}-${url}`;
const recentActions = actionHistory.slice(-10);
const repeatCount = recentActions.filter(h => h === hash).length;
if (repeatCount >= 3) {
// Same action repeated too many times
return true; // Loop detected
}
actionHistory.push(hash);
return false;
}
// Before executing
if (checkForLoop(decision.action, currentUrl)) {
console.log('Loop detected. DRIFT not decreasing. Stopping.');
fetch = 0; // Force stop
}The Semantic Anchoring Pattern
From the browser extension's governance document:
Rule 1: Observable DOM Over Inferred State
// ❌ WRONG: Inferring state
const isLoginPage = url.includes('/auth');
// ✅ RIGHT: Observable semantics
const isLoginPage = document.querySelector('input[type="password"]') !== null;This maps to PerchIQX (Space/Structure) - always anchor to observable structure.
Rule 2: Intent Preservation
Never override user's original goal during multi-step execution.
This maps to WakeIQX (Time/Memory) - remember the original intent.
Rule 3: Semantic Element Selection
Use meaningful attributes:
- ID attributes (most reliable)
- Name attributes
- ARIA labels
- Text content
This ensures high PerchIQX scores → higher confidence → safer execution.
Real Agent Code
async function executeStep() {
// 1. SENSE: Get DOM
const dom = await extractSemanticDOM();
// 2. SENSE: Calculate dimension scores
const chirp = detectCommandUrgency(userCommand);
const perch = calculateStructureScore(dom);
const wake = getContextScore(actionHistory);
// 3. MEASURE: Calculate DRIFT
const drift = assessGapToGoal(currentState, goalState);
// 4. ACT: Calculate Fetch
const confidence = Math.min(perch, wake) / 100;
const fetch = chirp * Math.abs(drift) * confidence;
// 5. DECIDE
if (fetch > 1000 && confidence > 0.6) {
await executeAction(aiDecision);
} else if (fetch > 1000 && confidence <= 0.6) {
await executeWithReview(aiDecision);
} else if (fetch > 500) {
const approved = await confirmWithUser(aiDecision);
if (approved) await executeAction(aiDecision);
} else {
console.log('Waiting - insufficient Fetch score');
}
// 6. LOOP: Re-measure
setTimeout(executeStep, 2000);
}The GitHub Repository
The actual browser extension:
Semantic Fetch Intelligence Extension
Features:
- Gemini 3.0 Pro integration
- Semantic DOM extraction
- Agentic loop (analyze → think → act)
- Loop detection
- Action validation
This is where Fetch was discovered.
Key Insights
1. The DRIFT Must Decrease
After each action, re-measure DRIFT. If it's not decreasing, you're stuck.
2. Confidence Gates Action
Even with high urgency (Chirp) and large gap (DRIFT), low confidence (Perch or Wake) blocks execution.
3. Multiplicative is Critical
fetch = chirp × drift × confidenceIf any component is zero, Fetch is zero. This prevents:
- Acting without urgency (Chirp = 0)
- Acting when goal is met (DRIFT = 0)
- Acting when not ready (Confidence = 0)
Lessons Learned
| Problem | Fetch Solution |
|---|---|
| Infinite loops | DRIFT should decrease; track history |
| Wrong element clicked | High Perch score requires semantic selectors |
| Premature action | Confidence multiplier gates execution |
| Stuck on loading page | Wake (context) should track "page loaded" |
| Ignoring user commands | Chirp detects urgency; don't act without it |
Try It Yourself
- Clone the repository
- Install dependencies
- Load in Chrome
- Try: "Search for artificial intelligence on Reddit"
- Watch the Fetch framework in action