Taskmosis LogoTaskmosisEarly Release

← Back to Blog

Why We Built Taskmosis on Chrome DevTools Protocol

A technical look at how we achieve 85% accuracy and sub-2-minute execution times by combining AI with full browser control.

January 202510 min readEngineering
85%
Success Rate
1.2 min
Avg Execution
$0.10
Cost/Task
98%+
Reliability

When we set out to build Taskmosis, we had a clear goal: create a browser automation tool that actually works reliably on real websites. After evaluating every approach in the market, we chose to build on the Chrome DevTools Protocol (CDP).

This wasn't an obvious choice. Some competitors avoid CDP entirely, claiming it's "detectable" or "insecure." Others go the opposite direction with screenshot-based vision models. We believe both extremes miss the mark.

Here's why CDP, when used correctly through a Chrome Extension, provides the best foundation for AI-powered browser automation.

Full Browser Control

Access every browser capability through the official Chrome debugging interface

  • DOM manipulation
  • Network control
  • Input simulation
  • File handling

Native Integration

Runs within your browser using your existing sessions and credentials

  • Your cookies
  • Your logins
  • Your extensions
  • Your settings

AI-Powered Decisions

Intelligent task planning with semantic understanding of web pages

  • Natural language
  • Context awareness
  • Error recovery
  • Smart retries

How Taskmosis Works

1
You describe your task
"Fill out job applications on LinkedIn"
2
Minimized DOM sent to AI
Parsed accessibility tree (90%+ smaller) + optional screenshot sent to backend AI
3
AI returns CDP commands
Backend AI analyzes and returns specific CDP function calls
4
CDP executes locally
Native clicks, typing, scrolling in YOUR browser - indistinguishable from you
Task completed
Applications submitted, forms filled, data extracted

Comparing Automation Approaches

There are three main ways to build browser automation. Here's how they stack up:

CDP-Based (Taskmosis)

RECOMMENDED

The Best of Both Worlds

Combines extension convenience with full browser control. Uses minimized accessibility tree sent to AI for planning, with screenshots when visual context needed. CDP commands execute locally.

Advantages

  • Complete DOM access
  • Native input events
  • 90%+ smaller payload than raw DOM
  • Works with your sessions
  • Credentials never transmitted

Limitations

  • Requires debugger permission
85%
Accuracy
Fast
Speed

Extension-Only

Limited by Design

Uses only Chrome Extension APIs without debugger access. Simpler but significantly restricted.

Advantages

  • No special permissions
  • Simple architecture

Limitations

  • Cannot access Shadow DOM
  • Synthetic events detectable
  • No file upload support
  • No network control
  • CSP restrictions
~70-80%
Accuracy
Fast
Speed

CUA (Cloud Screenshot Agents)

NOT RECOMMENDED

Expensive, Slow, Privacy Risk

Takes screenshots and sends them to external AI providers for interpretation. Fundamentally inefficient and exposes your screen content.

Advantages

  • Works on any visual content

Limitations

  • 10x slower execution
  • 5-10x higher cost
  • Screenshots sent to cloud
  • Privacy concerns
  • Hallucination prone
40-66%
Accuracy
Very Slow
Speed

What CDP Enables

The Chrome DevTools Protocol provides access to powerful browser internals that extension-only approaches simply cannot reach.

DOM Domain

Full access to the document structure, including Shadow DOM and iframes

Query any elementModify attributesTraverse the tree

Input Domain

Native mouse and keyboard events that are indistinguishable from real user input

Click anywhereType naturallyDrag and drop

Network Domain

Monitor and intercept network requests for advanced automation scenarios

Track API callsHandle auth flowsManage cookies

Page Domain

Control page lifecycle, capture screenshots, and handle navigation

Navigate pagesCapture stateHandle dialogs

Runtime Domain

Execute JavaScript in the page context for complex interactions

Run scriptsAccess variablesCall functions

Accessibility

Read the accessibility tree for semantic understanding of page structure

ARIA rolesElement labelsFocus order

Why Cloud Screenshot Agents Struggle

CUA (Cloud Screenshot) Approach

1
Capture screenshot for EVERY action
~500ms latency, large file size each time
2
Send to cloud AI provider
Expensive API call, 2-5 seconds, privacy risk
3
Interpret pixels remotely
OCR errors, hallucinations, missed elements
4
Guess coordinates
Click wrong element, retry needed
Result: 6-12 minutes per task, $0.50-3.00 cost

The Taskmosis Approach

1
Parse accessibility tree
Minimized DOM (90%+ smaller) sent to AI
2
AI plans actions
Fast, cheap, accurate - screenshots only when visual context needed
3
AI returns CDP commands
Specific function calls for precise actions
4
Execute via CDP locally
Native events in YOUR browser, precise targeting
Result: ~1.2 minutes per task, $0.10 cost

Addressing the "CDP is Detectable" Myth

You may have heard that CDP-based automation is easily detected by websites. This is only true for headless browsers launched with remote debugging enabled.

When CDP is accessed through Chrome's extension APIs (using chrome.debugger), the situation is completely different:

  • No navigator.webdriver flag - This flag is only set for automated browser launches, not extension debugger access
  • No CDP JavaScript objects - The window.cdc_* objects only appear in remote debugging mode
  • Identical browser fingerprint - Your browser looks exactly the same as manual browsing
  • Your existing session - Uses your cookies, logins, and browser state

Bottom line: Taskmosis works on LinkedIn, banking sites, and other platforms with aggressive bot detection because it's indistinguishable from you browsing manually.

Frequently Asked Questions

When used through a Chrome Extension (like Taskmosis), CDP operates within the browser's normal security model. The navigator.webdriver flag and other detection vectors only apply to headless browser automation - not to extension-based CDP usage. Your browser fingerprint remains identical to normal browsing.

Ready to Automate Your Browser Tasks?

Experience the power of CDP-based automation. 85% accuracy, ~1.2 minute execution, $0.10 per task.