Autonomous web research agent. Runs overnight. Handles Cloudflare, dynamic JS, login walls, and rate limits. Built because every other option failed the moment it touched anything real.
Every autonomous web agent I tried had the same issue. It worked on the demo site and fell apart on anything real. Cloudflare caught it immediately. JavaScript that rendered table content after a two second delay was invisible to it. Rate limit responses came back as 200 OK with an error page body and the agent reported success, saved garbage, and moved on.
I needed something that could run actual research tasks overnight. Full academic database downloads, multi-step navigation, sites that actively resist automation. Nothing I found was built for that. So I built Blackreach.
At the core is a ReAct loop. The agent receives a task, thinks through what action to take, executes it, observes the result, and repeats. The hard part is the observation space. Most agents dump raw HTML into the context. A typical page is 50k to 500k tokens of noise. Blackreach uses a DOM walker instead.
Thought: I need the inscription table on this page
Action: navigate("https://sigla.phis.me/")
Observation: Page loaded. Nav: [About, Database, Signs].
Main: table, 847 rows, columns [ID, Site, Text, Image].
Interactive: pagination controls, export button.
Thought: extract all rows and handle pagination
Action: extract_table(selector=".inscription-table", paginate=True)
...
The DOM walker extracts semantic structure instead of raw markup. Visible text, interactive elements, navigation landmarks, ARIA roles. A 200k token page becomes a 2k token observation the model can actually reason about.
Standard Playwright gets caught immediately. The tells are well documented:
navigator.webdriver = true, missing browser extensions, CDP artifacts,
headless viewport signatures, unnatural input timing.
Blackreach patches these before any page loads. Human-like timing on mouse moves and keypresses. Viewport sizes pulled from real device distributions. User-agent rotation with matching headers. JS injection to clear the automation fingerprint.
Task input
|
v
[ ReAct Loop ]
[ Think (LLM) <---> DOM Walker ]
[ | ]
[ v ]
[ Act ------> Stealth Playwright ]
|
v
Output + task log
Autonomous agents fail silently. The agent says it succeeded, the file is there, you don't find out until hours later when you open it and it's a 403 error page saved as HTML.
Every one of the 2,904 tests came from a real failure. Rate limits returning 200 OK. JavaScript rendering table content after a two second delay. Session tokens expiring mid-task. CAPTCHAs showing up on page 3 but not pages 1 or 2. Login walls that only trigger from non-residential IPs.
When it runs at 3am collecting data, I need to know it fails loud. Not silent. That's what the test suite is for.
Open source at github.com/Null-Phnix/Blackreach. Production-grade for research tasks. Issues and PRs welcome.