Documentation Index Fetch the complete documentation index at: https://docs.webrun.ai/llms.txt
Use this file to discover all available pages before exploring further.
Manual interaction lets you take direct control of the browser session. Use this when the AI agent gets stuck, when handling sensitive credentials, or when you need precise control over specific actions.
Coordinate System
The browser viewport is 1024×600 pixels. The origin (0,0) is at the top-left. Add 155 pixels to your Y coordinate to account for browser chrome (address bar, tabs).
Clickable area: 1024×445 pixels (after accounting for 155px chrome offset)
If you’re displaying the stream at a different size, scale coordinates proportionally:
function scaleCoords ( clientX , clientY , displayWidth , displayHeight ) {
const serverWidth = 1024 ;
const serverHeight = 600 ;
const yOffset = 155 ; // Browser chrome
return {
x: Math . round (( clientX / displayWidth ) * serverWidth ),
y: Math . round (( clientY / displayHeight ) * serverHeight ) + yOffset
};
}
Taking Control
takeOverControl
Pause the AI agent and enable manual mode:
curl -X POST https://connect.webrun.ai/start/send-message \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"sessionId": "SESSION_ID",
"message": {
"actionType": "interaction",
"action": { "type": "takeOverControl" }
}
}'
The current task pauses automatically. Perform manual actions, then release control to let the agent continue.
releaseControl
Return control to the AI agent:
curl -X POST https://connect.webrun.ai/start/send-message \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"sessionId": "SESSION_ID",
"message": {
"actionType": "interaction",
"action": { "type": "releaseControl" }
}
}'
Interaction Actions
Click
Click at specific coordinates:
{
"actionType" : "interaction" ,
"action" : {
"type" : "CLICK" ,
"x" : 500 , // 0-1024
"y" : 300 // 155-755 (with chrome offset)
}
}
Type
Type text into the currently focused element:
{
"actionType" : "interaction" ,
"action" : {
"type" : "TYPE" ,
"text" : "Hello world" ,
"humanLike" : true // Optional: simulate human typing speed
}
}
Set humanLike: true to simulate realistic typing speed and patterns. This helps avoid detection on sites with anti-bot measures.
Key Press
Press a single key or key combination:
{
"actionType" : "interaction" ,
"action" : {
"type" : "KEY_PRESS" ,
"key" : "Enter"
}
}
Common keys:
Enter, Escape, Tab, Backspace, Delete
ArrowUp, ArrowDown, ArrowLeft, ArrowRight
PageUp, PageDown, Home, End
Single characters: a, b, 1, 2, etc.
Common Patterns
Manual Login with 2FA
// Take control
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "takeOverControl" }
});
// Click username field
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "CLICK" , x: 400 , y: 300 }
});
// Type username
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "TYPE" , text: "user@example.com" , humanLike: true }
});
// Tab to password field
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "KEY_PRESS" , key: "Tab" }
});
// Type password
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "TYPE" , text: "secretPassword123" , humanLike: true }
});
// Submit
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "KEY_PRESS" , key: "Enter" }
});
// Wait for 2FA prompt, user solves it manually via video stream
// Release control after 2FA is complete
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "releaseControl" }
});
Dropdown Navigation
// Open dropdown
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "CLICK" , x: 500 , y: 350 }
});
// Navigate with arrow keys
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "KEY_PRESS" , key: "ArrowDown" }
});
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "KEY_PRESS" , key: "ArrowDown" }
});
// Select
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "KEY_PRESS" , key: "Enter" }
});
Responding to Guardrails
socket . on ( "message" , ( data ) => {
if ( data . type === "guardrail_trigger" &&
data . data . value . includes ( "CAPTCHA" )) {
// Take control for manual solving
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "takeOverControl" }
});
// User solves CAPTCHA via video stream
// Once done, release control and resume
document . getElementById ( "captcha-solved" ). onclick = () => {
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "releaseControl" }
});
socket . emit ( "message" , {
actionType: "guardrail" ,
prompt: "CAPTCHA solved, continue" ,
newState: "resume"
});
};
}
});
Complete Example with Video Stream
import { io } from "socket.io-client" ;
// Create session
const session = await fetch ( "https://connect.webrun.ai/start/start-session" , {
method: "POST" ,
headers: {
"Content-Type" : "application/json" ,
"Authorization" : `Bearer ${ API_KEY } `
},
body: JSON . stringify ({
task: {
prompt: "Go to login page" ,
startingUrl: "https://example.com/login"
}
})
}). then ( r => r . json ());
// Setup video stream
const iframe = document . createElement ( "iframe" );
iframe . src = session . streaming . webViewURL ;
iframe . width = 1024 ;
iframe . height = 600 ;
document . body . appendChild ( iframe );
// Connect WebSocket
const socket = io ( "https://connect.webrun.ai" , {
auth: { sessionId: session . sessionId },
transports: [ "websocket" ]
});
// Take control when ready
socket . on ( "connect" , () => {
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "takeOverControl" }
});
});
// Setup click handler
const canvas = document . getElementById ( "clickable-overlay" );
canvas . addEventListener ( "click" , ( e ) => {
const rect = canvas . getBoundingClientRect ();
const coords = scaleCoords (
e . clientX - rect . left ,
e . clientY - rect . top ,
rect . width ,
rect . height
);
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "CLICK" , x: coords . x , y: coords . y }
});
});
// Release control when done
document . getElementById ( "done-button" ). addEventListener ( "click" , () => {
socket . emit ( "message" , {
actionType: "interaction" ,
action: { type: "releaseControl" }
});
// Continue with next task
socket . emit ( "message" , {
actionType: "newTask" ,
newState: "start" ,
prompt: "Complete the checkout process"
});
});
function scaleCoords ( clientX , clientY , displayWidth , displayHeight ) {
return {
x: Math . round (( clientX / displayWidth ) * 1024 ),
y: Math . round (( clientY / displayHeight ) * 600 ) + 155
};
}
Troubleshooting
Issue Solution Clicks not registering Verify Y offset (155px) is added to coordinates Clicking wrong location Check coordinate scaling function Typing not working Click element first to ensure it’s focused Actions happening too fast Add delays between actions (500ms recommended)
Video Streaming Setup video streaming for visual feedback
Handling Guardrails Respond when the agent needs help