Skip to main content
Manual interaction lets you take direct control of the browser session. Use this when the AI agent gets stuck, when handling sensitive credentials, or when you need precise control over specific actions.

Coordinate System

The browser viewport is 1024×600 pixels. The origin (0,0) is at the top-left. Add 155 pixels to your Y coordinate to account for browser chrome (address bar, tabs). Clickable area: 1024×445 pixels (after accounting for 155px chrome offset) If you’re displaying the stream at a different size, scale coordinates proportionally:
function scaleCoords(clientX, clientY, displayWidth, displayHeight) {
  const serverWidth = 1024;
  const serverHeight = 600;
  const yOffset = 155; // Browser chrome

  return {
    x: Math.round((clientX / displayWidth) * serverWidth),
    y: Math.round((clientY / displayHeight) * serverHeight) + yOffset
  };
}

Taking Control

takeOverControl

Pause the AI agent and enable manual mode:
curl -X POST https://connect.webrun.ai/start/send-message \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "sessionId": "SESSION_ID",
    "message": {
      "actionType": "interaction",
      "action": { "type": "takeOverControl" }
    }
  }'
The current task pauses automatically. Perform manual actions, then release control to let the agent continue.

releaseControl

Return control to the AI agent:
curl -X POST https://connect.webrun.ai/start/send-message \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "sessionId": "SESSION_ID",
    "message": {
      "actionType": "interaction",
      "action": { "type": "releaseControl" }
    }
  }'

Interaction Actions

Click

Click at specific coordinates:
{
  "actionType": "interaction",
  "action": {
    "type": "CLICK",
    "x": 500,      // 0-1024
    "y": 300       // 155-755 (with chrome offset)
  }
}

Type

Type text into the currently focused element:
{
  "actionType": "interaction",
  "action": {
    "type": "TYPE",
    "text": "Hello world",
    "humanLike": true  // Optional: simulate human typing speed
  }
}
Set humanLike: true to simulate realistic typing speed and patterns. This helps avoid detection on sites with anti-bot measures.

Key Press

Press a single key or key combination:
{
  "actionType": "interaction",
  "action": {
    "type": "KEY_PRESS",
    "key": "Enter"
  }
}
Common keys:
  • Enter, Escape, Tab, Backspace, Delete
  • ArrowUp, ArrowDown, ArrowLeft, ArrowRight
  • PageUp, PageDown, Home, End
  • Single characters: a, b, 1, 2, etc.

Common Patterns

Manual Login with 2FA

// Take control
socket.emit("message", {
  actionType: "interaction",
  action: { type: "takeOverControl" }
});

// Click username field
socket.emit("message", {
  actionType: "interaction",
  action: { type: "CLICK", x: 400, y: 300 }
});

// Type username
socket.emit("message", {
  actionType: "interaction",
  action: { type: "TYPE", text: "[email protected]", humanLike: true }
});

// Tab to password field
socket.emit("message", {
  actionType: "interaction",
  action: { type: "KEY_PRESS", key: "Tab" }
});

// Type password
socket.emit("message", {
  actionType: "interaction",
  action: { type: "TYPE", text: "secretPassword123", humanLike: true }
});

// Submit
socket.emit("message", {
  actionType: "interaction",
  action: { type: "KEY_PRESS", key: "Enter" }
});

// Wait for 2FA prompt, user solves it manually via video stream

// Release control after 2FA is complete
socket.emit("message", {
  actionType: "interaction",
  action: { type: "releaseControl" }
});

// Open dropdown
socket.emit("message", {
  actionType: "interaction",
  action: { type: "CLICK", x: 500, y: 350 }
});

// Navigate with arrow keys
socket.emit("message", {
  actionType: "interaction",
  action: { type: "KEY_PRESS", key: "ArrowDown" }
});

socket.emit("message", {
  actionType: "interaction",
  action: { type: "KEY_PRESS", key: "ArrowDown" }
});

// Select
socket.emit("message", {
  actionType: "interaction",
  action: { type: "KEY_PRESS", key: "Enter" }
});

Responding to Guardrails

socket.on("message", (data) => {
  if (data.type === "guardrail_trigger" &&
      data.data.value.includes("CAPTCHA")) {

    // Take control for manual solving
    socket.emit("message", {
      actionType: "interaction",
      action: { type: "takeOverControl" }
    });

    // User solves CAPTCHA via video stream
    // Once done, release control and resume
    document.getElementById("captcha-solved").onclick = () => {
      socket.emit("message", {
        actionType: "interaction",
        action: { type: "releaseControl" }
      });

      socket.emit("message", {
        actionType: "guardrail",
        taskDetails: "CAPTCHA solved, continue",
        newState: "resume"
      });
    };
  }
});

Complete Example with Video Stream

import { io } from "socket.io-client";

// Create session
const session = await fetch("https://connect.webrun.ai/start/start-session", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${API_KEY}`
  },
  body: JSON.stringify({
    taskDetails: "Go to login page",
    startingUrl: "https://example.com/login"
  })
}).then(r => r.json());

// Setup video stream
const iframe = document.createElement("iframe");
iframe.src = session.streaming.webViewURL;
iframe.width = 1024;
iframe.height = 600;
document.body.appendChild(iframe);

// Connect WebSocket
const socket = io("https://connect.webrun.ai", {
  auth: { sessionId: session.sessionId },
  transports: ["websocket"]
});

// Take control when ready
socket.on("connect", () => {
  socket.emit("message", {
    actionType: "interaction",
    action: { type: "takeOverControl" }
  });
});

// Setup click handler
const canvas = document.getElementById("clickable-overlay");
canvas.addEventListener("click", (e) => {
  const rect = canvas.getBoundingClientRect();
  const coords = scaleCoords(
    e.clientX - rect.left,
    e.clientY - rect.top,
    rect.width,
    rect.height
  );

  socket.emit("message", {
    actionType: "interaction",
    action: { type: "CLICK", x: coords.x, y: coords.y }
  });
});

// Release control when done
document.getElementById("done-button").addEventListener("click", () => {
  socket.emit("message", {
    actionType: "interaction",
    action: { type: "releaseControl" }
  });

  // Continue with next task
  socket.emit("message", {
    actionType: "newTask",
    newState: "start",
    taskDetails: "Complete the checkout process"
  });
});

function scaleCoords(clientX, clientY, displayWidth, displayHeight) {
  return {
    x: Math.round((clientX / displayWidth) * 1024),
    y: Math.round((clientY / displayHeight) * 600) + 155
  };
}

Troubleshooting

IssueSolution
Clicks not registeringVerify Y offset (155px) is added to coordinates
Clicking wrong locationCheck coordinate scaling function
Typing not workingClick element first to ensure it’s focused
Actions happening too fastAdd delays between actions (500ms recommended)