Skip to main content

What are Guardrails?

Guardrails are WebRun’s safety mechanism for human-in-the-loop control. When the AI agent encounters a situation requiring human judgment or sensitive information, it pauses the task and triggers a guardrail. This ensures the agent never:
  • Submits credentials without explicit permission
  • Makes purchases or financial transactions autonomously
  • Proceeds when instructions are ambiguous
  • Bypasses security challenges like CAPTCHAs

When Guardrails Trigger

Trigger TypeExample ScenarioAgent Response
Credentials NeededLogin form encountered”I need login credentials to proceed”
Purchase ConfirmationCheckout page reached”Do you want me to complete this purchase?”
CAPTCHA DetectedSecurity challenge appears”A CAPTCHA is blocking me. Please solve it.”
Ambiguous InstructionMultiple valid interpretations”Which item should I click? There are several options.”
Security WarningSSL error or warning page”I encountered a security warning. Should I proceed?”

Guardrail Flow

When a guardrail is triggered, the workflow pauses and waits for human input:
Task Running

Guardrail Triggered (task pauses)

Human Provides Input

Agent Resumes (task continues)

Task Completes

Detection Methods

REST API (Polling):
{
  "type": "guardrail_trigger",
  "data": {
    "type": "human_input_needed",
    "value": "I need login credentials to proceed"
  }
}
WebSocket (Real-time):
socket.on("message", (data) => {
  if (data.type === "guardrail_trigger") {
    console.log("Guardrail:", data.data.value);
    // Handle guardrail response
  }
});

Guardrail Response Format

Request from Agent

{
  "success": true,
  "type": "guardrail_trigger",
  "data": {
    "type": "human_input_needed",
    "value": "I need login credentials for this website"
  }
}

Response to Agent

Provide the requested information and resume the task: REST:
POST /start/send-message
{
  "sessionId": "SESSION_ID",
  "message": {
    "actionType": "guardrail",
    "taskDetails": "Username: demo@example.com, Password: demo123",
    "newState": "resume"
  }
}
WebSocket:
socket.emit("message", {
  actionType: "guardrail",
  taskDetails: "Username: demo@example.com, Password: demo123",
  newState": "resume"
});

Common Guardrail Types

1. Credentials Request

Trigger: Login form detected Agent Message: “I need login credentials to proceed” Response: Provide username and password Example:
taskDetails: "Username: user@example.com, Password: secretpass123"

2. Purchase Confirmation

Trigger: Checkout or payment page reached Agent Message: “Do you want me to complete this purchase? The total is $49.99” Response: Confirm or deny Example:
taskDetails: "Yes, complete the purchase"
// or
taskDetails: "No, stop and return to cart"

3. Ambiguous Choice

Trigger: Multiple valid options exist Agent Message: “Which product should I select? I see 3 options with similar names” Response: Clarify the choice Example:
taskDetails: "Select the first one (Logitech K380)"

4. CAPTCHA or Security Challenge

Trigger: CAPTCHA appears Agent Message: “A CAPTCHA is blocking me. Please solve it.” Response: Either solve it manually or instruct to skip Example:
taskDetails: "I've solved the CAPTCHA, please continue"
// or use manual takeover to solve it yourself

Policy-Triggered Guardrails

When a session has an automation policy attached, the policy can trigger guardrails automatically. If a policy rule is set to guardrail enforcement (rather than block), the session pauses and waits for your approval — exactly like a credential or CAPTCHA guardrail. Handle policy guardrails the same way you handle any other guardrail in your integration.

How Guardrails Behave

The agent asks for clarification rather than guessing. Financial transactions, account changes, and data submissions always trigger a guardrail — the agent won’t act on these autonomously. WebRun doesn’t store credentials. You provide them on-demand when a guardrail triggers, or upfront using secrets. You can also intervene manually at any point, even if no guardrail has triggered.

Handling Guardrails in Code

Basic Pattern (REST)

async function handleTask(sessionId, taskId, apiKey) {
  // Poll for result
  const result = await pollForResult(sessionId, taskId, apiKey);

  if (result.type === "guardrail_trigger") {
    console.log("Guardrail:", result.data.value);

    // Get user input
    const userInput = await promptUser(result.data.value);

    // Respond to guardrail
    await fetch("https://connect.webrun.ai/start/send-message", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "Authorization": `Bearer ${apiKey}`
      },
      body: JSON.stringify({
        sessionId,
        message: {
          actionType: "guardrail",
          taskDetails: userInput,
          newState: "resume"
        }
      })
    });

    // Continue polling for completion
    return await pollForResult(sessionId, taskId, apiKey);
  }

  return result;
}

Advanced Pattern (WebSocket)

function setupGuardrailHandler(socket) {
  socket.on("message", async (data) => {
    if (data.type === "guardrail_trigger") {
      const userInput = await promptUser(data.data.value);

      socket.emit("message", {
        actionType: "guardrail",
        taskDetails: userInput,
        newState: "resume"
      });
    }
  });
}

Automated Guardrail Handling

For common scenarios, you can build automated guardrail handlers:
const guardrailHandlers = {
  "login credentials": () => {
    return `Username: ${process.env.USERNAME}, Password: ${process.env.PASSWORD}`;
  },
  "purchase": (message) => {
    const amount = extractAmount(message);
    return amount < 100 ? "Yes, proceed" : "No, cancel";
  },
  "captcha": () => {
    return "Skip this task, CAPTCHA detected";
  }
};

function handleGuardrail(message) {
  for (const [trigger, handler] of Object.entries(guardrailHandlers)) {
    if (message.toLowerCase().includes(trigger)) {
      return handler(message);
    }
  }
  // Default: ask human
  return promptUser(message);
}

Avoiding Guardrails with Secrets

If you know the credentials the agent will need ahead of time, you can provide them upfront using the secrets parameter. This lets the agent authenticate automatically without triggering a guardrail or pausing the task.
{
  "initialTask": {
    "taskDetails": "Log in and export the report",
    "secrets": [
      {
        "match": "*.example.com",
        "fields": { "email": "user@example.com", "password": "pass123" }
      }
    ]
  }
}
Secrets are never stored — they exist only in the session’s memory and are discarded when the session ends. Secrets guide

Secrets

Provide credentials upfront to avoid guardrails

Handling Guardrails

Implementation guide with examples

Manual Interaction

Take manual control of sessions

Tasks

Understanding task lifecycle

Automation Policies

Runtime rules that can trigger guardrails automatically