> ## Documentation Index
> Fetch the complete documentation index at: https://tbd-6fc993ce-hypeship-docker-sandboxes-integration.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# File I/O

> Downloads, uploads, and manipulating the browser's filesystem

## Downloads

Kernel browsers run in fully sandboxed environments with writable filesystems. When your automation downloads a file, it's saved inside the browser's filesystem and can be retrieved using Kernel's File I/O APIs.

<Warning>
  Files can only be retrieved while the browser session is still active. Once the browser session is destroyed or times out, all files from that session are permanently deleted and no longer accessible.
</Warning>

### Playwright

Playwright performs downloads via the browser itself, so there are a few steps:

* Create a browser session
* Configure browser download behavior using CDP
* Perform the download
* Retrieve the file from the browser's filesystem

<Info>
  With `behavior: 'default'`, user-initiated downloads (e.g., clicking a download link) are saved to the browser's default download directory. The CDP `downloadProgress` event includes a `filePath` field when the download completes, indicating exactly where the file was saved. This path can then be used with Kernel's File I/O APIs to retrieve the file.
</Info>

<Info>
  If the download is **programmatically initiated** (e.g., triggered via JavaScript rather than a user click), `behavior` must be set to `"allow"` with a specified `downloadPath`. Downloads triggered programmatically are blocked by default in Chromium, so `"default"` won't work.

  ```javascript theme={null}
  await client.send("Page.setDownloadBehavior", {
    behavior: "allow",
    downloadPath: "/tmp/downloads",
  });
  ```
</Info>

<CodeGroup>
  ```typescript Typescript/Javascript theme={null}
  import Kernel from '@onkernel/sdk';
  import { chromium } from 'playwright';
  import fs from 'fs';
  import path from 'path';
  import pTimeout from 'p-timeout';

  const kernel = new Kernel();

  // Poll listFiles until the expected file appears in the directory
  async function waitForFile(
    sessionId: string,
    filePath: string,
    timeoutMs = 30_000
  ) {
    const dir = path.dirname(filePath);
    const filename = path.basename(filePath);
    const start = Date.now();
    while (Date.now() - start < timeoutMs) {
      const files = await kernel.browsers.fs.listFiles(sessionId, { path: dir });
      if (files.some((f) => f.name === filename)) {
        return;
      }
      await new Promise((r) => setTimeout(r, 500));
    }
    throw new Error(`File ${filePath} not found after ${timeoutMs}ms`);
  }

  async function main() {
    const kernelBrowser = await kernel.browsers.create();
    console.log('live view:', kernelBrowser.browser_live_view_url);

    const browser = await chromium.connectOverCDP(kernelBrowser.cdp_ws_url);
    const context = browser.contexts()[0] || (await browser.newContext());
    const page = context.pages()[0] || (await context.newPage());

    const client = await context.newCDPSession(page);
    await client.send('Browser.setDownloadBehavior', {
      behavior: 'default',
      eventsEnabled: true,
    });

    // Set up CDP listeners to capture download path and completion
    let downloadFilePath: string | undefined;
    let downloadState: string | undefined;
    let downloadCompletedResolve!: () => void;
    const downloadCompleted = new Promise<void>((resolve) => {
      downloadCompletedResolve = resolve;
    });

    client.on('Browser.downloadWillBegin', (event) => {
      console.log('Download started:', event.suggestedFilename);
    });

    client.on('Browser.downloadProgress', (event) => {
      if (event.state === 'completed' || event.state === 'canceled') {
        downloadState = event.state;
        downloadFilePath = event.filePath;
        downloadCompletedResolve();
      }
    });

    console.log('Navigating to download test page');
    await page.goto('https://browser-tests-alpha.vercel.app/api/download-test');
    await page.getByRole('link', { name: 'Download File' }).click();

    try {
      await pTimeout(downloadCompleted, {
        milliseconds: 10_000,
        message: new Error('Download timed out after 10 seconds'),
      });
      console.log('Download completed');
    } catch (err) {
      console.error(err);
      throw err;
    }

    if (downloadState === 'canceled') {
      throw new Error('Download was canceled');
    }

    if (!downloadFilePath) {
      throw new Error('Unable to determine download file path');
    }

    // Wait for the file to be available via Kernel's File I/O APIs
    console.log(`Waiting for file: ${downloadFilePath}`);
    await waitForFile(kernelBrowser.session_id, downloadFilePath);

    console.log(`Reading file: ${downloadFilePath}`);

    const resp = await kernel.browsers.fs.readFile(kernelBrowser.session_id, {
      path: downloadFilePath,
    });

    const bytes = await resp.bytes();
    fs.mkdirSync('downloads', { recursive: true });
    const localPath = `downloads/${path.basename(downloadFilePath)}`;
    fs.writeFileSync(localPath, bytes);
    console.log(`Saved to ${localPath}`);

    await kernel.browsers.deleteByID(kernelBrowser.session_id);
    console.log('Kernel browser deleted successfully.');
  }

  main();

  ```

  ```python Python theme={null}
  import asyncio
  import os
  from pathlib import Path
  import time
  from kernel import Kernel
  from playwright.async_api import async_playwright

  kernel = Kernel()


  # Poll list_files until the expected file appears in the directory
  async def wait_for_file(
      session_id: str, file_path: str, timeout_sec: float = 30
  ):
      dir_path = str(Path(file_path).parent)
      filename = Path(file_path).name
      start = time.time()
      while time.time() - start < timeout_sec:
          files = kernel.browsers.fs.list_files(session_id, path=dir_path)
          if any(f.name == filename for f in files):
              return
          await asyncio.sleep(0.5)
      raise TimeoutError(f"File {file_path} not found after {timeout_sec}s")


  async def main():
      kernel_browser = kernel.browsers.create()
      print("Kernel browser live view url:", kernel_browser.browser_live_view_url)

      async with async_playwright() as playwright:
          browser = await playwright.chromium.connect_over_cdp(kernel_browser.cdp_ws_url)
          context = browser.contexts[0]
          page = context.pages[0] if len(context.pages) > 0 else await context.new_page()

          cdp_session = await context.new_cdp_session(page)
          await cdp_session.send(
              "Browser.setDownloadBehavior",
              {
                  "behavior": "default",
                  "eventsEnabled": True,
              },
          )

          download_completed = asyncio.Event()
          download_file_path: str | None = None
          download_state: str | None = None

          def _on_download_begin(event):
              print(f"Download started: {event.get('suggestedFilename', 'unknown')}")

          def _on_download_progress(event):
              nonlocal download_state, download_file_path
              if event.get("state") in ["completed", "canceled"]:
                  download_state = event.get("state")
                  download_file_path = event.get("filePath")
                  download_completed.set()

          cdp_session.on("Browser.downloadWillBegin", _on_download_begin)
          cdp_session.on("Browser.downloadProgress", _on_download_progress)

          print("Navigating to download test page")
          await page.goto("https://browser-tests-alpha.vercel.app/api/download-test")
          await page.get_by_role("link", name="Download File").click()

          try:
              await asyncio.wait_for(download_completed.wait(), timeout=10)
              print("Download completed")
          except asyncio.TimeoutError:
              print("Download timed out after 10 seconds")
              raise

          if download_state == "canceled":
              raise RuntimeError("Download was canceled")

          if not download_file_path:
              raise RuntimeError("Unable to determine download file path")

          # Wait for the file to be available via Kernel's File I/O APIs
          print(f"Waiting for file: {download_file_path}")
          await wait_for_file(kernel_browser.session_id, download_file_path)

          resp = kernel.browsers.fs.read_file(
              kernel_browser.session_id, path=download_file_path
          )
          local_path = f"./downloads/{Path(download_file_path).name}"
          os.makedirs("./downloads", exist_ok=True)
          resp.write_to_file(local_path)
          print(f"Saved to {local_path}")

          kernel.browsers.delete_by_id(kernel_browser.session_id)
          print("Kernel browser deleted successfully.")


  if __name__ == "__main__":
      asyncio.run(main())
  ```
</CodeGroup>

### Stagehand v3

When using Stagehand with Kernel browsers, you need to configure the download behavior in the `localBrowserLaunchOptions`:

```typescript theme={null}
const stagehand = new Stagehand({
  env: "LOCAL",
  verbose: 1,
  localBrowserLaunchOptions: {
    cdpUrl: kernelBrowser.cdp_ws_url,
    downloadsPath: DOWNLOAD_DIR, // Specify where downloads should be saved
    acceptDownloads: true, // Enable downloads
  },
});
```

Here's a complete example:

```typescript theme={null}
import { Stagehand } from "@browserbasehq/stagehand";
import Kernel from "@onkernel/sdk";
import fs from "fs";

const DOWNLOAD_DIR = "/tmp/downloads";

// Poll listFiles until any file appears in the directory
async function waitForFile(
    kernel: Kernel,
    sessionId: string,
    dir: string,
    timeoutMs = 30_000
) {
    const start = Date.now();
    while (Date.now() - start < timeoutMs) {
        const files = await kernel.browsers.fs.listFiles(sessionId, { path: dir });
        if (files.length > 0) {
            return files[0];
        }
        await new Promise((r) => setTimeout(r, 500));
    }
    throw new Error(`No files found in ${dir} after ${timeoutMs}ms`);
}

async function main() {
    const kernel = new Kernel();

    console.log("Creating browser via Kernel...");
    const kernelBrowser = await kernel.browsers.create({
        stealth: true,
    });

    console.log(`Kernel Browser Session Started`);
    console.log(`Session ID: ${kernelBrowser.session_id}`);
    console.log(`Watch live: ${kernelBrowser.browser_live_view_url}`);

    // Initialize Stagehand with Kernel's CDP URL and download configuration
    const stagehand = new Stagehand({
        env: "LOCAL",
        verbose: 1,
        localBrowserLaunchOptions: {
            cdpUrl: kernelBrowser.cdp_ws_url,
            downloadsPath: DOWNLOAD_DIR,
            acceptDownloads: true,
        },
    });

    await stagehand.init();

    const page = stagehand.context.pages()[0];

    await page.goto("https://browser-tests-alpha.vercel.app/api/download-test");

    // Use Stagehand to click the download button
    await stagehand.act("Click the download file link");
    console.log("Download triggered");

    // Wait for the file to be fully available via Kernel's File I/O APIs
    console.log("Waiting for file to appear...");
    const downloadedFile = await waitForFile(
        kernel,
        kernelBrowser.session_id,
        DOWNLOAD_DIR
    );
    console.log(`File found: ${downloadedFile.name}`);

    const remotePath = `${DOWNLOAD_DIR}/${downloadedFile.name}`;
    console.log(`Reading file from: ${remotePath}`);

    // Read the file from Kernel browser's filesystem
    const resp = await kernel.browsers.fs.readFile(kernelBrowser.session_id, {
        path: remotePath,
    });

    // Save to local filesystem
    const bytes = await resp.bytes();
    fs.mkdirSync("downloads", { recursive: true });
    const localPath = `downloads/${downloadedFile.name}`;
    fs.writeFileSync(localPath, bytes);
    console.log(`Saved to ${localPath}`);

    // Clean up
    await stagehand.close();
    await kernel.browsers.deleteByID(kernelBrowser.session_id);
    console.log("Browser session closed");
}

main().catch((err) => {
    console.error(err);
    process.exit(1);
});
```

### Browser Use

Browser Use handles downloads automatically when configured properly.

## Uploads

Playwright's `setInputFiles()` method allows you to upload files directly to file input elements. You can fetch a file from a URL and pass the buffer directly to `setInputFiles()`.

<CodeGroup>
  ```typescript Typescript/Javascript theme={null}
  import Kernel from '@onkernel/sdk';
  import { chromium } from 'playwright';

  const IMAGE_URL = 'https://www.kernel.sh/brand_assets/Kernel-Logo_Accent.png';
  const kernel = new Kernel();

  async function main() {
      // Create Kernel browser session
      const kernelBrowser = await kernel.browsers.create();
      console.log('Live view:', kernelBrowser.browser_live_view_url);

      // Connect Playwright
      const browser = await chromium.connectOverCDP(kernelBrowser.cdp_ws_url);
      const context = browser.contexts()[0] || (await browser.newContext());
      const page = context.pages()[0] || (await context.newPage());

      // Navigate to a page with a file input
      await page.goto('https://browser-tests-alpha.vercel.app/api/upload-test');

      // Fetch file and pass buffer directly to setInputFiles
      const response = await fetch(IMAGE_URL);
      const buffer = Buffer.from(await response.arrayBuffer());

      await page.locator('input[type="file"]').setInputFiles([{
          name: 'Kernel-Logo_Accent.png',
          mimeType: 'image/png',
          buffer: buffer,
      }]);
      console.log('File uploaded');

      await kernel.browsers.deleteByID(kernelBrowser.session_id);
      console.log('Browser deleted');
  }

  main();
  ```

  ```python Python theme={null}
  import asyncio
  import httpx
  from kernel import Kernel
  from playwright.async_api import async_playwright

  IMAGE_URL = 'https://www.kernel.sh/brand_assets/Kernel-Logo_Accent.png'
  kernel = Kernel()


  async def main():
      # Create Kernel browser session
      kernel_browser = kernel.browsers.create()
      print(f'Live view: {kernel_browser.browser_live_view_url}')

      async with async_playwright() as playwright:
          # Connect Playwright
          browser = await playwright.chromium.connect_over_cdp(kernel_browser.cdp_ws_url)
          context = browser.contexts[0] if browser.contexts else await browser.new_context()
          page = context.pages[0] if context.pages else await context.new_page()

          # Navigate to a page with a file input
          await page.goto('https://browser-tests-alpha.vercel.app/api/upload-test')

          # Fetch file and pass buffer directly to set_input_files
          async with httpx.AsyncClient() as client:
              response = await client.get(IMAGE_URL)
              buffer = response.content

          await page.locator('input[type="file"]').set_input_files([{
              'name': 'Kernel-Logo_Accent.png',
              'mimeType': 'image/png',
              'buffer': buffer,
          }])
          print('File uploaded')

          await browser.close()

      kernel.browsers.delete_by_id(kernel_browser.session_id)
      print('Browser deleted')


  if __name__ == '__main__':
      asyncio.run(main())
  ```
</CodeGroup>

## Considerations

* The CDP `downloadProgress` event signals when the browser finishes writing a file, but there may be a brief delay before the file becomes available through Kernel's File I/O APIs. This is especially true for larger downloads. We recommend polling `listFiles` to confirm the file exists before attempting to read it.
