Physical Intelligence × Cloudflare

Who They Are

Physical Intelligence, explained

PI's bet: the robotics industry needs a foundation model layer the way the software industry needed LLMs. Build it once, and every robotics company builds on top of it.

Core Thesis

The Physical Intelligence Layer

Today, building a robotic system means constructing the entire stack from scratch — controllers, data pipelines, training infrastructure, and the model itself. PI is building the equivalent of GPT-as-an-API for robots: a reusable physical intelligence layer that any robotics company can access, fine-tune, and deploy against their own hardware and tasks.

Their vision explicitly mirrors the LLM API model: "You wouldn't start by training foundation models… you would just make API calls to existing foundation models." That same sentence, applied to robotics, describes PI's product roadmap.

Vision-Language-Action Models Foundation Models API-first Roadmap Cross-Embodiment

Model Lineage

π0 → π0.7

π0

Oct 2024 · First generalist policy

Multi-task, multi-robot architecture. Open-sourced Feb 2025.

π0.5

Apr 2025 · Open-world generalization

Controls mobile manipulators in entirely unseen environments.

π0.6 / π*0.6

Nov 2025 · RL-optimized specialists

RL fine-tuning (Recap) for throughput and robustness on real tasks.

π0.7

Apr 2026 · Steerable generalist

Compositional generalization. Matches specialist fine-tunes out of the box. Cross-embodiment transfer.

Live Deployments

Partners in production

Weave Robotics

Laundry folding in San Francisco laundromats. π0.6 reduced missed grasps by 42% and interventions by 50% vs. π0.5.

Ultra

E-commerce order packaging in live US warehouses. Hundreds of deployments scaling. 96.4% autonomy demonstrated in production.

Models run on real customer infrastructure, packing real orders every day — this is production, not demo.

Infrastructure Profile

Three distinct workloads

Understanding what PI actually runs — from GPU training pipelines to on-robot inference to partner-facing APIs — is the foundation for mapping where Cloudflare fits.

Research & Training

Massively parallel GPU training runs for VLA models at scale. Multi-robot data collection pipelines — camera feeds, proprioception logs, teleoperation video — ingested and stored continuously. Model weights published publicly (π0 open-sourced) and shared privately with partners as fine-tuned checkpoints.

Multi-GB checkpoint artifacts Multi-robot sensor data Public + private weight distribution RL training pipelines

Edge / On-Robot Inference

VLA model inference must generate motor actions within the tight latency budget of physical control loops. PI's published "Real-Time Action Chunking" research directly addresses running large models at robot reaction speeds. Inference may run on-robot, on nearby edge compute, or via cloud API calls — each with different connectivity and latency requirements.

Real-time latency constraints On-robot or edge compute Cloud API fallback paths Venue / warehouse networks

Partner-Facing Platform

Partners submit proprietary robot data for pre-training inclusion. They receive fine-tuned model checkpoints back. They operate those models in live production environments on real customer hardware. This implies: model artifact storage, per-partner authentication, data ingestion pipelines, human-in-the-loop intervention systems, and eventually a public API surface.

Per-partner access control Checkpoint distribution Human-in-the-loop remote ops Future: inference API

Where Cloudflare Fits

Four high-value areas

PI's three infrastructure workloads map to four distinct Cloudflare product clusters. Each solves a real, current problem — not a hypothetical future one.

PI publishes multi-gigabyte model weights publicly (π0 open-sourced) and distributes fine-tuned partner checkpoints privately. On AWS S3 or GCS, distributing a 10GB checkpoint file to partners globally generates substantial egress charges per download — and those charges scale linearly with how many partners are pulling, how often they update, and how many robots each partner runs. R2 eliminates that entirely: $0 egress regardless of how many times a checkpoint is downloaded globally.

A Worker sits in front of each download: it validates a per-partner JWT, confirms the partner is authorized to access that specific model version, generates a time-limited signed R2 URL, and logs the access event — all in a single edge request with no separate auth service required.

$0 egress on multi-GB checkpoint distribution globally

Per-partner access control with time-limited signed URLs — no auth service

S3-compatible API — drop-in for existing training pipeline tooling

Checkpoint distribution cost

AWS S3 egress
(10GB × 100 partners/mo) ~$90 / mo + scales with model size & partner count

Cloudflare R2 egress $0 regardless of download volume

Access control flow

Partner Robot

JWT →

Worker
validates + signs

URL →

R2
checkpoint

PI's stated long-term product is exactly the LLM API model applied to robotics — partners call an API, the physical intelligence is already there. As that API surface materializes, every inference request from a partner's robot needs: rate limiting per partner tier, per-request logging and observability, cost accounting by API consumer, and fallback routing if an inference backend is degraded.

Cloudflare AI Gateway provides all of this as a managed proxy layer — sitting in front of PI's inference backends (whether self-hosted GPU clusters or third-party inference services). Caching identical prompts reduces redundant inference costs. Workers handle per-partner authentication, route to the correct model version, and enforce usage quotas — all at the edge before a request ever reaches PI's GPU infrastructure.

Rate limiting, logging, and cost accounting per partner tier

Model fallback if inference backend is unavailable

Response caching for identical prompts — reduces GPU utilization

API gateway layer

Weave robots

Ultra robots

Partner N

AI Gateway Rate limit · Log · Cache · Fallback

π0.7 Inference

Deployed robots at customer sites — warehouses like Ultra's, laundromats like Weave's — need to communicate with PI's cloud infrastructure to pull updated model weights, send telemetry, and receive intervention from a human operator. Exposing each robot or edge server with a public IP is a security and operational liability.

Cloudflare Tunnel gives each robot a secure outbound-only connection to Cloudflare's network — no public IP, no firewall rules to manage at the customer site. Zero Trust Access authenticates each robot as a device identity, ensuring only enrolled robots can reach PI's infrastructure. For Ultra's explicit human-in-the-loop intervention system — where remote operators step in when the model stumbles — Cloudflare Realtime SFU + TURN provides the WebRTC infrastructure for low-latency video and control through any network, including warehouse and laundromat WiFi that would block direct UDP.

No public IPs on deployed robots — outbound-only Tunnel connections

Per-robot device identity via Zero Trust — no network-level trust

Human-in-the-loop WebRTC through any venue network via TURN TLS:443

Robot secure connectivity

Customer site

Robot

Tunnel (outbound only)

Cloudflare

Zero Trust Tunnel Realtime

PI Cloud

Human-in-loop operator also connects via Realtime SFU

pi.website is a research publication site with embedded video, high-resolution robot footage, and dense figures from papers. Model launch announcements — especially the π0 open-source release and the π0.7 compositional generalization post — generate significant traffic spikes when shared on academic Twitter/X and HackerNews. Cloudflare Pages with global CDN delivers all static assets from the edge; traffic spikes are absorbed with no origin scaling required.

Cloudflare Images handles on-the-fly resizing and WebP conversion of research figures for different devices. WAF + bot management protects model download endpoints from scraper abuse and prevents hotlinking of research video assets. As PI moves to a partner portal — where Weave and Ultra log in to view model performance metrics, submit data, and download checkpoints — Workers + D1 provide the backend for authenticated session state and access logs without running a dedicated application server.

Traffic spike resilience on model launch days — no origin scaling needed

WAF + bot management protects download endpoints from scraper abuse

Partner portal: Workers + D1 for serverless auth and access logs

pi.website traffic pattern

π0 open-sourced

π0.7 launch

Pages CDN absorbs spikes — origin never overloads

Product Mapping

PI need → Cloudflare product

Every PI infrastructure requirement mapped to the Cloudflare product that addresses it — and the specific mechanism of value.

Model weight & checkpoint distribution

R2Workers

$0 egress on multi-GB artifacts; signed per-partner time-limited URLs

Highest

Inference API gateway (partner model calls)

AI Gateway

Rate limiting, logging, caching, model fallback per partner tier

Highest

Per-partner authentication & routing

WorkersAccess

JWT validation, per-tenant model version routing at the edge

High

Deployed robot cloud connectivity

TunnelZero Trust

Outbound-only connection; no public IPs; per-robot device identity

High

Human-in-the-loop remote intervention

Realtime SFUTURN

WebRTC through any venue network; TURN TLS:443 for firewall bypass

High

Research site CDN & media delivery

PagesImages

Global CDN; traffic spike absorption; on-the-fly image optimization

Medium

Download endpoint & scraper protection

WAFBot Mgmt

Block abusive scraping of model weights; protect research video assets

Medium

Partner portal (metrics, data submit, checkpoint access)

PagesWorkersD1

Git-deployed portal; serverless auth state; no dedicated app server

Medium

Robotics data pipeline security

WAFZero Trust

Protect sensor data ingest endpoints; authenticate partner upload sessions

Emerging

Recommended Wedge

Start with R2 + AI Gateway

The highest-priority entry point is the combination that solves the two most immediate, concrete problems PI faces as they scale from research to multi-partner commercial deployment:

R2 for model distribution

The moment PI starts distributing fine-tuned checkpoints to Weave, Ultra, and additional partners, S3 egress becomes a real line item that grows with every new partner and every model update. R2 removes it entirely, and Workers provide the per-partner access control layer with no additional auth infrastructure.

Immediate outcome: Eliminate egress charges on checkpoint distribution and enforce partner-level access control with a single Worker.

AI Gateway for inference APIs

PI's entire product thesis is converging on "call our model like an LLM API." When that API surface opens up — even internally to partners — every request needs rate limiting, logging, and cost accounting. AI Gateway provides this immediately as a managed proxy with no additional infrastructure, and it's already designed for the multi-model, multi-provider patterns PI will use.

Immediate outcome: Full observability and per-partner rate limiting on inference traffic from day one of the partner API program.

R2 docs ↗ AI Gateway docs ↗ Zero Trust docs ↗ Realtime docs ↗

The infrastructure layer
for physical AI

Physical Intelligence, explained

The Physical Intelligence Layer

π0 → π0.7

Partners in production

Three distinct workloads

Research & Training

Edge / On-Robot Inference

Partner-Facing Platform

Four high-value areas

R2 + Workers

AI Gateway + Workers

Cloudflare Tunnel + Zero Trust + Realtime

Pages + Workers AI + Images + WAF

PI need → Cloudflare product

Start with R2 + AI Gateway

R2 for model distribution

AI Gateway for inference APIs

The infrastructure layerfor physical AI

Physical Intelligence, explained

The Physical Intelligence Layer

π0 → π0.7

Partners in production

Three distinct workloads

Research & Training

Edge / On-Robot Inference

Partner-Facing Platform

Four high-value areas

R2 + Workers

AI Gateway + Workers

Cloudflare Tunnel + Zero Trust + Realtime

Pages + Workers AI + Images + WAF

PI need → Cloudflare product

Start with R2 + AI Gateway

R2 for model distribution

AI Gateway for inference APIs

The infrastructure layer
for physical AI