π Physical Intelligence × Cloudflare

The infrastructure layer
for physical AI

Physical Intelligence is building the GPT moment for robotics — general-purpose foundation models that control any robot to do any task. This brief maps their infrastructure requirements to where Cloudflare adds concrete value across model distribution, inference APIs, robot connectivity, and partner security.

Active research + commercial partnerships
Founded 2023 · San Francisco
Backed by Bezos, OpenAI, Sequoia, CapitalG

Physical Intelligence, explained

PI's bet: the robotics industry needs a foundation model layer the way the software industry needed LLMs. Build it once, and every robotics company builds on top of it.

Core Thesis

The Physical Intelligence Layer

Today, building a robotic system means constructing the entire stack from scratch — controllers, data pipelines, training infrastructure, and the model itself. PI is building the equivalent of GPT-as-an-API for robots: a reusable physical intelligence layer that any robotics company can access, fine-tune, and deploy against their own hardware and tasks.

Their vision explicitly mirrors the LLM API model: "You wouldn't start by training foundation models… you would just make API calls to existing foundation models." That same sentence, applied to robotics, describes PI's product roadmap.

Vision-Language-Action Models Foundation Models API-first Roadmap Cross-Embodiment
Model Lineage

π0 → π0.7

π0
Oct 2024 · First generalist policy

Multi-task, multi-robot architecture. Open-sourced Feb 2025.

π0.5
Apr 2025 · Open-world generalization

Controls mobile manipulators in entirely unseen environments.

π0.6 / π*0.6
Nov 2025 · RL-optimized specialists

RL fine-tuning (Recap) for throughput and robustness on real tasks.

π0.7
Apr 2026 · Steerable generalist

Compositional generalization. Matches specialist fine-tunes out of the box. Cross-embodiment transfer.

Live Deployments

Partners in production

Weave Robotics

Laundry folding in San Francisco laundromats. π0.6 reduced missed grasps by 42% and interventions by 50% vs. π0.5.

Ultra

E-commerce order packaging in live US warehouses. Hundreds of deployments scaling. 96.4% autonomy demonstrated in production.

Models run on real customer infrastructure, packing real orders every day — this is production, not demo.

Three distinct workloads

Understanding what PI actually runs — from GPU training pipelines to on-robot inference to partner-facing APIs — is the foundation for mapping where Cloudflare fits.

01

Research & Training

Massively parallel GPU training runs for VLA models at scale. Multi-robot data collection pipelines — camera feeds, proprioception logs, teleoperation video — ingested and stored continuously. Model weights published publicly (π0 open-sourced) and shared privately with partners as fine-tuned checkpoints.

Multi-GB checkpoint artifacts Multi-robot sensor data Public + private weight distribution RL training pipelines
02

Edge / On-Robot Inference

VLA model inference must generate motor actions within the tight latency budget of physical control loops. PI's published "Real-Time Action Chunking" research directly addresses running large models at robot reaction speeds. Inference may run on-robot, on nearby edge compute, or via cloud API calls — each with different connectivity and latency requirements.

Real-time latency constraints On-robot or edge compute Cloud API fallback paths Venue / warehouse networks
03

Partner-Facing Platform

Partners submit proprietary robot data for pre-training inclusion. They receive fine-tuned model checkpoints back. They operate those models in live production environments on real customer hardware. This implies: model artifact storage, per-partner authentication, data ingestion pipelines, human-in-the-loop intervention systems, and eventually a public API surface.

Per-partner access control Checkpoint distribution Human-in-the-loop remote ops Future: inference API

Four high-value areas

PI's three infrastructure workloads map to four distinct Cloudflare product clusters. Each solves a real, current problem — not a hypothetical future one.

01

Model & Dataset Distribution

R2 + Workers

R2 Workers

PI publishes multi-gigabyte model weights publicly (π0 open-sourced) and distributes fine-tuned partner checkpoints privately. On AWS S3 or GCS, distributing a 10GB checkpoint file to partners globally generates substantial egress charges per download — and those charges scale linearly with how many partners are pulling, how often they update, and how many robots each partner runs. R2 eliminates that entirely: $0 egress regardless of how many times a checkpoint is downloaded globally.

A Worker sits in front of each download: it validates a per-partner JWT, confirms the partner is authorized to access that specific model version, generates a time-limited signed R2 URL, and logs the access event — all in a single edge request with no separate auth service required.

$0 egress on multi-GB checkpoint distribution globally
Per-partner access control with time-limited signed URLs — no auth service
S3-compatible API — drop-in for existing training pipeline tooling
Checkpoint distribution cost
AWS S3 egress
(10GB × 100 partners/mo)
~$90 / mo + scales with model size & partner count
Cloudflare R2 egress $0 regardless of download volume
Access control flow
Partner Robot
JWT →
Worker
validates + signs
URL →
R2
checkpoint
02

Inference API Gateway

AI Gateway + Workers

AI Gateway Workers

PI's stated long-term product is exactly the LLM API model applied to robotics — partners call an API, the physical intelligence is already there. As that API surface materializes, every inference request from a partner's robot needs: rate limiting per partner tier, per-request logging and observability, cost accounting by API consumer, and fallback routing if an inference backend is degraded.

Cloudflare AI Gateway provides all of this as a managed proxy layer — sitting in front of PI's inference backends (whether self-hosted GPU clusters or third-party inference services). Caching identical prompts reduces redundant inference costs. Workers handle per-partner authentication, route to the correct model version, and enforce usage quotas — all at the edge before a request ever reaches PI's GPU infrastructure.

Rate limiting, logging, and cost accounting per partner tier
Model fallback if inference backend is unavailable
Response caching for identical prompts — reduces GPU utilization
API gateway layer
Weave robots
Ultra robots
Partner N
AI Gateway Rate limit · Log · Cache · Fallback
π0.7 Inference
03

Robot-to-Cloud Connectivity

Cloudflare Tunnel + Zero Trust + Realtime

Tunnel Zero Trust Realtime SFU

Deployed robots at customer sites — warehouses like Ultra's, laundromats like Weave's — need to communicate with PI's cloud infrastructure to pull updated model weights, send telemetry, and receive intervention from a human operator. Exposing each robot or edge server with a public IP is a security and operational liability.

Cloudflare Tunnel gives each robot a secure outbound-only connection to Cloudflare's network — no public IP, no firewall rules to manage at the customer site. Zero Trust Access authenticates each robot as a device identity, ensuring only enrolled robots can reach PI's infrastructure. For Ultra's explicit human-in-the-loop intervention system — where remote operators step in when the model stumbles — Cloudflare Realtime SFU + TURN provides the WebRTC infrastructure for low-latency video and control through any network, including warehouse and laundromat WiFi that would block direct UDP.

No public IPs on deployed robots — outbound-only Tunnel connections
Per-robot device identity via Zero Trust — no network-level trust
Human-in-the-loop WebRTC through any venue network via TURN TLS:443
Robot secure connectivity
Customer site
Robot
Tunnel (outbound only)
Cloudflare
Zero Trust Tunnel Realtime
PI Cloud
Human-in-loop operator also connects via Realtime SFU
04

Research Site & Partner Portal

Pages + Workers AI + Images + WAF

Pages Workers AI Images WAF

pi.website is a research publication site with embedded video, high-resolution robot footage, and dense figures from papers. Model launch announcements — especially the π0 open-source release and the π0.7 compositional generalization post — generate significant traffic spikes when shared on academic Twitter/X and HackerNews. Cloudflare Pages with global CDN delivers all static assets from the edge; traffic spikes are absorbed with no origin scaling required.

Cloudflare Images handles on-the-fly resizing and WebP conversion of research figures for different devices. WAF + bot management protects model download endpoints from scraper abuse and prevents hotlinking of research video assets. As PI moves to a partner portal — where Weave and Ultra log in to view model performance metrics, submit data, and download checkpoints — Workers + D1 provide the backend for authenticated session state and access logs without running a dedicated application server.

Traffic spike resilience on model launch days — no origin scaling needed
WAF + bot management protects download endpoints from scraper abuse
Partner portal: Workers + D1 for serverless auth and access logs
pi.website traffic pattern
π0 open-sourced
π0.7 launch
Pages CDN absorbs spikes — origin never overloads

PI need → Cloudflare product

Every PI infrastructure requirement mapped to the Cloudflare product that addresses it — and the specific mechanism of value.

PI Requirement Cloudflare Product Specific Value Priority
Model weight & checkpoint distribution
R2Workers
$0 egress on multi-GB artifacts; signed per-partner time-limited URLs
Highest
Inference API gateway (partner model calls)
AI Gateway
Rate limiting, logging, caching, model fallback per partner tier
Highest
Per-partner authentication & routing
WorkersAccess
JWT validation, per-tenant model version routing at the edge
High
Deployed robot cloud connectivity
TunnelZero Trust
Outbound-only connection; no public IPs; per-robot device identity
High
Human-in-the-loop remote intervention
Realtime SFUTURN
WebRTC through any venue network; TURN TLS:443 for firewall bypass
High
Research site CDN & media delivery
PagesImages
Global CDN; traffic spike absorption; on-the-fly image optimization
Medium
Download endpoint & scraper protection
WAFBot Mgmt
Block abusive scraping of model weights; protect research video assets
Medium
Partner portal (metrics, data submit, checkpoint access)
PagesWorkersD1
Git-deployed portal; serverless auth state; no dedicated app server
Medium
Robotics data pipeline security
WAFZero Trust
Protect sensor data ingest endpoints; authenticate partner upload sessions
Emerging
Recommended Wedge

Start with R2 + AI Gateway

The highest-priority entry point is the combination that solves the two most immediate, concrete problems PI faces as they scale from research to multi-partner commercial deployment:

R2 for model distribution

The moment PI starts distributing fine-tuned checkpoints to Weave, Ultra, and additional partners, S3 egress becomes a real line item that grows with every new partner and every model update. R2 removes it entirely, and Workers provide the per-partner access control layer with no additional auth infrastructure.

Immediate outcome: Eliminate egress charges on checkpoint distribution and enforce partner-level access control with a single Worker.

AI Gateway for inference APIs

PI's entire product thesis is converging on "call our model like an LLM API." When that API surface opens up — even internally to partners — every request needs rate limiting, logging, and cost accounting. AI Gateway provides this immediately as a managed proxy with no additional infrastructure, and it's already designed for the multi-model, multi-provider patterns PI will use.

Immediate outcome: Full observability and per-partner rate limiting on inference traffic from day one of the partner API program.