General
The Real-Time Routing Stack: How Big-Box Retailers Engineer Rapid Delivery at Scale
Apr 23, 2026
11 mins read

Key Takeaways
- Big-box rapid delivery is winning in North America — by running real-time routing engines on top of existing store and MFC networks with structural proximity to customers.
- Rapid delivery is a different computational class than next-day fulfillment: distributed inventory state, variable driver supply, customer-exposed promise engine, and per-order batch-vs-dedicated decisions.
- The production architecture is four layers: Signal Ingestion (streaming, synchronized), Decisioning Engine (constraint-based, continuous), Execution (multi-network carrier orchestration), and Learning (outcome-based model retraining).
- The architectural bar is continuous versus batch. Batch systems cannot run 2-hour delivery at big-box scale. Continuous re-optimization is the foundational design commitment.
- Five evaluation questions matter most for a tech buyer: continuous vs. batch, real-time state synchronization, multi-network carrier orchestration, simultaneous-constraint optimization, and outcome-based learning.
A Head of Logistics Technology at a national retailer is evaluating their rapid-delivery stack. The e-commerce front-end promises “2-hour delivery” on 40,000 SKUs from 600+ stores. By Thursday afternoon, promise-kept rate is 81%. The question isn’t whether the promise is aspirational — it’s which layer of the stack is failing, and whether any routing engine on the market can solve the problem they’re describing.
Most NA rapid-delivery programs are wrestling with this architectural question right now, and it deserves a clean-eyed answer — because the market has already rendered a verdict on the alternative.
Pure-play 15-minute q-commerce collapsed in North America between 2022 and 2024. The ventures that spent billions building dedicated dark-store networks discovered the unit economics didn’t work. The companies that actually delivered profitable rapid delivery at scale — Walmart, Target, Amazon, Kroger, Costco — did it by running real-time routing engines on top of existing store and micro-fulfillment-center (MFC) footprints.
A production real-time routing engine for big-box rapid delivery is a four-layer architecture — signal ingestion, decisioning, execution, and learning — that continuously re-optimizes across inventory, staff, drivers, traffic, and SLA tiers in real time. Static batch-optimization systems cannot run this model. The architectural shift from batch to continuous is what separates rapid-delivery winners from economic casualties.
According to Walmart, approximately 90% of the US population lives within 10 miles of a Walmart store. That structural advantage — inventory already positioned near the customer — is what makes store-based rapid delivery economically viable. The routing engine is what converts structural advantage into operational reality.
Why Rapid Delivery at Big-Box Scale Is a Different Computational Problem
For Heads of Logistics Technology evaluating rapid-delivery infrastructure, the first conceptual shift is this: rapid delivery is not “faster last-mile.” It’s a different class of routing problem, and systems designed for next-day or two-day fulfillment cannot be tuned into it.
Four properties make it different:
1. Distributed inventory state changes every second. 40,000 SKUs across 600+ nodes, with live reservations, pick-queue locks, and restock events. Node selection isn’t “which store is closest” — it’s “which store has the item, has pick capacity, has driver supply, and clears the SLA.”
2. Driver supply is continuously variable. Mixed W2 / gig / 3PL fleets mean availability changes minute-to-minute. Static capacity planning doesn’t fit; the engine has to reason about driver-supply state as a live input.
3. The promise engine is exposed to the customer. The shopper saw “arrives by 4:47pm” at checkout. That ETA must be trustworthy when emitted and still trustworthy at 3:30pm when the pick queue shifts and a new order enters the batch.
4. Batch-vs-dedicated decisions happen per order. Some orders are profitable batched with three others on a multi-stop route; some demand a dedicated trip. The engine makes this decision per order, per moment — not as a scheduled policy.
According to McKinsey & Company, same-day and next-day delivery have moved from premium feature to baseline consumer expectation across major NA retail categories — a shift that makes real-time routing a core platform, not a point feature.
Also Read: The CXO’s Guide to Implementing Agentic AI for Autonomous Route Optimization
The Four-Layer Architecture
Layer 1: Signal Ingestion
The foundation is streaming, not scheduled. The engine ingests:
- Order stream from the OMS and e-commerce platform
- Real-time inventory state per node (store-level availability plus reservation state)
- Staff and pick-capacity state (pick queue depth, labor availability, SKU-category throughput)
- Driver state across networks — W2, Spark, DoorDash Drive, Uber Direct, Instacart Connect, Shipt, 3PL
- Traffic (live and historical), weather, and customer-at-home signals where available
The hard part isn’t ingestion — it’s synchronization. A signal that’s 90 seconds stale poisons every downstream decision. A Chicago operator with 40 stores across Cook County needs inventory-state refresh on the order of seconds, not minutes. Five-minute ERP sync loops that worked for overnight fulfillment do not work for 2-hour delivery.
Layer 2: Decisioning Engine
This is the center of gravity. Five things happen here continuously:
Node selection. Which store or MFC fulfills this order? Inputs: proximity, inventory, pick capacity, historical cycle time, current queue depth.
Batching decision. Multi-order batch on a shared route versus dedicated trip — decided per order, per moment, against current supply and SLA state.
Constraint-based optimization. The optimizer treats node, driver, SLA tier, batching, margin, and ETA as simultaneous constraints — not sequential filters. Sequential filters make rapid-delivery economics fail, because each filter discards options that a simultaneous solve would have combined profitably.
Dynamic promise engine. The ETA the customer sees at checkout is the output of the full routing decision, not a marketing-defined promise. This is a meaningful architectural commitment: it means routing logic is exposed to a customer-facing system in real time.
Continuous re-optimization. Every inbound signal re-evaluates in-flight orders. This is the architectural break from legacy routing systems.
According to Gartner, supply chain leaders applying AI to real-time decisioning consistently outperform peers on cycle time, cost, and customer experience — with rapid-delivery orchestration cited as one of the highest-leverage use cases for continuous optimization.
Layer 3: Execution and Dispatch
Execution at big-box rapid-delivery scale is an orchestration problem, not an assignment problem. The routing engine isn’t just picking a driver — it’s selecting which carrier network to route through.
- Real-time API orchestration across Spark Driver, DoorDash Drive, Uber Direct, Instacart Connect, Shipt Delivery-as-a-Service, and proprietary W2 fleets.
- Dynamic re-routing on exception — driver delayed, order modified, customer not home, pick shortfall discovered at pack.
- Fallback logic when any single network saturates. A Dallas-Fort Worth operator running 12,000 daily rapid-delivery orders across suburban sprawl will routinely have one carrier network saturate during peak demand; the routing engine needs to reallocate across three or four alternatives without human intervention.
Layer 4: Feedback and Learning
The architecture closes with continuous learning:
- ETA accuracy models retrained on actual-vs-predicted outcomes, segmented by route type, store, and SKU category.
- Pick-time estimates refined per store per category (produce pick time ? electronics pick time ? pharmacy pick time).
- Carrier performance scoring feeds future allocation decisions — reliability, cost, cycle time, exception rate.
- Demand forecasting for staff and driver capacity planning cycles back to Layer 1.
The compounding effect matters: every delivered order makes the next routing decision more accurate. Architectures without a Layer 4 degrade as SKU mix, store mix, and carrier mix shift. Architectures with one get sharper every quarter.
Also Read: Intermodal Dispatch Platform Guide: Features, Benefits & Selection Checklist
Why Batch Optimization Cannot Run Rapid Delivery
For tech-buyer due diligence, the architectural distinction that matters is not “AI versus no AI.” It is continuous versus batch.
Batch systems optimize routes at fixed intervals — every 5, 15, or 60 minutes. Between runs, new orders queue. For next-day and two-day delivery, this works fine. For 2-hour delivery at big-box scale, it is structurally incompatible: the batch interval becomes the SLA ceiling, and the scheduler accumulates stale state between runs.
Continuous re-optimization treats the routing plan as live state, not a scheduled output. Every inbound signal triggers a partial re-solve of the affected orders and routes. The plan is never “finished” — it is continuously improved.
The implementation reality: continuous re-optimization at big-box scale requires an optimizer that produces high-quality solutions in sub-second latency across millions of constraint combinations, on a data stream that never pauses. This is the architectural bar.
According to Bain & Company, rapid-delivery economics only hold together when fulfillment density and routing efficiency cross specific thresholds — thresholds that static systems rarely reach and continuous systems reach as a matter of design.
The Head of Logistics Technology Evaluation Framework
Before signing off on the next rapid-delivery routing platform, five questions separate production-grade architectures from repackaged batch systems:
- Does the platform continuously re-optimize, or does it batch? If the vendor conversation includes “run intervals” or “optimization cycles,” it is batch. Rapid delivery requires continuous.
- How does it ingest and synchronize real-time inventory, staff, and driver state? The answer should describe streaming data patterns and sub-minute freshness — not scheduled polling.
- Does it orchestrate across external carrier networks via API, or does it assume a single fleet? Big-box rapid delivery requires the former. A platform built for proprietary W2 fleets alone will not scale.
- Can the optimizer handle node selection, batching, SLA tier, margin, and ETA as simultaneous constraints — or as sequential filters? Sequential filters leave margin on the table on every order.
- Does it learn from delivery outcomes, or does it require manual model retraining? Static systems degrade as the operating environment shifts. Learning systems compound.
Also Read: Last-Mile Orchestration: A Practical Guide to Closing the ETA-to-Execution Gap
The Strategic Reframe
For Heads of Logistics Technology evaluating rapid-delivery infrastructure, the decision isn’t “which vendor is fastest” or “which has the best UI.” It is: which vendor built a system that treats real-time signal synchronization, continuous re-optimization, multi-network carrier orchestration, and outcome-based learning as foundational — not as features added to a batch architecture.
The retailers winning NA rapid delivery haven’t won because they had faster trucks. They won because their routing architecture was engineered for a computational problem that looks nothing like the next-day-delivery problem legacy systems were designed to solve.
To learn more, visit locus.sh
Frequently Asked Questions (FAQs)
What is a real-time routing engine?
A real-time routing engine is a logistics platform that continuously optimizes delivery assignments and routes as new signals arrive — orders, inventory updates, driver availability, traffic, weather. Unlike batch route optimizers, which run at fixed intervals and produce a scheduled plan, real-time routing engines treat the plan as a live state that is continuously updated. They are the architectural foundation for rapid-delivery operations that commit to sub-hour or sub-2-hour service-level agreements.
How does a real-time routing engine differ from batch route optimization?
Batch route optimization runs at fixed intervals (every 5, 15, or 60 minutes), producing a scheduled route plan that remains static until the next run. Real-time routing engines continuously re-optimize — every new order, driver status change, or inventory update triggers a partial re-solve of affected orders and routes. For next-day delivery, batch systems work well. For rapid delivery (same-day, 2-hour, 1-hour), batch systems cannot keep pace with signal velocity; continuous re-optimization is the architectural requirement.
What is the architecture of a modern rapid-delivery routing platform?
A modern rapid-delivery routing platform is structured as four integrated layers: Signal Ingestion (streaming, synchronized feeds from OMS, inventory, staff, driver networks, traffic, and weather), Decisioning Engine (continuous constraint-based optimization handling node selection, batching, SLA tier, margin, and ETA simultaneously), Execution and Dispatch (multi-network carrier orchestration via API), and Feedback and Learning (outcome-based model retraining for ETA, pick-time, and carrier performance). The layers are distinct in purpose but continuous in data — each feeds the next, and the last feeds back to the first.
How do big-box retailers power same-day delivery technically?
Big-box retailers like Walmart, Target, Amazon, Kroger, and Costco power same-day delivery by running real-time routing engines on top of their existing store and micro-fulfillment-center networks. The routing engine performs node selection (which store fulfills), staff and pick-capacity reasoning, driver allocation across multiple carrier networks (including proprietary W2 fleets like Spark Driver and 3P networks like DoorDash Drive, Uber Direct, Instacart Connect, and Shipt), continuous re-optimization as conditions change, and outcome-based learning. The competitive advantage is structural: existing store footprints place inventory within 10 miles of most US customers, which routing software converts into viable rapid-delivery economics.
What should a Head of Logistics Technology evaluate in a routing platform?
A Head of Logistics Technology evaluating a routing platform for rapid delivery should assess five architectural criteria: whether the platform re-optimizes continuously or in batches; how it ingests and synchronizes real-time inventory, staff, and driver state; whether it orchestrates across external carrier networks via API or assumes a single fleet; whether the optimizer handles node selection, batching, SLA tier, margin, and ETA as simultaneous constraints rather than sequential filters; and whether it learns from delivery outcomes continuously or requires manual model retraining. Platforms that treat any of these as optional features rather than foundational design commitments will struggle to sustain rapid-delivery economics at scale.
Anas is a product marketer at Locus who enjoys turning complex logistics problems into simple, clear stories. Outside of work, he’s usually unwinding with a book or catching a good movie or series.
Related Tags:
General
The Urban Fleet Electrification Playbook: Why Medium-Duty EV Economics Only Work with Smart Routing
How NA fleets are making medium-duty EVs work — the regulatory landscape, the cost-parity math, and why route optimization is the make-or-break lever.
Read more
General
The Carrier Orchestration Problem: Rebuilding SEA Reverse Logistics Across a Fragmented Courier Network
How e-commerce operators in Singapore, Malaysia, and Thailand are redesigning reverse logistics through dynamic carrier orchestration — the ops playbook.
Read moreInsights Worth Your Time
The Real-Time Routing Stack: How Big-Box Retailers Engineer Rapid Delivery at Scale