- Full Stack Express
- Posts
- How Instacart Scaled Through the Pandemic
How Instacart Scaled Through the Pandemic


Good morning! Welcome back to this week's edition of Full Stack Express, your go-to newsletter for web development, software architecture, and system design.
Headlines
Microsoft's AI Team Exposes 38TB of Private Data on GitHub
Google Unveils Major Bard Update: Now with Extensions for Maps, YouTube, Flights, and Hotels
Vercel Releases v0.dev: A Natural Language-Powered Web UI Generator for React
Remix Rolls Out Version 2 of Its Full Stack Web Framework
Introducing Nue: The Newest, Fastest Thing in Frontend Development
Featured Deep Dives
How Instacart Scaled Through the Pandemic Managing 80,000+ Retail Locations and Predicting Real-Time Item Availability
Figma’s Performance Testing Journey from a Single Macbook to a Dual-System Infrastructure
Quick Bytes
How Instagram Scaled to 14 Million Users with Just 3 Engineers
Understanding Backpressure in Software Systems
Bard’s latest AI Updates and Improvements
Runtime Comparison Between Node, Deno, and Bun
AWS's IPv4 Estate Now Worth $4.5 Billion
Community Spotlight
Kiesel, Theatre.js, nanoGPT, and More
Tip of the Week
Boost Your React App's Performance with Lazy Loading and Suspense
Meme of the Week
JavaScript Frameworks: The Never-Ending Story
HOW INSTACART SCALED THROUGH THE PANDEMIC MANAGING 80k+ RETAIL LOCATIONS

The Challenge
Instacart, an online grocery delivery service, faced a complex challenge in predicting real-time item availability across 80,000+ retail locations, especially during the pandemic.
The company needed to ensure that its machine learning (ML) models could scale to predict the availability of millions of items while maintaining low latency and high consistency.
The Turning Point
The pandemic led to a surge in customer demand and fluctuating in-store inventories.
Instacart needed to evolve its Real-Time Availability (RTA) infrastructure to keep pace with these changes and maintain customer trust.
Objectives and Requirements
Low Latency: Fast and bulk fetching of availability scores at the retrieval stage.
High Consistency: Consistent availability information across all user interfaces.
Scalability: Ability to handle predictions for hundreds of millions of items.
Experimentation: A framework to test multiple ML models efficiently.
The Solution
Instacart implemented two methods for ingesting ML-generated scores into their database (DB) storage:
Full Sync: ML Availability Scoring Service updates a Snowflake table multiple times a day. DB ingestion workers read this table and update the availability scores.
Lazy Score Refresh: Scores are updated on-demand when an item appears in search results and exceeds the allowable age.

Full Sync and Lazy Score Refresh Architecture
To foster faster experimentation, Instacart developed a Multi-Model Experimentation Framework with three key components:
DB Column per Model: Each model's score is stored in a dedicated DB column.
Model-Column Mapping: A service-level configuration system maps the model version to its corresponding unique column.
Experiment-Column Mapping: A/B experiments are easily conducted with a unique feature flag associated with each column.

Multi-Model Experimentation Framework
To manage the complexity of different thresholds for various segments, Instacart introduced the Deltas Framework, which allows for the application of fixed deltas to base thresholds, computed at runtime.

Deltas Framework
Real-world Impact
Scalability: The lazy score refresh reduced the ingestion load by 2/3rds.
Experimentation: A 6X increase in experiments run using the new framework.
Customer Trust: Improved "good found rate," crucial for customer retention.
Key Takeaways
Scalability vs. Consistency: Balancing these two can be challenging but is crucial for maintaining customer trust.
Data Ingestion Strategies: Combining full sync and lazy refresh can optimize both latency and consistency.
Modular Experimentation: A well-designed experimentation framework can significantly reduce engineering work and speed up ML testing.
Threshold Management: Dynamic thresholding can be a powerful tool for handling complex, multi-segment optimization problems.
Instacart's innovative approach to RTA infrastructure demonstrates how engineering, machine learning, and product design can come together to solve complex, real-world challenges at scale.
FIGMA’S PERFORMANCE TESTING JOURNEY FROM A SINGLE MACBOOK TO A DUAL-SYSTEM INFRASTRUCTURE

The Challenge
In 2018, Figma's entire in-house performance testing system ran on a single MacBook, looping through a series of key test scenarios.

Stress Test with 5,000 Comment Pins
Fast forward to 2023, and the landscape has changed dramatically. Figma's codebase has grown in complexity, with new features, products, and a team distributed globally.
The single MacBook approach was no longer sustainable, especially with the team expanding to over 400 engineers and managers.
The Turning Point
The MacBook that had been running tests overheated in October 2020.
Attempts to replicate the tests on another laptop failed, signaling the need for a more scalable, sophisticated approach to performance testing.

Early Stress Tests Created in FigJam
Objectives and Requirements
Figma aimed to build a system that could:
Test every proposed code change in the main monorepo.
Complete performance guardrail checks in under 10 minutes.
Provide detailed performance metrics and comparisons.
The Solution
Figma deployed two systems connected by the same Continuous Integration (CI) system:
Cloud-based System: Runs in GPU-enabled virtual machines (VMs) for every pull request. Due to the noise levels in VMs, a 20% pass margin was set to catch only the most significant regressions.
Hardware System: Runs on an array of test laptops, including older machines, allowing for custom test scenarios and more precise performance metrics.
Both systems shared features like detailed HTML reports and CPU profiles for each run.
Real-world Impact
The new system went live in October 2022 and has been instrumental in identifying performance bottlenecks and regressions early in the development cycle.
It has also empowered engineers to collaborate on performance-sensitive code across teams and time zones.
Key Takeaways
Start Lean, Plan for Scale: While starting with a minimal setup is good, always have a plan for scaling your testing infrastructure.
Proactive vs. Reactive: Shifting from a reactive to a proactive approach in performance testing can save time and improve product quality.
Parallelization and CI: Utilizing parallel runs in CI can significantly speed up the testing process.
Hardware Matters: For graphics-heavy applications, real hardware testing can provide insights that VMs cannot.
Continuous Monitoring: Automated tests and detailed reporting can catch regressions before they impact users, maintaining development velocity.
By embracing a scalable, dual-system approach, Figma has set itself up for sustainable growth, ensuring that performance doesn't become a bottleneck as the product evolves.
BYTE-SIZED TOPICS
INTERESTING PRODUCTS, TOOLS & PACKAGES
TIP OF THE WEEK
React’s React.lazy() and Suspense allow you to perform code splitting, which can significantly improve the performance of your app by reducing the initial bundle size.
import React, { Suspense, lazy } from 'react';
const HeavyComponent = lazy(() => import('./HeavyComponent'));
function App() {
return (
<div>
<main>
<Suspense fallback={<div>Loading...</div>}>
<HeavyComponent />
</Suspense>
</main>
</div>
);
}
export default App;
In this example, HeavyComponent will only be loaded when it's about to be rendered, reducing the initial load time for your application.
The fallback prop in Suspense allows you to display a loading indicator or some other placeholder while the component is being loaded.
This approach is especially useful for components that are not immediately visible, like modals or tabs that the user might never click, or for splitting routes in a single-page application.
MEME OF THE WEEK

Reply