Full Stack Express
Posts
How Instacart Scaled Through the Pandemic

How Instacart Scaled Through the Pandemic

Stephen Sun
September 20, 2023

Good morning! Welcome back to this week's edition of Full Stack Express, your go-to newsletter for web development, software architecture, and system design.

Headlines

Microsoft's AI Team Exposes 38TB of Private Data on GitHub
Google Unveils Major Bard Update: Now with Extensions for Maps, YouTube, Flights, and Hotels
Vercel Releases v0.dev: A Natural Language-Powered Web UI Generator for React
Remix Rolls Out Version 2 of Its Full Stack Web Framework
Introducing Nue: The Newest, Fastest Thing in Frontend Development

Featured Deep Dives

How Instacart Scaled Through the Pandemic Managing 80,000+ Retail Locations and Predicting Real-Time Item Availability
Figma’s Performance Testing Journey from a Single Macbook to a Dual-System Infrastructure

Quick Bytes

How Instagram Scaled to 14 Million Users with Just 3 Engineers
Understanding Backpressure in Software Systems
Bard’s latest AI Updates and Improvements
Runtime Comparison Between Node, Deno, and Bun
AWS's IPv4 Estate Now Worth $4.5 Billion

Community Spotlight

Kiesel, Theatre.js, nanoGPT, and More

Tip of the Week

Boost Your React App's Performance with Lazy Loading and Suspense

Meme of the Week

JavaScript Frameworks: The Never-Ending Story

HOW INSTACART SCALED THROUGH THE PANDEMIC MANAGING 80k+ RETAIL LOCATIONS

The Challenge

Instacart, an online grocery delivery service, faced a complex challenge in predicting real-time item availability across 80,000+ retail locations, especially during the pandemic.

The company needed to ensure that its machine learning (ML) models could scale to predict the availability of millions of items while maintaining low latency and high consistency.

The Turning Point

The pandemic led to a surge in customer demand and fluctuating in-store inventories.

Instacart needed to evolve its Real-Time Availability (RTA) infrastructure to keep pace with these changes and maintain customer trust.

Objectives and Requirements

Low Latency: Fast and bulk fetching of availability scores at the retrieval stage.
High Consistency: Consistent availability information across all user interfaces.
Scalability: Ability to handle predictions for hundreds of millions of items.
Experimentation: A framework to test multiple ML models efficiently.

The Solution

Instacart implemented two methods for ingesting ML-generated scores into their database (DB) storage:

Full Sync: ML Availability Scoring Service updates a Snowflake table multiple times a day. DB ingestion workers read this table and update the availability scores.
Lazy Score Refresh: Scores are updated on-demand when an item appears in search results and exceeds the allowable age.

Full Sync and Lazy Score Refresh Architecture

To foster faster experimentation, Instacart developed a Multi-Model Experimentation Framework with three key components:

DB Column per Model: Each model's score is stored in a dedicated DB column.
Model-Column Mapping: A service-level configuration system maps the model version to its corresponding unique column.
Experiment-Column Mapping: A/B experiments are easily conducted with a unique feature flag associated with each column.

Multi-Model Experimentation Framework

To manage the complexity of different thresholds for various segments, Instacart introduced the Deltas Framework, which allows for the application of fixed deltas to base thresholds, computed at runtime.

Deltas Framework

Real-world Impact

Scalability: The lazy score refresh reduced the ingestion load by 2/3rds.
Experimentation: A 6X increase in experiments run using the new framework.
Customer Trust: Improved "good found rate," crucial for customer retention.

Key Takeaways

Scalability vs. Consistency: Balancing these two can be challenging but is crucial for maintaining customer trust.
Data Ingestion Strategies: Combining full sync and lazy refresh can optimize both latency and consistency.
Modular Experimentation: A well-designed experimentation framework can significantly reduce engineering work and speed up ML testing.
Threshold Management: Dynamic thresholding can be a powerful tool for handling complex, multi-segment optimization problems.

Instacart's innovative approach to RTA infrastructure demonstrates how engineering, machine learning, and product design can come together to solve complex, real-world challenges at scale.

Source

FIGMA’S PERFORMANCE TESTING JOURNEY FROM A SINGLE MACBOOK TO A DUAL-SYSTEM INFRASTRUCTURE

The Challenge

In 2018, Figma's entire in-house performance testing system ran on a single MacBook, looping through a series of key test scenarios.

Stress Test with 5,000 Comment Pins

Fast forward to 2023, and the landscape has changed dramatically. Figma's codebase has grown in complexity, with new features, products, and a team distributed globally.

The single MacBook approach was no longer sustainable, especially with the team expanding to over 400 engineers and managers.

The Turning Point

The MacBook that had been running tests overheated in October 2020.

Attempts to replicate the tests on another laptop failed, signaling the need for a more scalable, sophisticated approach to performance testing.

Early Stress Tests Created in FigJam

Objectives and Requirements

Figma aimed to build a system that could:

Test every proposed code change in the main monorepo.
Complete performance guardrail checks in under 10 minutes.
Provide detailed performance metrics and comparisons.

The Solution

Figma deployed two systems connected by the same Continuous Integration (CI) system:

Cloud-based System: Runs in GPU-enabled virtual machines (VMs) for every pull request. Due to the noise levels in VMs, a 20% pass margin was set to catch only the most significant regressions.
Hardware System: Runs on an array of test laptops, including older machines, allowing for custom test scenarios and more precise performance metrics.

Both systems shared features like detailed HTML reports and CPU profiles for each run.

Real-world Impact

The new system went live in October 2022 and has been instrumental in identifying performance bottlenecks and regressions early in the development cycle.

It has also empowered engineers to collaborate on performance-sensitive code across teams and time zones.

Key Takeaways

Start Lean, Plan for Scale: While starting with a minimal setup is good, always have a plan for scaling your testing infrastructure.
Proactive vs. Reactive: Shifting from a reactive to a proactive approach in performance testing can save time and improve product quality.
Parallelization and CI: Utilizing parallel runs in CI can significantly speed up the testing process.
Hardware Matters: For graphics-heavy applications, real hardware testing can provide insights that VMs cannot.
Continuous Monitoring: Automated tests and detailed reporting can catch regressions before they impact users, maintaining development velocity.

By embracing a scalable, dual-system approach, Figma has set itself up for sustainable growth, ensuring that performance doesn't become a bottleneck as the product evolves.

Source

BYTE-SIZED TOPICS

How Instagram scaled to 14 million users with only 3 engineers

Instagram's guiding principles and tech stack explained simply

engineercodex.substack.com/p/how-instagram-scaled-to-14-million

Backpressure explained — the flow of data through software

Backpressure is something nearly every software engineer will have to deal with at some point, and for some it’s a frequent problem. But…

medium.com/@jayphelps/backpressure-explained-the-flow-of-data-through-software-2350b3e77ce7

‎Bard's Latest AI Capability Updates & Improvements - Bard

Explore the latest Google Bard updates which include improvements in generative AI capabilities, expanded access, and more.

bard.google.com/updates

Node.js vs. Deno vs. Bun: JavaScript runtime comparison | Snyk

Node.js, Bun, and Deno... oh my! In this post, we compare these three popular JavaScript runtimes so you can determine which is the right one for your project.

snyk.io/blog/javascript-runtime-compare-node-deno-bun

AWS IPv4 Estate Now Worth $4.5 Billion

AWS grew its IPv4 estate with an additional 27 million IP addresses to now owning 128 Million IPv4 addresses. At a value of $35 per IPv4 address, the total value of AWS’ IPv4 estate is ~4.5 Billion dollars. An increase of $2 billion

toonk.io/aws-ipv4-estate-now-worth-4-5-billion/index.html

INTERESTING PRODUCTS, TOOLS & PACKAGES

OrbStack · Fast, light, simple Docker & Linux on macOS

Say goodbye to slow, clunky containers and VMs. The fast, light, and easy way to run containers and Linux. Develop at lightspeed with our Docker Desktop alternative.

orbstack.dev

kiesel

⚡ A JavaScript engine written in Zig

codeberg.org/kiesel-js/kiesel

Theatre.js - animation toolbox for the web

Theatre.js is an animation editor with a visual interface.

www.theatrejs.com

GitHub - karpathy/nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

github.com/karpathy/nanoGPT

Download RustRover - JetBrains Rust IDE

Download the latest version of RustRover for Windows, macOS or Linux.

www.jetbrains.com/rust/download

TIP OF THE WEEK

React’s React.lazy() and Suspense allow you to perform code splitting, which can significantly improve the performance of your app by reducing the initial bundle size.

import React, { Suspense, lazy } from 'react';

const HeavyComponent = lazy(() => import('./HeavyComponent'));

function App() {
  return (
    <div>
      <main>
        <Suspense fallback={<div>Loading...</div>}>
          <HeavyComponent />
        </Suspense>
      </main>
    </div>
  );
}

export default App;

In this example, HeavyComponent will only be loaded when it's about to be rendered, reducing the initial load time for your application.

The fallback prop in Suspense allows you to display a loading indicator or some other placeholder while the component is being loaded.

This approach is especially useful for components that are not immediately visible, like modals or tabs that the user might never click, or for splitting routes in a single-page application.

MEME OF THE WEEK

Reply

or to participate.