- Full Stack Express
- Posts
- How Canva Saves $3.6 Million Annually in S3 Costs
How Canva Saves $3.6 Million Annually in S3 Costs

Hello world.
In the ever-evolving world of software, where Bun vs Node.js debates seem to be the new coffee-break chatter, there's a fresh update that might just overshadow our favorite arguments.
ChatGPT, once confined to the realm of text, has now expanded its horizons to see, hear, and speak. As we navigate these rapid advancements, let's ensure we're not just debating, but also adapting.
In this week’s email:
Architecture: Expedia introduces the Juggler model to optimize hotel rankings.
Infrastructure: Canva saves $3.6M annually by optimizing AWS S3 storage.
Node.js: v20 showcases enhanced performance, as analyzed by Rafael Gonzaga.
React: Josh Comeau explores the nuances of React's Server Components as React celebrates its 10th anniversary.
JavaScript: Dan Abramov discusses the evolving complexities of the JavaScript ecosystem.
Development Practices: Emphasizing the importance of addressing technical debt for software efficiency.
There are only two hard things in computer science: cache invalidation and naming things.

Expedia has developed a system called the Juggler model, an algorithm designed to optimize the balance between travelers, hotel properties, and their platform's objectives.
How Expedia Ranks Hotels
When a user initiates a hotel search on Expedia, properties are ranked based on a set of algorithms.

Expedia’s Lodging Ranking Search
This system not only considers raw data but also user interactions, such as clicks and bookings, which are then fed back into the system for continuous learning.
System Architecture and Stakeholders
The platform's architecture is designed to cater to three primary entities:
Hotel properties, which are keen on maximizing their visibility.
Travelers, who demand optimal choices based on their preferences and budget.
Expedia's platform, which aims to maximize revenue while ensuring user retention.
Algorithmic Ranking
Rather than relying solely on direct relevance (like matching a user's desired location or price range), the Juggler model integrates multiple factors to determine the ranking.
This is achieved through a composite scoring system that combines direct relevance with other business-driven adjustments.
The weights for these adjustments are determined dynamically, ensuring a balance between immediate user satisfaction and long-term platform health.
The Juggler Model's Core
At its heart, the Juggler model is a meta-learning system.

Expedia’s Juggler Framework
It analyzes patterns from historical searches to predict optimal weight parameterizations for future queries.
This is similar to recommendation systems but tailored for the complexities of hotel bookings.
Future Tech Stack Enhancements
The success of the Juggler model has spurred further R&D. Expedia is exploring:
Enhanced personalization techniques, potentially integrating neural networks for deeper user profiling.
Advanced deep learning architectures to refine the search context extraction.
Reinforcement learning to allow the system to adapt in real-time based on user interactions.
P.S. If you’re enjoying the content of this newsletter, please share it with your network: https://www.fullstackexpress.io/subscribe
INFRASTRUCTURE
How Canva Saves $3.6 Million Annually in S3 Costs

Canva, a globally recognized online design platform, relies heavily on AWS for its production workloads.
Services used: Amazon S3, ECS, RDS, DynamoDB.
User growth: Scaled to over 100 million monthly active users since 2013.
Challenge: Efficiently storing a vast library and user-generated content.
Amazon S3: The Backbone
Amazon S3 offers a robust, cost-effective storage solution. Canva's data footprint on S3 is immense, with over 230 Petabytes of data.
The challenge?
Understanding the various S3 storage classes and optimizing them for cost and performance.

Amazon S3 Storage Classes
Storage Classes and Canva's Usage
For frequently accessed content like Canva's template libraries, the S3 Standard storage class is ideal.
However, user-generated content, which is often accessed briefly and then seldom revisited, was stored in the S3 Standard-Infrequent Access (S3 Standard-IA) class.
This class is cost-effective but still offers quick retrieval times. For specific cases like logs and backups, Canva uses S3 Glacier Flexible Retrieval.
The introduction of S3 Glacier Instant Retrieval in 2021, which combines low-cost storage with millisecond retrieval times, prompted Canva to reevaluate its storage strategy.
Deep Dive into Data Patterns
To optimize storage costs, Canva needed a granular understanding of its data access patterns.
Canva's primary content: Templates, stock photos, graphics.
User activity: Majority access content shortly after creation.
Previous strategy: Transition data from S3 Standard to S3 Standard-IA after 30 days.
New tool: S3 Storage Class Analysis for better data understanding.
Observation: Retrieval rate drops significantly after the first 15 days.

Canva’s S3 Access Patterns Over Time
Cost Analysis and Transitioning
Transitioning between S3 storage classes incurs costs based on the number of requests.
With over 300 billion objects in Canva's inventory, a blind transition would be costly.
However, the potential savings from using S3 Glacier Instant Retrieval, given its lower storage cost, are recurring.
By analyzing the average object size in each bucket, Canva could calculate the breakeven point for transitioning data and prioritize buckets for migration.

Canva’s S3 Breakeven Comparison
Results and Takeaways
Not only was Canva’s migration successful, the results are clear:
Migration simplicity: Applied a lifecycle policy to buckets.
Speed: Migrated nearly 80 billion objects in about two days.
Current data: 130 petabytes of Canva’s 230 petabytes in S3 now in S3 Glacier Instant Retrieval.
Cost savings: Approximately $300,000 per month ($3.6 million annually).
ROI: Positive return seen just a few months after the transition.
Initial cost: Over $1.6 million to transition roughly 80 billion objects.
AWS partnership: Continuous support and tailored storage solutions.
Key lesson: Understand data access patterns before making transitions.

Rafael Gonzaga's analysis of Node.js v20 reveals significant performance enhancements compared to earlier versions, using three distinct benchmark suites. Strategic initiatives, such as the 'Performance' and 'Startup Snapshot', alongside impactful pull requests, are driving these improvements. Despite some regressions, the Node.js performance team's ongoing efforts suggest a bright future for the platform's efficiency.

Josh Comeau highlights React's 10th anniversary and its latest innovation, React Server Components, which allows components to run exclusively on the server. Amidst the online confusion about this new feature, Comeau dives deep into its workings, emphasizing its potential and aiming to clarify its benefits and integration with Server Side Rendering (SSR). For context, he provides a primer on SSR, explaining how it offers users a fully-formed HTML document, improving initial load times, and how client-side React then enhances this with interactivity.
JAVASCRIPT
The Melting Pot of JavaScript

In this article by Dan Abramov, he discusses the complexities and evolution of the JavaScript ecosystem. Abramov highlights the inventive and ubiquitous nature of JavaScript, emphasizing its reflection of human culture. He advocates for making JavaScript tools more user-friendly, emphasizing the importance of sensible defaults, minimal configuration, and clear output to ensure a more approachable experience for newcomers. An older article, but very much still relevant today.
DEVELOPMENT PRACTICES
Measuring Technical Debt to Avoid Boiling Frog Syndrome

Technical debt represents the gap between the current and ideal state of software, impacting the ease of future modifications. It can accumulate from quick solutions, intentional decisions, or changing requirements. Addressing this debt early, through visibility and proactive management, is crucial to maintain software agility and efficiency.
AROUND THE WEB
What Else is Trending
✅ Learn: Sir Tim Berners-Lee launched the first website in 1991 at CERN on a NeXT computer. While there are now over 1.8 billion websites, the original can still be visited at it’s original address.
✅ Watch: Dive into TypeScript's captivating documentary, exploring its origins, purpose, and insights from its creators and Microsoft insiders.
✅ Listen: Explore solutions for solving glitchy animations in the latest episode of The CSS Podcast.
✅ Contribute: Contributing to open source is a rewarding way to collaborate and innovate. If you're wondering where to start, look no further.
✅ Join: Project IDX is Google's new cloud-based initiative to streamline full-stack, multiplatform app development through a familiar yet innovative web-based workspace. Join the waitlist.
COMMUNITY SPOTLIGHT
Hot Picks: Cool Tools in the Dev Community
Want to showcase something that you’re working on? Let us know!
SOCIAL MEDIA
Hall of Fame or Wall of Shame?
The MGM Grand is on a quest for a Red Hat Linux System Admin superhero, ready to forsake sunlight and weekends, all to resurrect the almighty slot machines from their digital slumber. Who needs a social life when you can rebuild IT infrastructure?
The MGM Grand is looking to hire a Red Hat Linux System Admin willing to work 10 hours per day 7 days a week to completely rebuild its IT environment from the ground up and get the slot machines working again. (h/t @zero04013437)
— Las Vegas Locally 🌴 (@LasVegasLocally)
10:30 PM • Sep 21, 2023
Over to you: What would you do?
MEMES
Battle of the Runtimes
What men are thinking about: Roman Empire
What JavaScript developers are thinking about: Bun

Reply