JetStream 3.0: Redefining Browser Performance Benchmarks for Modern Web

Introduction

In a collaborative effort with Google and Mozilla, the WebKit team has unveiled JetStream 3.0, a major update to the widely used cross-browser benchmark suite. While the joint announcement highlights the suite's breadth and the partnership behind it, this article delves into the specific challenges the WebKit team tackled and the engineering advancements made in JavaScriptCore to drive performance improvements.

JetStream 3.0: Redefining Browser Performance Benchmarks for Modern Web — Source: webkit.org

The Need for a Refresh

Benchmarks are essential tools for browser engine developers to gauge and optimize performance. However, the web evolves rapidly, and any benchmark can become outdated as new best practices emerge. Moreover, once the most obvious optimizations are exhausted in a benchmark, subsequent improvements tend to become increasingly workload-specific and less generalizable. JetStream 3 addresses these issues by offering both a content refresh and a fundamental shift in how performance is measured, particularly concerning WebAssembly and the scale of modern web applications.

A New Approach to WebAssembly Benchmarking

One of the most significant changes in JetStream 3 is how it measures WebAssembly (Wasm) workloads. To appreciate this change, it's important to understand Wasm's origins. When JetStream 2 was released, Wasm was still emerging. Early adopters were large C/C++ projects that had previously targeted asm.js. The expectation was that users would tolerate a long, one-time startup cost in exchange for high runtime throughput, especially in applications like video games. Consequently, JetStream 2 scored Wasm performance in two separate phases: Startup and Runtime.

The Evolution of WebAssembly

As browser engines matured, they became remarkably efficient at instantiating WebAssembly modules. This success, however, introduced a new challenge. When startup times improved from hundreds of milliseconds to just a few milliseconds, even micro-optimizations began to have disproportionate effects. In WebKit, for example, optimizations reduced startup times so dramatically that for certain smaller workloads, the initiation time effectively reached zero seconds.

Overcoming the Infinity Problem

In JetStream 2, each iteration's time was measured using Date.now(), which rounds down to the nearest millisecond. As a result, any sub-1 ms time was recorded as 0 ms. This created a unique problem: the scoring formula was Score = 5000 / Time. When time hit zero, the score became infinite, rendering other scores meaningless. The team eventually patched JetStream 2.2 to clamp the score at 5000, but this was only a temporary fix.

While an infinite score might seem like a victory, it signaled that browser engines had outgrown JetStream 2's Wasm subtests. On today's web, Wasm is often on the critical path for page loads—used in libraries, image decoders, and UI frameworks. A zero startup time in a microbenchmark does not reflect real-world scenarios where Wasm modules must be loaded and executed efficiently in the context of larger applications.

Broader Implications for Performance Measurement

JetStream 3 introduces a more nuanced benchmarking methodology that accounts for modern Wasm usage patterns. Instead of treating startup and runtime as separate phases, the new suite integrates them into a cohesive performance metric. This shift ensures that benchmarks better reflect the actual user experience, where fast instantiation and sustained throughput are both crucial.

Beyond Wasm, JetStream 3 also scales up the complexity of modern web applications. It includes workloads that mirror real-world tasks such as data visualization, image processing, and interactive graphics. This broader scope helps developers identify optimizations that benefit a wide range of user scenarios, rather than just narrowly defined microbenchmarks.

Conclusion

JetStream 3.0 represents a thoughtful evolution of browser benchmarking. By addressing the limitations of its predecessor—particularly the infinity problem and the outdated Wasm model—the suite provides a more accurate and actionable tool for performance engineering. WebKit's contributions to this release, from optimizing Wasm instantiation to rethinking scoring methodologies, demonstrate the team's commitment to pushing the boundaries of web performance. As the web continues to evolve, benchmarks like JetStream will remain indispensable for ensuring that browsers deliver fast, reliable experiences to users everywhere.

For more details, see the WebAssembly Benchmarking section or revisit the Infinity Problem discussion.

Tags: