Taking Air Hockey to production

A few years back I built an air hockey game from scratch. I even deployed it on an AWS EC2 instance and let my children play it. It never went further than that though, and it was what could be called an MVP. The polish wasn't there. But there was something with this game. It didn't really fit the mold of anything else that I had played. The fact that you had real control of your paddle and could move it as fast as real-world physics would allow. The puck movement was also very much what you would see in a real air hockey game that I've played quite a lot in arcades. The nostalgia was there. What if I could give this another go. Polish it up, optimize performance and deploy it globally. Could it be something that people would actually like playing, and might even pay the cost of a coffee for?

What had I built?

The game was a React frontend around an HTML canvas. A game board, two paddles and a puck, drawn as plain primitives (circles, lines, rectangles etc.). The player has control over their paddle. An event listener would listen for touch/click and hold events and move the paddle to wherever the pointer would move on the board. A RequestAnimationFrame ticks at the current device screen refresh rate and sends the paddle coordinates over a websocket to a backend written in Java, where the actual game engine would run at 60Hz. Each tick would run a collision check based on the current coordinates and velocity of the puck and the player paddles, and then broadcast the new puck state and the opponent handle positions back to the players.

Aspect ratio

A neat thing which I had done was to fix the game board aspect ratio and convert the positions of the game object to percentages of the game board width and height. That way it wouldn't matter what the screen resolution was on the two players' different screens. One could be playing on a large desktop monitor and one could be playing a small mobile screen.

Mirroring

In addition, I would make sure to mirror the position of the opponent handle and puck whenever it was broadcast down to the respective players. That way, each player would both see their own paddle at the bottom of the screen, optimizing for playing on a smartphone where you would control the paddle with your thumb while holding the device with one hand.

Physics

Probably what I am most proud of looking back, was the physics engine. I had not researched anything about game design and how to build a multiplayer game. I used the lowest level primitives for everything, and relied on my own knowledge of physics, vector math and linear algebra. A circle bouncing off walls was of course no problem, but having two circles hit each other at different speeds and moving in any direction: How would I calculate the bounce for that? This was before LLMs, so I had to go old school and look up the formula for vector projection in my Linear Algebra book from university. I remember spending a weekend thinking and experimenting around this until my brain hurt before I got it right.

Dusting it off

So here I was with the intention of finishing what I started. The game was almost done right? How hard could this be?

Cleanup and optimization

I got started on the low hanging fruit. Why did I use a heavy JavaScript library like React? The frontend was basically an HTML canvas with some JavaScript. And why TailwindCSS? I threw them both out, opting instead for plain JavaScript with Vite for DX, and plain CSS for styling. At least 50KB gone from the production bundle and less complexity.

The next thing to tackle was GC pressure in the Java backend. In order to be able to serve many concurrent games, and minimize stop-the-world garbage collection (potentially causing stutter in the gameplay), I did a major refactor of the game engine to remove immutable objects which were created at every tick and only used for those 16ms (60Hz game loop). At the time I was into functional programming and immutability. And perhaps I'd been too influenced by our tech lead at work who was a huge proponent of this programming paradigm. Now, returning a few years later and a lot wiser, I realised there is a time and place for everything. This game required mutable objects to avoid stop-the-world pauses — and so it was.

Lastly, I also switched to raw websockets (instead of STOMP), to reduce network overhead, as well as upgrading to Java 25 and Spring Boot 4.

Sound

One thing I'd for some reason omitted from the MVP was game sound. This was glaringly obvious when you returned to the game a few years later. The solution was simple. On the game server, whenever a wall hit, puck collision or goal was detected, send this in a bit mask to the client. Experiment with the Web Audio API to find something that fit the different events and trigger it. Handle it all in an appropriately named client class called SoundEngine.

Get it deployed

So I had something here and I could keep iterating but in order to not waste time I wanted to get it deployed somewhere so that testing would actually account for network latency. This is a real-time multiplayer game, and playing it against yourself locally might obscure issues that you would only notice once you deploy to production. For this reason I wanted to set up a deployment pipeline early. Once you start thinking along these lines your mind naturally starts wandering towards scalability. If this game went viral, wouldn't I want to be able to scale out my game servers? And what about people playing from different parts of the world. The experience would suck for a US player living on the west coast connecting to servers in central Europe. It was clear I needed a control plane. Users would connect to the lobby via a gateway (a separate server application which I developed in the Java Helidon framework) which in turn would have the knowledge of active game servers and the ability to close down and start servers based on traffic. Fly.io made this very simple. Their machine API allows you to easily start up a docker container and get the fqdn of the new box which could be relayed to the client for the websocket connection. The gateway would also be the server connecting to the database for persisting users, game statistics and leaderboard. I decided on a good ol' postgres database (self managed), a shared instance for the gateway and dedicated instance for the game servers. If there was no traffic I would scale it to zero to reduce cost.

Websockets and TCP not good enough

So now I could try the game on an actual server deployed in Amsterdam. Right away I noticed the bane of real-time multiplayer games: TCP. The overhead was too high. Each packet sent has to be acked. If the packet gets lost hitting into the wall of a router? No ack is sent and the whole game freezes waiting for retransmission of that lost packet, and hopefully this time it will be acked. Sometimes the game ran pretty smoothly, but randomly and regularly there would be network jitter. I went down a rabbit hole trying to fix this with client-side interpolation, snapshotting and jitter buffers (essentially adding deterministic latency on purpose so that we can absorb lost packets). Nothing made even the slightest dent in the performance of the game. I was basically ready to give up at this point. But, I had one thought running in my mind: UDP?

UDP and move to Hetzner

In order to get rid of head-of-line blocking inherent to TCP, I decided to try out UDP for communicating game state between browser and game server. Now, that was easier said than done when it comes to browsers. I had two options:

WebRTC: Peer-to-peer communication. Use the gateway as a signaling server to facilitate the intent of starting to use UDP for communication between the browser and game server. This is called SDP: Session Description Protocol. Then utilize a STUN (Session Traversal Utilities for NAT) to ping a google server to find out the IP address of the device the browser runs on. Finally transition to the ICE phase (Interactive Connection Establishment) where the game server (whose IP address is known) and client (the browser) start UDP hole punching in NAT gateways and routers to find the closest network path between the two peers. After that we are good to go.

WebTransport: The modern replacement for WebRTC. Runs on top of HTTP/3 QUIC which will eventually replace all web communication (and aptly uses UDP instead of TCP). It supports multiplexing and both unidirectional and bidirectional streams. Transactions are of course supported but implemented in a more efficient and performant way than in TCP. But if one wants it you can use fire-and-forget Datagrams instead (which I want). However, when I started this websocket replacement research, my browser of choice on my Mac was Safari, which did not have support for it yet. Weirdly, Apple released a new version on both iOS and macOS while I was developing this which does support WebTransport (what are the chances of that?).

In the end I made WebTransport the standard with WebRTC fallback on devices that run older browsers.

This whole thing took so much more time than I would have wanted. But to make a long story short: implementing UDP communication between the client and game server required me to move off of fly.io because I would have to provision static IP addresses for the game servers. These servers were run on performance instances which were already quite expensive. Pre-provisioning reserved IP addresses that cost a fixed sum per unit did not seem sustainable. So, I tore it all down and deployed everything to a Hetzner box instead (Gateway, Game Server and Postgres database + Valkey cache). CPX31 for €17.99 a month. The frontend I deployed to Cloudflare where I also bought a domain: airhockey.app.

A brief note on Caddy

QUIC requires TLS, and of course I need this for my gateway too. Caddy is a reverse proxy written in Go which handles the routing to my different servers and fetches free SSL certificates for my domains using Let's Encrypt. It also handles CORS for the game server — the browser connects from play.airhockey.app to game.airhockey.app for the WebRTC/WebTransport handshake, so the right headers need to be in place. Finally, it acts as a security boundary by only exposing the specific paths that should be publicly routable and returning 404 for everything else. Insanely convenient for something configured in 30 lines.

Java and its garbage collector

Even with UDP I was not satisfied with the "smoothness" of the puck movement across the board. I forgot to mention above that when I was deploying the Java game server on fly.io I decided to use GraalVM to compile a native image instead of running the server on a JVM. The reason was that I wanted fast startup times on fly.io, where I was spinning up game servers based on demand. When I now started to revisit performance on my pre-provisioned Hetzner box, I was advised to use Generational ZGC because it optimizes for sub-millisecond GC pauses (kind of important in a real-time multiplayer game — or any game for that matter). This, it turns out, was not available for native images. So I moved back to deploying in a standard JVM instead and opting into ZGC.

Port to Rust

While I was fine-tuning things and looking into different garbage collectors it dawned on me that why do I even use Java in the first place. I remember having read an article about Microsoft using LLM agents to port their legacy C++ code to Rust. Here I had a game server which at this point only contained the game logic. It seemed like the perfect candidate for a port. Would I dare it? Turns out I did, and now I never have to worry about GC pauses again (with the added benefit of consuming a lot less RAM).

The true cause of less than optimal smoothness

Even with the brand new Rust server, I was annoyed by the smoothness of the puck movement not being perfect. The issue was actually something else: the JavaScript requestAnimationFrame (RAF) tick was out of phase with the game server tick. Essentially, the RAF loop would draw the newly arrived puck coordinate whether it was 1 ms old or 15 ms old (within our 16.66 ms window). This non-deterministic timing caused the puck's perceived speed to vary between frames, which was noticeable even at 60Hz. The solution was to record the exact client-side timestamp the moment a state update arrives, intentionally delay the visual rendering by one tick, and use that elapsed time to smoothly and deterministically interpolate the puck's position between frames.

Auth

When I built my initial game I intentionally stayed away from authentication. It is a turn-off to have to create an account, and I wanted the barrier for playing the game to be minimal. So I only required the user to provide a game handle when they first entered the page. Under the hood I appended a UUID to that handle and persisted it in the browser's local storage. That was all I needed to be able to have a unique name for the players when they started communicating with the game server.

Now, however, I wanted to be able to add leaderboards (chess algorithm for rating), and potentially have the ability to monetize the game in the future. So I would have to add proper user accounts. In the end I settled on Google and Apple OAuth only. The reasoning is that it would still be minimal effort for people to sign up and play, and I wouldn't have to worry about account creation abuse and email verifications. I might lose some players because of this, but the tradeoff was worth it (at least for the time being).

Load testing

I wanted to know how many concurrent games I could handle on my server. I temporarily added dev endpoints to bypass the Google and Apple auth layer, provisioned bot users in the database, and wrote a script that would spin up a successively increasing number of concurrent games. Cloudflare ended up throttling me at around 200 connections since they all originated from the same IP. There is probably a way around this, but I decided it was good enough for now — the chances of me having that many concurrent games any time soon are slim.

Observability

I needed a way of monitoring errors and performance metrics of my running servers, so I added a server-rendered admin dashboard directly in the gateway, protected by an HMAC session cookie. It displays real-time gauges (concurrent games, online users, matchmaking queue depth), game server tick latencies with hourly history charts, and gateway JVM metrics (heap, threads, GC pauses, uptime). A sliding-window error counter tracks exceptions by category, and when the count crosses a threshold a Discord webhook fires so that I'm alerted in real time.

On the client side, the browser fires telemetry beacons for transport negotiation outcomes (WebTransport vs WebRTC fallback), player latency samples (p50/p95/p99), and JavaScript errors. The dashboard aggregates all of this alongside a device and browser breakdown.

DB backup

The postgres database runs on the same box as the gateway and game server which might not be sustainable in the long run. However, it is performant and cost effective. I considered using a managed SaaS offering like Neon but I couldn't sleep well knowing it would be pay per transaction. I might revisit this at some point in the future though.

But I would at least have to be responsible enough to take snapshots of the db and upload to an S3 bucket. In a previous project I already had a bucket configured for this with the limited AWS credentials required. I made a homemade bash script that gets run by crontab, which makes a pg_dump at 3 AM UTC every day and uses the aws-cli installed on the hetzner box to upload the snapshot to the S3 bucket. I also added retention by deleting the oldest snapshot if more than ten exist.

Deployment

Similarly, when it comes to deployment I have stayed away from any off-the-shelf offerings or SaaS services. I'm a solo developer and don't need a CI/CD platform with a million features that I won't use. Instead I wrote a single deploy.sh bash script. It first runs all test suites (frontend Vitest, gateway Maven, Rust game server), validates API contracts between the components, and only then builds the production artifacts. The backend is rsync'd to the Hetzner box where docker-compose rebuilds and restarts the containers. The frontend is deployed to Cloudflare Pages via wrangler. One command, a few minutes, and everything is live.

Try it out here: play.airhockey.app.

What's next

I will see if I can get a limited number of people in Europe to play this now to gather some metrics and learn. Then if it gains some traction I will introduce a US server just for the game server (keep the control plane and DB on the box in Europe), adopting a Hub-And-Spoke model. But we'll see.