The True Code Of Production Systems

Silence Is a Design Decision

What users see when your system fails, slows, or goes offline is not polish. It is production design.

Gaurav Sharma

30 Mar 2026 • 5 min read

This is part of The True Code of Production Systems. The series is about the decisions that only become visible when something breaks in production.

Most teams build UI for one scenario: everything works. The API responds fast, the data is there, the user is happy. That UI gets designed, reviewed, and shipped.

What never gets designed is what happens when things do not work. And in production, things do not work all the time. Networks drop. Services slow down. Requests fail. Responses never arrive.

This article is not about why that matters. It is about exactly what your UI should do in each of those moments. Every case, a specific behavior. No ambiguity.

When the User Takes an Action

The moment a user clicks a button, submits a form, or triggers anything that touches your backend, your UI has one job: confirm that the action was received.

Do not wait for the response. Do not wait to see if it succeeds. The moment the user acts, something must change on screen. Disable the button. Change its label to "Saving..." or "Submitting..." or whatever is truthful for that action. Show a spinner if the operation will take more than a second.

This is not optional polish. It is the baseline. Without it, users click again. And again. You get duplicate submissions, broken state, and confused users who had no signal that the system heard them.

The rule: action happens, UI responds immediately, before you know the outcome.

When the Request Is Taking Time

Different wait durations demand different responses from your UI.

Under 1 second: A spinner is enough. Keep it subtle.

1 to 10 seconds: A spinner plus words. "Processing your request..." or "Uploading your file..." Tell them what is happening.

Over 10 seconds: A progress indicator if you can calculate one, an estimated time if you have it, and always a Cancel button. Never trap a user in a wait with no escape.

The words matter more than most teams think. "Processing..." is not the same as "Analyzing your document..." The second tells the user what the system is doing with their time. It transforms anxious waiting into informed waiting. The experience of those ten seconds changes entirely.

When the Request Fails

When something goes wrong, your UI must answer three questions clearly:

What happened. In plain language. Not "Error 503." Not "An unexpected error occurred." Something a human being would say. "We couldn't save your changes." "This file couldn't be uploaded." Name the actual thing that failed.

Whose fault it is. This matters enormously to the user. If they made a mistake, tell them what to fix. If the system failed, tell them explicitly that this is not their fault. Those two situations feel completely different and should be communicated completely differently.

What they can do next. A visible retry button if retrying is safe. A way to save their work locally if they have unsaved input. A path to contact support if the problem persists. Never end on the failure itself. Always give a next step.

✗  "Error 503: Service Unavailable.
    An unexpected error occurred. Please try again."


✓  "We couldn't save your changes right now.

    Our servers are having a moment. This is not your fault.
    Your work is safe locally and you can retry without losing anything.

    [ Try Again ]   [ Save Draft ]   [ Contact Support ]"

The first gives the user nothing. The second gives them clarity, reassurance, and three paths forward. That is what an error state is for.

When You Retry Automatically

If your system retries failed requests automatically, do it with exponential backoff. Double the wait between each attempt. Never hammer a struggling service with back-to-back retries.

Attempt 1  ──▶  fails       t = 0s
               wait 1s
Attempt 2  ──▶  fails       t = 1s
               wait 2s
Attempt 3  ──▶  fails       t = 3s
               wait 4s
Attempt 4  ──▶  succeeds    t = 7s
               User sees result. All retries were invisible to them.

While retrying, show the user something honest. Not a frozen spinner with no explanation. Something like "Having trouble connecting, trying again..." so they know the system is still working for them and have not been abandoned.

One hard requirement underneath all of this: retries must be safe. If clicking retry could create a duplicate order, send a duplicate email, or charge a card twice, the retry is a trap. Your backend operations need to be idempotent before your frontend can offer honest retry behavior.

When Only Part of the Page Fails

In any system with multiple services, some parts of a page will fail while others work fine. The wrong response is to fail the entire page. The other wrong response is to silently show a blank section.

Each section of your UI should handle its own failure independently. If recommendations fail to load, that section shows a clear message and a retry option. The rest of the page keeps working. If the notification count cannot be fetched, that widget shows a fallback. Nothing else is affected.

This is what error boundaries in React are designed for. But the decision is not technical first. It is a design decision: every section of your UI is responsible for its own failure state. Build it that way from the start.

When the User Is Offline

The moment the user loses connectivity, show a banner. Do not wait for a request to fail. Do not let them discover they are offline when they hit submit on a form they just spent ten minutes filling out.

navigator.onLine tells you when connectivity is lost. Use it. Show something calm and clear: "You're offline. Changes will be saved when you reconnect." Then make that true. Queue their actions locally and replay them when the connection returns.

When they come back online, tell them that too. "You're back online. Syncing your changes..." Close the loop. The user should never have to wonder what happened to what they did while offline.

When There Is Nothing to Show

Empty states are not edge cases. They happen on first use, after filters return no results, when a user has not created anything yet. They need to be designed with the same care as any other state.

An empty state should never be a blank section with nothing in it. It should tell the user what belongs here, why it is empty right now, and what they can do to change that. Every empty state is an opportunity to guide the next action. Use it.

The Question to Ask Before Every Ship

Before any feature goes to production, go through every external dependency it touches and ask:

"What does the user see, and what can they do, when this fails completely?"

If you cannot answer that clearly, the feature is not ready. It may be working. But working on a good day and ready for production are not the same thing.

A feature without designed failure states is not a finished feature. It is a finished feature for the best possible conditions. Production is not the best possible conditions. Design for where your software actually lives.

In Summary

Your UI should never go silent. For every situation where something can go wrong or take time, there is a specific behavior that keeps the user informed, in control, and able to act:

Action taken: respond immediately, before the outcome is known
Request is slow: show progress with words, not just a spinner
Request fails: say what, say whose fault, say what to do next
Retrying: use exponential backoff, stay visible, ensure idempotency
Partial failure: fail at the section level, never the page level
Offline: catch it early, queue actions, confirm the sync
Empty: guide the next action, never leave a void

Each of these is a design decision. If you do not make it deliberately, you have made it by default. And the default is silence. Silence is the one behavior your UI can never afford.

Design every failure case. Everything else is just a demo.

The True Code of Production Systems is a series about the decisions that only become visible when something breaks in production.
Read the full series at The True Code of Production Systems