Many architecture and design decisions are based on unstated assumptions about how a system is going to be used. Stating those assumptions explicitly at the start of a project is key to avoid painting yourself into a corner at the end. Here are the three questions you need to start with.
Starting a project invariably includes a certain amount of fumbling around. It’s easy to have a good idea, but the step from good idea to functioning project is a long one. With a room full of programmers, the initial discussions almost always devolve into which technologies are to be used, often before even the most basic requirements are well understood. So I let the argument go on for a while – on topics like the debate between using powerful, feature-rich Apache versus lightweight, high-capacity nginx, nearly as amusing as EMACS versus vi – before I asked a question.
“So, how many pages do you expect to serve?”
The argument stops, both sides look at me, puzzled. “How do we know? We haven’t built it yet.”
“Then how do you know which choice is better? The nginx server is great for high capacity, but Apache is easier to configure and manage. How do you know you need the extra capacity?”
These three questions – How many users? How many transactions? How much storage? – are at the heart of system or software architecture design.
If you don’t believe it, consider the earliest web applications, which were often built with a single big program in C or Perl invoked as a CGI. Those original applications meant you needed a fork/exec or the equivalent every time a URL with dynamic content was serviced. Display a customer’s account page? Fork a process. Change a password? Fork a process.
The web application itself could be perfectly correct, in the sense that it always provided the expected result – eventually. But performance for a lot of these applications quickly became unacceptable. Some companies tried simply buying bigger servers, but soon people started making architectural changes, first with HTTP keepalive and fast CGI, then by using application servers and load-balanced multiple-server systems. None of those design decisions changed the functional requirements (you still needed to present account pages and password forms), but the web architecture changed to meet the “How many? How much? How fast?” requirements.
Of course, my colleagues were correct that we wouldn’t know the answers to these questions until the system was complete. That doesn’t excuse us from thinking about them, and making our architectural decisions based on them.
In fact, we can’t avoid it. When my colleagues were arguing over the Apache vs. nginx choice, they were in fact arguing based on their own assumptions. The question is only whether we make our reasons for these decisions clear and whether we think them out well.
If you look in a conventional, non-technical dictionary, you find “workload” defined as something like “the amount and type of work done by someone or something.” Technically, we refine that a bit for software or systems engineering and say a workload is the sequence of tasks performed by the system over time. Describing that workload is called workload characterization.
Of course, if the system were complete, workload characterization would be easy: We just instrument the system so that we record everything it does over its lifespan. Since we haven’t built the system yet, that option isn’t available. Instead, we make assumptions and we characterize the expected workload with a workload model.
This workload model can be more or less complicated. In Capacity Planning on a Cocktail Napkin, I described a very simple, back of the envelope, workload model: establishing the number of transactions a system needed to break even, and approximating the number of transactions at highest load.
The application that spawned this article is different: It’s a hardware appliance with a web management interface. For this project, the software is more likely to have short intervals with a few dozen page loads but long idle intervals, but it must be serviced directly from our appliance; you can’t put a configuration dialogue on a third-party service.
You can substitute your own project as you think through this process. At the other end of the spectrum is something like Amazon, which serves thousands of pages a second.
These three cases take very different architectural solutions, from a few pages served by practically any lightweight HTTP service to giant geographically-distributed data centers.
Choose wrong, and you have great troubles. Go too small, and service drops off and customers go away. Go too big, and you spend more money than you had to. Profits go away. Or worse, you never get the okay to build the system at all.
“Look,” I was back at the whiteboard, “We can do this. We’ve got a basic set of user stories now –” I started writing them on the whiteboard “—user logs in, user sets initial configuration, user checks status, user gets statistics, user displays performance charts, user logs out.”
General nodding ensues around the conference table.
“All we need for our workload model to get started is some idea how often these happen. And we know a couple things: Every time we have a user session, there is one log in and one log out. We only do the initial configuration once in a blue moon; we can more or less ignore that, it’s low probability. Same thing with configuration changes; we expect this thing to not need a lot of management. The main things we expect to do often are to display the status, download statistics, or update the charts. Those happen how often?”
“The spec says no faster than once a second, for the updates, and we download statistics once a day usually.”
“So there we have the basics. Every session, we’ve got one log in, one log out, and almost always we have one-second updates. We’ve got a basic workload model.”
“Now, one more thing: how much work do we do for each of these? We need one more column….”
“And now we have it. We don’t expect a lot of independent sessions, and most of the time a session is just serving one of a few pages. We’re worrying about nginx or Apache, when what we really need is probably the famous Python One Line Web Server.”
The next day, one of the guys came into my office.
“After all that argument about the right web server, we just got a new requirement: Instead of using a web server, we need to use a message queue so another application can do the management,” he laughed. “Only now, we have to pick a messaging product. We’re having another meeting this afternoon.”
“A messaging product? How many messages do you need to transmit?”