The first step in Performance Load testing of a Web Application is to Know Your Load.
If a person is asked to count the number of people wearing red colored caps, the person’s eye is tuned to focus only on things that are red in color. He makes no note of blue caps or violet caps. Because he is told to look at red caps, the focus is on red caps. Likewise, the performance testing harness would provide results for the kind of load it is set to be tested for. If the test harness is applicable for light loaded conditions, the response times obtained during testing is applicable ONLY for light loaded conditions. It would be a disaster to assume that under heavy loads the same response times would be obtained. Thus it is crucial to understand the load conditions for which the testing needs to be done.
To draw an analogy from Electrical Engineering, performance characteristics of Motors are usually specified for two conditions, No load conditions and full load conditions. The core losses, hysteresis losses and various other parameters are measured under these two conditions. This is supposed to provide the two ends of the spectrum and all other load conditions are likely to fall in between ( there are a few exceptions like surges etc). Likewise, performance characteristics under various load conditions need to obtained to understand how the web application would perform. In web applications, load patterns play a key role in designing an efficient load testing harness.
Let us look at some load patterns. These are by no means comprehensive or authoritative, but should provide some idea regarding knowing your load.
1. Uniform Load : This is a nice load condition to have. There are no spikes in the load. The load increases, decreases gradually with no major jerks. Such load distribution is possible only in controlled environments, which is more often not the case. However, in some cases, the load might settle down to something close to being called a uniform load.
2. Predictable Spikes : The expected load in terms of number of users is known and the spike times are predictable. Such loads are relatively easier to deal with. In some web applications that are login driven, the potential users who could login could be determined ( at least statistically). Then the test harness could be designed for testing such conditions.
Predictable Spikes are usually followed by two kinds of patterns.
For example, there is a rush of login at a given time and once the onrush is over, the user stay connected and use the application at their own comfortable pace. In such a case, the Predictable spike is followed by a sustained load which is more or less uniform.
The other case is when the user login at a given time, look for some information and logout. May be exam results are put up on a website at 10 am. There is a spike at around 10 am and by 11 the load is drastically reduced. Here, post the predictable spike there isn’t a sustained load.
As mentioned above, Predictable Spike means that the load is predictable in terms of the number of users and the time when such a spike would occur. But the time of occurrence is not significant for the performance testing. The harness is only set for the predictable spike load and the post test load condition.
3. Unpredictable Loads : This is the most difficult case to deal with. The number of users are unknown and the amount of spiking is unknown and the time of spike is also unknown. With so many unknowns, hardly anything could be done. One approach towards such situations is to measure performance characteristics for different load conditions, ie., break it down into a series of predictable spikes and have a kind of lookup table that can provide reference values for performance.
Loads, in reality, are not truly unpredictable. Statistical methods do exist (like how population of an area is estimated and so on). Historical data is of valuable help here. During 9/11, many news web sites faced outages due to severe load. But now, many of these companies are well braced even to handle such unexpected mega surges in load. I am no expert at handling such kinds of load and I better not talk about them much.
Armed with a rough idea about what kinds of loads a web application is subjected to, it is time to look at some terminologies that enable us communicate more precisely about Performance load testing.
Concurrency – In layman’s terms, this means the number of users using the web application simultaneously. But this definition is highly dangerous. A better (still layman’s) definition would be the number of simultaneous requests handled by the server. If 50 users are logged into a web application but no one is clicking on anything the concurreny at that instance is ZERO and not fifty. Because since all users are idle, the web server receives no requests. On the other hand, let us say a web applications login page has 10 images and 5 user request for the login page at the very same moment, the concurrency experienced by web server need not be 5 but MORE than that. This is because, each request for a page in turn spawns 10 more (one for each image) and so the server needs to serve (1+10)*5 = 55 requests. But in this case concurrent requests would be still less than 55. It may be easier to define concurrency as the number of requests per second. Though some of them do not agree to this definition.
Throughput – It is the number of requests per minute. Variations of this definition also exist. In general, it means the amount of transactions per unit time. In web load testing parlance, transactions = requests.
Response Time – This is the time taken to serve a request. This parameter is very critical to ascertaining the performance of a web application. A web applications response time is made of
Response time = time taken for request to reach server from client  + server side processing of request  + time taken for request to reach the client from server 
Out of these,  and  are not fully under our control.  can be improved by good coding practices.
Usually, Load testing tools provide Max and Min response times for a given run.
Median Response time – This is the response time 50% of the users will experience.
90% Line – This is the response time 90% of the users will experience. If the Min Response time is 3 secs and Max Response time is 25 seconds, it still does not give us any clue as to how the response time is distributed in this range. Median and 90% Response times gives us a fair idea. Let us assume that a Max Response time of 25 seconds is not acceptable per se, and for one web application A the 90% Line is 8 secs and for the other web application B the 90% Line is 15 secs, then it is obvious that web application A has good performance under load compared to application B.
Error % – If all the requests receive their corresponding responses correctly error is zero. If a few requests fail, the percentage is accordingly calculated. Usually load testing tools provide the failure code. Looking at these requests that fail would help us know if they are critical failures or not. An example of a non critical failure would be that a small % of requests for an image like a logo fails to load. It all depends on the application.
With this understanding about load patterns and some jargon, we are well on our way to a more meaningful performance load testing. In the next part, I will write about how to choose the appropriate tool and how to set up the test harness.