Saturday, January 27, 2007

Default Configuration

By default, settings under most microsoft windows installations has a mysterious little parameter called maxconnections set to 2.

This little bugger has already caused me pain twice. For applications consuming web services, this parameter is crucial. Of course, there is much more complexity to IIS server tuning. A good article on this is here. Specifically read the section on threading. This is just a tickler.

MaxConnections controls the number of simultaneous allowable open connections. So, if you have an app consuming web services running more than 2 parallel threads, you have a choke point.

In my world, I was seeing high variances in transaction performance at a particular server. This server was in turn consuming web services from a downstream system. Box was not CPU or I/O bound, however, had the typical CPU profile of a badly scaled application (flattish capped at 40% vs. highly spikey 0-100%). Tweaking this from a default of 2 to 40 changed the transaction profile from a 80th percentile bench of 40 secs, 90th %ile of 150 secs, 99th %ile 300 secs (timeout) to a 99th %ile of 5 secs.

Wow !!! Imagine the surprize the next day for the users ..

Too many people make the mistake of not tuning the windows box before production. I wonder why windows server is configured by default (from a scalability perspective in this regard) the same as my laptop running windows xp ?

In fact, this fundamentally also applies to an end user PC. Increasing this opens up the pipeline for a web browser. Largest impact is for loading web pages that have plenty of little sub-pages etc. each of which can be loaded concurrently. Web admins hate this because it causes 'bursty' traffic conditions on the back end web servers.

This whole issue relates to the need to pay attention to 'default' configuration for infrastructure deployed. By default, things are not 'tuned' and this applies to nearly all products. In fact, some are 'mis-tuned' making it mandatory to 'tweak'.

Windows OS, IIS server, SQL server, UNIX kernel, BEA, Oracle all are some of the standard culprits with varying degrees of guilt. For some, this is more an art than a science. Things are so application dependent that template configurations are impossible.

Milan Gupta

Pain points

So what keeps me busy ? While problems which start from IT blame usually are a combination of Business Operations and IT factors (people, process & technology), it is incumbent upon IT to take the lead in resolving these end-to-end. On the systems/software side, common themes emerge :

1> Performance
Probably the greatest area of weakness in the IT discipline which appears in direct conflict with our need / necessity to meet timelines. The art of performance testing an application requires the greatest skill level .. a thorough understanding of not only the software design but also of the business use of that software. A comprehensive black box test and simulating the real world is usually an impossible challenge for complex high transaction systems. So what constitutes a barely sufficient approach ? Does it boil down to having the right person do the job vs. a set formula ?

2> Software quality / engineering issues
Bugs Bugs Bugs !! When will we ever figure out the discipline of paying attention to detail and fully understanding the subtle behavioral side effects of the software we write. Cost, Quality & Speed seem to conflict, however, that isnt really so. Agile is a step towards really representing what developers feel, however, as usual, it is more a management buzzword than reality. More books on what it means written by people who have never written a line of code. This symptom however, represents something more fundamental. It is about engineering discipline .. instilling a sense of pride within teams about what we produce all the way to the individual contributing developer.

3> Solutions where complexity has overtaken team skills
This is a good one .. with lower cost sourcing strategies, we are not always sticking with the highest calibre talent. We are seduced too often by the promise of technology eliminating the critical dependence on the developer. An example, buying BEA doesnt mean you are relieved of the duty of understand middleware concepts and more importantly the proper use of BEA.

Milan Gupta

Recurring patterns

I have an interesting job. Being in the center of the top crisis situations facing a telecom company (that translate to IT blame), the ability to execute without barriers, the ability to traverse up and down the food chain, C level to end workerbee, makes my job fun. It is a good day when I get to go home feeling like I (we) fixed something that had a positive impact on our customers .. and I get a lot of that thanks to our infinite ability to create problems. So, feeling self-destructive, I want out of this job i.e. my goal is shifting to prevention rather than reaction.