Some weeks ago one of my
X-mas gifts to myself arrived. (One of those I'm always allowed to buy as none of my relatives would ever imagine to buy me such books for Christmas):
Guerrilla Capacity Planning by
Neil J. Gunther. In fact, I was only interested in his
Universal Scalability Law, which did me a good service in a performance review some weeks ago.
I really like this book and I'm sure it will go with me for some more time; at least every day I go to work by public transport, but also in many other situations.
I read it as another iteration of
KISS, applied to Capacity Planning. The main goal is to achieve fast and just good enough predictions by the usage of minimal resources and time. It just leads one step further and advises to be well prepared in lean tools and flexible methods to reach the goal when needed. In my current situation, I also see it as a hint to do Capacity Planning in every project, even if it's not calculated within the project plan. But keep it small enough to hide it within the jitter every project contains. The following chapters try to provide some simple weapons to be prepared for all tactical situations which can occur.
Chapter 3 is about significant digits, rounding rules and errors.
These 13 pages where worth reading for me, because I never had a full qualified mathematical or statistical education. It might be just enough to estimate the errors I carry through all my work and provide them (and their meanings to the results) in any discussion, presentation and so on.
Chapters 4 to 6 are the real reason why I ordered this book: the
Universal Scalability Model (or Law?). Neil leads in a very consistent way, why scalability is not only limited by
contention, as
Amdahls law implies, but also by
coherency, which leads to a retrograde in scalability beyond p*.
Not only the Model itself is described, but - important for all Guerrilleros - an easy method to gain the parameters for σ and κ out of some measured data with excel is provided. Not mentioned in the book directly but easily to find is the
spreadsheet which contains exactly the method Neil provides.
I would like to have also other methods provided within the book to circumvent the problem with excels numerical precision, but Neil provided (and discussed it's difficulties) a glimpse of a implementation in
R somewhere else.
In chapter 7 the main focus is on virtualization across all scales, from in-CPU (Hyperthreading) up to Grids and P2P on the other end of the scale. Here it's more about queues, schedules and polling cycles. I had the feeling I should have read
Analyzing Computer System Performance with Perl::PDQ before. But I have not. (I will, after the 2nd ed. is published). Also there is not too much ammunition for my underground resistance. More a rough description of that area and some major snares.
Currently I cannot say much about chapters 8 to 11 as I'm in the middle of 8 at the moment. they will also be worth to write about them - in the future.
There is also some criticism
outside there which laments, USL does only provide limited practical use for the forecast, until a set of data is measured and σ and κ are derived from these. In comparison, Amdahls
s can be measured so much easier as the single threaded phase of a program. This criticism provides its own error inside:
If s can be measured, this mean the setup is well instrumented and known and the target hardware is chosen (otherwise, the measurement must be somehow translated to the target hardware) at least for a single process. But, with a well instrumented and know setup, also all code parts which will account to coherency can be spotted and therefore not only Amdahls
s leads to σ, but the coherency part leads to κ.
Unfortunately, even in a well instrumented Software like Oracle, nobody sorted all the wait events where they account to.