Donnerstag, 31. Dezember 2009

USL in excel - without the need to read

Neil Gunther created a nice small excel spreadsheet to calculate the key values sigma and kappa for his Universal Scalability Law - primarily for the use in his GCaP class. Beside some other problems with excels numerical precision and Negative Scalability Coefficients in Excel I dislike the ugly 'create a graphic, let it show a trendline and it's coefficients, add these into some cells and continue' step. Fortunately, Scott Roberts has created some google spreadsheets where he implemented a real formula (as google spreadsheets does not support trendlines and their coefficients). These are mentioned in Neils blog entry Scalability in a Spreadsheet - google style.
As this method is also available in excel, I decided to extend Neils excel by some functions to avoud the read and insert part.
sscalc-class_berx1 It is not just easier to apply, it's also more accurate as the coefficients shown in the graph are shown with small rounding.

With the formula shown here:

Don't be confused by the semicolons in the formula, depending on your language settings, excel sometimes use colon or semicolon to separate fields of the formula.
Here the detailed fields I've added:

Field H11:
Field I8:=INDEX(LINEST($G8:$G14;$F8:$F14^{1,2}; FALSE); 1)
Field I9: =INDEX(INDEX(LINEST($G8:$G14;$F8:$F14^{1,2}; FALSE;TRUE); 1);2)
Field I10: =INDEX(INDEX(LINEST($G8:$G14;$F8:$F14^{1,2}; FALSE;TRUE); 1);3)
Field I11: =INDEX(INDEX(LINEST($G8:$G14;$F8:$F14^{1,2}; FALSE;TRUE); 3);1)

I hope i used the LINEST function correct, and this might help others to reduce one step while playing with USL.

Mittwoch, 30. Dezember 2009

MOS can't count

MOS (the flash version) can not even count up to 3:

Can someone show me the hidden element?

Guerrilla Capacity Planning arrived - let's prepare the underground resistance!

Some weeks ago one of my X-mas gifts to myself arrived. (One of those I'm always allowed to buy as none of my relatives would ever imagine to buy me such books for Christmas): Guerrilla Capacity Planning by Neil J. Gunther. In fact, I was only interested in his Universal Scalability Law, which did me a good service in a performance review some weeks ago.
I really like this book and I'm sure it will go with me for some more time; at least every day I go to work by public transport, but also in many other situations.

The idea which gave the book it's name is the Guerrilla Capacity Planning. This is described in the first 2 chapters.
I read it as another iteration of KISS, applied to Capacity Planning. The main goal is to achieve fast and just good enough predictions by the usage of minimal resources and time. It just leads one step further and advises to be well prepared in lean tools and flexible methods to reach the goal when needed. In my current situation, I also see it as a hint to do Capacity Planning in every project, even if it's not calculated within the project plan. But keep it small enough to hide it within the jitter every project contains. The following chapters try to provide some simple weapons to be prepared for all tactical situations which can occur.

Chapter 3 is about significant digits, rounding rules and errors.
These 13 pages where worth reading for me, because I never had a full qualified mathematical or statistical education. It might be just enough to estimate the errors I carry through all my work and provide them (and their meanings to the results) in any discussion, presentation and so on.

Chapters 4 to 6 are the real reason why I ordered this book: the Universal Scalability Model (or Law?). Neil leads in a very consistent way, why scalability is not only limited by contention, as Amdahls law implies, but also by coherency, which leads to a retrograde in scalability beyond p*.
Not only the Model itself is described, but - important for all Guerrilleros - an easy method to gain the parameters for σ and κ out of some measured data with excel is provided. Not mentioned in the book directly but easily to find is the spreadsheet which contains exactly the method Neil provides.
I would like to have also other methods provided within the book to circumvent the problem with excels numerical precision, but Neil provided (and discussed it's difficulties) a glimpse of a implementation in R somewhere else.

In chapter 7 the main focus is on virtualization across all scales, from in-CPU (Hyperthreading) up to Grids and P2P on the other end of the scale. Here it's more about queues, schedules and polling cycles. I had the feeling I should have read Analyzing Computer System Performance with Perl::PDQ before. But I have not. (I will, after the 2nd ed. is published). Also there is not too much ammunition for my underground resistance. More a rough description of that area and some major snares.

Currently I cannot say much about chapters 8 to 11 as I'm in the middle of 8 at the moment. they will also be worth to write about them - in the future.

There is also some criticism outside there which laments, USL does only provide limited practical use for the forecast, until a set of data is measured and σ and κ are derived from these. In comparison, Amdahls s can be measured so much easier as the single threaded phase of a program. This criticism provides its own error inside: If s can be measured, this mean the setup is well instrumented and known and the target hardware is chosen (otherwise, the measurement must be somehow translated to the target hardware) at least for a single process. But, with a well instrumented and know setup, also all code parts which will account to coherency can be spotted and therefore not only Amdahls s leads to σ, but the coherency part leads to κ.
Unfortunately, even in a well instrumented Software like Oracle, nobody sorted all the wait events where they account to.