# Weighted Average Limitations

This Originally appeared at “http://www.arlingsoft.com”, but it is no longer available on the Web. Also see: United States Patent 6151565.

The Weighted Average And its Limitations In Decision Support
What is a weighted average and why is it used in decision support? This is a key question to answer if we are to rely upon the weighted average for assisting us in making a decision, particularly as most decision systems are based on this fundamental value, calculated in one way or another. Perhaps the best starting point is to understand the concept of the average. An average is something we all have calculated at one time in our school lives, and probably many times since. We know, for instance that the average of the three numbers (3, 4, 5) is 4 almost instinctively. It is the middle number, but we can check the result by adding 3+4+5=12 and dividing by the number of values – in this case three values. The result is of course 12/3=4. However, taking the average of 3, 4 and 5 actually implies we have given the same weight to each of the three numbers – we value them equally to reach the average. We have in fact performed the following operation:

(3+4+5)/3 = [(1/3)*(3)] + [(1/3)* (4)] + [(1/3)*(5)] = 4

This is a special case of the weighted average where all the weights are the same. If we suppose that the three is twice as important as the other two numbers, we can represent this by changing the weight applied to the number 3 – in fact by making the ratios of the weights 2:1:1. Hence our “weighted” average becomes:

(1/2)*(3) + (1/4)*(4) + (1/4)*(5) = 3.75

Another series of values can also create the same weighted average:

(1/2)*(4) + (1/4)*(3) + (1/4)*(4) = 3.75

or

(1/2)*(2) + (1/4)*5 + (1/4)*(6) = 3.75

If the weighted average is used for decision purposes, then the preceding examples obviously are problematic as we have no way of distinguishing which is the ‘best’ among the sets of values. Clearly something is wrong. When faced with these circumstances, many decision makers revisit the data, and re-examine how the decision was reached. What Arlington has done is to essentially revisit by a numerical measure. However, it goes deeper than that. The “pattern” of thinking of an evaluator is reflected in the weights, which is often referred to as the utility function. In terms of Which & Why, the utility function is not a function but a utility pattern. The fundamental difference in view implies that we do not require a function to link disparate criteria, or compare the closeness of one function to another through statistical means (in effect, one is comparing two series of numbers.) Methods such as linear regression make the assumption that some respectable degree of normality – a truly random distribution – exists in both scores and weights, and often, their combination. This assumption is at best invalid because scores and weights are deliberately biased by human evaluators, and the question of comparing to a random distribution is as equally valid as comparing it to any other distribution. Not only that, but the combination of even normally distributed scores and weights does not result in a normally distributed combination. The implications are clear. Traditional statistical analysis does not work – and should not be expected to work – for decision systems.

The weighted average is used in many sciences, from econometrics to biology, medical analysis to particle physics. And of course in Decision Sciences. Let us be clear, however, that the question asked by the decision process is not neccesarily the same as the weighted average “answer” in quantitative sciences. In the latter data is often measured and weights determined from physical processes having quantifiable parameters. In decision processes, these values are at best imprecise, and heuristically arrived at with parameters which are likely to have little or no relationship to each other – they are not united through any physical process. Quantifiable sciences ask what is the end value of the particular measurement given a set of quantifiable nuisance and shape parameters. In decision processes we are asking what is the alternative closest to the pattern of requirements of the evaluator. These questions are different. Lotfi Zadeh of the University of California at Berkeley, the father of modern fuzzy logic applications to artificial intelligence, declared that no single unique value can represent human thought processes, and normal statistics does not apply. If we assume that there is a particular pattern to human thinking for the purpose of making a decision, then the thinking behind a decision must reflect the pattern of thought of the evaluator. The weighted average alone cannot reflect a pattern of thinking, and any method that does not deal with the question of matching the pattern of thought with the evaluation pattern of an alternative is missing a fundamental point.

The Three Houses Problem

Here is a further simple example. Let us assume that one wishes to purchase a house. The decision rests on a weight distribution of 40% for the house, 30% for the neighborhood, 20% for the property, and 10% for the garage and driveway. Three houses at equal selling prices were assessed on a 0 to 10 point scale for the four factors, and the following results were presented:

```Factor:          Weight      House A     House B      House C
------------------------------------------------------------
House              40           6           0            8
Neighborhood       30           6          10            6
Property           20           6          10            4
Garage+Driveway    10           6          10            2
============================================================
Weighted Average:               6           6            6
```

The figure on the right illustrates the build-up-pattern (the Which & Why Overall Chart) as to how the values cumulatively add up to the weighted average. The order of the houses has been changed to show the area plots more clearly.

Obviously, if the prices are the same House B would not be chosen, since it appears it does not even exist. Yet if we blindly accepted the weighted average, we could wind up with a plot without a house! Of course one can set minimum requirements which would eliminate House B.

However, we are left with trying to choose between Houses A and C. There is no guidance here except from the “revisitation” of the data, and reviewing what our priorities are in choosing a house. However, we have already, in theory, expressed this in the weight distribution. Consequently we may consider comparing the utility pattern of weights and the resulting score pattern that was obtained. We do this by looking at how each weighted average is composed, and compare the composition pattern to the pattern of weights, factor by factor. A method common in pattern analysis that has been utilized for this measure relates to a linear measure of difference between two patterns, and is referred to in terms of a “loss” or “cost” from a benchmark pattern. Using this method, and a twist added by Arlington’s research, a value called the matching index, related to the pattern loss, is calculated. The following table gives the results for the matching index:

```Factor:          Weight      House A     House B      House C
------------------------------------------------------------
House              40           6           0            8
Neighborhood       30           6          10            6
Property           20           6          10            4
Garage+Driveway    10           6          10            2
============================================================
Weighted Average:             6.00        6.00         6.00
Matching Index                1.00        0.56         0.85
Adjusted Weighted Average:    6.00        3.33         5.11
```

The combined result is given in the adjusted weighted average, and it is patently obvious House B does not conform to our initial desires, even without minimum conditions. This is further illustrated in the factor-by-factor analysis chart given in the figure on the right. The House A line plot covers the benchmark pattern perfectly.

We could ask at what point does House B become attractive? In other words, at what point are we willing to trade House A for House B in order to live in the obviously better neighborhood. There are four ways to deal with this problem. These are: Price Equivalency – what is House B really worth when compared to the recommended?

1. At what point does House B become interesting: i.e. suppose there is another house in the same area – at what point does
it become competive with the better houses in the not-so-good neighborhood (amongst other factors)?
2. What trade-offs are we ready to make? Would we feel comfortable changing the weight distribution to meet our goal? By how much
do we feel comfortable in changing the weights?
3. A combination of the above three.

At every point in this discussion, it is evident that we are making trade-offs. The matching index makes us aware of the incompatibilities of the house evaluations with our own sense of the relative importance of the factors, and how much ‘trading’ needs to be done to get to a particular option. In terms of price equivalency, for instance, the value of House B to us is (3.33/6) of House A. This works out to a 44% reduction in the price if the houses were equally priced. Alternatively, we may find a house in the same area that scores a four or more. In this case, the trade off of a lesser house against neighborhood and so forth is evident as the adjused weighted average of House B will exceed 6. As for the weight distribution, it turns out one would have to reduce the emphasis on the house some 20% in order for House B to be selected – in other words its emphasis must be reduced from 40% to 21% in our decision – are we willing to make that kind of reduction? Is there a mixture of all three of these that we can live with? These are scenarios that need to be considered in making our decision.

Complex Models and the Weighted Average

In more complex models, the deficiencies in a weighted average may not be so obvious. This could be particularly true where some alternatives are deficient where other alternatives are strong, and the weighted averages are close. The trade-offs become important, but can be completely masked by the wholesale reliance on the weighted average. As mentioned earlier, the matching index is an automated measure of the degree of difference from the evaluator’s utility pattern. Wise decision makers know they have to question where a weighted average came from before finalizing any decision. Yet in many organizations the weighted average is relied upon to indicate the solution in complex situations – to give, in other words, the “best numerical guess.” This may be a poor assumption. Of course, this means we rely upon the utility pattern first set by the evaluator. Again, in complex models one requires a significantly large feedback mechanism than just slide bars and looking at tables of numbers to get a good ‘feel’ for the utility pattern. Objective assesment and feedback is essential, and without this the process can become unreliable. We know also it is difficult to pinpoint scores – often there is a spread in values which leads to uncertainty in the final values. We know also that a decision maker must look from several or many perspectives – in other words, to look at the various scenarios. In the process of decision making, the weighted average’s significance can be changed in unexpected ways, leading to false results, as each scenario is considered. For not only does the score change, so can the weights – hence the evaluation of House B above could lose all significance as the weight of the house is reduced with respect to the other parameters. The trading of weights may lead to more exaggeration, and a reduced minimum score for House B. The following table indicates the amount of change in the the house factor (exchanged with the next highest weighted factor – the neighborhood) to switch ranks between House A and House B. Reductions are required using the matching index adjusted weighted average as House A leads. With the weighted average, we must increase the house importance in our decision in order for House A to lead House B in ranking.

```Score House B  Reduction in Weight of      Increase In House Weight
House Factor using the      to bring House B below A      adjusted weighted average   using weighted average only
-----------------------------------------------------------------------
0                 -19%                          +0%
1                 -16%                          +5%
2                 -13%                          +10%
3                  -8%                          +17.5%
```

What the above scores are telling us is that large changes in weights are required when the matching index is used to cause a change in rank reversal. If we took the weighted average only, then lower changes in weight are required to reverse rank between House A and House B. The Matching Index method appears from this simple example to be more stable. It is difficult to extend this to more complex problems, and more study is obviously required. The robustness of a decision, however, is important, and again the addition of the matching index appears to improve this aspect.

Originally written by Dr. Edward Robins, Arlington Software. No copyright infringement intended.