Cowles Foundation Paper 98
THE QuARTERLY JouRNAL OF Ecoxomcs
VoL. LXIX, FEBRUARY, 1955
A BEHAVIORAL MODEL OF RATIONAL CHOICE
By HEHBERT A. SIMON*
Introduction, 99.-I. Some general features of rational choice, 100. II. The essential simplifications, 103.-III. Existence and uniqueness of solu tions, 111.-IV. Further comments on dynamics, 113.-Y. Conclusion, 114. Appendix, 115.
Traditional economic theory postulates an "economic man," who, in the course of being "economic" is also "rational." This man is assumed to have knowledge of the relevant aspects of his environ ment which, if not absolutely complete, is at least impressively clear and voluminous. He is assumed also to have a Vell-organized and stable system of preferences, and a skill in computation that enables him to calculate, for the alternative courses of action that are avail able to him, which of these will permit him to reach the highest attainable point on his preference scale.
Recent developments in economics, and particularly in the theory
of the business firm, have raised great doubts as to whether this schematized model of economic man provides a suitable foundation on which to erect a theory-whether it be a theory of how firms do behave, or of how they "should" rationally behave. It is not the purpose of this paper to discuss these doubts, or to determine whether they are justified. Rather, I shall assume that the concept of "ceo nomic man" (and, I might add, of his brother "administrative man") is in need of fairly drastic revision, and shall put forth some sugges tions as to the direction the revision might take.
Broadly stated, the task is to replace the global rationality of economic man with a kind of rational behavior that is compatible
·with the access to information and the computational capacities that
are actually possessed by organislhs, ineluding man, in the kinds of environments in which such organisms exist. One is tempted to turn
*The ideas embodied in this paper were initially developed in a series of discussions with Herbert Bohnert, Norman Dalkey, Gerald Thompson, and Robert Wolfson during the summer of 1952. These collaborators deserve a large share of the credit for whatever merit this approach to rational choice may possess. A first draft. of this paper was prepared in my capacity as a consultant to the HA?D Corporation. It has been developed further (including the Appen dix) in work with the Cowles Commission for Research in Ec-onomics on "Decision l[aking Under Uncertainty," under contract with the Office of Xaval Research, and has been completed with the aid of a grant from the Ford Foundation.
100 QUARTERLY JOUR1'AL OF ECONOMICS
to the literature of psychology for the answer. Psychologists have certainly been concerned with rational behavior, particularly in their interest in learning phenomena. But the distance is so great between our present psychological knowledge of the learning and choice processes and the kinds of knowledge needed for economic and administrative theory that a marking stone placed halfway between might help travellers from both directions to keep to their courses.
Lacking the kinds of empirical knowledge of the decisional processes that will be required for a definitive theory, the hard facts of the actual world can, at the present stage, enter the theory only in a relatively unsystematic and unrigorous way. But none of us is completely innocent of acquaintance with the gross characteristics of human choice, or of the broad features of the environment in which this choice takes place. I shall feel free to call on this common experience as a source of the hypotheses needed for the theory about the nature of man and his world.
The problem can be approached initially either by inquiring into
the properties of the choosing organism, or by inquiring into the environment of choice. In this paper, I shall take the former approach. I propose, in a sequel, to deal with the characteristics of the environ ment and the interrelations of environment and organism.
The present paper, then, attempts to include explicitly some of the properties of the choosing organism as elements in defining what is meant by rational behavior in specific situations and in selecting a rational behavior iri terms of such a definition. In part, this involves making more explicit \'hat is already implicit in some of the recent work on the problem-that the state of information may as well be regarded as a characteristic of the. decision-maker as a characteristic of his environment. In part, it involves some new considerations in particular taking into account the simplifications the choosing organism may deliberately introduce into its model of the situation in order to bring the model within the range of its computing capacity.
I. SoME GENERAL FEATURES oF RATIOXAL CHoicE
The "flavor" of various models of rational choice stems primarily from the specific kinds of assumptions that are introduc·ed as to the "givens" or constraints within which rational adaptation must take place. Among the eommon constraints- which ttre not themselves the objects of rational calculation-are (1) the set of alternatives open to choice, (2) the relationships that determine the pay-offs ("satisfactions," "goal attainment") as a function of the alternative that is chosen, and (3) the preference-orderings among pay-offs. The
A BEHAVIORAL 1t,f0DEL OF RATIONAL CHOICE 101
selection of particular constraints and the rejection of others for incorporation in the model of rational behavior involves implicit assumptions as to what variables the rational organism "controls" - and hence can "optimize" as a means to rational adaptation-and what variables it must take as fixed. It also involves assumptions as to the character of the variables that are fixed. For example, by making different assumptions about the amount of information the organism has with respect to the relations between alternatives and pay-offs, optimization might involve selection of a certain maximum, of an expected value, or a minimax.
Another way of characterizing the givens and the behavior
variables is to say that the latter refer to the organism itself, the former to its environment. But if we adopt this viewpoint, we must be prepared to accept the possibility that what we call "the environ ment" may lie, in part, within the skin of the biological organism. That is, some of the constraints that must be taken as givens in an optimization problem may be physiological and psychological limita tions of the organism (biologically defined) itself. For example, the maximum speed at which an organism can move establishes a bound ary on the set of its available behavior alternatives. Similarly, limits on computational capacity may be important constraints enter ing into the definition of rational choice under particular circum stances. We shall explore possible ways of formulating the process of rational choice in situations where we wish to take explicit account of the "internal" as well as the "external" constraints that define the problem of rationality for the organism.
Vhether our interests lie in the normative or in the descriptive
aspects of rational choice, the construction of models of this kind should prove instructive. Because of the psychological limits of the organism (particularly with respect to computational and predictive ability), actual human rationality-striving can at best be an extremely crude and simplified approximation to the kind of global rationality that is implied, for example, by game-theoretical models. While the approximations that organisms employ may not be the best-even at the levels of computational complexity they are able to handle it is probable that a great deal can be learned about possible mecha nisms from an examination of the schemes of approximation that are actually employed by human and other organisms.
In describing the proposed model, we shall begin with elements it has in common with the more global models, and then proceed to introduce simplifying assumptions and (what is the same thing) approximating procedures.
102 Q[JARTERLY JOURNAL OF ECONOMICS
1.1 Primitive Terms and Definitions
lIodels of rational behavior- both the global kinds usually constructed, and the more limited kinds to be discussed here - generally require some or all of the following elements:
1. A set of behavior alternatiues (alternatives of ehoice or dcei sion). In a mathematical model, these can be represented by a point set, A.
2. The subset of behavior alternatives that the organism "considers" or "perceives." That is, the organism may make 1ts ehoice within a set of alternatives more limited than the whole range objectively available to it. The "considered" subset can be represented by a
point set A, with A included in A (ACA).
3. The possible future states of affairs, or outcomes of choice, represented by a point set, S. (For the moment it is not necessary to distinguish between actual and perceived outcomes.)
'1. A "pay-off" function, representing the "value" or "utility" placed by theorganism upon each of the possible outcomes of choice. The pay-off may be represented by a real function, ll(s) defined for all elements, s, of S. For many purposes there is needed only an ordering relation on pairs of elements of S- i.e., a relation that states that s1 is preferred to Sz or vice versa- but to avoid unneces sary complications in the present discussion, we will assume that a cardinal utility, V(s), has been defined.
5. Information as to which outcomes in S will actually occur if a particular alternative, a, in A (or in A) is chosen. This information
may be incomplete- that is, there may be more than one possible outcome, s, for each behavior alternative, a. Ve represent the information, then, by a mapping of each element, a, in A upon a subset, Sa- the set of outcomes that may ensue if a is the chosen behavior alternative.
6. Information as to the probability that a particular outcome will ensue if a particular behavior alternative is chosen. This is a more precise kind of information than that postulated in (5), for it asso ciates with each element, s, in the set Sa, a probability, Pa(s) -the probability that swill occur if a is chosen. The probability Pa(s) is a real, non-negative function with 2: Pa(s) = 1.
Attention is directed to the threefold distinction drawn by the definitions among the set of behavior alternatives, A, the set of out comes or future states of affairs, S, and the pay-off, V. In the ordi nary representation of a game, in reduced form, by its pay-off matrix, the set S corresponds to the cells of the matrix, the set A to the
strategies of the first player, and the function V to the values in the cells. The set Sa is then the set of cells in the ath row. By keeping in mind this interpretation, the reader may compare the present formu lation with "classical" game theory.
1.2 "Classical" Concepts of Rationality
Vith these elements, we can define procedures of rational choice corresponding to the ordinary game-theoretical and probabilistic models.1
A. },fax-min Rule. Assume that whatever alternative is chosen,
the worst possible outcome will ensue- the smallest V(s) for s in Sa will be realized. Then select that alternative, a, for which this worst pay-off is as large as possible.
V(d) = llin V(s) = llax Min V(s)
A, of "considered" alternatives. The probability distribution of
outcomes, (6) does not play any role in the max-mit1 rule.
B. Probabilisi1·c R11le. :Maximize the expected value of V(s) for the (assumed known) probability distribution, Pa(s).
V(a) =V(s)P 6 (s) = Max V(s)Pa(s)
8<sd atA stSa
C. Certainty Rule. Given the information that each a in A (or ill A) maps upon a specified sa in S, select the behavior alternative whose outcome has the largest pay-off.
V(a) = V(Sd) = Max Y(Sa)
II. THE ESSENTIAL SIMPLIFICATIONS
If we examine closely the "classical" concepts of rationality out lined above, we see immediately what severe demands they make upon the choosing organism. The organism must be able to attach definite pay-offs (or at least a definite range of pay-offs) to each possible out come. This, of course, involves also the ability to specify the exact nature of the outcomes- there is no room in the scheme for "unan ticipated consequences.'' The pay-offs must be completely ordered-
- See Kenneth J. Arrow, "Alternative Approaches to the Theory of Choice in Risk-Taking Situations," Ecvnometrica, XIX, 404-37 (Oct. 1951).
104 QUARTERLY JOURl-lAL OF ECONOlffiCS
it must always be possible to specify, in a consistent way, that one outcome is better than, as good as, or worse than any other. And, if the certainty or probabilistic rules are employed, either the out comes of particular alternatives must be known with certainty, or at least it must be possible to attach definite probabilities to outcomes. l'>'ly first empirical proposition is that there is a complete lack of evidence that, in actual human choice situations of any complexity, these computations can be, or are in fact, performed. The intro spective evidence is certainly clear enough, but we cannot, of course, rule out the possibility that the unconscious is a better decision-maker than the conscious. Nevertheless, in the absence of evidence that the classical concepts do describe the decision-making process, it seems reasonable to examine the possibility that the actual process is
quite different from the ones the rules describe.
Our procedure will be to introduce some modifications that appear (on the basis of casual empiricism) to correspond to observed behavior processes in humans, and that lead to substantial computa tional simplifications in the making of a choice. There is no implica tion that human beings use all of these modifications and simplifica tions all the time. K or is this the place to attempt the formidable empirical task of determining the extent to which, and the circum stances under which humans actually employ these simplifications. The point is rather that these are procedures which appear often to be employed by human beings in complex choice situations to find an approximate model of manageable proportions.
2.1 "Simple" Pay-off Functions
One route to simplification is to assume that V(s) necessarily assumes one of two values, (1, 0), or of three values, (1, 0, -1), for all sinS. Depending on the circumstances, we might want to interpret these values, as (a) (satisfactory or unsatisfactory), or (b) (win, draw or lose).
As an example of (b), letS represent the possible positions in a chess game at Vhite's 20th move. Then a ( +1) position is one in
which Vhite possesses a strategy leading to a win whatever Black does. A (0) position is one in which Vhite can enforce a draw, but not a win. A ( -1) position is one in which Black can force a win.
As an example of (a) letS represent possible prices for a house an individual is selling. He may regard $15,000 as an "acceptable" price, anything over this amount as "satisfactory," anything less as "unsatisfactory." In psychological theory we would fix the boundary at the "aspiration level"; in economic theory we would fix the bound-
ary at the price which evokes indifference between selling and not selling (an opportunity cost concept).
The objection may be raised that, although $16,000 and $25,000 are both "very satisfactory" prices for the house, a rational individual would prefer to sell at the higher price, and hence, that the simple
pay-off function is an inadequate representation of the choice situa tion. The objection may be answered in several different ways, each answer corresponding to a class of situations in which the simple function might be appropriate.
First, the individual may not be confronted simultaneously with
a number of buyers offering to purchase the house at different prices, but may receive a sequence of offers, and may have to decide to accept or reject each one before he receives the next. (Or, more generally, he may receive a sequence of pairs or triplets or n-tuples of offers, and may have to decide whether to accept the highest of an n-tuple before the next n-tuple is received.) Then, if the elements S correspond ton-tuples of offers, V(s) would be 1 whenever the highest offer in the n-tuple exceeded the "acceptance price" the seller had determined upon at that time. Ve can then raise the further ques tion of what would be a rational process for determining the accept ance price.2
2. See the Appendix. It might be remarked here that the simple risk fu<lc tion, introduced by Wald to bring problems in statistical decision theory 'vithin the bounds of computability, is an example of a simple pay-off function as that term is defined here.
lOG QUARTERLY JOURNAL OF ECONOMICS
Second, even if there were a more general pay-off function, W(s), capable of assuming more than two different values, the simplified V(s) might be a satisfactory approximation to W(s). Suppose, for example, that there were some way of introducing a cardinal utility function, defined overS, say U(s). Suppose further that U(W) is a monotonic increasing function with a strongly negative second deriva tive (decreasing marginal utility). Then V(s) = v:l W(s) /might be the approximation as shown on page 107.
vVhen a simple V(s), assuming only the values (+1, 0) is admis sible, under the circumstances just discussed or under other circum stances, then a (fourth) rational decision-process could be defined as follows:
D. (i) Search for a set of possible outcomes (a subset, S' in S) such that the pay-off is satisfactory (V(s) = 1) for all these possible outcomes (for all sinS'). o
(ii) Search for a behavior alternative (an a in A) whose possible outcomes all are in S' (such that a maps upon a set, Sa, that is con tained in 8').
If a behavior alternative can be found by this procedure, then a
satisfactory outcome is assured. The procedure does not, of course, guarantee the existence or uniqueness of an a with the desired properties.
2.2 Informat£on Gathering
One element of realism we may wish to introduce is that, while V(s) may be known in advance, the mapping of A on subsets of S may not. In the extreme case, at the outset each element, a, may be mapped on the whole set, S. We may then introduce into the deci sion-making process information-gathering steps that produce a more precise mapping of the various elements of A on nonidentical subsets of S. If the information-gathering process is not costless, then one element in the decision ·will be the determination of how farthemap ping is to be refined.
Now in the case of the simple pay-off functions, (+1, 0), the
information-gathering process can be streamlined in an important respect. First, we suppose that the individual has initially a very coarse mapping of A on S. Second, he looks for an S' in S such that V(s) = 1 for sin S'. Third, he gathers information to refine that part of the mapping of A on S in which elements of S' are involved. Fourth, having refined the mapping, he looks for an a that maps on to a subset of S'.
Under favorable circumstances, this procedure may require the
A BEHAYIORAL MODEL OF RATIONAL CHOICE 107
individual to gather only a small amount of information-an insig nificant part of the whole mapping of elements of A on individual elements of S. If the search for an a having the desirable properties is successful, he is certain that he cannot better his choice by securing additional information.3
It appears that the decision process just described is one of the important means employed by chess players to select a move in the middle and end game. Let A be the set of moves available to 'White on his 20th move. Let S be a set of positions that might be reached, say, by the 30th move.. Let S' be some subset of S that consists of clearly "won" positions. From a very rough knowledge of the map ping of A on S, Vhite tentatively selects a move, a, that (if Black plays in a certain way) maps on S'. By then considering alternative replies for Black, White "explores" the whole mapping of a. His exploration may lead to points, s, that are not in S', but which are now recognized also as winning positions. These can be adjoined to S'. On the other hand, a sequence may be discovered that permits Black to bring about a position that is dearly not "won" for ·white. Then White may reject the original point, a, and try another.
Whether this procedure leads to any essential simplification of the computation depends on certain empirical facts about the game. Clearly all positions can be categorized as "won," "lost," or "drawn" in an objective sense. But from the standpoint of the player, posi tions may be categorized as "clearly won," "dearly lost," "clearly drawn," "won or drawn," "drawn or lost," and so forth-depending on the adequacy of this mapping. If the "clearly won" positions represent a significant subset of the objectively "won" positions, then the combinatorics involved in seeing whether a position can be trans formed into a clearly won position, for all possible replies by Black, may not be unmanageable.4 The advantage of this procedure over the more common notion (which may, however, be applicable in the opening) of a general valuation function for positions, taking on values fro-1 to 1, is that it implies much less complex and subtle evaluation criteria. All that is required is that the evaluation func-
3. This procedure also dispenses with the necessity of estimating explicitly the cost of obtaining additional information.. For further discussion of this point cc
4.. I have estimated roughly the actual degree of simplification that might be realized in the middle game in ehess by !'xperimentation with two middle-game positions. A sequence of sixteen moves, eight by each player, might be expected to yield a total of about 1024 (one septilion) legally permi.<sible variations. By following the general kind of program jut
number of lines of play examined in each of these positionto fewer than 100
variations-a rather spectacular simplification of the choice problem.
108 Q[JiiRTERLY JOURNAL OF ECON0111ICS
tion be reasonably sensitive in detecting when a position in one of the three states- won, lost, or drawn -has been transformed into a position in another state. The player, instead of seeking for a "best" move, needs only to look for a "good" move.
Ve see that, by the introduction of a simple pay-off function and
of a process for gradually improving the mapping of behavior alterna tives upon possible outcomes, the process of reaching a rational deci sion may be drastically simplified from a computational standpoint. In the theory and practice of linear programming, the distinction is commonly drawn between computations to determine the feasibility of a program, and computations to discover the optimal program. Feasibility testing consists in determining whether a program satisfies certain linear inequalities that are given at the outset. For example, a mobilization plan may take as given the maximum work force and the steel-making capacity of the economy. Then a feasible program is one that does not require a work force or steel-making facilities exceeding the given limits.
An optimal program is that one of the feasible programs which maximizes a given pay-off function. If, instead of requiring that the pay-off be maximized, Ye require only that the pay-off exceed some given amount, then we ean find a program that satisfies this require ment by the usual methods of feasibility testing. The pay-off require ment is represented simply by an additional linear inequality that must be satisfied. Once this requirement is met, it is not necessary to determine whether there exists an alternative plan with a still higher pay-ofT.
For all practical purposes, this procedure may represent a suffi
cient approach to optimization, provided the minimum required pay off can he set "reasoi1ably." Iu later sections of this paper we will discuss how this might be done, and we shall show also how the scheme can be extended to veetor pay-off functions with multiple components (Optimization requires, of coi1rse, a complete ordering of pay-offs).
2.3 Partial Ordering of Pay-O,ffs
The classical theory does not tolerate the incomparability of oranges and apples. It requires a sealar pay-off function, that is, a complete ordering of pay-offs. Instead of a sealar pay-off function, l'(s), we might have a 'ector function, V(s); where V has the com ponents l'1 , 1·2 , • • • A vector pay-off function may be introduced to handle a number of situations:
1. In the ease of a decision to be made by a group of persons,
components may represent the pay-off functions of the individual
members of the group. What is preferred by one may not be pre
ferred by the others.
2. In the case of an individual, he may be trying to implement a number of values that do not have a common denominator-e.g., he compares two jobs in terms of salary, climate, pleasantness of work, prestige, etc.;
3. Where each behavior alternative, a, maps on a set of n possible
consequences, Sa, we may replace the model by one in which each alternative maps on a single consequence, but each consequence has
PARTIAL ORDERING OF PAY-OFFS
as its pay-off the n-dimensional vector whose components are the pay-offs of the elements of Sa.
This representation exhibits a striking similarity among these
three important cases where the traditional maximizing model breaks dO\ n for lack of a complete ordering of the pay-offs. The first case has never been satisfaetorily treated-the theory of the n-person game is the most ambitious attempt to deal with it, and the so-ealled "weak welfare principles" of economic theory are attempts to avoid it. The second case is usually handled by superimposing a complete ordering on the points in the vector space ("indifference curves"). The third case has been handled by in traducing probabilities as weights
110 QUARTERLY JOURl'lAL OF ECONOMICS
for summing the vector components, or by using principles like minimaxing satisfaction or regret.
An extension of the notion of a simplified pay-off function permits us to treat all three cases in much the same fashion. Suppose we regard a pay-off as satisfactory provided that Vi ;:: ki for all i. Then a reasonable decision rule is the following:
E. Search for a subsetS' inS such that V(s) is satisfactory for all sinS' (i.e.,"()
Then search for an a in A such that Sa lies inS'.
Again existence and uniqueness of solutions are not guaranteed. Rule E is illustrated in Figure II for the case of a 2-component pay-off vector.
In the first of the three cases mentioned above, the satisfactory
pay-off corresponds to what I have called a viable solution in "A Formal Theory of the Employment Relation" and "A Comparison of Organization Theories."5 In the second case, the components of V define the aspiration levels with respect to several components of pay-off. In the third case (in this case it is most plausible to assume that all the components of k are equal), k .. may be interpreted as the minimum guaranteed pay-off- also an aspiration level concept.
III. ExisTENCE AND UNIQUENESS oF SoLUTIONs
Throughout our discussion we have admitted decision procedures that do not guarantee the existence or uniqueness of solutions. This was done in order to construct a model that parallels as nearly as possible the decision procedures that appear to be used by humans in complex decision-making settings. We now proceed to add supple mentary rules to fill this gap.
3.1 Obtaining a Unique Solution
In most global models of rational choice, all alternatives are evaluated before a choice is made. In actual human decision-making, alternatives are often examined sequentially. We may, or may not, know the mechanism that determines the order of procedure. Vhen alternatives are examined sequentially, we may regard the first satis factory alternative that is evaluated as such as the one actually selected.
If a chess player finds an alternative that leads to a forced mate
for his opponent, he generally adopts tlus alternative without worry-
5. Ecvnometrica, XIX (July 1951), 293-305 and Review of Economic Studies,
XX (1952-53, No. 1), 40-49.
ing about whether another alternative also leads to a forced mate. In this case we would find it very hard to predict whieh alternative would be chosen, for we have no theory that predicts the order in which alternatives will be examined. I3ut in another case diseussed above- the sale of a house- the environment presents the seller with alternatives in a definite sequence, and the selection of the first satisfactory alternative has preeise meaning.
However, there are certain dynamic considerations, having a good psychological foundation, that we should introduce at this point. Let us consider, instead of a single static choice situation, a sequence of such situations. The aspiration level, which defines a satisfactory alternative, may change from point to point in this sequence of trials. A vague principle would be that as the individual, in his exploration of alternatives, finds it easy to discover satisfactory alternatives, his aspiration level rises; as he finds it difficult to discover satisfactory alternatives, his aspiration level falls. Perhaps it would be possible to express the ease or difficulty of exploration in terms of the cost of obtaining better information about the mapping of A on S, or the combinatorial magnitude of the task of refining this mapping. There are a number of ways in which this process could be defined formally.
Such changes in aspiration level would tend to bring about a
"near-uniqueness" of the satisfactory solutions and would also tend to guarantee the existence of satisfactory solutions. For the failure to discover a solution would depress the aspiration level and bring satisfactory solutions into existence.
3.2 Existence of Solutions: Further Possibilities
Ve have already discussed one mechanism by which the existence of solutions, in the long run, is assured. There is another way of representing the processes already described. t:p to this point little use has been made of the distinction between A, the set of behavior
alternatives, and A, the set of behavior alternatives that the organism
considers. Suppose now that the latter is a proper subset of the former. Then, the failure to find a satisfactory alternative in A may
lead to a search for additional alternatives in A that can be adjoined to A.6 This procedure is simply an elaboration of the information-
6. I might mention that, in the spirit of crude empirici rn,
a number of students and friends with a problem involving a multiple pay-ofT
in which the pay-off depends violently upon a very contingent and unc-ertain event-and have found them extrcmelv reluctant to restrict thPrnselvPs to a set
of behavior alternatives allowed by the 1;roblem. Thc'y were averse to an alterna tive that promised very large profit or ruin, where the relevant probability could not be computed, and tried to invent new alternatives whose pay-offs were less sensitive to the contingent event. The problem in question is 1lodigliani's "ho
112 QUARTERLY JOURl'AL OF ECONOMICS
gathering process previously described. (We can regard the elements of A that are not in A as elements that are initially mapped on the
whole set, S.)
In one organism, dynamic adjustment over a sequence of choices may depend primarily upon adjustments of the aspiration level. In
another organism, the adjustments may be primarily in the set A: if
satisfactory a!ternatives are discovered easily, A narrows; if it becomes difficult to find satisfactory alternatives, 1 broadens. The more
persistent the organism, the greater the role played by the adjustment of Ao, relative to the role played by the adjustment of the aspiration level. (It is possible, of course, and even probable, that there is an asymmetry between adjustments upward and downward.)
If the pay-off were measurable in money or utility terms, and if
the cost of discovering alternatives were similarly measurable, we could replaee the partial ordering of alternatives exhibited in Figure II by a complete ordering (an ordering in terms of a weighted sum of the pay-off and the cost of discovering alternatives). Then we could speak of the optimal degree of persistence in behavior-we could say that the more persistent organism was more rational than the other, or vice versa. But the central argument of the present paper is that the behaving organism does not in general know these costs, nor does it have a set of weights for comparing the components of a multiple pay-off. It is precisely because of these limitations on its knowledge and capabilities that the less global models of rationality described here are significant and useful. The question of how it is to behave "rationally," given these limitations, is distinct from the question of how its capabil ities could be inereased to permit action that would be more "rational" judged from the mountain-top of a more complete model. 7
The t\·o viewpoints are not, of course, completely different, much
less antithetical. Ve have already pointed out that the organism may possess a whole hierarchy of rational mechanisms -that, for example, the aspiration level itself may be subject to anadjustment process that is rational in some dyMmic sense. Moreover, in many situations \'e may be interested in the precise question of whether one decision-making procedure is more rational than another, and to answer this question we will usually have to construct a broader criterion of rationality that encompasses both proeedures as approxi mations. Our whole point is that it is important to make explicit what level -e are eonsidcring in such a hierarchy of models, and that
dog stand" problem de cribed
7. One might add: "or judged in terms of the survival value of its choice
for many purposes we are interested in models of "limited" rationality rather than models of relatively "global" rationality.
IV. FuRTHER CoMMENTs ON DYNAMics
The models thus far discussed are dynamic only in a very special sense: the aspiration level at timet depends upon the previous history of the system (previous aspiration levels and previous levels of attain ment). Another kind of dynamic linkage might be very important. The pay-offs in a particular trial might depend not only on the alterna tive chosen in that trial but also on the alternatives chosen in pre vious trials.
The most direct representation of this situation is to include, as
components of a vector pay-ofT function, the pay-offs for the whole sequence of trials. But then optimization would require the selection, at the beginning of the sequence, of a strategy for the whole sequence (see the Appendix). Such a procedure would again rapidly complicate the problem beyond the computational capacity of the organism. A possible middle ground is to define for each trial a pay-ofT function with two components. One would be the "immediate" pay-off (con sumption), the other, the "position" in which the organism is left for future trials (saving, liquidity).
Let us consider a chess game in which the players are paid off at
the end of each ten moves in proportion to arbitrarily assigned values of their pieces left on the board (say, queen, 1; rook, 10; etc.). Then a player could adopt some kind of planning horizon and include in his estimated pay-off the "goodness" of his position at the planning horizon. A comparable notion in economics is that of the depreciated value of an asset at the planning horizon. To compute such a value precisely would require the player actually to carry his strategy beyond the horizon. If there is time-discounting of pay-offs, this has the advantage of reducing the importance of errors in estimating these depreciated values. (Time-discounting may sometimes be essential in order to assure convergence of the summed pay-offs.)
It is easy to conjure up other dynamic eomplications, which may
be of considerable practical importance. Two more may be men tioned-without attempting to incorporate them formally. The consequences that the organism experiences may change the pay-off function-it doesn't know how well it likes cheese until it has eaten cheese. Likewise, one method for refining the mapping of A on S may be to select a partieular alternative and experience its conse quences. In these cases, one of the elements of the pay-ofT associated with a particular alternative is the information that is gathered about the mapping or about the pay-ofT function.
114 QUARTERLY JOURNAL OF ECONO,lJICS
The aim of this paper has been to construct definitions of "rational choice" that are modeled more closely upon the actual deci sion processes in the behavior of organisms than definitions heretofore proposed. We have outlined a fairly complete model for the static case, and have described one extension of this model into dynamics. As has been indicated in the last section, a great deal remains to be done before we can handle realistically a more completely dynamic system.
In the introduction, it was suggested that definitions of this kind
might have normative as well as descriptive value. In particular, they may suggest approaches to rational choice in areas that appear to be far beyond the capacities of existing or prospective computing equipment. The comparison of the I.Q. of a computer with tfiat of a human being is very difficult. If one were to factor the scores made by each on a comprehensive intelligence test, one would undoubtedly find that in those factors on which the one scored as a genius the other would appear a moron-and conversely. A survey of possible defini tions of rationality might suggest directions for the design and use of computing equipment with reasonably good scores on some of the factors of intelligence in which present computers are moronic.
The broader aim, ho,vever, in construeting these definitions of
"approximate" rationality is to provide some materials for the con struction of a theory of the behavior of a human individual or of groups of individuals who are making decisions in an organizational context. The apparent paradox to be faced is that the economic theory of the firm and the theory of administration attempt to deal with human behavior in situations in which that behavior is at least "intendedly" rational; while, at th<same time, it can be shown that
if we assume the global kinds of rationality of the classical theory the problems of internal structure of the firm or other organization largely disappear.8 The paradox vanishes, and the outlines of theory begin to emerge when we substitute for "economic man" or "administra tive man" a choosing organism of limited knowledge and ability. This organism's simplifications of the real world for purposes of choice introduce discrepancies between the simplified model and the reality; and these discrepancies, in turn, serve to explain many of the phe nomena of organizational behavior.
- See Herbert A. Simon, Administrative Behavior (:Iacrnillan, 1947),
pp.39-41, 80-84, 96-102, 240-44.
EXAMPLE OF RATIONAL DETERMINATION
oF AX AccEPTABLE PAY-OFF
In the body of this paper, the notion is introduced that rational adjustment may operate at various "levels." That is, the organism may choose rationally within a given set of limits postulated by the model, but it may also undertake to set these limits rationally. The house-selling illustration of Section 2.1 provides an example of this.
Ve suppose that an individual is selling a house. Each day (or
other unit of time) he sets an acceptance price: d(k), say, for the kth day. If he receives one or more offers above this price on the day in question, he accepts the highest offer; if he does not receive an offer above d(k), he retains the house until the next day, and sets a new
acceptance price, d(k + 1).
Now, if he has certain information about the probability dis tribution of offers on each day, he can set the acceptance price so that it will be optimal in the sense that it will maximize the expected value, V[d(k)]. of the sales price.
To show this, we proceed as follows. Let Pk(y) be the probability that y will be the highest price offered on the kth day. Then:
Pk(d) = k (y)dy
is the probability that the house will be sold on the kth day if it has not been sold earlier.
will be the expected value received by the seller on the kth day if the house has not been sold earlier. Taking into account the probability that the house will be sold before the kth day,
Ek(d) = Edd) IT (1 - P;(d))
will be the unconditional expected value of the seller's receipts on the
kth day; and
116 QUARTERLY JOCJRl-lAL OF ECONOMICS
(A.4) V ( d(k)) = Ek(d)
will be the expected value of the sales price.
Now we wish to set d(k), for each k, at the level that will maxi mize (AA). The k components of the function d(k) are independent. Differentiating V partially with respect to each component, Ve get:
(A.5) = aEk(d)
ad( i) k -1 ad( i)
(i = 1, ... , n).
(A.6) aEi(d) = ae;(d) ;IT (1 - p ·(d)) and
ad(i) ad(i) j ,.1 ' '
(A.7) aEk(d) = ek(driT (1- P·(d))(- aP;(d)) fori< k and
ad(i) j. i ' ad (i)
(A.8) aEk(d) = 0
fori > k.
Hence for a maximum:
-d(i)p;(d) IT (1 - Pj(d))
+ 1: ek (d) IT (1 - Pi(d))pi(d) = 0.
k=i+l j i
Factoring out p;(d), we obtain, finally:
:::; ek(d) II(1 - Pi(d))
d (i) = k =i +.:.,1 ..,...----'-j..C::=-!:..:.i _
IT (1 - Pi(d))
= 1: ek(d) IT (1 - Pi(d)).
k i+J j=i+!
For the answer to be meaningful, the infinite sum in (A.10) must converge. If we look at the definition (A.2) for ek(d) we see this
A. BEHA.VJORA.L JJODEL OF RATIONAL CHOICE 117
would come about if the probability distribution of offers shifts downward through time with sufficient rapidity. Such a shift might correspond to (a) expectations of falling prices, or (b) interpretation of y as the present value of the future price, discounted at a sufficiently high interest rate.
Alternatively, we can avoid the question of convergence by assuming a reservation price a(n), for the nth day, which is low enough so that Pn(d) is unity. We shall take this last alternative, but before proceeding, we wish to interpret the equation (A.lO). Equation (A.lO) says that the rational acceptance price on the ith day, d(i), is equal to the expected value of the sales price if the house is not sold on the ith day and acceptance prices are set optimally for subsequent days. This can be seen by observing that the right-hand side of (A.IO) is the same as the right-hand side of (A.4) but with the summa
tion extending from k = (i + 1) instead of from (k = 1).9
lienee, in the case where the summation is terminated at period n- that is, the house will be sold with certainty in period n if it has not been sold previously-we can compute the optimal d(i) by working backward from the terminal period, and without the necessity of solving simultaneously the equations (A.IO).
It is interesting to observe what additional information the seller needs in order to determine the rational acceptance price, over and above the information he needs once the acceptance price is set. He needs, in fact, virtually complete information as to the probability distribution of offers for all relevant subsequent time periods.
Now the seller who does not have this information, and who will
be satisfied with a more bumbling kind of rationality, will make approximations to avoid using the information he doesn't have. First, he will probably limit the planning horizon by assuming a price at which he can certainly sell and will be willing to sell in the nth time period. Second, he will set his initial acceptance price quite high, watch the distribution of offers he receives, and gradually and approximately adjust his acceptance price downward or upward until
9. Equation (A.lO) appears to have been arrived at independently by D. A. Darling and V. ::I. Kincaid. See their abstract, uAn Inventory Problem," in the Journal of Operations Research Society of America, I, 80 (Feb. 1953).
118 QUARTERLY JOURNAL OF ECONOMICS
he receives an offer he accepts -without ever making probability calculations. This, I submit, is the kind of rational adjustment that humans find "good enough" and are capable of exercising in a wide
range of practical circumstances.
HERBERT A. SIMON.
CARNEGIE INSTITUTE OF TECHNOLOGY