Download - 0011 Learning
-
7/31/2019 0011 Learning
1/22
1
1. Introduction
This paper deals with consumer learning from the point of view of the relationship between
the theory of the consumers optimal choice and the cognitive sciences. The approach is
consistent with Simons theory of bounded rationality, and particularly with the aspect of said
theory that goes under the name of "procedural rationality", used as a concise term to mean the
determination of a heuristic with a view to achieving satisficing decisions. In the light of this
conception of rationality, the concept of consumer learning is developed by means of a critical
discussion both of the Bayesian approach and of the neural approach, the aim being to identify its
potential for application and its limits.
In this setting, a subject, j, assumed to represent the generic consumer, uses acts of
consumption or, in more general terms, the acquisition of information to assess the capacity of a
given good to satisfy specific needs, because he initially has some doubts as to said assessment.
In fact, the problem contains aspects of a cognitive order that inevitably influence the way in
which j makes his decisions. Simon's contribution lies in considering the question of the
formation of knowledge as being part of the decisional process and therefore as being capable of
influencing said process, as if it were a system of constraints.
This work deals with the following aspects: a) the statement of a practical problem which
prompts the need to derive an adequate representation of the learning process; b) a discussion of
the relationship of learning in economics and cognitive science; c) a discussion of the concept of
bounded rationality applied to consumer theory; d) the derivation of a specific "cognitive"
representation of products; e) a critical discussion of the Bayesian representation of the learning
process; f) an evaluation of the neural representation of said learning process.
2. The problem
Lets take Tto be the whole time horizon of a generic agent,j, and let's assume that during the
initial period, t1, he decides to purchase one unit of a given good or service, a, reserving any
decision to purchase another unit at a later date, t2. The reason why j implements such a
strategy lies in the existence of some doubt concerning the adequacy of the good or service being
-
7/31/2019 0011 Learning
2/22
2
provided in terms of satisfying the need for which it is purchased, so forj the consumption made
at t1 serves to ascertain whether there is an acceptable degree of correspondence between the
predicted adequacy and the ascertained adequacy of the good or service, a.
On the matter of whether the consumer is doubtful concerning the ability of a to satisfy
specific needs, I shall go into more depth in a subsequent section of this paper. For the time
being, suffice it to note that the existence of any such uncertainty does not necessarily involve
contractual failings on the part of the supplier, such as those lying at the bottom of the Akerlof
model (1970), for instance, but can be traced back to difficulties of a cognitive order that every
consumer faces. In this context, it is notjs uncertainty concerning his own utility function, as
hypothesized by Cyert and De Groot (1975), or as described in one of my previous papers
(Mistri, 1996), that is the problem here. The question considered here poses the need to derive a
consumer learning method and requires said process to be set within a consistent theoretical
framework (Brenner, 1999). Intuitively, we might suppose that the most suitable theoretical
framework for the above-illustrated problem lies in the experimental consumer approach. What
remains to be filled with an operational content is the concept of the experimental consumer, who
uses the goods not to maximize his utility function, but to ascertain their adequacy in satisfying
specific needs. We are thus considering a consumer who expresses cardinal preferences on
classes of goods; all goods are described by their observable features, which can be represented
in vectorial terms. For each class of goods, our consumerj derives a value function - in the sense
used by Luce and Raiffa (1976, p. 220) - which, as the two above-mentioned authors themselves
point out, is not necessarily a utility function.
3. Learning and cognitivism
The introduction mentions a concept of learning that departs from those used in many works
dedicated specifically to consumer learning, though in a rough approximation it is impossible to
prescind from the latter. Besides, it is worth emphasizing that there is no unequivocal learning
model in economics. The reason for this must be sought in the multiplicity of the theorizations
existing in the cognitive sciences. The term "learning" is therefore used to indicate a very ample
class of phenomena that differ from each other and that only have in common the fact that they
-
7/31/2019 0011 Learning
3/22
-
7/31/2019 0011 Learning
4/22
4
any risk of error.
4. The consumer and bounded rationality
From the standpoint taken here, learning is defined both as a process by means of which the
subject creates classes of goods, and as a process by means of which the consumerj refines his
classification of the goods. Inevitably this poses the problem of defining a good. In the standard
sense adopted by Debreu (1959), goods are defined on the basis of their physical nature and are
distinguished according to their features and their territorial and temporal location. The inclusion
of the time factor involves introducing a dimension of uncertainty. As a first approximation, let's
assume that j has a definite order of preferences, j, concerning a set of goods, {ai}, where i =
1,2,...l, which can be represented in the space lR+ . At the same time, still following the standard
scheme approach, the information needed forj can be said to be restricted to the system of
relative prices, which can be indicated by the vectorpi , with i = 1,2,...,l.
Purchases are defined at an initial moment t1; then an instantaneous equilibrium is
determined forj according to the rules of bounded maximization. In the standard definition of
goods, having different features makes the goods objectively differ from each other and a suitable
utility function can be derived for them as a set. Conversely, following the interpretative line
prevailing in marketing studies, we can define goods on the strength of a set of characteristics
using multi-attribute analysis, according to which the goods can be considered as equivalent if
they are found so on the basis of a comparison of theirattributive synthetic indexes, or attributes
which sum up their characteristics as a whole (Lancaster, 1966).
In an approach that considers multi-attribute goods, it can be assumed that j defines his order
of preferences for a set ofabstractgoods, which represent categories or classes of goods against
which every real good can be compared. This way of defining the goods has cognitive
foundations, in the sense that a person generally tends to conceptualize and categorize.
Conceptualization and categorization are the outcome of peoples natural tendency to contain the
amount of information to remember, seeking a substantial cognitive economy. Conceptualiza-
tion helps to facilitate inference; from the consumer's point of view, categorization helps to
facilitate inferences concerning the ability of a certain class of goods to satisfy a specific need.
-
7/31/2019 0011 Learning
5/22
5
A goods-purchasing action always has an inferential nature, especially if we consider the
sequence of stages by means of which the purchase/consumption process takes place, beginning
with a decisional phase. The various phases do not necessarily coincide; it has been said that a
decision to purchase is an inference on the features of a good that may be consumed in the future.
This immediately poses the problem of establishing how, in practical terms, j makes these
conjectures and how this cognitive process can be set in a typically economic conceptual scheme.
A purchasing project is based, first of all, on a heritage of information accumulated prior to
such a decision being reached; above all, it goes through the way in which said information is
classified and represented in the person's memory. What j classifies is the coupling between
goods features and needs, as they became apparent in the past and as j believes they may be
manifest in the future. Said coupling gives rise to mental images, which are interior
representations of the outside reality (Marucci, 1997). Cognition of the mental images appears as
a useful medium between the activity of perceiving the sensorial input and the knowledge
systems stored in the "semantic memory", i.e. the memory where the concepts are categorized.
Recourse to the theory of mental images appears useful for an understanding of the link between
plans and actions - a link that lies, for instance, at the foundations of the theory of sequential
decisional processes.
The mental image becomes a logical pivot between perception, categorization and
memorization, and seems necessary to explain the behavior of a subject (such as our consumer),
who is not merely a classifier of goods, but also an elaborator of consumption schemes. In
essence, we can assume that, in deciding on a consumption plan, the consumer has in mind a
certain image of the good and of the pleasure that he can gain from it. To simplify the
description of the decisional process involved, we can say that this is implemented exclusively on
the basis of the structural features of the goods, as codified in the persons memory. At the same
time, it is feasible to imagine that the subject tends to simplify the images through categorization,
reducing the goods to prototypes which become representations of abstract goods (Macchi,
1989). In fact, we can assume that j draws from various different training processes (e.g. an
exogenous education towards consumption, imitation, experimentation, the collection of
information, the opinions of experts, the opinions of opinion leaders, etc. ) and is thus capable of
building himself a grid of typical features that the good must possess.
-
7/31/2019 0011 Learning
6/22
6
A previous paper (1998) introduced the distinction between genotypical goods and
phenotypical goods: genotypical goods represent a class of goods with specific general or abstract
features, while phenotypical goods are variations of the abstract type. Economic theory
implicitly uses the concept of phenotypical good when it deals with the differentiation between
similar goods as part of the monopolistic competition approach. At the same time, economic
theory implicitly uses the concept of genotypical good as part of its standard theory on consumer
behavior.
In the present context, the genotypical good can be considered as the prototype that emerges
from an adequate process of categorization on the part ofj. Besides,js deriving of a prototype is
consistent with the principle of satisficing behavior. The prototype theory can be linked to a
simple representation of multi-attribute goods, reminiscent of the representation processes used
by the neural schemes, i.e. with vectors (Paul Churchland, 1995; Patricia Churchland and T.
Sejinowski, 1992); on this basis every attribute, or feature, can be considered as a dimension in
the abstract space of the features; thus a multi-attribute good can be represented by a vector of the
features,
[1] x = (x1,x2,...,xm) where i = 1,2,...,m
where (x1,x2,...,xm) is the space of the features.
The agent j has a definite order of preferences j on a set of abstract goods that can be
represented in vectorial form {x1, x2,...,xl}, which represent specific classes of goods with which
the real goods {x1*, x2*,...,xl*} and their features can be compared. So the problem consists in
establishing whether a given real product, xk*, has such features as will make it belong to the
specific class typified by the product xk. This is a typical recognition problem in the sense of
pattern recognition logic. In mathematical terms, the problem involves assigning the specific
product xk* to its own class, C.
Let's consider a real good xk*; this will be equivalent to a typical good, xk, providing it
belongs to the same class, Ck; so we can say that two products, represented vectorially, x1, x2, are
equivalent in terms of features when they both belong to Ck, i.e.
[2] x1* x2* {x1* Ck and x2* Ck}
Note that the class of equivalence is determined on the basis of the features of the products
and of their functions.
-
7/31/2019 0011 Learning
7/22
7
5. Classes and multi-attribute value functions
The definition of class of equivalence, as mentioned above, is entirely generic; it can be
specified by associating each class of equivalence, Ci (where i = 1,2,...,k,...,l, defined on the space
of the goods classes), with an index of value that correlates the value functions, v, with the
structures of the perceived features of the single products (Luce and Raiffa, 1976, p. 68). A value
function, v, associates a real number with every point on the space of the features and can
represent a cardinally-ordered structure of preferences.
Assuming that a product is assessed on the basis of the set of its perceived features, it follows
that a criterion has to be identified with which to obtain a concise representation ofv. Decisional
theory uses the multi-attribute value functions, v, which are linearly additive in their arguments
(Keeney and Raiffa, 1976; Marshall and Oliver, 1995), e.g.
[3] v (x1, x2) = v (x1) + v (x2)
where v (x1) and v (x2) are single-attribute functions (Keeney and Raiffa, 1976, p. 105).
According to [3], the linear form ofvenables the space of the features to be broken down into
subspaces, each with a single feature, dealing with v (x1) and v (x2) as single-attribute value
functions, each of which is defined on a specific space of the classes. This operation can prove
useful in practice, as we shall see.
Conversely, we are aware that non-additive multi-attribute value functions would be better
able to grasp the complexity of the process of categorizing the various products, though for the
purposes of the present work it is probably advisable to restrict ourselves to considering vas
linearly additive. A linearly additive v can be considered as a linear approximation of a
corresponding non-linearly additive v. From the cognitive viewpoint, the difference between the
two functions expresses a different conceptualizing and categorizing method. The cognitive
sciences themselves are not unequivocal in giving an adequate interpretation of categorization
processes, because in the simplest cases they are implemented by means of a linear breakdown of
basic components, while gestaltic phenomena cannot be eliminated in the more complex cases.
This means that the single parts of the entity that we want to categorize interact with each other,
emphasizing the role of the structure as a whole, so that goods with a different attributive
-
7/31/2019 0011 Learning
8/22
8
structure may belong to the same class.
In practice, the consumer is required to solve a problem of "pattern recognition" that involves
recognizing the relationship between the set of features and the utility that can be gained from
them. As a result, any two goods can be considered as belonging to the same class, even if their
features have vectorial structures that are not the same, if their respective vhave the same value -
or values that fall close enough to a "shadow" value. Let's consider [3] and assume that 'av (x1a,
x2a) indicates the value function of the product a and that 'bv (x1
b, x2b) indicates the value function
of the product b; assuming also thatx1a x1
b and thatx2a x2
b, but are such that:
'
av (x1a
, x2a
) ='
bv (x1b
, x2b
). In this case the two products will belong to the same class, just as
they will in the obvious case in whichx1a = x1
b andx2a = x2
b.
From a cognitive point of view, the consumer will recognize the patterns by breaking them
down into essential parts, according to "features analysis" criteria (Anderson,1980), assessing
their fundamental distinctive features. Using this model, the stimuli are considered as
combinations of elementary distinctive features. The consumer is therefore required to classify
first the simple attributes, by determining their mono-attributive classes, then the combination of
said attributes, by determining their pluri-attributive classes. He will memorize these features by
means of a specific coding procedure. In a subsequent phase, when he must recall the features
and the sensations they gave him from memory, the consumer must adopt a synthetic assessment.
A loss of information is implicit in this process of memorizing and recalling from memory, which
also explains the difficulty that many people have, according to Marshall and Oliver (1995, p.
291), in comparing objectives with multiple attributes. It follows that using an additive v,
inasmuch as it is a linear approximation of a non-additive v, represents a satisficing heuristic, the
use of which can generate uncertainty in the determination of the choices made by j. Given alinearly additive v, which takes its values on the space of the classes of equivalence, , assuming
that x = (x1,x2,...,xm) is a vector of attributes and kis the weight assigned to the generic attribute
x, then
[4] )()(1=
=m
i
ixkvxV
In t1 the consumer memorizes the vectorx, which becomes x1; in t2 he recalls to mind the
-
7/31/2019 0011 Learning
9/22
9
same vector, called x2, so that if
[5] x2 x1
there is a loss of information in t2 due to the effect of the memorization process that took
place in t1. A taxonomy of the consumer's learning processes can be charted that identifies the
objectives that are met by these processes; three fundamental approaches to consumer learning
can be identified, i.e.
a) j has a defined order of preferences j on the actual goods {xi*} inlR+ and reaches his
decisions in a series of periods, , where i = 1,2,...T, and for each period a probability
distribution can be deduced on the expected conditions of his world. This means that j findshimself in a situation of environmental uncertainty and the learning only concerns a refinement of
his knowledge of the conditions of his world;
b)j's order of preferences changes with time in a sequence of periods . This assumption
lies, for instance, at the basis of several works by Cyert and De Groot (1975), which assume that
it is through a process of acquiring new information that the consumer can modify his own utility
function. This has an important fallout on the inter-temporal consistency of the multi-period
plans within which j's preferences can be modified from one period to another, consequently
inducing him to make sub-optimal choices (Woo, 1992);
c)j has a defined order of preferences j on a set of goods prototypes {xi}, where i = 1,2,...l,
but he has difficulty in adequately assessing the suitability of (i.e. in classifying) any real
products, xj*. The experimentation of xj*, consisting in the acquisition of information (also
through acts of consumption) will enable him to refine his judgement.
Hypothesis (c) is the only one considered in the present context, where it is assumed that j has
no difficulty in arranging goods types according to j, defining them on the basis of their
representative features, whereas he may have difficulty in classifying the actual good xk*. After
the first period, t1 , when j has had the opportunity to verify whether the product xk* has
exhibited the expected suitability, he will be able to assess whether the actual product comes
within a given class of representative features, if any.
Defining the goods class of equivalence enables the matter of learning to be considered in
terms of pattern recognition; js learning process concerning the goods ability to satisfy a need
-
7/31/2019 0011 Learning
10/22
10
can consequently be defined as his capacity to classify said good correctly. In operational terms,
we can say that a classification process is correct if the expected level of the goods value
function in t1, v(x), coincides with the one ascertained in t2, v(x)*, i.e.
[6] v(x) = v(x)*
The idea of class of equivalence contains two specific categorization modalities: one relates to
the creation of the classes of equivalence concerned, the other involves attributing the goods to
their single respective classes of equivalence. Both modalities belong to the more general
learning process and it is on the representation of said process that, as mentioned earlier, the two
great families of models are divided, one inspired by the cognitivist approach and the other by the
connectionist or neural approach.
6. The consumer as a Bayesian classifier, Critical considerations
The cognitivist approach - which is based on the assumption that the subject is a data
processor - finds formal expression in the Bayesian modeling method. Models of this type have
been applied to the theory of consumer behavior by Cyert and De Groot (1975) and by
Kihlstrom, Mirman and Postlewaite (KMP) (1984). In the light of what has been said so far, it is
assumed that the concept of class has a fundamental role in the consumers decisions. j's decision
to consume a good x* depends on his evaluation of the "level" of the goods value function,
v(x*). So the problem forj consists in refining his assessment, by acquiring information, of
whether the good or service belongs to one class or another. In Bayesian logic, in t1, j
estimates the level ofv(x*), and he does so on the basis of the information that he possesses at the
time. As mentioned earlier, the features of a product are represented vectorially and j doesn't
necessarily know the structure of the vectorx* before his act of consumption; j can establish a
probability distribution of said structure. The approach adopted by KMP consists in deriving a
consumers utility function that incorporates a process of Bayesian learning defined on a space
that is given by a coupling of the space of the goods with that of their features, which are not
necessarily all known toj in advance.
Following KMP, we assume that the consumerj will obtain certain services from the product
represented by the vector of the features x*, which can be indicated as a, so that
-
7/31/2019 0011 Learning
11/22
11
[7] a = x* +
In [7] represents a random variable with a known density function. Note that the parameter
is non-random, but is not known in advance. Let's assume thatj estimates that can only take
on two values, 1 and 2, that stand for two different classes. In t1, j has the (subjective)
probability that x* falls into 1 or2; i.e. p(1), p(2). These are a priori probabilities. j's
estimate may change if he acquires information synthesized by the likelihood function, p(x* |i),
where i = 1,2. It is then easy to complete the Bayesian formula
[8] p(i|x*) =p(x* |i)p(i ) /p(x*)
where p(x*) is the probability density function of
x* and p(i |x*) is the a posteriori
probability. The Bayes classification rule states that:
if p(1|x*) > p(2|x*) then x* belongs to 1
[9]
if p(1|x*) < p(2|x*) then x* belongs to 2
In this arrangement we have to assume that the probability function is known, which may be
scarcely realistic. In fact, j will alter his estimate of the probability levels of1 and 2 on thebasis of the information he receives, and he should estimate its reliability in probabilistic terms,
which does not always satisfy the condition of realism for the hypotheses by which the Bayesian
models would like to be inspired (Salmon 1995).
7. Towards a neural representation
Generally speaking, two consumer learning modalities have been identified; in one, j builds
his classes of equivalence on the basis of which he defines his own stable order of preferences, j,and a second one with which he assigns each actual product to its class of equivalence. While the
Bayesian approach seems unsuitable for representing these two modalities, the neural approach -
which is a formalized expression of connectionism, seems capable of responding better to the
need to formalize the consumer learning process thus described. Equation [1] concisely
represents the vector of the features of any given product; note that [1] is, in a nutshell, consistent
with the neural modeling method. In [1], the representation of a typical product can be
-
7/31/2019 0011 Learning
12/22
12
considered as isomorphic to the structure of its vectorially-expressed features. It follows that
learning in [1] can be represented as a transformation of the vector x into a new vectorx
according to the rule Tx = x, where Tis a suitable transformation. This formula easily explains
the interest of certain scholars (Salmon,1995; Fabbri and Orsini, 1983; Beltratti, Margarita and
Terna, 1996) in opportunities for using neural networks in the field of learning in economics. If
[1] leads us to think of an isomorphism between the structure of a products features and its
vectorial representation, in neural networks this isomorphism is strengthened, as it were, in the
sense that it can be traced in the formal equivalence between the vectorial representation of the
products features and the vectorial representation that is given of the cognitive structures by
several neural models, and by the PDP (parallel distributed processing) models in particular
(Rumelhart and McClelland, 1986).
The vectorial representation of the goods really consists in a vectorial representation of their
features, since working on the features makes the coding process easier (Floreano, 1996, p. 41).
Opting to codify the features enables certain difficulties relating to the so-called "local code"
(consisting in the fact that each input unit,xj, wherej = 1,2,...,n, corresponds to a specific object)
to be overcome. Generally speaking, neural networks with a minimal complexity use the so-
called "distributed coding", in which many units contribute towards representing each object. If
we assume that the input units codify the objects features, then every input unit on the network
will codify the presence or the value of a certain feature. Thus each object activates one or more
units and each unit is used for one or more objects, so each object is defined by the combinations
of active units in the network (Floreano, 1996, p. 43).
Taking the most straightforward hypothesis, i.e. that every input unit represents a feature, the
weight attributable to each input unit depends on the relative importance assigned to the feature
with respect to the others, according to the value function logic. Returning to [1], the component
of the vector of the features can be identified - again on the extremely simplified assumption
adopted here - with the input signals. Bearing in mind the significance of the weights, the net
input of a neuron,Ai, is usually represented by
[10] =N
j
jiji xwA where i = 1,2,...,n j = 1,2,...,n
Note that, while wj stands for the weight of the ith input of a unit, wij represents the strength of
-
7/31/2019 0011 Learning
13/22
13
the interconnection between the unitj and a unit i.
The net input Ai of an ith neuron is the algebraic sum of the products among all the j input
signalsxj and the values of the corresponding synapses wij, from which the threshold value i of
the neuron is subtracted. Thus the net input of the neuron will be given by
[11] =N
j
ijiji xwA
The response of the neuron, yi, is established by passing the net input through an activation
function (x) (Floreano, 1996, p. 35).
[12] =
N
j
ijiji xwy )(
The general principle is that learning (intended as the organized acquisition of knowledge) in
the model, i.e. in the neural network, mimics what is thought to happen in the brain when
something is learnt, i.e. connections are created between neurons and the cortical areas via the
synapses. The connections may have a "variable geometry", in the sense that the same stimulus
can give rise to different connections in different people. Knowledge (and consequently also
recall) of events, situations, objects, etc., is represented in the brain by means of relatively
durable configurations of synaptic connections and is distributed through said synaptic
connections. Knowledge is not stored in single units, but is distributed among many different
units, each of which contributes to the representations of many different elements of knowledge
(Mazzoni, 1998, p. 324). Rather than storing what they learn in a sort of "private" memory, the
neural networks store information in the connections between the nodes. In the neural scheme,
learning thus consists in reinforcing certain connections and extinguishing others.
The processing of the information takes place in the layers in which the neural network is
composed. The most straightforward neural network models are those which, like the
Perceptron, are composed of a layer of incoming networks, that receive stimuli and information
from the outside world, and a layer of neuronodes that process the information and then give a
representation of it as output. In this latter case, we speak of a layer of outgoing units or outputs.
At a slightly more complex level, there are models including a layer of hidden units that do some
essential preliminary information-processing work. The input units represent incoming
information elements and are activated by the stimulation deriving from information coming
-
7/31/2019 0011 Learning
14/22
14
from the surrounding environment.
This information makes the units trigger a signal; each input is attributed a relative weight,
which takes into account the importance of the input signal. The distribution of the weights on
the connections is due to the fact that some inputs are more important than others in the way in
which they combine to produce an impulse, and thus have a greater weight. The weight can thus
be seen as a measure of the strength, or intensity, of the connection. The hidden units then
receive signals from the input units and the weights of the synaptic connections that define them
are modified on the basis of said signals, which release more signals to the output units; here
again, these signals can modify both the weight of the output units and the strength of the
connections between the hidden units and the output units. The role of learning in the logic of
the Hebbian networks (which are used in the classification processes) can be expressed as follows
(Floreano, 1996, p. 66):
- given a neural network with N input nodes andP training pairs, each composedof an input vector, xp, and a required response ("target"), tp, the output from the network for
each input pattern is given by:
1 if wijxj > 0
[13]y =
0 otherwise
This value is compared with the required response, tp, for a given input pattern. If the nets
response is the same as the required response, the synaptic values are not changed; if, on the other
hand, there is a difference between output and required response, i.e. an error in the logic of the
neural networks, the synaptic weights are modified on the basis of the correct response, where
wij represents the correction to attribute to the synaptic weight in question
[14] wij = t xj
where is a proportionality constant; the value thus obtained is added to the preceding values
of the synapses.
Note that, according to Churchland and Sejnowski (1992, ch. IV), the representation and
classification of inputs and outputs in a PDP system takes place vectorially. Each neuron can
take part in the representation of many different elements and no single neuron represents an
-
7/31/2019 0011 Learning
15/22
-
7/31/2019 0011 Learning
16/22
16
one of two possible classes, so that C**1 and C*0.
Assuming (x1,x2,...,xn) as the input values and (w1,w2,...,wn) as the synaptic weights, without
any interconnections between the units, so that wijwi, then
[15] =N
i
ii xxwy 01
wherex0 is a threshold value.
As mentioned before,j-Perceptron has to verify whether or not a product belongs to a certain
class. Training takes place by submitting pairs of input/output examples to him in sequence until
the network is capable of calculating the function exactly. In other words, let's imagine that there
are only two product classes, (C*, C**), that divide the space of the goods, C, into two specific
subspaces. j must assign a generic good, b, to one of the two goods classes. It is also assumed
that there are only two inputs,x1,x2. Given the values ofx1 andx2,j-Perceptron must assign the
value of1 to the output if he "believes" that product b belongs to C**, or the value of 0 if he
"believes" that b belongs to C*.
Let
[16]==
x
iii
xwxg0
)(
be the overall input value; for the output value we shall have
1 if y(x) > x0
[17]y(x) =
0 if y(x) < x0
Assuming that we have only two inputs, then
[18]y(x) = w0 + w1w1 + w2w2
where w0 is an ad hoc weight. Resolving fory(x) = 0, we shall have
[19]x2 = (w1 / w2) x1 (w0 / w2)
which gives rise to a straight line that divides the Cregion into two sub-regions. For certain
values of (w0,w1,w2) the output will fall in the C** region and will thus be equal to 1; for other
values it will fall in the C* region and will equate to 0. The "decision surface" is found on the set
C. If there are numerous inputs, the decision surface will be composed of a hyperplane; we can
-
7/31/2019 0011 Learning
17/22
17
say that "the problem relating to the learning of aPerceptron can be brought down to the correct
determination of a decision surface" (Carrella, 1995, p. 189). So the learning strategy of a j-
Perceptron consists in progressively modifying the synaptic weights so as to enable the network
to proceed with a correct classification, assigning b to the right class. By way of an (extremely
simple) example, let's imagine that we have a consumerj who has the features of thePerceptron,
in that his function will be to learn to classify certain goods in their respective classes. The
neural network he uses will be a network with no hidden levels, with linear outputs from the
nodes. Errors will be corrected by means of a manual application of the "delta rule", also called
the root-mean-square error rule, which is based on the principle of modifying the weights of the
connections in sequence to reduce the difference (or "delta") between the required output and the
value found at the output neuron. Let's assume thatj-Perceptron has to classify a good, x, with
two features,x1,x2, and thaty(x)is the output indicating the product classes, 1 C**, 0 C*;
and lets say that w1,w2 are the corresponding synaptic weights. j-Perceptrons learning process
will consist in changing the synaptic weights if for a given output node the calculated value is not
the same as the required value. The model presented here is an adaptation of the model
illustrated by Carrella (1995, p. 163 et seq.).
Take the following truth table:
Table 1
Features OutputGood
x x y(x)
x 0 0 1
Each feature can be associated with the value of 1 or 0, which indicate the previously-
mentioned mono-attributive classes. The following parameters are also needed for the
application of the learning rule: T(threshold value), arbitrarily assumed as corresponding to 0.1;
e (error), assumed as corresponding to 0.1; d (percentage of weight correction), assumed as
corresponding to 0.5.
In the initial phase, the synaptic weights are assigned arbitrarily, in the sense that j-Perceptron
is uncertain as to how to classify the goods; lets take these weights to be
[20] w1 = -0.1 ; w2 = 0.2
-
7/31/2019 0011 Learning
18/22
18
Putting w0=Tin [18], the output neuronode is calculated according to the equation
[21]y(x) = w1x1 + w2x2 T
Ify(x) acquires a positive value, the output note will indicate a value of 1; if not, it will
indicate a value of0; and, as we know, these values indicate the products classes.
Forx1 = 0, x2 = 0, T = 0.1, the output value required would be 1. Taking the weights indicated
in [20] and inserting them in [21], and making the necessary simple calculations, we find that the
result equates to -0.1 and is therefore negative, so the output value assigned toy(x) will be 0.
Table 2
Good Required y Calculated y
x 1 0
If we compare the two columns, we find that j-Perceptron has failed to classify the good
correctly, so he must modify the synaptic weights by a proportional amount; if the output value is
0, when it should be 1, he must increase the weights; in the opposite case, he must reduce them.
In order to calculate the error, it is best to treat the threshold value as an input, x3 = 1, having a
weight w3 = -T. Then the equation for determining the weights becomes
[22] w1x1 + w2x2 + w3x3 > 0
The new weights are obtained from the old weights plus the correction factor, Fc, calculated
on the old weights. The correction factor will be as follows
[23]Fc= (E + e)d
whereEis the error; as we know, e is the value assigned to the error, d is the percentage of
weight correction. The error,E, is defined as
[24]E = 0 (w1x1 + w2x2 + w3x3)
For the values assigned before,E = 0.1. Hence
[25]Fc = (E + e)d = (0.1 + 0.1) 0.5 = 0.1
We can now modify the weights in proportion to the calculated value until we find a system of
weights capable of representing all the input/output pairs, through an iterative process which, in
the specific case of our example, can lead to the solution of the problem in a number of cycles.
Each cycle can be considered as an act of experimental consumption.
-
7/31/2019 0011 Learning
19/22
19
9. Conclusions
Given the simplified structure of the model used here, we can represent every feature of a
product as an input and assume that every training cycle will correspond to the acquisition of a
new item of information, so thatj is capable of modifying the structure representing the features
of a product in his memory.
In practice, categorization processes, even for a single attribute, can be described by highly
complex neural networks, so that the coupling of different attributes necessarily leads to the
construction of neural networks that are far more complex than the example considered here.
Nonetheless, this is a useful exercise for the purpose of understanding the fact that product
classification, through the acquisition of suitable information, involves "internal" modeling
processes on the consumer's cognitive structure. Learning can thus be represented by said
processes.
Clearly, the use of neural modeling can hardly cover all learning processes. It does grasp a
part of said processes, however, i.e. the ones characterized by the need to classify certain
patterns, as in the case of consumer products.
-
7/31/2019 0011 Learning
20/22
20
REFERENCES
AKERLOF G., "The Market for "Lemons": Quality Uncertainty and the Market Mechanism",Quartely Journal of Economics, 1970, 84, pp.488-500
ALLAIS M., "Determination of cardinal Utility according to an Intrinsic Invariant Model", in L.
Daboni, A. Montesano and M. Lines, eds., Recent Developments in the Foundations of
Utility and Risk Theory, Dordrecht: D. Reidel Publishing Company, 1986, pp.83-120
ANDERSON J.R., Cognitive Psychology and its Implications, New York: Freeman & Co. 1980
BELTRATTI A., MARGARITA S. and TERNA P., Neural Networks for Economic and
Financial Models, London: International Thompson Computer Press, 1996
BRENNER T.,Modelling Learning in Economics, Cheltenham (UK); Edward Elgar, 1999
CARRELLA G.,L'Officina Neurale, Milano: Franco Angeli Editore, 1995
CHURCHLAND P.J.and SEJNOWSKI T.J., The Computational Brain, Cambridge, MA: The
MIT Press, 1992
CHURCHLAND P., The Engine of Reason, the Seat of the Soul, Cambridge,MA: The MIT
Press,1995
CYERT R. and DE GROOt M.H., Adaptive Utility, in R.H.Day, T.Groves, eds., AdaptiveEconomic Models, London: Academic Press, 1975, pp. 223-46
DEBREU G., Theory of Value, New York: Wiley, 1959
FABBRI G. and ORSINI R., Reti Neurali per le Scienze Economiche, Padova: Franco Muzzio
Editore,1993
FLOREANO D.,Manuale sulle Reti Neurali, Bologna, Il Mulino, 1996
von HAYEK F., The Sensory Order. An Inquiry into the Foundations of Theoretical Psychology ,
London: Routledge & Kegan, 1952
KIHLSTROM R., MIRMAN L. and POSTLEWAITE A., "Experimental Consumption and the
Rothschild Effect", in M. Boyer, R. Kihlstrom, eds.,Bayesian Models in Economic Theory,
Amsterdam: North Holland,1984, pp. 279-302
KEENEY R.L. and RAIFFA H., Decisions with Multiple Objectives: Preferences and Value
Tradeoffs, New York, Wiley, 1976
-
7/31/2019 0011 Learning
21/22
-
7/31/2019 0011 Learning
22/22
22
PATTERN RECOGNITION
AND
CONSUMER LEARNING
Maurizio Mistri
(Department of Economics, University of Padua)
ABSTRACT
This paper deals with the topic of consumer learning as an extension of the experimental
consumer approach. With respect to said approach, however, learning is dealt with as a process ofproducts categorization and classification. For this purpose, the goods are described on the basis
of their features and are represented vectorially by means of value functions. It is easily
demonstrated how said methodology enables the use of neural networks as an analytical and
logical instrument. In the last part of the paper, a simple example is given of the application of
the neural network layout to describe a consumer called upon to classify certain products.
JEL classification: D12,D83
Keywords: consumption, consumer behavior, consumer learning