operational reasearch techniques
DESCRIPTION
Operations research assignmentTRANSCRIPT
-
1
Name: Ayenigba Sogo Emmanuel
Department: Civil Engineering
Matric Number: 08/30GB096
Course: CVE657 (Operational research techniques) Assignment
1.0 What is Operations research?
Operations research (OR), or operational research in British usage, is a discipline that deals with the
application of advanced analytical methods to help make better decisions. It employs techniques
from other mathematical sciences, such as mathematical modelling, statistical analysis and
mathematical optimization. It arrives at optimal or near-optimal solutions to complex decision-
making problems. Operations research attempts to provide those who manage organized systems
with an objective and quantitative basis for decision; it is normally carried out by teams of scientists
and engineers drawn from a variety of disciplines. Thus, operations research is not a science itself
but rather the application of science to the solution of managerial and administrative problems, and
it focuses on the performance of organized systems taken as a whole rather than on their parts taken
separately.
1.1 Terminologies
Operations The activities carried out in an organization related to attaining its goals and
objectives.
Research The process of observation and testing characterized by the scientific method.
The steps of the process include observing the situation and formulating a
problem statement, constructing a mathematical model, hypothesizing that the
model represents the important aspects of the situation, and validating the
model through experimentation.
Organization The society in which the problem arises or for which the solution is important.
The organization may be a corporation, a branch of government, a department
within a firm, a group of employees, or perhaps even a household or individual.
Decision maker An individual or group in the organization capable of proposing and
implementing necessary actions.
Analyst An individual called upon to aid the decision maker in the problem solving
process. The analyst typically has special skills in modeling, mathematics, data
gathering, and computer implementation.
Team A group of individuals bringing various skills and viewpoints to a problem.
Historically, operations research has used the team approach in order that the
solution not be limited by past experience or too narrow a focus. A team also
provides the collection of specialized skills that are rarely found in a single
individual.
Model An abstract representation of reality. As used here, a representation of a
decision problem related to the operations of the organization. The model is
usually presented in mathematical terms and includes a statement of the
assumptions used in the functional relationships. Models can also be physical,
narrative, or a set of rules embodied in a computer program.
Systems
approach
An approach to analysis that attempts to ascertain and include the broad
implications of decisions for the organization. Both quantitative and qualitative
factors are included in the analysis.
http://en.wikipedia.org/wiki/Mathematical_modelhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Mathematical_optimizationhttp://www.britannica.com/EBchecked/topic/528756/science -
2
Optimal solution A solution to the model that optimizes (maximizes or minimizes) some
objective measure of merit over all feasible solutions -- the best solution
amongst all alternatives given the organizational, physical and technological
constraints.
Operations
research
techniques
A collection of general mathematical models, analytical procedures, and
optimization algorithms that have been found useful in quantitative studies.
These include linear programming, integer programming, network
programming, nonlinear programming, dynamic programming, statistical
analysis, probability theory, queuing theory, stochastic processes, simulation,
inventory theory, reliability, decision analysis, and others. Operations research
professionals have created some of these fields while others derive from allied
disciplines.
1.2 The Operations research process
The goal of operations research is to provide a framework for constructing models of decision-
making problems, finding the best solutions with respect to a given measure of merit, and
implementing the solutions in an attempt to solve the problems. The problem is a situation arising in
an organization that requires some solution. The decision maker is the individual or group
responsible for making decisions regarding the solution. The individual or group called upon to aid
the decision maker in the problem solving process is the analyst. The steps taken in the OR
processes are as follows:
1.2.1 Recognize the problem
Decision making begins with a situation in which a problem is recognized. The problem may be
actual or abstract, it may involve current operations or proposed expansions or contractions due to
expected market shifts, it may become apparent through consumer complaints or through employee
suggestions, it may be a conscious effort to improve efficiency or a response to an unexpected
crisis. It is impossible to circumscribe the breadth of circumstances that might be appropriate for
this discussion, for indeed problem situations that are amenable to objective analysis arise in every
area of human activity.
1.2.2 Formulate the problem
The first analytical step of the solution process is to formulate the problem in more precise terms.
At the formulation stage, statements of objectives, constraints on solutions, appropriate
assumptions, descriptions of processes, data requirements, alternatives for action and metrics for
measuring progress are introduced. Because of the ambiguity of the perceived situation, the process
of formulating the problem is extremely important. The analyst is usually not the decision maker
and may not be part of the organization, so care must be taken to get agreement on the exact
character of the problem to be solved from those who perceive it. There is little value to either a
poor solution to a correctly formulated problem or a good solution to one that has been incorrectly
formulated.
-
3
1.2.3 Construct a model
A mathematical model is a collection of functional relationships by which allowable actions are
delimited and evaluated. Although the analyst would hope to study the broad implications of the
problem using a systems approach, a model cannot include every aspect of a situation. A model is
always an abstraction that is, by necessity, simpler than the reality. Elements that are irrelevant or
unimportant to the problem are to be ignored; hopefully leaving sufficient detail so that the solution
obtained with the model has value with regard to the original problem. The statements of the
abstractions introduced in the construction of the model are called the assumptions. It is important
to observe that assumptions are not necessarily statements of belief, but are descriptions of the
abstractions used to arrive at a model. The appropriateness of the assumptions can be determined
only by subsequent testing of the models validity. Models must be both tractable -- capable of
being solved, and valid -- representative of the true situation. These dual goals are often
contradictory and are not always attainable.
1.2.4 Find a solution
Here tools available to the analyst are used to obtain a solution to the mathematical model. Some
methods can prescribe optimal solutions while other only evaluate candidates, thus requiring a trial
and error approach to finding an acceptable course of action. To carry out this task the analyst must
have a broad knowledge of available solution methodologies. It may be necessary to develop new
techniques specifically tailored to the problem at hand. A model that is impossible to solve may
have been formulated incorrectly or burdened with too much detail. Such a case signals the return to
the previous step for simplification or perhaps the postponement of the study if no acceptable,
tractable model can be found. Of course, the solution provided by the computer is only a proposal.
An analysis does not promise a solution but only guidance to the decision maker. Choosing a
solution to implement is the responsibility of the decision maker and not the analyst. The decision
maker may modify the solution to incorporate practical or intangible considerations not reflected in
the model.
1.2.5 Establish the procedure
Once a solution is accepted a procedure must be designed to retain control of the implementation
effort. Problems are usually on-going rather than unique. Solutions are implemented as procedures
to be used repeatedly in an almost automatic fashion under perhaps changing conditions. Control
may be achieved with a set of operating rules, a job description, laws or regulations promulgated by
a government body, or computer programs that accept current data and prescribe actions. Once a
procedure is established (and implemented), the analyst and perhaps the decision maker are ready to
tackle new problems, leaving the procedure to handle the required tasks. But what if the situation
changes? An unfortunate result of many analyses is a remnant procedure designed to solve a
problem that no longer exists or which places restrictions on an organization that are limiting and no
longer appropriate. Therefore, it is important to establish controls that recognize a changing
situation and signal the need to modify or update the solution.
-
4
1.2.6 Implement the solution
A solution to a problem usually implies changes for some individuals in the organization. Because
resistance to change is common, the implementation of solutions is perhaps the most difficult part
of a problem solving exercise. Some say it is the most important part. Although not strictly the
responsibility of the analyst, the solution process itself can be designed to smooth the way for
implementation. The persons who are likely to be affected by the changes brought about by a
solution should take part, or at least be consulted, during the various stages involving problem
formulation, solution testing, and the establishment of the procedure.
1.2.7 The OR process
Combining the steps we obtain the complete OR process. In practice, the process may not be well
defined and the steps may not be executed in a strict order. Rather there are many loops in the
process, with experimentation and observation at each step suggesting modifications to decisions
made earlier. The process rarely terminates with all the loose ends tied up. Work continues after a
solution is proposed and implemented. Parameters and conditions change over time requiring a
constant review of the solution and a continuing repetition of portions of the process. It is
particularly important to test the validity of the model and the solution obtained. Are the
computations being performed correctly? Does the model have relevance to the original problem?
Do the assumptions used to obtain a tractable model render the solution useless? These questions
must be answered before the solution is implemented in the field. There are a number of ways to to
test a solution. The simplest determines whether the solution makes sense to the decision maker.
Solutions obtained by quantitative studies may not be predictable but they are often not too
surprising. Other testing procedures include sensitivity analysis, the use of the model under a
variety of conjectured conditions including a range of parameter values, and the use of the model
with historical data.
Fig. 1.1 The OR process
-
5
2.0 Operations research models
Most operations research studies involve the construction of a mathematical model. The model is a
collection of logical and mathematical relationships that represents aspects of the situation under
study. Models describe important relationships between variables; include an objective function
with which alternative solutions are evaluated, and constraints that restrict solutions to feasible
values. OR models include: Linear Programming, Network Flow Programming, Integer
Programming, Nonlinear Programming, Dynamic Programming, Stochastic Programming,
Combinatorial Optimization, Stochastic Processes, Discrete Time Markov Chains, Continuous
Time Markov Chains, Queuing, and Simulation.
2.1 Linear programming
Linear programming (LP) is a widely used model type that can solve decision problems with many
thousands of variables. Generally, the feasible values of the decisions are delimited by a set of
constraints that are described by mathematical functions of the decision variables. The feasible
decisions are compared using an objective function that depends on the decision variables. For a
linear program the objective function and constraints are required to be linearly related to the
variables of the problem.
2.1.1 Terminologies
Decision
Variables
Decision variables describe the quantities that the decision makers would like to
determine. They are the unknowns of a mathematical programming model.
Typically we will determine their optimum values with an optimization method. In
a general model, decision variables are given algebraic designations such
as . The number of decision variables is n, and is the name of
the jth variable. In a specific situation, it is often convenient to use other names
such as or or . In computer models we use names such as FLOW1 or
AB_5 to represent specific problem-related quantities. An assignment of values to
all variables in a problem is called a solution.
Objective
function
The objective function evaluates some quantitative criterion of immediate
importance such as cost, profit, utility, or yield. The general linear objective
function can be written as
Here is the coefficient of the jth decision variable. The criterion selected can be
either maximized or minimized.
Constraints A constraint is an inequality or equality defining limitations on decisions.
Constraints arise from a variety of sources such as limited resources, contractual
obligations, or physical laws. In general, an LP is said to have m linear constraints
that can be stated as
-
6
One of the three relations shown in the large brackets must be chosen for each
constraint. The number is called a "technological coefficient," and the
number is called the "right-hand side" value of the ith constraint. Strict
inequalities (< and >) are not permitted. When formulating a model, it is good
practice to give a name to each constraint that reflects its purpose.
Simple upper
bound Associated with each variable, , may be a specified quantity, , that limits its
value from above;
When a simple upper is not specified for a variable, the variable is said to be
unbounded from above.
Nonnegativity
restrictions
In most practical problems the variables are required to be nonnegative;
This special kind of constraint is called a nonnegativity restriction. Sometimes
variables are required to be nonpositive or, in fact, may be unrestricted (allowing
any real value).
Complete
linear
programming
model
Combining the aforementioned components into a single statement gives:
The constraints, including nonnegativity and simple upper bounds, define the
feasible region of a problem.
Parameters The collection of coefficients for all values of the indices i and j are
called the parameters of the model. For the model to be completely determined all
parameter values must be known.
2.1.2 Properties of a linear programming model
Any linear programming model (problem) must have the following properties:
(a) The relationship between variables and constraints must be linear.
(b) The model must have an objective function.
(c) The model must have structural constraints.
(d) The model must have non-negativity constraint.
-
7
Let us consider a product mix problem and see the applicability of the above properties.
A company manufactures two products X and Y, which require the following resources. The
resources are the capacities machine M1, M2, and M3. The available capacities are 50, 25, and 15
hours respectively in the planning period. Product X requires 1 hour of machine M2 and 1 hour of
machine M3. Product Y requires 2 hours of machine M1, 2 hours of machine M2 and 1 hour of
machine M3. The profit contribution of products X and Y are Rs.5/- and Rs.4/- respectively.
In the above problem, Products X and Y are competing candidates or variables. Machine capacities
are available resources. Profit contribution of products X and Y are given. Now let us formulate the
model. Let the company manufactures x units of X and y units of Y. As the profit contributions of X
and Y are Rs.5/- and Rs. 4/- respectively. The objective of the problem is to maximize the profit Z,
hence objective function is:
Maximize Z = 5x + 4y ----------------------------- Objective function
This should be done so that the utilization of machine hours by products x and y should not
exceed the available capacity. This can be shown as follows:
For Machine M1 0x + 2y 50
For Machine M2 1x + 2y 25 and ----------- Linear structural constraints
For machine M3 1x + 1y 15
But the company can stop production of x and y or can manufacture any amount of x and y. It
cannot manufacture negative quantities of x and y. Hence,
Both x and y are 0 ---------------- non-negativity constraint
As the problem has got objective function, structural constraints, and non-negativity constraints
and there exist a linear relationship between the variables and the constraints in the form of
inequalities,
the problem satisfies the properties of the Linear Programming Problem.
2.1.3 Basic assumptions
The following are some important assumptions made in formulating a linear programming model:
1. It is assumed that the decision maker here is completely certain (i.e., deterministic conditions)
regarding all aspects of the situation, i.e., availability of resources, profit contribution of the
products, technology, courses of action and their consequences etc.
2. It is assumed that the relationship between variables in the problem and the resources available
i.e., constraints of the problem exhibits linearity. Here the term linearity implies proportionality and
additivity. This assumption is very useful as it simplifies modelling of the problem.
3. We assume here fixed technology. Fixed technology refers to the fact that the production
requirements are fixed during the planning period and will not change in the period.
-
8
4. It is assumed that the profit contribution of a product remains constant, irrespective of level of
production and sales.
5. It is assumed that the decision variables are continuous. It means that the companies manufacture
products in fractional units. For example, company manufacture 2.5 vehicles, 3.2 barrels of oil etc.
This is referred to as the assumption of divisibility.
6. It is assumed that only one decision is required for the planning period. This condition shows that
the linear programming model is a static model, which implies that the linear programming problem
is a single stage decision problem. (Note: Dynamic Programming problem is a multistage decision
problem).
7. All variables are restricted to nonnegative values (i.e., their numerical value will be 0).
2.1.4 Steps in formulating linear programming
The steps for formulating the linear programming are:
1. Identify the unknown decision variables to be determined and assign symbols to them.
2. Identify all the restrictions or constraints in the problem and express them as linear equations or
inequalities of decision variables.
3. Identify the objective or aim and represent it also as a linear function of decision variables.
2.1.5 Methods for the solution of a linear programming problem
Linear Programming, is a method of solving the type of problem in which two or more candidates
or activities are competing to utilize the available limited resources, with a view to optimize the
objective function of the problem. The objective may be to maximize the returns or to minimize the
costs. The various methods available to solve the problem are:
1. The Graphical Method: when we have two decision variables in the problem. (To deal with
more decision variables by graphical method will become complicated, because we have to deal
with planes instead of straight lines. Hence in graphical method, it is better to work within the limits
of two variable problems.
2. The Systematic Trial and Error method: where we go on giving various values to variables
until we get optimal solution. This method takes too much of time and laborious.
3. The Vector method: In this method each decision variable is considered as a vector and
principles of vector algebra is used to get the optimal solution. This method is also time consuming.
4. The Simplex method: When the problem is having more than two decision variables, simplex
method is the most powerful method to solve the problem. It has a systematic programme
which can be used to solve the problem.
-
9
Fig 2.1 Network flow
2.2 Network flow programming
The term network flow program describes a type of model that is a
special case of the more general linear program. The class of network
flow programs includes such problems as the transportation problem,
the assignment problem, the shortest path problem, the maximum flow
problem, the pure minimum cost flow problem, and the generalized
minimum cost flow problem. It is an important class because many
aspects of actual situations are readily recognized as networks and the
representation of the model is much more compact than the general linear program. When a
situation can be entirely modelled as a network, very efficient algorithms exist for the solution of
the optimization problem, many times more efficient than linear programming in the utilization of
computer time and space resources.
2.2.1 Terminologies
Nodes and arcs The network flow model consists of nodes and arcs. In the context of
modeling a problem, each node, shown as a circle, represents some aspect of
the problem such as a physical location, an individual worker, or a point in
time. For modeling purposes it is often convenient to assign names to the
nodes. Arcs are directed line segments. The nodes at its ends identify an arc.
The arc passes from its origin node to its terminal node. We use m as the
number of nodes and n as the number of arcs.
Arc flow Flow is associated with the network, entering and leaving at the nodes and
passing through the arcs. The flow in arc k is . Flow is conserved at the
nodes, implying that the total flow entering a node must equal the total flow
leaving the node. The arc flows are the decision variables for the network flow
programming model.
Upper and lower
bounds on flow
Flow is limited in an arc by the lower and upper bounds on flow. Sometimes
the term capacity refers to the upper bound on flow. We use and for the
lower and upper bounds of arc k.
Cost The criterion for optimality is cost. Associated with each arc k, is the cost per
unit of flow, . Negative values of model revenues
Gain The arc gain, , multiplies the flow at the beginning of the arc to obtain the
flow at the end of the arc. When a flow is assigned to an arc, this flow
leaves the origin node of the arc. The flow entering the terminal node of the
arc is
.
The arc lower bound, upper bound, and cost all refer to the flow at the
beginning of the arc. Gains less than 1 model losses such as evaporation or
spoilage. Gains greater than 1 model growth in flow.
A network in which all arcs have unit gains is called a pure network. The
optimum solution for a pure network with integer parameters always has
integer flows. If some gains have values other than 1 the network is a
generalized network, and the solution is not usually all integer.
Arc parameters The set of arc parameters are shown adjacent to arcs enclosed in parenthesis:
-
10
(lower bound, upper bound, cost, gain).
When a parameter is not shown, it assumes its default value. Default values
are: 0 for lower bound, infinity for upper bound, 0 for cost and 1 for gain.
External flows The external flow at node i, , is the flow that must enter or leave node i. A
positive external flow enters the node, and a negative external flow leaves the
node. We show the external flow adjacent to the node with square brackets.
Feasible flow A feasible flow is an assignment of flow to the arcs that satisfies conservation
of flow for each node and the bounds on flow for each arc.
Side constraints Side constraints are constraints on arc flows that cannot be modeled using the
network structure, arc parameters or external flows.
Optimal flow The feasible flow that minimizes total arc cost is the optimal flow.
The linear
programming
model
Every network flow programming model has an equivalent linear
programming model.
2.2.2 Special cases
When learning to use network models, it is helpful to recognize several special cases of network
flow programming. These are the transportation, assignment, shortest path, and maximum flow
models. The problems differ primarily in the set of arc parameters that are relevant or the
arrangement of nodes and arcs. For example, for the transportation model only the arc costs are
relevant and all arcs originate at one set of nodes and terminate at another. We describe in this
section the classes in terms of the network model defined above and note that for each only a subset
of the parameters is relevant. All irrelevant parameters will take on the default values. For the
special network models of this section we set the parameter default values as: 0 for the lower bound,
M (a large number) for the upper bound, 0 for the cost, and 1 for the gain.
Using the default values for parameters, all of the classes can be solved using algorithms defined for
the more general case. There are, however, a number of algorithms for solving each class that do
not use the irrelevant parameters at all. In this way the special case algorithms can be more efficient
than the more general ones. Fig. 2.2 shows the relationships between the various network flow
programming models and linear programming. The models on the left are the least general. As we
move to the right, the problems become more general. Thus all the problems to the left of the
generalized minimum cost flow problem can be solved with an algorithm designed for the
generalized problem. The generalized problem is itself a special case of the linear program.
-
11
2.2.3 Transportation problem
A typical transportation problem is shown in Fig. 2.3 It deals with sources where a supply of some
commodity is available and destinations where the commodity is demanded. The classic statement
of the transportation problem uses a matrix with the rows representing sources and columns
representing destinations. The algorithms for solving the problem are based on this matrix
representation. The costs of shipping from sources to destinations are indicated by the entries in the
matrix. If shipment is impossible between a given source and destination, a large cost of M is
entered. This discourages the solution from using such cells. Supplies and demands are shown
along the margins of the matrix. As in the example, the classic transportation problem has total
supply equal to total demand.
The network model of the transportation problem is shown in Fig. 2.4. Sources are identified as the
nodes on the left and destinations on the right. Allowable shipping links are shown as arcs, while
disallowed links are not included.
Fig. 2.2 Relationship between network models
Fig. 2.3 Matrix model of a transportation problem.
-
12
Only arc costs are shown in the network model, as these are the only relevant parameters. All other
parameters are set to the default values. The network has a special form important in graph theory;
it is called a bipartite network since the nodes can be divided into two parts with all arcs going from
one part to the other.
On each supply node the positive external flow indicates supply flow entering the network. On each
destination node a demand is a negative fixed external flow indicating that this amount must leave
the network. The optimum solution for the example is shown in Fig. 2.5.
Variations of the classical transportation problem are easily handled by modifications of the
network model. If links have finite capacity, the arc upper bounds can be made finite. If supplies
represent raw materials that are transformed into products at the sources and the demands are in
units of product, the gain factors can be used to represent transformation efficiency at each source.
If some minimal flow is required in certain links, arc lower bounds can be set to nonzero values.
2.3 Integer programming (IP)
Integer programming is concerned with optimization problems in which some of the variables are
required to take on discrete values. Rather than allow a variable to assume all real values in a given
range, only predetermined discrete values within the range are permitted. In most cases, these
values are the integers giving rise to the name of this class of models.
The integrality requirement underlies a wide variety of applications. There are many situations,
such as distributing goods from warehouses to factories or finding the shortest path through a
network, where the flow variables are logically required to be integer valued. In manufacturing,
Fig. 2.4 Network flow model of the transportation problem.
Fig. 2.5 Optimum solution, z = 46.
-
13
products are often indivisible so a production plan that calls for fractional output is not acceptable.
There are also many situations that require logical decisions of the form yes/no, go/no go,
assign/dont assign. Clearly these are discrete decisions that when quantified allow only two values.
They can be modelled with binary variables that assume values of zero or one. Designers faced with
selecting from a finite set of alternatives, schedulers seeking the optimal sequence of activities, or
transportation planners searching for the minimum cost routing of vehicles all face discrete decision
problems.
When optimization models contain both integer and continuous variables they are referred to as
mixed-integer programs. The power and usefulness of these models to represent real-world
situations cannot be overstated, but along with modelling convenience comes substantial
computational difficulty. Only relatively small problems containing integer variables can be solved
to optimality in most cases. At first glance this might seem counterintuitive given our ability to
solve huge linear programs. However, the discrete nature of the variables gives rise to a
combinatorial explosion of possible solutions. In the worst case, a majority of these solutions must
be enumerated before optimality can be confirmed. Consequently, when the number of integer
variables in a problem gets large, solving a model to optimality becomes very difficult, if not
impossible. Rather, heuristic methods that do not guarantee optimality must be used to find
solutions.
2.3.1 Terminologies
Integer programming
model (IP)
Mixed interger
programming model
(MIP)
IP Model with p strictly less than n.
Pure integer
programming model (PIP)
IP with all variables required to be integer (p = n).
Binary programming
model (BP)
IP with all integer variables restricted to 0 or 1. The model can be
classified as a pure-binary programming model or mixed-binary
programming model.
Logical constraint A linear constraint involving integer variables that models some
logical condition on the decision variables.
Concave function of a
single variable
A function of a single variable with a decreasing derivative. The
figure below shows a concave function of a decision variable. A
piecewise linear approximation is shown by the dotted lines.
-
14
A concave function in the objective function of a maximization
problem can be represented by the sum of several linear expressions
with a piecewise linear approximation (3 for the figure). No binary
variables are required.
For a minimization problem, a piecewise linear approximation
requires binary variables to force the pieces to enter the solution in
the proper order. The number of binary variables is one fewer than
the number of pieces.
Convex function of a
single variable
A function of a single variable with a increasing derivative. The
figure below shows a convex function of a decision variable. A
piecewise linear approximation is shown by the dotted lines.
A convex function in the objective function of a minimization
problem can be represented by the sum of several linear expressions
with a piecewise linear approximation (3 for the figure). No binary
variables are required.
For a maximization problem, a piecewise linear approximation
requires binary variables to force the pieces to enter the solution in
the proper order. The number of binary variables is one fewer than
the number of pieces.
Fixed charge function A nonlinear function that represents the cost of a decision. The cost
is 0 if the variable is 0. The cost is a fixed value plus a cost which is
-
15
linear with the amount of the variable.
The fixed charge is modeled with a binary variable.
The fixed charge multiplies the binary variable and the unit cost
multiplies the decision variable.
The model must include an implication constraint that forces the
decision variable to 0 when the binary variable is zero. When the
binary variable is 1, the constraint limits the decision variable to its
upper bound. The model also includes nonnegativity restrictions and
the upper bound on the binary variable.
2.3.2 Site selection problem
A builder is planning to construct new buildings at four local sites designated 1, 2, 3 and 4. At each
site there are three possible building designs labelled A, B and C. There is also the option of not
using a site. The problem is to select the building locations and accompanying designs. Preliminary
studies have determined the required investment and net annual income for each of the 12 options.
These are shown in the table below, with A1, for example, denoting design A at site 1. The
company has an investment budget of $100M. The goal is to maximize total annual income without
exceeding the investment budget. As the OR analyst, you are given the job of finding the optimum
plan.
-
16
This example introduces one of the major differences between linear and integer programming; the
indivisibility of decisions. It is an obvious requirement here that only whole buildings may be built
and only whole designs may be selected.
Linear/Integer programming model:
To begin creating a model, variables must be defined to represent each decision. In many cases
models can be more succinctly stated using algebraic expressions with subscripted variables. In this
section, we use variable names like A1, A2, C3, and C4 to represent the decisions. This makes it
easier to present the model in a browser readable format. The notation is also similar to that used for
the computer model. We will write several models in this section, but the simplest is:
Objective: Max. z = 6*A1 + 7*A2 + 9*A3 ... + 19*C3 + 20*C4
subject to:
Budget: 13*A1 + 20*A2 + 24*A3 ... + 48*C3 + 55*C4 100
Simple Bounds and Integrality
0 A1 1, 0 A2 1, ... , 0 C4 1 and integer
We use * to indicate multiplication and ... to indicate a continuation in a similar fashion.
Note that since the variables are restricted between 0 and 1 and required to be integer, there are
actually only two feasible values, 0 and 1 for each variable. A design/location combination either
adds its contributions to the net income and budget (=1) or does not (=0). The simple phrase "and
integer" specifies that all the variables must be integer Thus the model describes the problem of
selecting the set of design/location combinations that maximize net income, while not exceeding the
budget constraint.
The model has the linear form required for linear programming, but it is not a linear programming
model because the variables are not allowed to assume all values within a continuum. Often the
phrase Integer Programming is used for the linear model with some or all the variables required to
be integer, leaving out the term linear. Although, one can express models that are integer and
nonlinear, these are generally much more difficult to solve. For a nonlinear-integer model we use
the phrase nonlinear-integer program.
2.4 Nonlinear programming
The principal abstraction of the linear programming model is that all functions are linear. This leads
to a number of powerful results that greatly facilitate our ability to find solutions. The first is that all
local optima are global optima; the second is that if the optimal value of the objective function is
finite, at least one of the extreme points in the set of feasible solutions will be an
optimum. Furthermore, starting at any extreme point in the feasible region, it is possible to reach an
optimal extreme point in a finite number of steps by moving only to an adjacent extreme point in an
improving direction. The simplex method embodies these ideas and has proven to be extremely
efficient.
-
17
Nevertheless, much of the world is nonlinear so it is natural to ask if it is possible to achieve the
same efficiency with nonlinear models. In many contexts, the elements of a linear model are really
approximations of more complex relationships. Economies of scale in manufacturing, for example,
lead to decreasing costs, while biological systems commonly exhibit exponential growth. In the
design of a simple hatch cover, the shearing stress, bending stress and degree of defection are each
polynomial functions of flange thickness and beam height. Similar relationships abound in
engineering design, economics, and distribution systems, to name a few.
The appeal of nonlinear programming (NLP) is strong because of the modelling richness it affords.
Unfortunately, NLP solvers have not yet achieved the same level of performance and reliability
associated with LP solvers. For all but the most structured problems, the solution obtained from an
NLP solver may not be globally optimal. This argues for caution. Before taking any action, the
decision maker should have a full understanding of the nonlinearities governing the system under
study.
2.4.1 Terminologies
Model
components
The objective and constraint functions are scalar quantities that vary with
the decision vector. The functions may be nonlinear. General
model
Every mathematical programming model can be put in this form. General
discussions assume this standard format. Global
minimum A solution that has an objective value less than or equal to all other
solutions.
Unique global
minimum A solution that has an objective value less than all other solutions. Not all
problems have a strict global minimum.
Weak local
minimum A solution that has an objective value less than or equal to all other
solutions in a small neighbourhood of the solution.
Strong local
minimum A solution that has an objective value less than all other solutions in a
-
18
small neighbourhood of the solution.
Global and
local
maximum
Definitions are the same as the definitions of global and local minimum
with greater than replacing less than.
Convex
function When a straight line is drawn between any two points on a convex
function, the line lies on or above the function. The figure shows a one
dimensional convex function.
In multiple dimensions a convex function has the following property.
Concave
function When a straight line is drawn between any two points on a concave
function, the line lies on or below the function. The figure shows a one
dimensional concave function.
In multiple dimensions a concave function has the following property.
Feasible
region The feasible region is the set of all solutions that satisfy all of the
-
19
constraints.
Convex set A set S is convex if any point on the line segment connecting any two
points in the set is also in S. The figure shows examples of convex sets in
two dimensions. An important issue in nonlinear programming is
whether the feasible region is convex.
When the all the constraints of a problem are linear or convex, the
feasible region is a convex set.
If the objective function is a convex and the feasible region defines a
convex set, every local minimum is a global minimum.
If the objective function is not a convex function, a local minima may or
may not be global minimum. A nonlinear programming algorithm may
terminate at a solution that is not a global minimum. Nonconvex
set f a set does not satisfy the requirements of convex set, it is a nonconvex
set. The figure shows examples of nonconvex sets in two dimensions.
If the feasible region of a problem is a nonconvex set, a local minimum
may or may not be global minimum. A nonlinear programming algorithm
may terminate at a solution that is not a global minimum.
2.4.2 Manufacturing example: Linear objective
Problem
We consider a manufacturing problem with three products, P, Q and R, being manufactured on four
machines, A through D. The products require four raw materials, M1 through M4. Tables 1, 2 and 3
provide revenue and cost data, as well as information relating production to machine capacity and
-
20
raw material usage. In this section we show a linear programming model of the problem. After
examining its solution, we introduce several nonlinearities and discuss their implications.
Linear Programming Model
For the linear programming model, we select the goal of maximizing net operating income, revenue
minus raw material cost.
Decision Variables:
: number of units of product P to produce during the week
: number of units of Q to produce during the week
: number of units of R to produce during the week
: number of units of raw material j to purchase, j = 1...4.
Objective: Maximize operating income = Revenue - Raw Material Cost
Subject to:
Machine Time Limit Constraints
-
21
Raw Material Constraints
Nonegativity and Upper Bounds
Solution:
The figure below shows the linear model created and solved in Excel.
The solution to the LP model is a basic solution. We note that machines A and B are bottlenecks,
with their available capacity entirely used. Production of R is limited by the market.
2.5 Dynamic programming
Many planning and control problems involve a sequence of decisions that are made over time. The
initial decision is followed by a second, the second by a third, and so on. The process continues
perhaps infinitely. Because the word dynamic describes situations that occur over time and
programming is a synonym for planning, the original definition of dynamic programming was
"planning over time." In a limited sense, our concern is with decisions that relate to and affect
phenomena that are functions of time. This is in contrast to other forms of mathematical
-
22
programming that often, but not always, describe static decision problems. As is true in many fields,
the original definition has been broadened somewhat over the years to connote an analytic approach
to problems involving decisions that are not necessarily sequential but can be viewed as such. In
this expanded sense, dynamic programming (DP) has come to embrace a solution methodology in
addition to a class of planning problems. It is put to the best advantage when the decision set is
bounded and discrete, and the objective function is nonlinear.
To model with dynamic programming, an abstraction or reduction of complexity from the real
world problem is necessary. There are two reasons for this, one practical and the other
computational. From a practical point of view, it is rarely possible to identify and evaluate all the
factors that are relevant to a realistic decision problem. Thus the analyst will inevitably leave out
some more or less important descriptors of the situation. From a computational point of view, only
problems with relatively simple state descriptions will be solvable by dynamic programming
algorithms. Thus abstraction is necessary in order to arrive at a formulation that is computationally
tractable. Often a particular problem may have several alternative representations in terms of state
and decision variables. It is important that the analyst realize that some formulations require more
computation than others do, and hence choose the most manageable representation.
Dynamic programming has been described as the most general of the optimization approaches
because conceivably it can solve the broadest class of problems. In many instances, this promise is
unfulfilled because of the attending computational requirements. Certain problems, however, are
particularly adaptable to the model structure and lend themselves to efficient computational
procedures; in cases involving discontinuous functions or discrete variables, dynamic programming
may be the only practical solution methodology.
2.5.1 Terminologies
State
where si is the value of state variable i and m is the number of state
variables.
Initial state set I = {s : states where the decision process begins}
Final state set F = {s : states where the decision process ends}
State space S = {s : s is feasible}
Decision
where dj is the value of the jth decision variable and p is the number of
decision variables.
Feasible decision D(s) = {d : d leads to a feasible state from state s}
Transition function s' = T(s, d), a function that determines the next state, s', reached when
decision d is taken from state s
Decision objective z(s, d), the measure of effectiveness associated with decision d taken in
state s
Path objective z(P), the measure of effectiveness defined for path P. This function
describes how the objective terms for each state on the path and the final
value function are combined to obtain a measure for the entire path
Final function
value
f(s) value that is specified for all final states
-
23
2.5.2 Investment problem
Problem statement:
A portfolio manager with a fixed budget of $100 million is considering the eight investment
opportunities shown in Table 1. The manager must choose an investment level for each alternative
ranging from $0 to $40 million. Although an acceptable investment may assume any value within
the range, we discretize the permissible allocations to intervals of $10 million to facilitate the
modelling. This restriction is important to what follows. For convenience we define a unit of
investment to be $10 million. In these terms, the budget is 10 and the amounts to invest are the
integers in the range from 0 to 4.
Table 1 provides the net annual returns from the investment opportunities expressed in millions of
dollars. A ninth opportunity, not shown in the table, is available for funds left over from the first
eight investments. The return is 5% per year for the amount invested, or equivalently, $0.5 million
for each $10 million invested. The manager's goal is to maximize the total annual return without
exceeding the budget.
The investment problem has a general mathematical programming formulation.
The notation is the general model is defined below.
-
24
2.6 Stochastic programming
There is no question that decision making in the face of uncertainty is an important topic. It is one
of the prime activities of man, arising in every field of purposeful work and in most of the games
that we invent for our amusement.
The mathematical programming models, such as linear programming, network flow programming
and integer programming generally neglect the effects of uncertainty and assume that the results of
decisions are predictable and deterministic. This abstraction of reality allows large and complex
decision problems to be modelled and solved using powerful computational methods.
Stochastic programming explicitly recognizes uncertainty by using random variables for some
aspects of the problem. With probability distributions assigned to the random variables, an
expression can be written for the expected value of the objective to be optimized. Then a variety of
computational methods can be used to maximize or minimize the expected value. This page
provides a brief introduction to the modelling process.
When we recognize the possibility of uncertainty or risk within the context of mathematical
programming, the decision problem might be written as below. The model is shown in the typical
matrix format for an LP except tildes appear over all the parameter matrices. This implies that all
may be affected by a vector of random variables. In practical instances the model could be
complicated by nonlinear terms and integrality restrictions. In any event, the model as stated is not
well defined because: (1) the timing of the decisions and observations of the randomness are
ambiguous and (2) the concepts of optimal solutions and feasible solutions are unclear.
Stochastic programming addresses the first issue by explicitly defining the sequence of decisions in
relation to the realization of the random variables. Given the sequence, an objective function is
defined that reflects a rational criterion for evaluating the decisions at the time they must be made.
Feasibility conditions must be adapted to the fact that decisions made before the realization of
randomness may have feasibility consequences after the realization. How the issues are resolved
leads to the several different problems listed below. No single problem formulation is sufficient.
For stochastic programming the probability distribution of must be known. More properly we
should say that stochastic programming is decision making under risk, reserving the phrase decision
making under uncertainty for those situations for which probability distributions are unavailable.
For stochastic programming, some variables are to be set by a decision maker, these are the decision
variables, while some model parameters are determined by chance, and these are the random
variables. There are a variety of situations one might consider in this context. One differentiation is
-
25
based on when the decision maker must make decisions relative to the time when the random
variables are realized. There are several possibilities.
- Wait and See: The decision maker makes no decisions until all random variables are
realized.
- No Recourse: The decision maker must choose values for the decision variables before any
of the random variables are realized. There is a risk of violating the constraints.
- Chance Constraints: This is the no-recourse situation when only the RHS vector is random.
Solutions are found with specified risks of constraint infeasibility.
- Simple Recourse: This topic considers the problem with a random RHS vector. The decision
maker must choose values for the decision variables before the RHS values are known, but
variables adjusting to the RHS variation are set after the realization. Penalties for constraint
violation are specified.
- Recourse: Some of the decision variables must be set before the random variables are
realized, while others may wait until after they are realized. Models explicitly represent the
initial decisions and all recourse decisions. Although the models can be very large, optimum
solutions solve for all possible circumstances.
- Multistage Recourse: Decisions and random variables are determined in stages. The first set
of decision variables must be fixed before all realizations. Then the first set of random
variables are realized and the second set of decision variables are fixed. The process
continues through a series of decisions and realizations. Typically the stages represent
intervals of time. We do not consider this situation.
2.6.1 Stochastic programming example Wait and see
To illustrate stochastic programming, consider the linear programming model below. The solution
algorithm is the Jensen LP/IP add-in. The problem was generated randomly using the Math
Programming add-in. The solution shown is the optimum solution when all parameters of the model
are deterministic.
We assume that all parameters are deterministic except the right-hand-side vector, shown in the
range F15:F19. The numbers given are the expected values but we add a random variation for each
value that is distributed as a Normal variant with mean 0 and a standard deviation of 10. We
http://www.me.utexas.edu/~jensen/ORMM/computation/unit/mp_add/subunits/stochastic/wait.htmlhttp://www.me.utexas.edu/~jensen/ORMM/computation/unit/mp_add/subunits/stochastic/no_recourse.htmlhttp://www.me.utexas.edu/~jensen/ORMM/computation/unit/mp_add/subunits/stochastic/chance.htmlhttp://www.me.utexas.edu/~jensen/ORMM/computation/unit/mp_add/subunits/stochastic/simple.htmlhttp://www.me.utexas.edu/~jensen/ORMM/computation/unit/mp_add/subunits/stochastic/recourse.html -
26
investigate the wait-and-see policy in that the random variables are realized before the decision
maker sets the solution vector x. The solution x is determined by the solution of the LP. This is not
really a stochastic programming problem because there is no uncertainty when the decisions must
be made. We might be interested, however, in the stochastic features of the optimum solution value
and on the feasibility of the model when viewed before the random variables are realized. We will
use the Monte Carlo simulation features of the Random Variables add-in to perform the analysis.
The stochastic model:
To create a stochastic model that describes the random features of the situation, we
choose Function from the Random Variables menu and fill in the Add Function dialog as below.
We have specified LP/IP as the algorithm in the field at the bottom of the dialog. This means that
the LP will be solved for each point of the sample space that is generated by the enumeration or
simulation procedure. For the example we use the Simulate method.
The form is placed below the LP model as shown below. The information in rows 22 and 23 give
the function name, F_9, the type of analysis, Simulate, the number of simulation iterations, 100, and
the algorithm to be used with each iteration, LP/IP.
The random variable definitions are in rows 25 through 27. All have been specified as Normal
distributions with mean values of 0 and standard deviations of 10. The parameters can be changed
on the form. The math programming model is linked to the random variables via the RHS entries
shown in column F of the LP model. They have been replaced by the equations in column G (the
equations are placed in column F, but illustrated in G). Each RHS is the sum of the original value
stored in column C and a Normal random variable defined on the function form. As the simulation
is performed the RHS values will change randomly.
http://www.me.utexas.edu/~jensen/ORMM/computation/unit/rvadd/functions.htmlhttp://www.me.utexas.edu/~jensen/ORMM/computation/unit/rvadd/index.html -
27
Rows 28 through 30 generate the simulated values. The numbers in row 29 (now 0.5) will be
replaced by uniformly distributed random numbers. The numbers in row 30 compute the simulated
values based on the Monte Carlo method.
Rows 31, 32 and 33 describe the functions to be observed during the simulation. Cell G31 is the
name. The function value in G32 is an equation that links to the LP objective in F4. The feasibility
value in G33 is a logical statement that points to cell O4 (shown with the LP model above). That
cell, filled by the LP/IP add-in, holds the word "Optimal" only when the model has a feasible
solution. For each simulation these cells will hold the optimum LP value and an indication of
feasibility. The particular solution shown has all random variables set to 0, their expected values.
The solution for this sample point is the same as the deterministic solution.
The green cells in the range G34 through G38 hold the results of the simulation. These are filled in
by the add-in. Cell 34 returns the proportion of the observations that are feasible. Cells G36 and
G37 hold the mean and variance of the simulated function, the LP optimum value in this case. Only
the results for feasible solutions are reported. Cell G38 holds the number of feasible solutions or the
sample size of the mean and variance statistics immediately above. G39 holds a confidence level
entered by the user, cell G40 holds a confidence interval computed from the statistics.
-
28
To simulate the model choose Moments from the Random Variables menu and set the number of
simulation observations to 100. Larger values give more accurate results, but take more time. Each
simulation iteration draws a random sample from each distribution. The values are reflected in the
RHS vector, and the LP/IP algorithm solves the problem. The results of 100 observations are
combined and placed on the form. The results begin in row 34 of the figure below. The numbers
shown in rows 29 through 33 are for the final simulated observation of the random variables.
We have simulated 100 observations of the wait-and-see option. Even though the solution is
optimum for each simulated RHS, the mean objective function (estimated at 122.7 in cell G36) is
lower than the value when the RHS values were set to the expected values (125.6). This is a
consequence of "Jensen's Inequality" (named after a different person than the author). The expected
value solution will always have a higher objective function (when maximizing) than a solution that
explicitly represents uncertainty.
All the observations in the sample of 100 were feasible for the example and the 90% confidence
interval for the LP objective value is almost 2. The values of the solution variables, X1 through X10
are not reported on this form.
It is interesting to increase the standard deviation of the random variables to 30, thus increasing the
variability of the RHS values. The simulated results are below. For this model, the solution is
infeasible only if one of the RHS values is negative. With the data provided, the proportion of
feasible solutions is 0.73. The mean objective value is decreased with significantly greater variance
over the case when the standard deviation values were 10. The 90% confidence interval is now
about 8.5. There is a price to pay for variability, even with the wait and see policy.
-
29
2.7 Combinatorial optimization
The most general type of optimization problem and one that is applicable to most spreadsheet
models is the combinatorial optimization problem. Many spreadsheet models contain variables and
compute measures of effectiveness. The spreadsheet user often changes the variables in an
unstructured way to look for the solution that obtains the greatest or least of the measure. In the
words of OR, the analyst is searching for the solution that optimizes an objective function, the
measure of effectiveness. Combinatorial optimization provides tools for automating the search for
good solutions and can be of great value for spreadsheet applications.
Combinatorics is the branch of mathematics studying the enumeration, combination, and
permutation of sets of elements and the mathematical relations that characterize their properties.
Combinatorial optimization does not only enumerate sets, but has the goal of finding the member of
the set that optimizes an objective function. For OR, combinatorial optimization has come to mean
methods for finding or searching for the optimum of problems with discrete solution spaces. This is
in contrast to problems with continuous solution spaces, such as nonlinear programs. Continuous
solution spaces allow the use of differentiation to aid in finding optimum solutions. For
combinatorial optimization differentiation is not commonly used. The Combinatorial optimization
problem is the most general of the optimization problems considered by OR and has been the
subject of a great deal of research.
2.7.1 Combinatorial optimization problem (COP)
We use the general model given below for the combinatorial optimization problem.
-
30
This model places very few limitations on the model components. Since the principal methods of
solution involve enumeration there is no need to restrict the functions involved. Special cases of the
model will add restrictions, and the solution methods for the special cases depend on these
restrictions.
A solution x is a vector of integers. For a spreadsheet application, the individual solution
components may be cells anywhere on the worksheet, usually arranged for convenience related to
the application. A solution is a member of the solution space X. The discrete solution space X is
defined using mathematical relations that members of the set must satisfy. The number of solutions
in the solution space is finite. An example is below where the solution vectors are restricted to
integers between lower and upper limits. This is the restriction for IP.
The definition of X is very important for our models, because it will be different for different
classes of problems. For a permutation problem, X is the set of all permutations of n integers. For
the traveling salesman problem it is the set of all tours through n cities. The definition of X might
also include combinatorial constraints that limit individual components of x or combinations of
components. The delineation of the set X determines the number of solutions that must be
enumerated or partially enumerated in the search for the optimum.
The objective function, f(x), is a single real number and the measure by which alternative solutions
are compared.
This function will be evaluated in a single worksheet cell, but it may be the result of computations
performed in many cells and even on several worksheets in a workbook. For the general model, no
specific form is specified for f(x), but it is important that its value be unique for a given x. The
function must also be computable for all x in the solution space. By computable we mean that it
contains no errors such as dividing by zero, taking the square root of a negative number, or using
circular cell references. These errors are easy to make on a spreadsheet.
For the IP, the objective function is a linear function.
-
31
G is the set of solutions that are feasible. Many applications have several separate logical relations
that must all be satisfied (logically TRUE) if the solution is feasible. Logical relations are easy to
express on a spreadsheet. The expressions involved in G must be computable for all solutions in X.
Considering again an IP, the components of G come from linear constraints. The following shows
the development for "less than or equal to" constraints. A similar development follows for
"equality" and "greater than or equal to" constraints.
The optimum solution is the feasible solution that minimizes (or maximizes) the objective over all
feasible solutions in the solution space.
Although X and G both limit the set over which the optimum is to be found, in our solution
methods X is the most important. We will search the solutions delimited by X, but only consider for
optimality those solutions that are feasible, members of G. The process of generating solutions from
the set X is embodied in the solution algorithm for each special type of problem, so only members
of X are generated.
Problem types:
We identify a problem type by its decisions and the space from which the decisions are to be drawn.
For each case a solution is given by a vector of integer variables.
-
32
We consider four problem types identified by their solution spaces. More detail and examples are
on the following pages.
Range
Permutation
TSP
Tree
Finding solutions:
In general our goal is to find the feasible solution in X that maximizes or minimizes the objective
function. Much OR research and practice and many commercial products related to OR have been
dedicated to this task for special classes of problems.
Some problems such as minimal spanning tree, shortest path tree and others have very efficient
algorithms for finding optimum solutions. There are many papers describing faster and faster
algorithms for solving these problems. We call these the easy problems. Other problems such as the
traveling salesman and most sequencing problems are very difficult in that there are problems of
moderate size that cannot be solved on today's high-speed computers and probably never will be
solved. We call these the hard problems.
Hard problems are usually solved by enumeration of the solution space. In the most general case,
exhaustive enumeration of all solutions is necessary to find the optimum. Although practical for
small problems, this method becomes impossible for problems of only moderate size. For some
kinds of problems, implicit enumeration methods may provide optimum solutions for larger
problems. These include the branch and bound methods of integer programming and discrete
dynamic programming. Implicit enumeration considers all solutions, but large subsets are
eliminated by numerical tests that assure that the optimum cannot lie in the eliminated subsets.
2.8 Stochastic process
In many practical situations the attributes of a system randomly change over time. Examples
include the number of customers in a checkout line, congestion on a highway, the number of items
in a warehouse, and the price of a financial security, to name a few. In certain instances, it is
possible to describe an underlying process that explains how the variability occurs. When aspects
of the process are governed by probability theory, we have a stochastic process.
-
33
The first step in modelling a dynamic process is to define the set of states over which it can range
and to characterize the mechanisms that govern its transitions. The state is like a snapshot of the
system at a point in time. It is an abstraction of reality that describes the attributes of the system of
interest. Time is the linear measure through which the system moves, and can be thought of as a
parameter. Because of time there is a past, present, and future. We usually know the trajectory a
system has followed to arrive at its present state. Using this information, our goal is to predict the
future behaviour of the system in terms of a basic a set of attributes. As we shall see, a variety of
analytic techniques are available for this purpose.
From a modelling point of view, state and time can be treated as either continuous or discrete. Both
theoretical and computational considerations, however, argue in favour of the discrete state case so
this will be our focus. We consider both discrete time and continuous time models. To obtain
computational tractibility we assume that the stochastic process satisfies the Markov condition. That
is, the path the process takes in the future depends only on the current state, and not the sequence of
states visited prior to the current state. For the discrete time system this leads to the Markov Chain
model. For the continuous time system the model is called a Markov Process.
The model of a stochastic process describes activities that culminate in events. The events cause a
transition from one state to another. Because activity durations are assumed to be continuous
random variables, events occur in the continuum of time.
2.8.1 Terminologies
Stochastic
process
A random variable, {X(t)}, where t is a time index that takes values from a
given set T. T may be discrete or continuous. X(t) is a scalar that may take
discrete or continuous values. We consider here only finite-discrete stochastic
processes.
Time The parameter of a stochastic process.
State A vector that describes attributes of a system at any point in time. The state
vector has m components.
X(t) describes some feature of the state.
State space Collection of all possible states.
Activity An activity begins at some point in time, has a duration and culminates in an
event. Generally the duration of the activity is a random variable with a known
probability distribution.
Event The culmination of an activity. The event has the potential to change the state
of the process.
Calendar The set of events that can occur in a specified state, Y(s).
Next event While in some state when one or more events can occur, the one that occurs
next is called the next event. Measured from the current time, the time of the
next event is:
The next event is the value of x that obtains the minimum. When the durations
of events are random variables both the next event and the time of the next
event are random variables.
Transition A function that determines the next state, s', based on the current state, s, and
the event, x. The number of elements of the transition function is the same as
the number of elements in the state vector.
-
34
s' = T(s,x).
State-transition
network
A graphical representation of the states, represented by nodes, and events,
represented by arcs. A transition is shown as a directed arc going from one
node to another.
Markovian
property
Given that the current state is known, the conditional probability of the next
state is independent of the states prior to the current state.
Discrete-time
markov chain
A stochastic process that satisfies the Markovian property and has a discrete
time parameter. Sometimes such a process is called simply Markov Chain.
Continuous-time
markov chain
A stochastic process that satisfies the Markovian property and has a continuous
time parameter. Sometimes such a process is called a Markov Process.
2.9 Discrete Time Markov Chains
We now investigate a finite-state stochastic process in which the defining random variables are
observed at discrete points in time. When the future probabilistic behaviour of the process depends
only on the present state regardless of when the present state is measured, the resultant model is
called a Markov chain. When time is measured in discrete intervals the model is called the Discrete
Time Markov Chain (DTMC). The term Markov Chain is more general that DTMC because it also
includes continuous time processes called Continuous Time Markov Chains (CTMC).We consider
CTMC in a later section. We use the term Markov chain when a comment refers to both CTMC and
DTMC.
Unlike most stochastic processes, Markov chains have very agreeable properties allowing for easy
study. Often they are used to approximate quite complex physical systems, even when it is clear that
the actual behavior of the system being analyzed may depend on more than just the present state, or
when the number of states is not really finite.
To develop a model of a DTMC we need to define the system states S and specify the one-step
transition matrix P. Given this information, computational procedures are available to answer
questions related to the steady-state behavior of the DTMC. In particular we can compute: the t-step
transition matrix, transient and steady-state probability vectors, absorbing state probabilities,
and first passage probability distributions. Integrating these results with economic data leads
directly to a framework for systems design and optimal decision making under uncertainty.
2.9.1 Example problem Computer repair
A real estate office that relies on two aging computers for word processing is experiencing high
costs and great inconvenience due to chronic machine failures. It has been observed that when both
computers are working in the morning, there is a 30% chance that one will fail by evening and a
10% chance that both will fail. If it happens that only one computer is working at the beginning of
the day, there is a 20% chance that it will fail by the close of business. If neither is working in the
morning, the office sends all work to a typing service. In this case, of course, no machines will fail
during the day. Service orders are placed with a local repair shop. The computers are picked up
during the day and returned the next morning in operating condition. The one-day delay is
experienced when either one or both machines are under repair.
Time:
-
35
For a DTMC, the system is observed at discrete points in time that are indexed with the nonnegative
integers. Time t = 0 is the initial point of observation. It is important to identify carefully the exact
moment when the system is to be observed in relation to the events described by the problem
statement. For the example, the system is to be observed in the morning after any repaired
computers are returned and before any failures have occurred during the current day.
State:
The state describes the situation at a point in time. Because the states are required to be discrete
they can be identified with nonnegative integers 0, 1, 2, 3... and so on. There may be a finite or an
infinite number of states. For this introductory discussion, it is easier to concentrate on the finite
case and use m - 1 as the maximum state index. The sequence of random variables
is the stochastic process that describes the state at time t. Each can take one of m values.
Depending on the situation, the state may be very complex or very simple. We use a v-dimensional
vector of state variables to define the state.
In constructing a model, it must be made clear what the one-to-one relationships are between the
possible state vectors and the nonnegative integers used to identify a state in a DTMC. We call the
state associated with index i, . Depending on the context, i typically ranges from 0 to m - 1 or from
1 to m. The state definition must encompass all possible states; however, the system can reside in
only one state at any given time.
The computer repair example allows a simple state definition with only one component, s =
(n), where n is the number of computers that have failed when the system is observed in the
morning. Note that the system is observed after the repaired units are delivered but before any
failures occur. The value of m is 3. In this case, the state index is conveniently identical to the
variable defining the state; however, this relationship will not always be true. We assign a cost of
operating the system for one day that depends on how many computers are failed. The cost is $50
per computer failed. This cost is a function of the state.
Index s State definitions Cost
0
No computers have failed. The office starts the day with
both computers functioning properly. 0
1
One computer has failed. The office starts the day with one
working computer and the other in the shop until the next
morning.
50
2
Both computers have failed. All work must be sent out for
the day. 100
-
36
Events:
To understand the behaviour of a DTMC, it is necessary to identify the events that might occur
during a single time period and to describe their probabilities of occurrence. Generally, the set of
possible events and their probabilities depend on the state s.
Given some current state at the beginning of a period and one or more events occurring during the
period, the system will be in some new (next) state at the beginning of the next period. This
occurrence is called a transition. One or more events may occur within the period, and by observing
them, we must identify the resulting new state at the beginning of the next period.
For the computer example we list the current states together with the set of possible events that
might occur during the day. Given the current state and the problem description, one must be able to
determine the probability of every possible transition for the upcoming period. We use coloured
bands to distinguish the states. Note that each state has one or more associated events and that the
sum of the probabilities for each state must equal 1. The cost column represents the repair cost of
the computers assumed to be $40 per computer repaired.
Index s Events Probability Next
state Cost
0
Neither computer fails. 0.6 (0) 0
One computer fails. 0.3 (1) 40
Both computers fail. 0.1 (2) 80
1
The remaining computer does not fail
and the other is returned. 0.8 (0) 0
The remaining computer fails and the
other is returned. 0.2 (1) 40
2
Both computers have failed. No failures
can occur. 1.0 (0) 0
State-transition matrix:
The major properties of a DTMC model can be described with the m x m state-transition
matrix P whose rows and columns are labelled 0 through m-1. An element of the matrix, is the
probability that given the system is in state i at some time t, the system will be in state j at time t+1.
A requirement of a DTMC is that the transition probability depends only on the current state i and
not on the particular path that the process takes to reach state i.
-
37
The state indices are shown to the right and above the matrix.
Some general characteristics of the transition matrix are as follows.
- The elements of a row must sum to 1. This follows from the logical requirement that the
states define every possible condition for the system.
- All elements are between 0 and 1. This follows from the definition of probability.
State-transition network:
The information in the transition matrix can also be displayed in a directed network which has a
node for each state and an arc passing from state i to state j if is nonzero. The figure depicts the
network for the computer repair example. Transition probabilities are adjacent to the arcs. A
requirement is that the sum of all probabilities leaving a node must be 1.
Complete model:
A DTMC model requires the specification of the following:
- The times when the system is to be observed.
- The discrete states in which the system may be found. The list of states must be exhaustive.
In addition, a one-to-one correspondence must be prescribed between the states and the
nonnegative integers.
- The state-transition matrix showing the transition probabilities in one time interval.
Transition probabilities must satisfy the Markovian property, and every row must sum to
one.
Although the model structure is easily stated it is not always easy to realize. For example, one might
propose time and state definitions (parts (1) and (2) above) for which the Markovian property is not
satisfied. This may sometimes be remedied by a more complex state definition.
Because a DTMC model is very general it can be used to describe many interesting stochastic
systems. In many cases, however, the number of states required to adequately define the model is
-
38
very large. As with dynamic programming, this "curse of dimensionality" frequently arises when
we try to identify all possible states of the system.
2.10 Continuous Time Markov Chains
A natural extension of a DTMC occurs when time is treated as a continuous parameter. In this
section, we consider continuous-time, discrete-state stochastic processes but limit our attention to
the case where the Markovian property holds; that is, the future realization of a system depends
only on the current state and the random events that proceed from it. This is called a Continuous
Time Markov Chain (CTMC). Sometimes we use the term Markov Process for this kind of system.
It happens that the Markov property is only satisfied in a continuous-time stochastic process if all
activity durations are exponentially distributed. Although this may sound somewhat restrictive,
many practical situations can be modelled as CTMC and many powerful analytical results can be
obtained. A primary example is an M/M/s queuing system in which customer arrivals and service
times follow an exponential distribution. Because it is possible to compute the steady-state
probabilities for such systems, it is also possible to compute many performance-related statistics
such as the average wait and the average number of customers in the queue. In addition, many
critical design and operational questions can be answered with little computational effort.
2.10.1 Example problem ATM machine
To illustrate the elements of the stochastic process model, we use the example of a single
Automated Teller Machine (ATM) located in foyer of a bank. The ATM performs banking
operations for people arriving for service. The machine is used by only one person at a time, and
that person said to be in service. Others arriving when the machine is busy must wait in a single
queue, and these people are said to be in the queue. Following the rule of first-come-first-served, a
person in the queue will eventually enter service and will ultimately leave the system. The number
in the system is the total of the number in service plus the number in the queue. The foyer is limited
in size so that it can hold only five people. Since the weather is generally bad in this part of the
country, when the foyer is full, arriving people do not enter. We have gathered statistics on ATM
usage that show the time between arrivals averages 30 seconds (or 0.5 minutes). The time for
service averages 24 seconds (or 0.4 minutes). Although the ATM has sufficient capacity to meet all
demand, we frequently observe queues at the machine and occasionally customers are lost.
We want to perform an analysis to determine statistical measures that describe the number of people
in the system, the waiting time for customers, the efficiency of the ATM machine, and the number
of customers not served because there is no room in the foyer. We intend to use these statistics to
guide managers in design questions such as whether another ATM should be installed, or whether
the size of the foyer should be expanded.
One way to describe the process associated with this situation is in the state-transition network. The
state is the number of customers in the foyer. A change in state occurs when we have an arrival (a)
or a departure (d). When the foyer opens in the morning, the system is empty -- in state 0. As
customers arrive the state increases. Since there is only one machine, customers must wait in a
queue when the state is greater than one. The state index increase