(virtual) agents for running electricity markets

11
(Virtual) Agents for running electricity markets Paulo Trigo a, * , Paulo Marques b , Helder Coelho c a LabMAg, GuIAA, DEETC, ISEL – Instituto Superior de Engenharia de Lisboa, Portugal b GuIAA, DEETC, ISEL – Instituto Superior de Engenharia de Lisboa, Portugal c LabMAg, DI, FCUL – Faculdade de Ciências da Universidade de Lisboa, Portugal article info Article history: Received 27 October 2009 Received in revised form 13 March 2010 Accepted 2 April 2010 Available online 10 April 2010 Keywords: Multi-agent based simulation Electricity market modeling Sequential decision process Adaptation and learning abstract This paper describes a multi-agent based simulation (MABS) framework to construct an artificial electric power market populated with learning agents. The artificial market, named TEMMAS (The Electricity Market Multi-Agent Simulator), explores the integration of two design constructs: (i) the specification of the environmental physical market prop- erties and (ii) the specification of the decision-making (deliberative) and reactive agents. TEMMAS is materialized in an experimental setup involving distinct power generator com- panies that operate in the market and search for the trading strategies that best exploit their generating units’ resources. The experimental results show a coherent market behav- ior that emerges from the overall simulated environment. Ó 2010 Elsevier B.V. All rights reserved. 1. Introduction The organizational structure of the electrical power industry has been completely altered over the two last decades. The traditional organization used to follow a heavily regulated perspective with a single nation-wide electric power company owning the whole infrastructure from generating stations to transmission and distribution facilities. Such monopolistic ap- proach has been gradually deregulated and the modern organization is that of an energy market where competition should be achieved in a fair and transparent way (according to the European Directives 2003/54/EC and 2003/55/EC [1]). The start-up of nation-wide electric markets, along with its recent expansion to intercountry markets, aims at providing a competitive electricity service to consumers. This newly deregulated electricity market organization calls for an increasing (human) decision-making responsibility in order to settle the energy assets’ trading strategies. The growing number of inter- actions among market participants and their mutual-influencing are usually described by game theoretic approaches that are based on the determination of equilibrium points against which to compare the actual market performance [2,3]. How- ever, those approaches find it difficult to incorporate the ability of market participants to repeatedly probe markets and adapt their strategies. Usually, the problem of finding the equilibria strategies is relaxed (simplified) both in terms of: (i) the human agents’ bidding policies and (ii) the technical and economical operation of the power system. As an alternative to the equilibrium approaches, the multi-agent based simulation (MABS) comes forth as being partic- ulary well fitted to analyze dynamic and adaptive systems with complex interactions among constituents [4,5]. In this paper we describe a MABS modeling framework that provides constructs for the (human) designer to specify a dynamic environment, its resources, observable properties and its inhabitant decision-making agents. We used the framework to capture the behavior of the electricity market and to build a simulator, named TEMMAS (The Electricity Market Multi-Agent Simulator), which incorporates the operation of multiple generator company (GenCo) operators, each with distinct power generating units (GenUnit), and a market operator (Pool) which computes the hourly market price (driven by the electricity demand). 1569-190X/$ - see front matter Ó 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.simpat.2010.04.003 * Corresponding author. Tel.: +351 91 887 2938. E-mail addresses: [email protected] (P. Trigo), [email protected] (P. Marques), [email protected] (H. Coelho). Simulation Modelling Practice and Theory 18 (2010) 1442–1452 Contents lists available at ScienceDirect Simulation Modelling Practice and Theory journal homepage: www.elsevier.com/locate/simpat

Upload: paulo-trigo

Post on 29-Jun-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Simulation Modelling Practice and Theory 18 (2010) 1442–1452

Contents lists available at ScienceDirect

Simulation Modelling Practice and Theory

journal homepage: www.elsevier .com/locate /s impat

(Virtual) Agents for running electricity markets

Paulo Trigo a,*, Paulo Marques b, Helder Coelho c

a LabMAg, GuIAA, DEETC, ISEL – Instituto Superior de Engenharia de Lisboa, Portugalb GuIAA, DEETC, ISEL – Instituto Superior de Engenharia de Lisboa, Portugalc LabMAg, DI, FCUL – Faculdade de Ciências da Universidade de Lisboa, Portugal

a r t i c l e i n f o

Article history:Received 27 October 2009Received in revised form 13 March 2010Accepted 2 April 2010Available online 10 April 2010

Keywords:Multi-agent based simulationElectricity market modelingSequential decision processAdaptation and learning

1569-190X/$ - see front matter � 2010 Elsevier B.Vdoi:10.1016/j.simpat.2010.04.003

* Corresponding author. Tel.: +351 91 887 2938.E-mail addresses: [email protected] (P. Trig

a b s t r a c t

This paper describes a multi-agent based simulation (MABS) framework to construct anartificial electric power market populated with learning agents. The artificial market,named TEMMAS (The Electricity Market Multi-Agent Simulator), explores the integrationof two design constructs: (i) the specification of the environmental physical market prop-erties and (ii) the specification of the decision-making (deliberative) and reactive agents.TEMMAS is materialized in an experimental setup involving distinct power generator com-panies that operate in the market and search for the trading strategies that best exploittheir generating units’ resources. The experimental results show a coherent market behav-ior that emerges from the overall simulated environment.

� 2010 Elsevier B.V. All rights reserved.

1. Introduction

The organizational structure of the electrical power industry has been completely altered over the two last decades. Thetraditional organization used to follow a heavily regulated perspective with a single nation-wide electric power companyowning the whole infrastructure from generating stations to transmission and distribution facilities. Such monopolistic ap-proach has been gradually deregulated and the modern organization is that of an energy market where competition shouldbe achieved in a fair and transparent way (according to the European Directives 2003/54/EC and 2003/55/EC [1]).

The start-up of nation-wide electric markets, along with its recent expansion to intercountry markets, aims at providing acompetitive electricity service to consumers. This newly deregulated electricity market organization calls for an increasing(human) decision-making responsibility in order to settle the energy assets’ trading strategies. The growing number of inter-actions among market participants and their mutual-influencing are usually described by game theoretic approaches thatare based on the determination of equilibrium points against which to compare the actual market performance [2,3]. How-ever, those approaches find it difficult to incorporate the ability of market participants to repeatedly probe markets andadapt their strategies. Usually, the problem of finding the equilibria strategies is relaxed (simplified) both in terms of: (i)the human agents’ bidding policies and (ii) the technical and economical operation of the power system.

As an alternative to the equilibrium approaches, the multi-agent based simulation (MABS) comes forth as being partic-ulary well fitted to analyze dynamic and adaptive systems with complex interactions among constituents [4,5].

In this paper we describe a MABS modeling framework that provides constructs for the (human) designer to specify a dynamicenvironment, its resources, observable properties and its inhabitant decision-making agents. We used the framework to capturethe behavior of the electricity market and to build a simulator, named TEMMAS (The Electricity Market Multi-Agent Simulator),which incorporates the operation of multiple generator company (GenCo) operators, each with distinct power generating units(GenUnit), and a market operator (Pool) which computes the hourly market price (driven by the electricity demand).

. All rights reserved.

o), [email protected] (P. Marques), [email protected] (H. Coelho).

P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452 1443

TEMMAS agency exhibits bounded rationality, i.e., each agent makes decisions based on local information (partial knowl-edge) of the system and of other agents while learning and adapting their strategies during a simulation. The TEMMAS pur-pose is not to explicitly search for equilibrium points, but rather to reveal and assist to understand the complex andaggregate system behaviors that emerge from the interactions of the market agents.

We used TEMMAS to construct an agency for the Iberian (Portugal and Spain) Electricity Market (MIBEL – ‘‘Mercado IbT-rico de Electricidade”). The agency follows the market organizational structure and each agent acts (bid, buy, sell) accordingto its role in the market. The simulation explores the relation between the production capacity and the search for compet-itive bidding strategies. Initial results revel a coherent market behavior as price reductions from competition favour consum-ers and harm producers that do not adapt their electricity bids.

2. The MABS modeling framework

We describe the structural MABS constituents by means of two concepts: (i) the environmental entity, which owns a dis-tinct existence in the real environment, e.g. a resource such as an electricity producer, or a decision-making agent such as amarket bidder generator company, and (ii) the environmental property, which is a measurable aspect of the real environment,e.g. the price of a bid or the demand for electricity. Hence, we define the environmental entity set, ET ¼ fe1; . . . ; eng, and theenvironmental property set, EY ¼ fp1; . . . ; pmg. The whole environment is the union of its entities and properties: E ¼ ET [ EY .

The environmental entities, ET , are often clustered in different classes, or types, thus partitioning ET into a set, PET , ofdisjoints subsets, Pi

ET , each containing entities that belong to the same class. Formally, PET ¼ fP1ET ; . . . ;Pk

ET g defines a fullpartition of ET , such that Pi

ET # ET and PET ¼ [i¼1;...;kPiET and Pi

ET \ PjET ¼ ; 8i–j. The partitioning may be used to distinguish

between decision-making agents and available resources, e.g. a company that decides the bidding strategy to pursue or aplant that provides the demanded power.

The environmental properties, EY , can also be clustered, in a similar way as for the environmental entities, thus groupingproperties that are related. The partitioning may be used to express distinct categories, e.g. economical, electrical, ecologicalor social aspects. Another, more technical usage, is to separate constant parameters from dynamic state variables.

The factored state space representation. The state of the simulated environment is implicitly defined by the state of all itsenvironmental entities and properties. We follow a factored representation, that describes the state space as a set, V, of dis-crete state variables [6]. Each state variable, v i 2 V, takes on values in its domainDðv iÞ and the global (i.e., over E) state space,S#�v i2VDðv iÞ, is a subset of the Cartesian product of the state variable domains. A state s 2 S is an assignment of values tothe set of state variables V. We define fC , C#V, as a projection such that if s is an assignment to V, fCðsÞ is the assignment of sto C; we define a context c as an assignment to the subset C#V; the initial state variables of each entity and property aredefined, respectively, by the functions initET : ET ! C and initEY : EY ! C.

From environmental entities to resources and agents. The embodiment is central in describing the relation between the enti-ties and the environment [7]. Each environmental entity can be seen as a body, possibly with the capability to influence theenvironmental properties. Based on this idea of embodiment, two higher-level concepts (decoupled from the environment, E,characterization) are introduced: (i) agent, owning reasoning and decision-making capabilities and (ii) resource, without anyreasoning capability. Thus, given a set of agents, !, we define an association function embody : !! ET , which connects anagent to its physical entity. In a similar way, given a set of resources, U, we define the mapping function identity : U! EY .We consider that jEj ¼ j!j þ jUj, thus each entity is either mapped to an agent or to a resource; there is no third category.

The decision-makingapproach. Each agent perceives (the market) and acts (sells or buys) and there are two main approachesto develop the reasoning and decision-making capabilities: (i) the qualitative mental-state based reasoning, such as thebelief-desire-intention (BDI) architecture [8], which is founded on logic theories, and (ii) the quantitative, decision-theoretic,evaluation of causal effects, such as the Markov decision process (MDP) support for sequential decision-making in stochasticenvironments. There are also hybrid approaches that combine the qualitative and quantitative formulations [9,10].

The qualitative mental-state approaches capture the relation between high level components (e.g. beliefs, desires, inten-tions) and tend to follow heuristic (or rule-based) decision-making strategies, thus being better fitted to tackle large-scaleproblems and worst fitted to deal with stochastic environments.

The quantitative decision-theoretic approaches deal with low level components (e.g., primitive actions and immediaterewards) and searches for long-term policies that maximize some utility function, thus being worst fitted to tackle large-scale problems and better fitted to deal with stochastic environments.

The electric power market is a stochastic environment and we currently formulate medium-scale problems that can fit adecision-theoretic agent model. Therefore, TEMMAS adaptive agents (e.g., market bidders) follow a MDP based approach andresort to experience (sampled sequences of states, actions and rewards from simulated interaction) to search for optimal, ornear-optimal, policies using reinforcement learning methods such as Q-learning [11] or SARSA [12].

3. TEMMAS agency design

Within the current design model of TEMMAS the electricity asset is traded through a spot market (no bilateral agree-ments), which is operated via a Pool institutional power entity. Each generator company, GenCo, submits (to Pool) how muchenergy, each of its generating unit, GenUnitGenCo, is willing to produce and at what price. Thus, we have:

1444 P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452

� the power supply system comprises a set, EGenCo, of generator companies,� each generator company, GenCo, contains its own set, EGenUnitGenCo , of generating units,� each generating unit, GenUnitGenCo, of a GenCo, is responsible of calculating its marginal costs, and� the market operator, Pool, computes the bidding procedure that settles the market price.

The bidding procedure conforms to the so-called ‘‘block bids” approach [13], where a block represents a quantity ofenergy being bided for a certain price; also, GenCos are not allowed to bid higher than a predefined price ceiling. Thus,the market supply essential measurable aspects are the energy price, quantity and marginal production cost. The consumerside of the market is mainly described by the quantity of demanded energy; we assume that there is no price elasticity ofdemand (i.e., no demand-side market bidding). Therefore, we have: ET ¼ fPoolg [ EGenCo[g2EGenCoEGenUnitg andEY ¼ fquantity; price; marginalProductionCostg. The quantity refers both to the supply and demand sides of the market. Theprice refers both to the supply bided values and to the market settled (by Pool) value. The marginalProductionCost refersto the marginal cost as defined by each GenUnitGenCo (according to Eqs. (1) and (2)). The EGenCo contains the decision-makingagents. The Pool is a reactive agent that always applies the same predefined auction rules in order to determine the marketprice and hence the block bids that clear the market. Each EGenUnitGenCo represents the GenCo’s set of available resources.

The resources’ specification. Each generating unit, GenUnitGenCo, defines its marginal costs and constructs the block bidsaccording to the strategy indicated by its generator company, GenCo. The marginal costs’ valuations are usually kept as a‘‘private business” affair that incorporates the econometrical and technical company’s knowledge. Although being undis-closed formulations they usually follow two main perspectives; one that relates marginal costs with the heat rate gradient,‘‘WithHeatRate” [14], and another that considers productivity indexes, ‘‘WithProductivity” [15]. TEMMAS implements bothformulations and allows to include other perspectives. The ‘‘WithHeatRate” formulation estimates the marginal cost, MC, bycombining the variable operations and maintenance costs, vO&M, the number of heat rate intervals, nPat, each interval’scapacity, capi and the corresponding heat rate value, hri, and the price of the fuel, fPrice, being used; the marginal cost fora given i 2 [1,nPat] interval is given by

MCiþ1 ¼ vO&M þ ðcapiþ1 � hriþ1Þ � ðcapi � hriÞblockCapiþ1

� fPrice ð1Þ

where each block’s capacity is: blockCapi+1 = capi+1 � capi. The ‘‘WithProductivity” marginal cost, MC, combines the variableoperations and maintenance costs, vO&M, the fuel price, fPrice, the CO2 cost, CO2cost, and unit’s productivity, g, as

MC ¼ vO&M þ fPriceg� K þ CO2cost ð2Þ

where K is a fuel-dependent constant factor, CO2cost ¼ CO2price� CO2emitg � K and CO2emit is the CO2 fuel’s emissions. Here all

blocks have the same capacity; given a unit’s maximum capacity, maxCap, and a number of blocks, nBlocks, to sell, eachblock’s capacity is given by: blockCap ¼ maxCap

nBlocks.

The decision-making strategies. Each generator company defines the bidding strategy for each of its generating units. Wedesigned two types of strategies: (a) the basic-adjustment, that chooses among a set of basic rigid options and (b) the heu-ristic-adjustment, that selects and follows a predefined well-known heuristic. There are several basic-adjustment strategiesalready defined in TEMMAS. Here we outline seven of those strategies, sttgi where i 2 {1, . . . ,7}, available for a GenCo to apply:(i) sttg1, bid according to the marginal production cost of each GenUnitGenCo (follow heat rate curves, e.g., cf. Tables 2 and 3),(ii) sttg2, make a ‘‘small” increment in the prices of all the previous-day’s block bids, (iii) sttg3, similar to sttg2, but makes a‘‘large” increment, (iv) sttg4, make a ‘‘small” decrement in the prices of all the previous-day’s block bids, (v) sttg5, similar tosttg4, but makes a ‘‘large” decrement, (vi) sttg6, hold the prices of all previous-day’s block bids, (vii) sttg7 set the price to zero.There are two heuristic-adjustment defined strategies: (a) the ‘‘Fixed Increment Price Probing” (FIPP) that uses a percentage toincrement the price of last day’s transacted energy blocks and to decrement the non-transacted blocks, and (b) ‘‘PhysicalWithholding based on System Reserve” (PWSR) that reduces the block’s capacity, as to decrement the next day’s estimatedsystem reserve (difference between total capacity and total demand), and then bids the remaining energy at the maximummarket price. These strategies were identified with insight from human decision-makers but the ‘‘real-world” strategies aremore complex as they incorporate the estimation of economical indexes, the foresight of others’ behavior and the company’shistorical knowledge and business best-practices.

The agents’ decision process. The above strategies correspond to the GenCo agent’s primary actions. The GenCo has a set,EGenUnitGenCo , of generating units and, at each decision-epoch, it decides the strategy to apply to each generating unit, thuschoosing a vector of strategies, sttg

!, where the ith vector’s component refers to the GenUniti

GenCo generating unit; thus, its ac-tion space is given by: A ¼ �jEGenUnitGenCo

ji¼1 fsttg1; . . . ; sttg7gi [ fFIPP;PWSRg. The GenCo’s perceived market share, mShare, is used

to characterize the agent internal memory so its state space is given by mShare 2 [0..100]. Each GenCo is a MDP decision-making agent such that the decision process period represents a daily market. At each decision-epoch each agent computesits daily profit (that is regarded as an internal reward function) and the Pool agent receives all the GenCo s’s block bids for the24 daily hours and settles the hourly market price by matching offers in a classic supply and demand equilibrium price (weassume a hourly constant demand).

P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452 1445

4. TEMMAS architecture and construction

The TEMMAS agents along with the major inter-agent communication paths are represented in the bottom region ofFig. 1; the top region shows the user interface that enables to specify each of the resources’ and agents’ configurable param-eters. The implementation of the TEMMAS architecture followed the INGENIAS [16] methodology and used its supportingdevelopment platform. The methodology provides (five) different viewpoint descriptions of an agent [17] and the tasks-goalsviewpoint depicts how to satisfy a goal, via the execution of the goal’s related tasks, given the dependencies between tasksand resources; e.g., the resources consumed, modified or used during the execution of a task along with the resources pro-duced upon the task’s completion. Fig. 2 illustrates a tasks-goals model with one goal, ‘‘update-goal”, and two tasks: (a) the‘‘init-task” that fires when the agent starts and (b) the ‘‘update-task” that is triggered on the arrival of an ‘‘update-event”.Those tasks initialize and update the agent’s configurable parameters (via user interface); a syntactic label is provided foreach model component (cf. bottom of Fig. 2) but we refer to INGENIAS documentation [18] for a comprehensive descriptionof the model’s semantics (and syntax).

At each decision epoch the agent consumes perceptions, activates a task (to achieve a goal) and decides the next action toperform. The Fig. 3 shows the ‘‘bid” goal to be achieved via the ‘‘bidding” task, whenever the ‘‘private-data” and ‘‘public-data”become available, thus generating the ‘‘bids” action (a communicative act to be sent to the Pool representative agent). Eachagent’s ‘‘private-data” includes its (hourly) energy transaction, revenue, market share percentage and total revenue of allother agents; the ‘‘public-data” includes the market price, the total amount of energy transactions and the system reserve.Both private and public data are used to compute the bidding strategies (e.g., FIPP and PWSR, cf. Section 3). The ‘‘QStub” ele-ment (cf. Fig. 3) connects to a Q-learning module coded as a R-Project server [19]; the state-action is represented as a (per-sistent) sparse matrix. Thus, the ‘‘bidding” task uses ‘‘QStub” resource and gets the MDP policy to achieve the ‘‘bid” goal.

5. TEMMAS illustrative setup

We used TEMMAS to build a specific electric market simulation model. We picked the inspiration from the Iberian Elec-tricity Market (MIBEL – ‘‘Mercado Ibérico de Electricidade”) with Portuguese (e.g., EDP – ‘‘Electricidade de Portugal”,‘‘Turbogás”, ‘‘Tejo Energia”) and Spanish (e.g., ‘‘Endesa”, ‘‘Iberdrola”, ‘‘Union Fenosa”, ‘‘Hidro Cantábrico”, ‘‘Viesgo”, ‘‘BasNatural”, ‘‘Elcogás”) generator companies. Regarding the total electricity capacity installed the Iberian market is composedof a major player (Spain) and a minor player (Portugal). Our experiments exploit the combined market behavior of a majorand a minor electricity market players. We abstracted intra-nation market details and modeled each country as a singlegenerator company (with several generating units). Fig. 4 uses INGENIAS notation to depict the hierarchical structure ofthe electricity market; the Pool (OMEL – ‘‘Operador do Mercado IbTrico de Electricidade”) settles the market price (andcoupled bids) after the bids submitted by each GenCo (PT – ‘‘Portugal” and ES – ‘‘Spain”) according to a strategy that dependson the marginal production costs of each GenUnit. We considered three types of generating units:

Fig. 1. TEMMAS architecture and configurable parameters (via interface).

Fig. 2. TEMMAS tasks-goals viewpoint (INGENIAS notation).

Fig. 3. The TEMMAS bidding goal, task and related resources.

1446 P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452

(i) coal plant, CO, to provide the base load demand,(ii) combined cycle plant, CC, to cover intermediate load, and

(iii) gas turbine, GT, to cover peaking loads.

Table 1 shows the properties of each plant type; Tables 2 and 3 show the heat rate curves used to define the biddingblocks. The marginal cost (cf. Tables 2 and 3) was computed using expression (1); the bidding block’s quantity is the capacityincrement, e.g. for CO, the 11.9 marginal cost bidding block’s quantity is 350 � 250 = 100 MW (cf. Table 2, CO, top lines 2and 1).

6. Experiments and results

Our experiments have two main purposes: (i) illustrate the TEMMAS functionality and (ii) analyze the agents’ resultingbehavior, e.g. the learnt bidding policies, in light of the market specific dynamics. We designed three experimental scenarios

Fig. 4. Illustrative TEMMAS formulation (INGENIAS organizational viewpoit); PT and ES are Portugal and Spain; OMEL is the Iberian market operator.

Table 1Properties of generating units; coal (CO), combined cycle (CC) and gas turbine (GT); the O&M indicates ‘‘operation and maintenance” cost.

Property Unit Type of generating unit

CO CC GT

Fuel – Coal (BIT) Nat. gas Nat. gasCapacity MW 500 250 125Fuel price €/MMBtu 1.5 5 5Variable O&M €/MWh 1.75 2.8 8

Table 2CO and CC unit’s capacity block (MW) and heat rate (Btu/kW h) and the corresponding marginal cost (€/MWh).

CO generating unit CC generating unit

Cap. Heat rate Marg. cost Cap. Heat rate Marg. cost

250 12,000 – 100 9000 –350 10,500 11.9 150 7800 29.8400 10,080 12.5 200 7200 29.8450 9770 12.7 225 7010 30.3500 9550 13.1 250 6880 31.4

Table 3GT unit’s capacity block (MW) and heat rate (Btu/kW h) and the corresponding marginal cost (€/MWh).

GT generating unit

Cap. Heat rate Marg. cost

50 14,000 –100 10,600 44.0110 10,330 46.2120 10,150 48.9125 10,100 52.5

P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452 1447

and Table 4 shows the GenCo’s name along with its production capacity, computed according to the respective GenUnits (cf.Table 1). The ‘‘active” suffix (cf. Table 4, name column) means that the GenCo searches for its GenUnits best bidding strategies;i.e. ‘‘active” is a policy learning agent. The state-action space is represented as a sparse matrix (zeros discarded), an action is avector of strategies (one by GenUnit) and the state space is the GenCo’s market share (cf., Section 3, A and mShare).

Experiment #1. The experiment sets a constant, 600 MW, hourly demand for electricity. Fig. 5 shows the GenCo_activeprocess of learning the bidding policy with the highest long-term profit. We used Q-learning, with an �-greedy explorationstrategy, which picks a random action with probability � and behaves greedily otherwise (i.e., picks the action with thehighest estimated action value); we defined � = 0.2. The learning factor rate of Q-learning was defined as a = 0.01 and thediscount factor (which measures the present value of future rewards) was set to c = 0.5. Fig. 6 shows the bid blocks thatcleared the market (at the first hour of last simulated day). As there is no market competition the cheapest, CO, bids zeroand provides 500 MW, the GT sets the market price (to its ceiling) and the remaining 100 MW are distributed among themost expensive (highest marginal cost) GenUnits (CC,GT). So, the GenCo_active agent found, for each perceived market share,mShare, the best strategy, sttg

!, to bid its GenUnits’ energy blocks.

Table 4The experiment’s GenCos and GenUnits.

Exp. GenCo GenUnits

Name Prod. Capac.

#1 GenCo_active 875 CO & CC & GT

#2 GenCo_major 2000 2 � CO & 4 � CCGenCo_minor& active 875 3 � CC & 1 � GT

#3 GenCo_major& active 2000 2 � CO & 4 � CCGenCo_minor& active 875 3 � CC & 1 � GT

Fig. 5. The process of learning a bid policy to maximize profit [Exp. #1].

1448 P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452

Experiment #2. The experiment sets a constant, 2000 MW, hourly demand for electricity. Fig. 7 shows the market shareevolution while GenCo_minor&active learns to play in the market with GenCo_major, which is a larger company with a fixedstrategy: ‘‘bid each block 5€ higher than its marginal cost”.

The GenCo_major has a production capacity of 2000 MW (cf., Table 4) thus being able to cover the whole electricity de-mand (2000 MW); GenCo_minor has a production capacity of 875 MW. Hence, the market has 2875 MW available to offerand a demand of 2000 MW. The GenCo_major will always sell (at least 2000 � 875 = 1125 MW) and GenCo_minor mustsearch for a bidding policy that throws some (at most 875 MW) of the GenCo_major electricity out of the market.

We see, from Fig. 7, that GenCo_minor&active gets around 18% (75 � 57) of market from GenCo_major. To earn that marketthe GenCo_minor&active learnt to lower its prices in order to exploit the ‘‘5€ space” offered by GenCo_major fixed strategy.

Fig. 6. The bid policy that maximizes profit (price ceiling is 180) [Exp. #1].

P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452 1449

In this scenario the dimension of GenCo_major is enough for it to settle the market price that corresponds to the mostexpensive 1125 MW (as no other company can provide that electricity). The GenCo_minor needs to bid its 875 MW as lowas possible in order to withdraw, from the market, the corresponding bidding blocks of GenCo_major.

Fig. 8 shows that the profit of GenCo_major decreases (from around 600 K€ to 550 K€) while GenCo_minor learns how tobid in order to get a stable profit (around 60 K€). In conclusion, the GenCo_major fixed bidding strategy (to ‘‘bid each block 5€higher than its marginal cost”) enabled the GenCo_minor to acquire a valuable (60 K€) market share.

Experiment #3. In this experiment both GenCos are ‘‘active”; the remaining is the same as in experiment #2. Fig. 9 showsthe market share oscillation while each company reacts to the other’s strategy to win the market. Despite the competitioneach company learns to secure its own fringe of the market. Fig. 10 shows that the profit of both companies follows the mar-ket share pattern and the bidding rational is identical to that of the previous experiment (i.e., as GenCo_minor in experiment#2).

The experiments illustrate how a one-company market (cf., experiment #1) will find its way to sell all its electricity at thesame price of its most expensive energy (e.g., sell the cheapest CO at the price of the most expensive CC and GT producedenergy). The results for the two-company market (cf., experiments #1 and #2) show that a big company, capable of supply-ing the whole demand, with a strategy to always settle the market price will probably loose some of its market share for asmaller but aggressive cutting-price competitor. This may also suggest that smaller companies could explore alternativeways to increase its market share, such as attractive bilateral contracts with final consumers.

The experiments illustrate the behavior of one and two GenCo companies with a limited number of GenUnit s. The exper-iments were specified using the TEMMAS simulator graphical interface that enables the user to add/remove a GenCo, attach/detach to it a set of GenUnit s, specify whether to use the learning capability and enable/disable the predefined bidding strat-egies. In this paper we explore the two-company market scenario because each GenCo always perceives (from the Pool) not

Fig. 7. Market share evolution induced by GenCo_minor & active [Exp. #2].

Fig. 8. Profit evolution induced by GenCo_minor & active [Exp. #2].

1450 P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452

only its private data but also the public data that aggregates the set of all other GenCos (cf. Section 4); each GenCo knowsabout itself and about ‘‘the rest of the world” thus reducing (for simulation purposes) to a two-company market scenario.The TEMMAS interface also enables to add/remove a GenUnit and to specify its technical parameters (e.g., heat rate and mar-ginal cost). Within TEMMAS current stage the increment on the number of GenUnits must be taken carefully because theGenCo state-action space increases exponentially in the number of GenUnits (the GenCo action space includes all possiblecombinations of strategies). Currently, the state-action space is explored in Q-learning and represented as a sparse matrix(in a file) and the scaling to a large number of GenUnits opens a TEMMAS future research on techniques to explore and rep-resent the GenCos’ state-action space.

7. Related work

The research on MABS for the electricity markets can be grouped in three main categories: (i) the ‘‘market design anal-ysis”, (ii) the ‘‘modeling of the agents’ decision-making processes”, and (iii) a mixture of the two previous categories.

The ‘‘market design analysis” research’s ultimate goal is to characterize a market phenomenon by means of simple behav-ioral correlations among the market (human) agents. A pioneer work on the market design analysis, by Day and Bunn [20],simulates a uniform price marketing clearing model where generation companies are daily profit maximizers who assumethat competitors bid the same supply function as they did in the previous day. The model is explored in a case study, by Bunnand Oliveira [21,22], to analyse whether two generation companies (in the United Kingdom) were capable of increasing theirprofits by manipulating market prices above marginal cost; the simulation showed that in order to profitably manipulateprices both companies would have to act together. Another approach, by Visudhiphan and Ilic [23], describes the electricitymarket as a dynamic bidding game, comprising different time scales, where the agent-based simulation would enable to dis-tinguish situations in which market power has been exerted from situations where effects of technical constraints mighthave raised market prices.

Fig. 10. Profit evolution induced by both GenCos [Exp. #3].

Fig. 9. Market share evolution induced by both GenCos. [Exp. #3]

P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452 1451

The early research on the ‘‘modeling of the agents’ decision-making processes” from Richter and SheblT [24] describes theuse of a genetic algorithm to optimize multiple bidding rounds for a one-time period of electricity deliveries where all gen-erators are characterized by the same generation cost curve. The auction takes place and is repeated as long as there is elec-tricity to buy or to sell; the genetic method selects, at each cycle, the most profitable parents and replaces the least successfulhalf of the population. Another case study, by Koesrindartoto [25], simulates buyers and sellers bidding on a double-auctionmarket to analyze the impact, on market result, of agent learning of market outcomes for bidding at marginal cost or rev-enue. The work from Scheidt and Sebastian [26], on the German electricity sector, simulates a spot market as a double-auction with an uniform price settlement market clearing; the approach includes non-agent tools for importing data (fromexternal providers) into the simulation in order for agents to generate load and price forecast.

The approaches that embrace both the ‘‘market design analysis” and the ‘‘modeling of the agents’ decision-making pro-cesses” are usually described as market simulation frameworks. The AMES (Agent-based Modeling of Electricity Systems)[27] is targeted to small and medium markets and is able to use learning agents (with the Variant Roth-Erev algorithm[28]). The (human) user can choose whether to use the learning agents but cannot choose (neither define) any bidding strat-egies. The AMES is implemented in Repast (Recursive Porous Agent Simulation Toolkit) [29] which does not support themapping from concrete institutional relations into virtual agents with goals, communicative needs and behavioral capabil-ities. Another simulator is the PowerWeb [30] that assumes a fixed demand and the bidding companies can only own a singlegenerating unit. The bidder agent can either define a price-quantity energy block or choose from a set of predefined availablestrategies (e.g., bid all at zero cost, bid all at marginal cost, increase the cost of previous day sold blocks, decrease the cost ofprevious day non-sold blocks, bid randomly); the system provides Web-based interaction but only supports human sellers,i.e., there are no autonomous bidder agents. The most adopted simulator is the EMCAS (Electricity Market Complex AdaptiveSystem) [31] commercial system, implemented in Repast, that incorporates spot and bilateral markets and simulates differ-ent levels of reserve for the grid regulation. The demand is represented by consumer agents who are supplied by demandcompanies; consumer agents can switch their supplier or decrease their electricity demand. The supply side is representedby generation companies, that own generators representing power plants, and decide on bidding strategies. Agents are mod-eled as maximizers of a multi-objective utility function which includes risk preferences and other aspects such as profit andmarket share; the goals are represented by a minimum and maximum expected value, and a risk preference. The EMCASgenerates a price forecast based on data imported from external providers (electric system and historical prices); this infor-mation is used to calculate the expected utility of a given strategy.

We remark that the commercial EMCAS system is currently being used by EDP (‘‘Electricidade de Portugal”) to analyze theIberian Electricity Market (MIBEL) [32].

Our TEMMAS simulator is designed to embrace both the ‘‘market design analysis” and the ‘‘modeling of the agents’ deci-sion-making processes”. TEMMAS supports multiple generator companies, multiple power plants (generating units) withdifferent technologies and distinct ways of computing marginal costs. The user is given the capability to configure the deci-sion-making agents (specify the bidding strategies or choose from a set of predefined); also decision-making agents may beconfigured as learning or non-learning agents. The system is being specified (from the beginning) as a multi-agent environ-ment [33], modeled with INGENIAS, implemented with JADE [34] (using ‘‘INGENIAS to JADE” transformations and additionalJava coding) and with R-Project [19] for the statistical computations and graphics’ generation. Although TEMMAS is currentlyin a preliminary stage its architectural and design decisions are strongly founded in the MABS field and the initial results arean incentive to further extend the already implemented system. We intend to extend TEMMAS taking into account the par-ticularities of the MIBEL market (e.g., the network congestion and its relation with the market-splitting procedure).

8. Conclusions and future work

This paper describes our preliminary work in the construction of a MABS framework to describe and analyze themacro-scale dynamics of the electric power market. Although both research fields (MABS and market simulation) achievedconsiderable progress there is a lack of cross-cutting approaches. We used the proposed MABS framework to support theconstruction of the TEMMAS agent-based electricity market simulator following INGENIAS methodology and its high leveldesign constructs. Hence, our contribution is two folded: (i) a comprehensive formulation of MABS, including the simulatedenvironment and the inhabiting decision-making and learning agents, and (ii) a simulation model (TEMMAS) of the electricpower market framed in the proposed formulation. TEMMAS was build in the course of a cooperation project with the EDP –‘‘Electricidade de Portugal” (Portuguese Electricity Company) and our initial results reveal an emerging and coherent marketbehavior, thus inciting us to further extend the system with additional bidding strategies and to incorporate specific marketrules, such as congestion management, pricing regulation and the Iberian network congestion constraints.

References

[1] EC, Eur-lex.europa.eu. Available from: <http://eur-lex.europa.eu/>.[2] C. Berry, B. Hobbs, W. Meroney, R. O’Neill, W. Stewart Jr., Understanding how market power can arise in network competition: a game theoretic

approach, Utilities Policy 8 (3) (1999) 139–158.[3] S. Gabriel, J. Zhuang, S. Kiet, A Nash-Cournot model for the north american natural gas market, in: Proceedings of the 6th IAEE European Conference:

Modelling in Energy Economics and Policy, 2004.[4] S. Schuster, N. Gilbert, Simulating online business models, in: Proceedings of the 5th Workshop on Agent-Based Simulation (ABS-04), 2004, pp. 55–61.

1452 P. Trigo et al. / Simulation Modelling Practice and Theory 18 (2010) 1442–1452

[5] A. Helleboogh, G. Vizzari, A. Uhrmacher, F. Michel, Modeling dynamic environments in multi-agent simulation, JAAMAS 14 (1) (2007) 87–116.[6] C. Boutilier, R. Dearden, M. Goldszmidt, Exploiting structure in policy construction, in: Proceedings of the IJCAI-95, 1995, pp. 1104–1111.[7] A. Clark, Being there: putting brain, body, and world together again, MIT Press, Cambridge, MA, 1998.[8] A. Rao, M. Georgeff, BDI agents: from theory to practice, in: Proc. of the First International Conference on Multiagent Systems, S, 1995, pp. 312–319.[9] G. Simari, S. Parsons, On the relationship between MDPs and the BDI architecture, in: Proceedings of the AAMAS-06, 2006, pp. 1041–1048.

[10] P. Trigo, H. Coelho, Decision making with hybrid models: the case of collective and individual motivations, International Journal of Reasoning-BasedIntelligent Systems (IJRIS) 2 (1) (2010) 60–72.

[11] C. Watkins, P. Dayan, Q-learning, Mach. Learning 8 (1992) 279–292.[12] R. Sutton, A. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.[13] OMIP, Iberian Electricity Market Operator. Available from: <http://www.omip.pt>.[14] A. Botterud, P. Thimmapuram, M. Yamakado, Simulating GenCo bidding strategies in electricity markets with an agent-based model, in: Proceedings of

the 7th Annual IAEE European Energy Conf. (IAEE-05), 2005.[15] J. Sousa, J. Lagarto, How market players adjusted their strategic behaviour taking into account the CO2 emission costs – an application to the spanish

electricity market, in: Proceedings of the 4th International Conference on the European Electricity Market (EEM-07), Cracow, Poland, 2007.[16] J. Gómez-Sanz, R. Fuentes-Fernández, J. Pavón, I. Garcfa-Magariño, INGENIAS development kit: a visual multi-agent system development environment

(BEST ACADEMIC DEMO OF AAMAS’08), in: Proceedings of the 7th AAMAS, Estoril, Portugal, 2008, pp. 1675–1676.[17] J. Gómez-Sanz, J. Pavon, Methodologies for developing multi-agent systems, Journal of Universal Computer Science 10 (4) (2004) 359–374.[18] J. Pavon, J.J. Gomez-Sanz, R. Fuentes, The INGENIAS methodology and tools, in: Agent-Oriented Methodologies, Idea Group Publishing, 2000, pp. 236–

276.[19] W.N. Venables, D.M. Smith, An Introduction to R, Network Theory Ltd. – publishing free software manuals, 2009. Available from: http://www.network-

theory.co.uk/R/intro/>.[20] C.J. Day, D.W. Bunn, Divestiture of generation assets in the electricity pool of england and wales: a computational approach to analyzing market power,

Journal of Regulatory Economics 19 (2) (2001) 123–141.[21] D.W. Bunn, F.S. Oliveira, Agent-based simulation: an application to the new electricity arrangements on england and wales, in: TEC on Evolutionary

Computing – special issue: Agent Based Computational Economics, vol. 5, IEEE, 2001, pp. 493–503.[22] D.W. Bunn, F.S. Oliveira, Evaluating individual market power in electricity markets via agent-based simulation, Annals of Operations Research 121

(2003) 57–78.[23] P. Visudhiphan, M. Ilic, On the necessity of an agent-based approach to assessing market power in the electricity markets, in: Proceedings of the

International Symposium on Dynamic Games and Applications, Saint-Petersburg, Russia, 2002.[24] C.W. Richter, G.B. SheblT, Genetic algorithm evolution of tility biding strategies for the competitive market price, in: IEEE Transactions on Power

Systems, vol. 13, IEEE, 1998, pp. 256–261.[25] D. Koesrindartoto, Discrete double auctions with artificial adaptive agents: a case study of an electricity market using a double auction simulator, Tech.

Rep. W02005, Iowa Univ., Dept. of Economics, 2002.[26] M. Scheidt, H.J. Sebastian, Simulating day-ahead trading in electricity markets with agents, in: Proceedings of the 2nd Asia-Pacific Conference of

International Agent Technology (IAT-2001).[27] D. Koesrindartoto, J. Sun, L. Tesfatsion, An agent-based computational laboratory for testing the economic reliability of wholesale power market

designs, in: Proceedings of the IEEE Power Engineering Conference, San Francisco, 2005, pp. 931–936.[28] A.E. Roth, I. Erev, Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term, Games and Economical

Behavior 8 (1995) 164–212.[29] M. North, N.T. Collier, J.R. Vos, Experiences creating three implementations of the repast agent modeling toolkit, in: ACM Transactions on Modeling and

Computer Simulation, vol. 16, ACM, 2006, pp. 1–25.[30] R.D. Zimmerman, R.J. Thomas, D. Gan, C. Murillo-Sánchez, A Web-based platform for experimental investigation of electric power auctions, in: Decision

Support Systems, vol. 24 of special issue on Restructuring the Electric Power Business – A New Paradigm for Reducing Regulation, Elsevier Science,Amsterdam, 1999, pp. 193–205.

[31] G. Conzelmann, M. North, G. Boyd, R. Cirillo, V. Koritarov, C. Macal, P. Thimmapuram, T. Veselka, Agent-based power market modeling: simulatingstrategic market behavior using an agent-based modeling approach, in: Proceedings of the 6th IAEE European Conference on Modeling in EnergyEconomics and Policy, Zurich, 2004.

[32] P. Thimmapuram, T. Veselka, S. Vilela, R. Pereira, R. Silva, Modeling hydro power plants in deregulated electricity markets: integration and applicationof EMCAS to VALORAGUA, in: Proceedings of the 4th International Conference on the European Electricity Market (EEM-08), 2008.

[33] P. Trigo, H. Coelho, Simulating a multi-agent electricity market, in: Proceedings of the 1st Brazilian Workshop on Social Simulation (BWSS-08/ SBIA-08), Bahia, Brazil, 2008.

[34] F.L. Bellifemine, G. Caire, D. Greenwood, Developing Multi-Agent Systems with JADE, Wiley Series in Agent Technology, Wiley, New York, 2007.