knowledge discovery in stock market

44
Knowledge Discovery in the Stock Market Supervised and Unsupervised Learning with BayesiaLab Stefan Conrady, [email protected] Dr. Lionel Jouffe, [email protected] June 29, 2011 Conrady Applied Science, LLC - Bayesia’s North American Partner for Sales and Consulting

Upload: jouffe

Post on 22-Apr-2015

487 views

Category:

Documents


4 download

DESCRIPTION

Perhaps more than any other kind of time series data, financial markets have been scrutinized by countless mathematicians, economists, investors and speculators over hundreds of years. Even in modern times, despite all scientific advances, the effort of predicting future movements of the stock market sometimes still bears resemblance to the ancient alchemistic aspirations of turning base metals into gold. That is not to say that there is no genuine scientific effort in studying financial markets, but distinguishing serious research from charlatanism (or even fraud) remains remarkably difficult.

TRANSCRIPT

Page 1: Knowledge Discovery in Stock Market

Knowledge Discovery in the Stock Market

Supervised and Unsupervised Learning with BayesiaLab

Stefan Conrady, [email protected]

Dr. Lionel Jouffe, [email protected]

June 29, 2011

Conrady Applied Science, LLC - Bayesia’s North American Partner for Sales and Consulting

Page 2: Knowledge Discovery in Stock Market

Table of Contents

Tutorial

Highlights 1

Background & Objective 1

Notation 2

Dataset 3

Data Preparation and Transformation 4

Data Import 5

Determining Discretization Intervals 6

Modeling Mode 8

Unsupervised Learning 12

Bayesian Network versus Correlation Matrix 16

Inference with Bayesian Networks 16

Inference with Hard Evidence 18

Inference with Soft Evidence 22

Bayesian Network Metrics 25

Arc Force 25

Mutual Information 26

Correlation 27

Summary - Unsupervised Learning 27

Supervised Learning 29

Inference with Supervised Learning 32

Adaptive Questionnaire 34

Summary - Supervised Learning 38

Appendix

Appendix 39

Markov Blanket 39

Bayes’ Theorem 39

About the Authors 40

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com ii

Page 3: Knowledge Discovery in Stock Market

Stefan Conrady 40

Lionel Jouffe 40

Contact Information 41

Conrady Applied Science, LLC 41

Bayesia S.A.S. 41

Copyright 41

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com iii

Page 4: Knowledge Discovery in Stock Market

Tutorial

Highlights• Unsupervised Learning with BayesiaLab can rapidly generate plausible structures of unfamiliar problem domains, as

illustrated in this paper with examples from the U.S. stock market.

• Supervised Learning with BayesiaLab delivers reliable models in high-dimensional domains, providing both powerful

predictive performance plus a platform for simulating domain dynamics.

• Knowledge representation with Bayesian networks is highly intuitive and effectively provides computable knowledge that allows inference and reasoning under uncertainty.

Background & ObjectivePerhaps more than any other kind of time series data, !nancial markets have been scrutinized by countless mathemati-

cians, economists, investors and speculators over hundreds of years. Even in modern times, despite all scienti!c ad-vances, the effort of predicting future movements of the stock market sometimes still bears resemblance to the ancient

alchemistic aspirations of turning base metals into gold. That is not to say that there is no genuine scienti!c effort in

studying !nancial markets, but distinguishing serious research from charlatanism (or even fraud) remains remarkably dif!cult.

We neither aspire to develop a crystal ball for investors nor do we expect to contribute to the economic and economet-

ric literature. However, we !nd the wealth of data in the !nancial markets to be fertile ground for experimenting with knowledge discovery algorithms and for generating knowledge representations in the form of Bayesian networks. This

area can perhaps serve as a very practical proof of the powerful properties of Bayesian networks, as we can quickly

compare machine-learned !ndings with our own understanding of market dynamics. For instance, the prevailing opin-

ions among investors regarding the relationships between major stocks should be re"ected in any structure that is to be discovered by our algorithms.

More speci!cally, we will utilize the unsupervised and supervised learning algorithms of the BayesiaLab software pack-

age to automatically generate Bayesian networks from daily stock returns over a six-year period. We will examine 459 stocks from the S&P 500 index, for which observations are available over the entire timeframe. We selected the S&P

500 as the basis for our study, as the companies listed on this index are presumably among the best-known corporations

worldwide, so even a casual observer should be able to critically review the machine-learned !ndings. In other words, we are trying to machine-learn the obvious, as any mistakes in this process would automatically become self-evident.

Quite often experts’ reaction to such machine-learned !ndings is, “well, we already knew that.” That is the very point

we want to make, as machine-learning can — within seconds — catch up with human expertise accumulated over years,

and then rapidly expand beyond what is already known.

The power of such algorithmic learning will be still more apparent in entirely unknown domains. However, if we were

to machine-learn the structure of a foreign equity market for expository purposes in this paper, chances are that many

readers would not immediately be able to judge the resulting structure as plausible or not.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 1

Page 5: Knowledge Discovery in Stock Market

In addition to generating human-readable and interpretable structures, we want to illustrate how we can immediately

use machine-learned Bayesian networks as “computable knowledge” for automated inference and prediction. Our ob-jective is to gain both a qualitative and quantitative understanding of the stock market by using Bayesian networks. In

the quantitative context, we will also show how BayesiaLab can reliably carry out inference with multiple pieces of un-

certain and even con"icting evidence. The inherent ability of Bayesian networks to perform computations under uncer-tainty makes them highly suitable for a wide range of real-world applications.

Continuing the practice established in our previous white papers, we attempt to present the proposed approach in the

style of a tutorial, so that each step can be immediately replicated (and scrutinized) by any reader equipped with the BayesiaLab software.1 This re"ects our desire to establish a high degree of transparency regarding all proposed methods

and to minimize the risk of Bayesian networks being perceived as a black-box technology.

NotationTo clearly distinguish between natural language, software-speci!c functions and example-speci!c variable names, the

following notation is used:

• Bayesian network and BayesiaLab-speci!c functions, keywords, commands, etc., are capitalized and shown in bold type.

• Names of attributes, variables, nodes and are italicized.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 2

1 The preprocessed dataset with daily return data is available for download from our website:

www.conradyscience.com/white_papers/!nancial/SP500_v6_dlog_b.csv

Page 6: Knowledge Discovery in Stock Market

DatasetThe S&P 500 is a free-"oat capitalization-weighted index of the prices of 500 large-cap common stocks actively traded in the United States, which has been published since 1957. The stocks included in the S&P 500 are those of large pub-

licly held companies that trade on either of the two largest American stock market exchanges; the New York Stock Ex-

change and the NASDAQ. For our case study we have tracked the daily closing prices of all stocks included in the S&P

500 index from January 3, 2005 through December 30, 2010, only excluding those stocks which were not traded con-tinuously over the entire study period. This leaves a total of 459 stock prices with 1,510 observations each.

A

0 400 800 1200

20

40 A AA

0 400 800 1200

20

40 AA AAPL

0 400 800 1200

100

300 AAPL ABC

0 400 800 1200

20

30 ABC ABT

0 400 800 1200

40

60ABT ACE

0 400 800 1200

40

60 ACE

ADBE

0 400 800 120020

40ADBE ADI

0 400 800 1200

20

40ADI ADM

0 400 800 1200

20

40ADM ADP

0 400 800 1200

35

45 ADP ADSK

0 400 800 120020

60ADSK AEE

0 400 800 120020

40 AEE

AEP

0 400 800 1200

30

40 AEP AES

0 400 800 1200

10

20 AES AET

0 400 800 120020

60AET AFL

0 400 800 120020

60 AFL AGN

0 400 800 1200

40

80AGN AIG

0 400 800 1200

500

1000 AIG

AIV

0 400 800 1200

10

30AIV AIZ

0 400 800 1200

25

75AIZ AKAM

0 400 800 1200

20

60AKAM AKS

0 400 800 1200

25

75AKS ALL

0 400 800 120020

60ALL ALTR

0 400 800 1200

20

40ALTR

AMAT

0 400 800 120010

20 AMAT AMD

0 400 800 1200

20

40 AMD AMGN

0 400 800 120040

80 AMGN AMT

0 400 800 1200

30

50 AMT AMZN

0 400 800 120050

150AMZN AN

0 400 800 1200

10

30AN

ANF

0 400 800 120025

75 ANF AON

0 400 800 120020

40AON APA

0 400 800 120050

150APA APC

0 400 800 1200

40

80APC APD

0 400 800 120050

100APD APH

0 400 800 1200

30

50 APH

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 3

Page 7: Knowledge Discovery in Stock Market

Data Preparation and TransformationRather than treating the time series in levels, we will difference the stock prices and compute the daily returns. More speci!cally, we will take differences of the logarithms of the levels, which is a good approximation of the daily stock

return in percentage terms. After this transformation, 1,509 observations remain and a selection of the !rst 36 stocks (in

alphabetical order) is shown below.

A

0 400 800 1200

0.0

0.1 A AA

0 400 800 1200-0.1

0.1AA AAPL

0 400 800 1200-0.1

0.1 AAPL ABC

0 400 800 1200-0.1

0.1 ABC ABT

0 400 800 1200

-0.05

0.05 ABT ACE

0 400 800 1200-0.1

0.1ACE

ADBE

0 400 800 1200

-0.1

0.1 ADBE ADI

0 400 800 1200

-0.1

0.1ADI ADM

0 400 800 1200

-0.1

0.1 ADM ADP

0 400 800 1200

-0.05

0.05 ADP ADSK

0 400 800 1200

-0.1

0.1 ADSK AEE

0 400 800 1200-0.1

0.1AEE

AEP

0 400 800 1200

-0.05

0.05AEP AES

0 400 800 1200

0.0

0.2 AES AET

0 400 800 1200

-0.1

0.1 AET AFL

0 400 800 1200-0.25

0.25 AFL AGN

0 400 800 1200

0.0

0.2AGN AIG

0 400 800 1200-0.5

0.5 AIG

AIV

0 400 800 1200-0.2

0.2 AIV AIZ

0 400 800 1200-0.2

0.2AIZ AKAM

0 400 800 1200-0.2

0.2 AKAM AKS

0 400 800 1200

0.0

0.2 AKS ALL

0 400 800 1200

-0.1

0.1 ALL ALTR

0 400 800 1200

-0.1

0.1 ALTR

AMAT

0 400 800 1200

0.0

0.1 AMAT AMD

0 400 800 1200-0.2

0.2 AMD AMGN

0 400 800 1200

0.0

0.1 AMGN AMT

0 400 800 1200

0.0

0.1 AMT AMZN

0 400 800 1200-0.2

0.2 AMZN AN

0 400 800 1200-0.1

0.1AN

ANF

0 400 800 1200

-0.1

0.1 ANF AON

0 400 800 1200-0.1

0.1AON APA

0 400 800 1200

0.0

0.1 APA APC

0 400 800 1200

-0.1

0.1 APC APD

0 400 800 1200

-0.05

0.05 APD APH

0 400 800 1200

0.0

0.1 APH

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 4

Page 8: Knowledge Discovery in Stock Market

Data ImportWe use BayesiaLab’s Data Import Wizard to load all 459 time series2 into memory from a comma-separated !le. BayesiaLab automatically detects the column headers, which contain the ticker symbols3 as variable names.

The next step identi!es the data types contained in the dataset and, as expected, BayesiaLab !nds 459 continuous vari-

ables.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 5

2 Although the dataset has a temporal ordering, for expository simplicity we will treat each time interval as an inde-

pendent observation.

3 A ticker symbol is a short abbreviation used to uniquely identify publicly traded stocks.

Page 9: Knowledge Discovery in Stock Market

There are no missing values in the dataset and we do not want to !lter out any observations, so the next screen of the

Data Import Wizard can be skipped entirely.

The next step, however, is critical. As part of every data import process into BayesiaLab we must discretize any con-tinuous variables, which means all 459 variables in our particular case.

BayesiaLab offers a number of algorithms to automatically discretize the continuous variables and one of the most prac-

tical ones, for subsequent Unsupervised Learning, is the K-Means algorithm. It provides a very quick way to capture the salient characteristics of probability density curves and creates suitable thresholds for binning purposes.

Determining Discretization IntervalsAnalyst judgement is required though for choosing an appropriate number of intervals. A common heuristic found in

the statistical literature is !ve observations per parameter. We adapt this as a guide for the minimum number of obser-

vations required for each cell in any of the yet-to-be-learned Conditional Probability Tables (CPT).

In our particular case we already know that we will initially perform Unsupervised Learning with the Maximum Weight Spanning Tree algorithm. This tree structure implies that each Node will have only have one parent, which, in turn,

means that each CPT will have the size determined by number of parent states times the number of child states. Choos-

ing !ve intervals for the discretization process would thus mean a CPT size of 25 cells.4

With a uniform distribution of the states this would suggest that we have approximately 60 observations per cell, which

would clearly be more than enough. However, upon visual inspection of the actual distributions of the variables, the

uniform distribution assumption does de!nitely not hold. The graph below shows the distribution of variable AA:

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 6

4 Other learning algorithms do not have this one-parent constraint and, for instance, a !ve-interval discretization with

three parents per node would generate CPTs consisting of 625 cells. Even when assuming uniform distributions, the available observations would be insuf!cient for estimation purposes.

Page 10: Knowledge Discovery in Stock Market

Rather, looking at this graph, it may be more appropriate to assume a normal distribution.5 Given that each Node will

have one parent, we would perhaps further assume a bivariate normal distribution for the joint distribution of each pair of Nodes. We need to emphasize that we are not attempting to !t distributions per se, but that we are rather trying to

!nd a heuristic that allows us to establish the minimum number of observations needed to characterize the tail ends of

the distributions.

An assumed bivariate normal distribution would yield a discrete probability density function similar to what is shown in

the table below. In other words, this is what we would expect the Conditional Probability Table (CPT) to approxi-

mately look like, once we have discretized the states and learned the CPT from the actual occurrences. However, we

have not yet discretized the states and much less estimated the CPT. Actually, we have not really determined how many discretization levels are correct. So, it is a catch-22 and hence the need for a heuristic.

Our heuristic is that we use our qualitative understanding of the distributions to determine a reasonable number of in-

tervals that provides a minimum number of samples for the tails. More formally, the “thinnest tail” is the minimal local joint probability (MLJP). Assuming 5 states for parent and child each, and with a total of 1,509 observations, this

would translate into approximately 4 observations for the MLJP (highlighted in red).

!" !# $ # "!" !"#$% &"'&% #"&(% &"'&% !"#$%!# &"'&% (")(% $"*(% (")(% &"'&%$ #"&(% $"*(% &("$#% $"*(% #"&(%# &"'&% (")(% $"*(% (")(% &"'&%" !"#$% &"'&% #"&(% &"'&% !"#$%

&(!$ +,-./012345-

!" !# $ # "!" 6 #! '' #! 6!# #! )) &6* )) #!$ '' &6* #6! &6* ''# #! )) &6* )) #!" 6 #! '' #! 6

789 :212.-;4<;7=3>?;@4?.-

:212.-;4<;81/.52;@4?.-

:212.-;4<;7=3>?;@4?.-

:212.-;4<;81/.52;@4?.-

789

Although the number of expected samples for the MLJP appears to be below the recommended minimum, we will for

now proceed on this basis and set the number of intervals to 5. Only upon completion of the discretization, and after learning the network including the CPTs, we will know for sure whether this was indeed a reasonable assumption or

not.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 7

5 We omit plotting the distributions of all variables, but all the variables’ distributions do indeed resemble the normal

distribution.

Page 11: Knowledge Discovery in Stock Market

Clicking Finish will now perform the discretization. A progress bar will be shown to track the state of this process.

Modeling ModeUpon conclusion, the variables are delivered as blue Nodes into the Graph Panel of BayesiaLab and by default we are now in the Modeling Mode. The original variable names, which were stored the !rst line of the database, become our

Node Names.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 8

Page 12: Knowledge Discovery in Stock Market

At this point it is practical to add Node Comments to the Node Names. Node Comments are typically used in

BayesiaLab for longer and more descriptive titles, which can be turned on or off, depending on the desired view of the graph. Here, we associate a dictionary of the complete company names with the Node Comments, while the more com-

pact ticker symbols remain as Node Names.6

The syntax for this association is rather straightforward: we simply de!ne a text !le which includes one Node Name per line. Each Node Name is followed by the equal sign (“=”), or alternatively TAB or SPACE, and then by the full com-

pany name, which will serve as the Node Comment.

This !le can then be loaded into BayesiaLab via Data>Associate Dictionary>Node>Comments.

Once the comments are loaded, a small call-out symbol will appear next to each Node Name. This indicates that

Node Comments are available for display.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 9

6 To maintain a compact presentation, we will typically use the ticker symbol when referencing a particular stock rather

than the full company name.

Page 13: Knowledge Discovery in Stock Market

As the name implies, selecting View>Display Node Comments (or alternatively the keyboard shortcut “M”) will reveal

the company names.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 10

Page 14: Knowledge Discovery in Stock Market

Node Comments can be displayed for either all Nodes or only for selected ones.

Before proceeding with the !rst learning step, it is also recommended to brie"y switch into the Validation Mode (F5)

and to check the distributions of the states of the Nodes. The Monitors of the !rst nine Nodes are shown below. At !rst glance, the distributions appear to be plausible representations of the historical return distributions.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 11

Page 15: Knowledge Discovery in Stock Market

Unsupervised LearningTo perform the !rst Unsupervised Learning algorithm on our dataset, we switch back into Modeling Mode (F4) and select Learning>Association Discovering>Maximum Spanning Tree.7 This starts the Maximum Weight Spanning Tree

algorithm, which is the fastest of the Unsupervised Learning algorithms and thus recommended at the beginning of most

studies.8 As the name implies, this algorithm generates a tree structure, i.e. it permits only one parent per Node. This

constraint is one of the reasons for the extreme learning speed of this algorithm.9 Performing the algorithm with a !le of this size should only take a few seconds.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 12

7 In BayesiaLab nomenclature, Unsupervised Learning is listed in the Learning menu as “Association Discovering”

8 Several other Unsupervised Learning algorithms are available in BayesiaLab, including Taboo, EQ, SopLEQ and Ta-boo Order.

9 It goes beyond the scope of this tutorial to discuss the different types of learning algorithms and their speci!c proper-

ties.

Page 16: Knowledge Discovery in Stock Market

At !rst glance, however, the resulting network does not appear simple and tree-like at all.

This can be quickly resolved with BayesiaLab’s built-in layout algorithms. Selecting View>Automatic Layout (shortcut

“P”) rearranges the network instantly to reveal a much more intuitive structure.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 13

Page 17: Knowledge Discovery in Stock Market

The resulting, reformatted Bayesian network representing the stock returns can now be read and interpreted

immediately:10 11

For instance, we can zoom into the branch of the Bayesian network which contains Procter & Gamble (PG).

BayesiaLab offers a search function (shortcut Ctrl-F or ⌘-F), which helps !nd individual nodes very easily.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 14

10 A separate, high-resolution PDF of this Bayesian network can be downloaded here:

www.conradyscience.com/white_papers/!nancial/SP500_V13.pdf. This allows those readers without an active BayesiaLab installation to explore the network graph in much greater detail.

11 For expositional clarity we have only learned contemporaneous relationships and, as a result, potential lag structures

will not appear in this network. However, in BayesiaLab, Unsupervised Learning can be generalized to a temporal ap-plication. A white paper speci!cally focusing on learning temporal (or dynamic) Bayesian networks is planned for the

near future.

Page 18: Knowledge Discovery in Stock Market

The neighborhood of Procter & Gamble contains many familiar company names, mostly from the CPG industry.12 Per-haps these companies appear all-too-obvious and the reader may wonder what insight is gained at this point. Chances

are that even a casual observer of the industry would have mentioned Kimberly-Clark, Colgate-Palmolive and Johnson

& Johnson as businesses operating in the same !eld as Procter & Gamble, which would therefore presumably have somewhat related stock price movements.

The key point is that without any prior knowledge of this domain a computer algorithm automatically extracted this

structure, i.e. a Bayesian network, which intuitively matches the understanding that we have established over years as

consumers of these companies’ products.

Clearly, if this was an unfamiliar domain, the knowledge gain for the reader would be far greater. However, a lesser-

known domain would presumably prevent the reader’s intuitive veri!cation of the machine-discovered structure here.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 15

12 CPG stands for Consumer Packaged Goods.

Page 19: Knowledge Discovery in Stock Market

Bayesian Network versus Correlation MatrixThe bene!t of the concise representation as a Bayesian network is further demonstrated by juxtaposing it to a correla-

tion matrix, which would perhaps be the !rst step in a traditional statistical analysis of this domain. Even when using

heat map-style color-coding, the sheer number of relationships13 makes an immediate visual interpretation of the corre-lation matrix very dif!cult (see the subset of 25 by 25 cells from the correlation matrix below).

A AA AAPL ABC ADI ADM ADP ADSK AEE AEP AES AET AFL AGN AIV AIZ AKAM AKS ALL ALTR AMAT AMD AMGN AMT AMZNA 1 0.570668 0.46678 0.408163 0.533252 0.425324 0.535525 0.495613 0.531351 0.486749 0.490094 0.384297 0.476417 0.465186 0.506165 0.450875 0.4315 0.533276 0.490529 0.521889 0.541416 0.454983 0.388191 0.526454 0.447969AA 0.570668 1 0.412423 0.363121 0.432512 0.49727 0.513374 0.453742 0.540668 0.487494 0.555778 0.386198 0.505749 0.417878 0.533665 0.525495 0.433653 0.691676 0.558741 0.443481 0.502896 0.406542 0.357239 0.532022 0.369067AAPL 0.46678 0.412423 1 0.236667 0.43525 0.323588 0.403402 0.417302 0.340484 0.322327 0.319482 0.289725 0.334087 0.328982 0.402068 0.340316 0.38855 0.432112 0.351426 0.444068 0.463454 0.395558 0.330339 0.437053 0.450858ABC 0.408163 0.363121 0.236667 1 0.329262 0.298421 0.416881 0.31158 0.440094 0.417974 0.347976 0.408529 0.294418 0.391646 0.33699 0.360633 0.288028 0.340885 0.39043 0.318401 0.309671 0.244243 0.36276 0.347773 0.269919ADI 0.533252 0.432512 0.43525 0.329262 1 0.321593 0.483858 0.482746 0.425898 0.371848 0.343594 0.314271 0.389693 0.366576 0.462091 0.371839 0.426141 0.460124 0.423266 0.691107 0.638214 0.495377 0.330517 0.467126 0.420969ADM 0.425324 0.49727 0.323588 0.298421 0.321593 1 0.378516 0.322902 0.452433 0.403492 0.417093 0.305003 0.366817 0.304062 0.366267 0.358504 0.389176 0.452943 0.392224 0.352995 0.339473 0.274791 0.266671 0.414046 0.313261ADP 0.535525 0.513374 0.403402 0.416881 0.483858 0.378516 1 0.452686 0.542809 0.527541 0.456298 0.372908 0.50101 0.486193 0.526986 0.507023 0.406286 0.476395 0.514611 0.513513 0.515278 0.394056 0.406387 0.48288 0.41627ADSK 0.495613 0.453742 0.417302 0.31158 0.482746 0.322902 0.452686 1 0.421398 0.402325 0.442238 0.349215 0.417223 0.389226 0.447525 0.405751 0.392804 0.43849 0.41419 0.46149 0.497755 0.396007 0.333145 0.45594 0.383973AEE 0.531351 0.540668 0.340484 0.440094 0.425898 0.452433 0.542809 0.421398 1 0.756735 0.590583 0.424766 0.513378 0.475327 0.474898 0.473565 0.321768 0.452686 0.537636 0.447271 0.436028 0.31983 0.390525 0.465076 0.32218AEP 0.486749 0.487494 0.322327 0.417974 0.371848 0.403492 0.527541 0.402325 0.756735 1 0.565275 0.403458 0.42596 0.440173 0.419188 0.458727 0.318872 0.422276 0.459285 0.396228 0.417472 0.292099 0.398822 0.446867 0.314108AES 0.490094 0.555778 0.319482 0.347976 0.343594 0.417093 0.456298 0.442238 0.590583 0.565275 1 0.378383 0.476892 0.40224 0.420327 0.453099 0.34483 0.492532 0.476188 0.349014 0.398017 0.315139 0.308978 0.438492 0.28071AET 0.384297 0.386198 0.289725 0.408529 0.314271 0.305003 0.372908 0.349215 0.424766 0.403458 0.378383 1 0.370713 0.421565 0.364347 0.420521 0.249157 0.360531 0.427641 0.290668 0.279035 0.275143 0.321026 0.401321 0.280863AFL 0.476417 0.505749 0.334087 0.294418 0.389693 0.366817 0.50101 0.417223 0.513378 0.42596 0.476892 0.370713 1 0.418877 0.588516 0.588617 0.351403 0.446767 0.634718 0.390395 0.459462 0.364762 0.285856 0.50493 0.359955AGN 0.465186 0.417878 0.328982 0.391646 0.366576 0.304062 0.486193 0.389226 0.475327 0.440173 0.40224 0.421565 0.418877 1 0.422619 0.396071 0.323589 0.388559 0.443402 0.332295 0.393542 0.347243 0.345897 0.461649 0.336944AIV 0.506165 0.533665 0.402068 0.33699 0.462091 0.366267 0.526986 0.447525 0.474898 0.419188 0.420327 0.364347 0.588516 0.422619 1 0.558192 0.408232 0.49093 0.644666 0.485371 0.541239 0.390922 0.30768 0.512831 0.397449AIZ 0.450875 0.525495 0.340316 0.360633 0.371839 0.358504 0.507023 0.405751 0.473565 0.458727 0.453099 0.420521 0.588617 0.396071 0.558192 1 0.353718 0.45162 0.616235 0.378966 0.430116 0.315676 0.343417 0.513195 0.347806AKAM 0.4315 0.433653 0.38855 0.288028 0.426141 0.389176 0.406286 0.392804 0.321768 0.318872 0.34483 0.249157 0.351403 0.323589 0.408232 0.353718 1 0.438362 0.364883 0.435992 0.428331 0.368554 0.245363 0.419715 0.385661AKS 0.533276 0.691676 0.432112 0.340885 0.460124 0.452943 0.476395 0.43849 0.452686 0.422276 0.492532 0.360531 0.446767 0.388559 0.49093 0.45162 0.438362 1 0.478014 0.420897 0.475609 0.423204 0.337167 0.508704 0.390437ALL 0.490529 0.558741 0.351426 0.39043 0.423266 0.392224 0.514611 0.41419 0.537636 0.459285 0.476188 0.427641 0.634718 0.443402 0.644666 0.616235 0.364883 0.478014 1 0.436321 0.503192 0.387605 0.312268 0.525026 0.351342ALTR 0.521889 0.443481 0.444068 0.318401 0.691107 0.352995 0.513513 0.46149 0.447271 0.396228 0.349014 0.290668 0.390395 0.332295 0.485371 0.378966 0.435992 0.420897 0.436321 1 0.645041 0.490712 0.332572 0.480285 0.443469AMAT 0.541416 0.502896 0.463454 0.309671 0.638214 0.339473 0.515278 0.497755 0.436028 0.417472 0.398017 0.279035 0.459462 0.393542 0.541239 0.430116 0.428331 0.475609 0.503192 0.645041 1 0.481282 0.354883 0.482778 0.435212AMD 0.454983 0.406542 0.395558 0.244243 0.495377 0.274791 0.394056 0.396007 0.31983 0.292099 0.315139 0.275143 0.364762 0.347243 0.390922 0.315676 0.368554 0.423204 0.387605 0.490712 0.481282 1 0.230527 0.390012 0.318144AMGN 0.388191 0.357239 0.330339 0.36276 0.330517 0.266671 0.406387 0.333145 0.390525 0.398822 0.308978 0.321026 0.285856 0.345897 0.30768 0.343417 0.245363 0.337167 0.312268 0.332572 0.354883 0.230527 1 0.327344 0.330847AMT 0.526454 0.532022 0.437053 0.347773 0.467126 0.414046 0.48288 0.45594 0.465076 0.446867 0.438492 0.401321 0.50493 0.461649 0.512831 0.513195 0.419715 0.508704 0.525026 0.480285 0.482778 0.390012 0.327344 1 0.412541AMZN 0.447969 0.369067 0.450858 0.269919 0.420969 0.313261 0.41627 0.383973 0.32218 0.314108 0.28071 0.280863 0.359955 0.336944 0.397449 0.347806 0.385661 0.390437 0.351342 0.443469 0.435212 0.318144 0.330847 0.412541 1

Admittedly, there are a number of statistical techniques available which can help in this situation, but the point is that generating a Bayesian network (e.g. with the Maximum Weight Spanning Tree algorithm we used) takes the practitioner

about the same amount of time as computing a correlation matrix, yet the former yields a much richer picture.

Beyond visual interpretability, there is another key distinction between these two representations. Whereas the correla-tion matrix is merely descriptive, the Bayesian network is actually computable. By its very nature, any Bayesian network

is a functioning model. On the other hand, with the correlation matrix one could not predict the value of one stock

given the observation of several others. For this purpose, we would have to !t and estimate speci!c models, e.g. a re-

gression. In a Bayesian network, however, we can use the graph of the Bayesian network itself for computing inference. For instance, given that we observe the values of JNJ and CL, we immediately obtain an updated value for PG and, at

the same time, also updated values for all other Nodes in the network. We refer to this property as omnidirectional in-

ference, which re"ects the updating of beliefs given evidence according to Bayes’ Rule.14 We shall illustrate carrying out omnidirectional inference in the next section.

Inference with Bayesian NetworksWe have shown that the Maximum Weight Spanning Tree algorithm can generate a readily-interpretable and fully-

computable Bayesian network from daily stock return data. However, we have not yet explained in detail what this

structure represents speci!cally.

Each Arc in this structure represents a probabilistic relationship between a pair of Nodes. The parameters15 of these

relationships are encoded in Conditional Probability Tables. In the example of the PG and JNJ relationship shown be-

low, the table de!nes the probabilities of the states of PG, given the states of JNJ. This table can be accessed in the

Modeling Mode by simply double-clicking on the desired Node, which opens up the Node Editor.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 16

13 4592 − 4592

= 105,111

14 See appendix for a brief summary of Bayes’ Theorem.

15 We use the term “parameter” rather loosely in this context, as Bayesian networks are entirely nonparametric models

in BayesiaLab.

Page 20: Knowledge Discovery in Stock Market

For clarity, we show the relevant portion of the network for JNJ and PG below plus an enlarged version of the condi-

tional probability table from the Node Editor:

This says, among other things, given that we observe a JNJ return greater than 1.2%, there would be a 50.9% probabil-

ity that we would observe a PG return of greater than 1.2% (see bottom right cell in the above table). More formally

we can also write, P(PG>0.012 | JNJ > 0.012) = 50.9%.

The upper left cell says, given that we observe a JNJ return smaller than -0.9% there is a 46.5% probability that we will observe a PG return smaller than -1.3%, i.e. P(PG<=0.013 | JNJ <=0.009) = 46.5%.16

If we follow the network “downstream,” i.e from PG to KMB, we see that their relationship is quanti!ed in yet another

conditional probability table.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 17

16 As the discretization intervals were generated by the K-Means algorithm, the bins do not necessarily have the same

interval size, which we see in this example.

Page 21: Knowledge Discovery in Stock Market

This can be interpreted in the same way: given that we observe a return of PG greater than 1.2%, there is a 42.4%

probability that we would also observe a KMB return of higher than 1.2%. This kind of inference is perhaps the sim-plest type, as we can directly read the table, i.e. “given this, then that.”

Inference with Hard Evidence

Beyond reviewing the conditional probability tables directly in Modeling Mode in the Node Editor, as above, we can

carry out inference conveniently in the Validation Mode (shortcut F5) of BayesiaLab.

This allows setting evidence and observing inference directly via the Monitors in the Monitor Panel (right side of screen-shot). We will now highlight JNJ and PG and focus on their Monitors only. Prior to setting any evidence, we will sim-

ply see their marginal distributions in the Monitors. As we would expect, we see the returns distributed around 0 and

the expected value of the returns is 0.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 18

Page 22: Knowledge Discovery in Stock Market

Observing a speci!c state of a Node is equivalent to setting evidence and we can do that directly on the histograms in-side the Monitors. For instance, we can double-click on the state JNJ > 0.012, which sets it to a 100% probability, as

indicated by the green bar. Setting such evidence will automatically propagate this evidence throughout the network and

we can immediately observe the new distribution of PG. The gray arrows indicate how the distributions have changed compared to before setting evidence.

So far, this provides no more insight than what we could read from the Conditional Probability Table in the Node Edi-tor of the PG Node. What is not readily accessible from the CPT is the inverse probability by carrying out inference in the opposite direction of the Arc, i.e. setting evidence on PG and computing JNJ. Bayes’ Rule speci!es the necessary

computation in this case.17

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 19

17 See appendix for more details about Bayes’ Rule. Although this calculation is straightforward, application errors are

unfortunately commonplace. The error is so common that is now widely known as the Prosecutor’s Fallacy. In a recent white paper, Paradoxes and Fallacies, we dedicated a chapter to this problem:

www.conradyscience.com/index.php/paradoxes

Page 23: Knowledge Discovery in Stock Market

In BayesiaLab the inference computation of JNJ is automatic once we set evidence to PG. To illustrate this, we arbitrar-

ily set the PG return to <=-1.3% and we can immediately see the updated distribution of JNJ.

So far, this could have been computed quite easily by directly applying Bayes’ Rule. It becomes a bit more challenging when we look at more than two Nodes at the same time. This time we will examine JNJ, PG and KMB (their relevant

subnetwork is shown for reference below).

Once again, prior to setting any evidence, the Monitors show the marginal distributions of JNJ, PG and KMB.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 20

Page 24: Knowledge Discovery in Stock Market

Upon setting JNJ > 0.012, we can now see how the evidence not only propagates to PG, but also further “downstream”

to KMB:

We can also invert the chain of inference by simply setting evidence at the other end of the network, e.g. KMB > 0.012:

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 21

Page 25: Knowledge Discovery in Stock Market

Or, we can set evidence on both ends, i.e. on JNJ and KMB, and then read the inference in the middle, for PG.

This inference will probably not surprise us: we now have an 80% probability that PG will have a return greater than

1.2%, given that we set both JNJ and KMB to >0.012.

Inference with Soft Evidence

We are not limited to only setting “hard evidence,” as we did above. In the real world, observations often provide “soft

evidence” only. So, instead of setting any of these variables to a state with a 100% probability and thus make them “hard evidence,” we can use BayesiaLab to set any evidence according to its nature, even when it is uncertain.

For illustration purposes, we will now generate two kinds of “soft evidence,” one for JNJ and one for KMB.

1. We set the evidence directly by right-clicking on the JNJ Monitor and selecting Enter Probabilities:

We can now adjust the histogram by dragging the bars to the desired probability levels which re"ect our subjective belief.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 22

Page 26: Knowledge Discovery in Stock Market

Clicking the light-green button con!rms our choice of probabilities.

In addition, we right-click on the Monitor again to Fix Probabilities, meaning that we want to hold these values re-

gardless of any subsequent evidence we enter.

2. Assuming that we have a more general expectation regarding the KMB return, without having any beliefs regarding the probabilities of speci!c states, we can set the expected mean of the entire KMB distribution. For instance, we set

the expected mean of the states of KMB to -1% by right-clicking the KMB Monitor and selecting Distribution for Target Value/Mean.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 23

Page 27: Knowledge Discovery in Stock Market

We type in “-0.01” into the dialog box,

which generates a new KMB distribution with the desired mean value of -0.01 or -1%.

It is obvious that an in!nite number of combinations could generate a mean value of -1%. However, as an aid to the

analyst, BayesiaLab computes which distribution with a mean value of -1% would be “closest” to the a-priori distri-bution.

Not only are these observations “soft,” in this example they are also of the opposite sign, i.e. JNJ has a positive mean of

the return and KMB has a negative mean of the return.

As a result, carrying out inference generates a more uniform probability distribution for PG (rather than a narrower

distribution), effectively increasing our uncertainty about the state of PG compared to the marginal distribution. The knowledge gain for the analyst is that greater volatility for PG must be expected.

We have limited our example to inference within a small subnetwork of only three Nodes, but we could have performed

the same approach over the entire Bayesian network of 459 Nodes. With this, the analyst has the complete freedom to set an unlimited number of all different kinds of evidence, both hard and soft, and to carry out inference “backwards”

and “forwards” within the network. For users of the BayesiaLab software, the automatic computation of inference and

the instant visual updating of the Monitors is comparable to recalculating all cells in a large spreadsheet.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 24

Page 28: Knowledge Discovery in Stock Market

Bayesian Network MetricsAs shown in these examples, the Arcs represent the probabilistic relationships between Nodes. In addition to visually

interpreting the network structure, and beyond carrying out inference, we can also review the “summary statistics” of

the network and its components with several metrics.

It is important to point out that we use the information theory-based concepts of Entropy, Arc Force and Mutual In-formation as central metrics in generating and analyzing Bayesian networks. This is a clear departure from commonly

used metrics in traditional statistics, such as covariance and correlation. While these information theory-based metrics may appear novel to end-users of research, they have many advantages. Most importantly, we can entirely discard the

(often incorrect) assumption regarding linearity and normal distributions. As a result, highly nonlinear dynamics can be

easily captured in a Bayesian network.

Arc Force

For instance, the importance of each Arc can be highlighted by displaying the associated Arc Force and its contribution

with respect to the overall network. From within the Validation Mode, the Arc Force can be displayed by selecting

Analysis>Graphic>Arc Force (or with the shortcut “F”).

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 25

Page 29: Knowledge Discovery in Stock Market

Mutual Information

A perhaps more accessible interpretation is possible by displaying the Mutual Information, which can be obtained by selecting Analysis>Graphic>Arcs’ Mutual Information.18

The Mutual Information I(X,Y) measures how much (on average) the observation of random variable Y tells us about

the uncertainty of X, i.e. by how much the entropy of X is reduced if we have information on Y. Mutual Information is

a symmetric metric, which re"ects the uncertainty reduction of X by knowing Y as well as of Y by knowing X.

In our example, knowing the value of PG on average reduces the uncertainty of the value of KMB by 0.2843 bits, which means that it reduces its uncertainty by 13.27% (shown in blue, in the direction of the arc). Conversely, knowing KMB

reduces the uncertainty or PG by 13.09% (shown in red, in the opposite direction of the arc).

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 26

18 Although interpreting Mutual Information is somewhat more intuitive, in the case of a network tree, Mutual Infor-mation is identical to Arc Force. For Bayesian networks that are not trees, this distinction becomes very important.

Page 30: Knowledge Discovery in Stock Market

Correlation

While we emphasize the importance of Arc Force and Mutual Information as measures capable for capturing nonlinear relationships, BayesiaLab allows to display Pearson’s R for the network (select Analysis>Graphic>Pearson’s Correlation

or shortcut “G”).

By displaying the Pearson’s correlation coef!cient, we implicitly make the assumption of linear relationships between

the connected Nodes, which may often not hold in practice. Special care must thus be taken when interpreting low val-

ues of R, as they may re"ect nonlinearity rather than independence. On the other hand, R values close to 1 do indeed suggest the presence of linear relationship. Furthermore, Pearson’s R can be very helpful for determining the sign of the

relationship between variables. BayesiaLab will color-code positive and negative correlations by highlighting the associ-

ated Arcs in blue and red respectively. Finally, correlation is typically a much more familiar metric to most audiences who are not familiar with Mutual Information.

Summary - Unsupervised LearningIn summary, Unsupervised Learning is an excellent approach to obtain a general understanding of simultaneous rela-

tionships between many variables in a dataset. The learned Bayesian network allows immediate visual interpretation

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 27

Page 31: Knowledge Discovery in Stock Market

plus immediate computation of omnidirectional inference based on any type of evidence, including uncertain and con-

"icting observations. Given these properties, Unsupervised Learning with Bayesian networks becomes a universal and robust tool for knowledge discovery and modeling in unknown problem domains.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 28

Page 32: Knowledge Discovery in Stock Market

Supervised LearningUpon gaining a general understanding of a domain, questions typically arise regarding individual variables and how to predict them speci!cally. Even though we can use Unsupervised Learning to discover a network structure and use it for

prediction, Supervised Learning is often a more appropriate method when studying a speci!c target variable. By focus-

ing on a single target variable, BayesiaLab’s learning algorithms focus on !tting a (generative) model to a single target

rather than !tting a model that balances the !t in terms of all variables.

To remain consistent with the example we started earlier, we will once again use PG for illustration purposes. More

speci!cally, we will characterize PG as the Target Node. We can do so by right-clicking on the node and then selecting

Set as Target Node from the contextual menu (or by double-clicking the Node while holding “T”).

Now that we have de!ned a Target Node, we can perform a range of Supervised Learning algorithms implemented in

BayesiaLab.19

The Markov Blanket20 algorithm is suitable for this kind of application and its speed is particularly helpful when deal-

ing with hundreds or even thousands of variables. Furthermore, BayesiaLab offers the Augmented Markov Blanket, which starts with the Markov Blanket structure and then uses an unsupervised search to !nd the probabilistic relations that hold between each variable belonging to the Markov Blanket.21 This unsupervised search requires additional com-

putation time but generally results in an improved predictive performance of the model.

The learning process can be started by selecting Learning>Target Node Characterization>Augmented Markov Blanket from the menu.22

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 29

19 For expositional clarity we will only learn contemporaneous relationships and, as a result, potential lag structures will

not appear in the resulting networks. However, in BayesiaLab, Supervised Learning can be generalized to a temporal application.

20 See appendix for a de!nition of the Markov Blanket

21 Intuitively, the “augmented” part of the network plays the same role as the interaction terms between independent

variables in a regression.

22 In BayesiaLab nomenclature, Supervised Learning is listed in the Learning menu as “Target Node Characterization”

Page 33: Knowledge Discovery in Stock Market

As we still have our previous network that was generated through Unsupervised Learning, we need to con!rm the dele-

tion of that original network before proceeding with Supervised Learning.

After a few seconds, we will see the result of the Supervised Learning process. Our Target Node PG is now connected to

all variables in its Markov Blanket. This means that, given the knowledge of the Nodes in the Markov Blanket, PG is independent of the remaining network. This effectively identi!es the subset of variables which are most important for

predicting the value of the Target Node, PG.

As stated in the introduction, it is not our intention to forecast stock prices per se, but rather to identify meaningful and

relevant structures in the market. Such a structure is this Augmented Markov Blanket and a stock market analyst can use it to identify a relevant subset of stocks for an in-depth analysis, perhaps with the objective of establishing a buy/sell

recommendation or to directly trade on such knowledge.

Once we have this network, we can use it to analyze these Nodes’ relationships in a number of ways within BayesiaLab. For instance, we can select Analysis>Graphic>Target Mean Analysis, which graphs PG as a function of the other Nodes in the network.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 30

Page 34: Knowledge Discovery in Stock Market

Alternatively, by selecting Analysis>Report>Target Analysis>Correlation with the Target Node,

we obtain a table displaying the Mutual Information between the Nodes in the network and the Target Variable, PG:

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 31

Page 35: Knowledge Discovery in Stock Market

By clicking Quadrants these values can be displayed as a graph:

Inference with Supervised LearningTo illustrate potential applications of Supervised Learning, beyond interpretation, we have created a simple simulation

of possible stock market conditions. Despite the hypothetical nature of these scenarios, the underlying Bayesian network

was learned from actual market data (as is the case for this entire white paper) and, as a result, the computed inference based on these assumed conditions is “real.”

One could imagine this purely hypothetical scenario: Colgate-Palmolive and Johnson & Johnson are involved in a pat-

ent lawsuit and an investment analyst speculates about the impact of the imminent verdict in this court case. It is fairly easy to imagine that a verdict in favor of Johnson & Johnson would result in a boost to its stock price and simultane-

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 32

Page 36: Knowledge Discovery in Stock Market

ously cause a sharp drop for Colgate-Palmolive’s stock. Conversely, a win for Colgate-Palmolive would result in just the

opposite. However, our question is how either outcome would affect Procter & Gamble’s return, PG. We can best an-swer this question by simulating either outcome within the Bayesian network we learned.

Prior to setting any evidence, our marginal distributions of returns would be as follows, i.e. this is what we would ex-

pect any given day without any other knowledge:

If we were now to believe in a Johnson & Johnson win in combination with a Colgate-Palmolive loss and the corre-

sponding stock price movement for both of them, we could create the following scenario:

The gray arrows now highlight the impact on all other stocks in this model, including our target variable, PG. The

model suggests that the new distribution for PG would now be distinctly bimodal as opposed to the normal marginal

distribution.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 33

Page 37: Knowledge Discovery in Stock Market

Now considering the opposite verdict, i.e. a Colgate-Palmolive win and a Johnson & Johnson defeat, we can once again

assume their resulting stock price movements and then infer the impact on PG.

This time, the a gain for PG would be much more probable.

So, if an analyst had a deep understanding of the subject matter (or insider knowledge23) and hence could anticipate the patent trial’s outcome, he should, everything else being equal, update his beliefs regarding the Procter & Gamble stock

return according to the computed inference of our model.

It is important to stress that this doesn’t mean we have discovered a causal pathway, but rather that we are taking ad-

vantage of historically observed associations between returns, which have generated a model in the form of a Bayesian network. The Bayesian network simply allows us to consequently exploit our learned knowledge.

Adaptive QuestionnaireThe Bayesian network from above can perhaps also serve to illustrate how evidence-gathering can be optimized in

BayesiaLab. Once again, this is purely hypothetical, but let’s assume that a stock trader seeks to predict tomorrow’s

return of PG. Tomorrow, as it turns out, earnings will also be released for numerous other stocks in the CPG industry, excluding PG. With limited time, our stock trader needs to prioritize his research resources on those stocks, which will

be most informative of the PG return. BayesiaLab has a convenient function, Adaptive Questionnaire, which allows the

analyst to adapt his evidence-seeking process as per the most recent information obtained and given the previously learned Bayesian network (shown again below for reference).

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 34

23 It should be noted that insider trading can refer to both legal and illegal conduct. See

http://www.sec.gov/answers/insider.htm

Page 38: Knowledge Discovery in Stock Market

The function can be called by selecting Inference>Adaptive Questionnaire. The following pop-up window then prompts

to select and con!rm the Target.

Initially, the analyst’s research should begin with CL as the most informative Node, which is listed at the top of all

Monitors, right below the Target, PG.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 35

Page 39: Knowledge Discovery in Stock Market

Let’s now assume he receives a tip, suggesting that CL earnings are coming in much higher than expected. He translates

this updated, subjective beliefs into “soft” evidence and thus sets P(CL>0.017)=60%, P(CL<=0.017)=30%, P(CL<=0.05)=10%, plus the remaining states to zero.

Upon entering this probability distribution, the Adaptive Questionnaire will move CL to the bottom (green bars with

gray background) and scroll up the next most important Node to study, in this case KMB.

Upon setting this evidence, the probabilities need to be !xed by right-clicking the Monitor and selecting Fix Probabili-ties.

This is important as other simultaneous beliefs have yet to be set. By not !xing the probabilities of CL, subsequent evi-

dence could inadvertently update the probabilities that were just de!ned.

Next, the analyst may obtain inconclusive views from his sources on KMB and thus he cannot set any new evidence to

this particular Node, although it would be the most informative evidence at this point. Rather, he moves on to CLX, which is widely believed to meet the expected earnings without any surprises. As a result, our analyst sets hard negative

evidence on either end of the return distribution, meaning that he anticipates no major swings either way:

P(CLX<=-0.11)=0 and P(CLX>0.13)=0. Upon setting this evidence, and once again !xing it, the Adaptive Question-

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 36

Page 40: Knowledge Discovery in Stock Market

naire presents a new order of Nodes. Interestingly, given the evidence set on CLX, KMB has declined in importance

with respect to PG.

In the new order JNJ is next and our analyst determines that the stock will de!nitely gain based on insider rumors he

heard. He translates this insight into a certain JNJ return greater than 1.2% and sets it as “hard” evidence accordingly.

Given all the evidence he gathered, although some of it may be vague, the analyst concludes that there is now a 90%

probability of a PG return greater than 0.3%. Perhaps more importantly, the chance of a decline of -1.3% or below has

diminished to virtually zero. This translates into an expected mean return of 1.5% versus the a-priori expectation of 0%.

With the Bayesian network generated through Unsupervised Learning and the subsequent application of the Adaptive Questionnaire, the analyst has optimized his information-seeking process and thus spent the least amount of resources for a maximum reduction of uncertainty regarding the variable of interest.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 37

Page 41: Knowledge Discovery in Stock Market

Summary - Supervised LearningIn many ways, Supervised Learning with BayesiaLab resembles traditional modeling and can thus be benchmarked

against a wide range of statistical techniques. In addition to its predictive performance, BayesiaLab offers an array of

analysis tools, which can provide the analyst with a deeper understanding of the domain’s underlying dynamics. The Bayesian network also provides the basis for a wide range of scenario simulation and optimization algorithms imple-

mented in BayesiaLab. Beyond mere one-time predictions, BayesiaLab allows dealing with evidence interactively and

incrementally, which makes it a highly adaptive tool for real-time inference.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 38

Page 42: Knowledge Discovery in Stock Market

Appendix

Appendix

Markov Blanket In many cases, the Markov Blanket algorithm is a good starting point for any predictive model, whether used for scor-

ing or classi!cation. This algorithm is extremely fast and can even be applied to databases with thousands of variables and millions of records.

The Markov Blanket for a node A is the set of nodes composed of A’s parents, its children, and its children’s other par-

ents (=spouses).

The Markov Blanket of the node A contains all the variables, which, if we know their states, will shield the node A

from the rest of the network. This means that the Markov Blanket of a node is the only knowledge needed to predict

the behavior of that node A. Learning a Markov Blanket selects relevant predictor variables, which is particularly help-

ful when there is a large number of variables in the database (In fact, this can also serve as a highly-ef!cient variable selection method in preparation for other types of modeling, outside the Bayesian network framework).

Bayes’ TheoremBayes’ theorem relates the conditional and marginal probabilities of discrete events A and B, provided that the probabil-

ity of B does not equal zero:

P(A∣B) = P(B∣A)P(A)P(B)

In Bayes’ theorem, each probability has a conventional name:

• P(A) is the prior probability (or “unconditional” or “marginal” probability) of A. It is “prior” in the sense that it

does not take into account any information about B. The unconditional probability P(A) was called “a priori” by

Ronald A. Fisher.

• P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from

or depends upon the speci!ed value of B.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 39

Page 43: Knowledge Discovery in Stock Market

• P(B|A) is the conditional probability of B given A. It is also called the likelihood.

• P(B) is the prior or marginal probability of B.

Bayes theorem in this form gives a mathematical representation of how the conditional probability of event A given B is

related to the converse conditional probability of B given A.

About the Authors

Stefan Conrady

Stefan Conrady is the cofounder and managing partner of Conrady Applied Science, LLC, a privately held consulting

!rm specializing in knowledge discovery and probabilistic reasoning with Bayesian networks. In 2010, Conrady Applied

Science was appointed the authorized sales and consulting partner of Bayesia S.A.S. for North America.

Stefan Conrady studied Electrical Engineering and has extensive management experience in the !elds of product plan-

ning, marketing and analytics, working at Daimler and BMW Group in Europe, North America and Asia. Prior to es-

tablishing his own !rm, he was heading the Analytics & Forecasting group at Nissan North America.

Lionel Jouffe

Dr. Lionel Jouffe is cofounder and CEO of France-based Bayesia S.A.S. Lionel Jouffe holds a Ph.D. in Computer Science

and has been working in the !eld of Arti!cial Intelligence since the early 1990s. He and his team have been developing

BayesiaLab since 1999 and it has emerged as the leading software package for knowledge discovery, data mining and knowledge modeling using Bayesian networks. BayesiaLab enjoys broad acceptance in academic communities as well as

in business and industry. The relevance of Bayesian networks, especially in the context of consumer research, is high-

lighted by Bayesia’s strategic partnership with Procter & Gamble, who has deployed BayesiaLab globally since 2007.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 40

Page 44: Knowledge Discovery in Stock Market

Contact Information

Conrady Applied Science, LLC312 Hamlet’s End Way

Franklin, TN 37067

USA

+1 888-386-8383 [email protected]

www.conradyscience.com

Bayesia S.A.S.6, rue Léonard de Vinci

BP 119

53001 Laval CedexFrance

+33(0)2 43 49 75 69

[email protected]

www.bayesia.com

Copyright© 2011 Conrady Applied Science, LLC and Bayesia S.A.S. All rights reserved.

Any redistribution or reproduction of part or all of the contents in any form is prohibited other than the following:

• You may print or download this document for your personal and noncommercial use only.

• You may copy the content to individual third parties for their personal use, but only if you acknowledge Conrady

Applied Science, LLC and Bayesia S.A.S as the source of the material.

• You may not, except with our express written permission, distribute or commercially exploit the content. Nor may you transmit it or store it in any other website or other form of electronic retrieval system.

Knowledge Discovery in the Stock Market with Bayesian Networks

www.conradyscience.com | www.bayesia.com 41