alex hawala - ib math exploration - benford's law

Upload: alex-hawala

Post on 08-Mar-2016

122 views

Category:

Documents


14 download

DESCRIPTION

THis is an

TRANSCRIPT

Benfords Law

Candidate Name: Alex Evat Lineekela HawalaCandidate number: 0015

Benfords LawWord count: 1179 wordsCandidate Name: Alex Evat Lineekela HawalaCandidate Number: 0015School Number: 001179Maths ExplorationWindhoek International School

RationaleI wrote the Math Exploration on Benfords Law as I wanted to find a statistical concept that could be applied to different aspects of mathematics. The topic was also chosen to explore what kind of data sets it could be applied to and also where in can be used in the real world. I used the Fibonacci sequence as a proof because I wanted to find another mathematical concept where Benfords Law could be applied. I then used data from the Namibia Statistics Agency so that I would be able to test the Law on a random set of data. As for what it could be used for in the real world, the case of the Arizona Treasury manager was an aspect of financial forensics that was useful in my investigation. In this investigation I was able to utilise mathematical concepts such as logarithms and statistics.

IntroductionStatistics are a part of mathematics that use numerical data in order to identify trends and patterns. These trends and patterns can be used to make predictions that can be used to solve problems. An aspect of statistics that will be discussed in this mathematical investigation is Benfords law, stated by Frank Benford in 1938. Benfords law refers to the frequency of the first digit in numbers in many sets of data. Benfords law states that in a set of data, numbers that have the first digit as 1 will occur the most. In this mathematical investigation I will first fully explain the concept of Benfords Law. Afterwards, I will use data from The Namibia Statistics Agency on Buildings Completed, and Fibonaccis Sequence to prove Benfords law. I will then provide an example on how the Statistical law is used in the real world in financial forensics, particularly in the detection of fraud. From what I have discovered in the Investigation I will make a sound conclusion regarding Benfords Law.BackgroundBenfords Law was first stated by Simon Newcomb in 1881, but was popularized by Frank Benford, who later stated the law in 1938. Bedford stated the Law after using data sets from numerous sources from the surface area of rivers, death rates, and telephone numbers. Benford found that the number of digits that begin with the number one occurred around 30% of the time, with while the number two occurred 17% of the time. As the numbers of the starting digits increased, the frequency of a number occurring in that number would decrease, therefore implying that the number of figures starting with the digit nine would occur much less than that of any other number in a data set, as opposed to the thought that any number from 1 to 9 would have an equal change of occurring in a set of data. Figure 1 represents the distribution of this data.Figure 1

Simon Newcomb had calculated this distribution with the formula:

Benfords law also has uses in the real world not only as statistical phenomenon but also as a method of detecting fraud in statistical forensics. For example, an Arizona bank manager was accused of committing cheque fraud in 1993. Figure 2 is a list of the transactions made by the manager. Under scrutiny one notices that most of the digits begin with an 8 or 9 in its values. This is the managers first mistake, as his list of transactions would then have a low correlation towards Benfords Law. Most of the values in the data are values closely below US$ 100 000, which would act as a threshold for the data. This was probably because the perpetrator avoided any transactions underneath the value as it would have prompted a human signature instead of the automated transfer using the Treasurys computer system.However, the distribution does not work in this manner. Humans do not assume that some numbers more frequently than others, and assume that numbers would tend to have random frequencies. Figure 3 displays the frequency of the values in the data set. Figure 2

As shown in the example, Benfords Law can detect the fraudulent distribution of data in financial statements. However, in the analysis I will investigate whether it applies to other statistical data and set of data.AnalysisThe Fibonacci sequence will be used as an example of how Benfords Law applies to any set of data. The Fibonacci sequence is a progressive sequence that begins with the number zero, followed by one. The third term is derived from the sum of the two previous terms, 0 and 1, which would equal 1. The next terms follow this pattern as well, making the sequence: . The Fibonacci sequence was used to prove Benfords Law in naturally occurring sequences by calculating the first 200 numbers in the sequence using Wolfram Alpha, a powerful computational knowledge engine that can be found on the internet. I then took the numbers from the sequence, and counted how many of numbers in the sequence began with the digits 1-9 respectfully. I calculated the frequencies of the data, using my Graphical Display Calculator as shown in Table 1, which displays the frequencies of the first digits in the data set. First DigitNumber of occurrencesFrequency

16030.0%

23617.5%

32512.5%

4189.0%

5178.5%

6126.0%

7116.0%

8126.0%

994.5%

Total200100%

Table 1Figure 3 plots the frequencies with the graph of to examine a correlation between the two.Figure 3

The data from the first 200 numbers from the Fibonacci sequence seemed to follow Benfords Law very closely, proving its usefulness in determining the distribution in natural mathematical sequences. I then continued the investigation by using a set of data that has not been tested against Benfords Law. The data in question is from the Namibia Statistics Agency, in a report named Monthly Building Report: January 2015. The report contains the indices on Buildings Completed in Windhoek, Swakopmund, Walvis Bay, and Ongwediva in Namibia from in the time period of January 2010 to December 2014. It is not disclosed what values were used to calculate the indices. The report also contained a composite index calculated from the four towns. The values that will be used in this section of the mathematical investigation will be the values from the composite index of the report.

The list of the composite values are shown in Table 3. MonthIndexMonthIndexMonthIndexMonthIndexMonthIndex

Jan 201047.9Jan 201163.6Jan201231.6Jan 2013102.9Jan 201469.7

Feb --123.5Feb --68.4Feb --162.7Feb --119.5Feb --145.1

Mar83.7Mar85.8Mar61.2Mar176.8Mar96.2

Apr85Apr179.5Apr221.6Apr124.5Apr61

May99.5May76.3May154.6May159.6May135.5

Jun125.6Jun160.7Jun259.1Jun139.7Jun104.7

Jul165.3Jul125.7Jul120.1Jul87.5Jul316.4

Aug72.7Aug171.6Aug122.1Aug137.3Aug153.7

Sep120.7Sep237.7Sep120.1Sep70.2Sep134.8

Oct97.4Oct53.2Oct253.9Oct108.4Oct277.2

Nov95.2Nov102.8Nov131.3Nov90.6Nov108.8

Dec83.6Dec100.1Dec83.9Dec80.3Dec107.5

Table 2The tally from the composite indices was performed once again. Table 3 shows the frequency of the numbers 1-9 as first digits of the indices respectively.First DigitNumber of occurrencesFrequency

13151.7%

258.3%

311.7%

411.7%

523.3%

658.3%

735.0%

8711.7%

958.3%

Total60100%

Table 3

Figure 4 displays the frequencies with the graph of to test the relationship between the two. Figure 4It was found that the values from the NSA did not follow Bensons law as closely as the values from the Fibonacci sequence. This could be due to the fact that the values were manipulated via human interaction and therefore did not correlate with the Law, as the values were not naturally recorded, and were made up of different values. This may imply that Benfords law only works with data sets that contain naturally occurring/recorded numbers. In addition the data used in the investigation was also altered in the manner that the data was given an indices threshold of 300, which creates a statistical bias. This bias then has the effect of neglecting any values over 300.ConclusionIn conclusion, I found that Benfords Law does not apply to every set of data, particularly data in which humans had a great influence, such as in the Namibia Statistics Agency Building Indices. In these cases, a maximum threshold was created, which created a bias in the data set. Other causes in the case of the Building Indices was that the values in the data set were compiled from different values, of which the sources were not disclosed in the report. However, it was found that the Benford Index applied to data set composed of natural sequences, such as the Fibonacci sequence. In addition, in the example of the Arizona Treasury managers case of fraud in 1993, the Benford Index can also be used to detect fraud in financial data, as human interference can be easily detected.

List of SourcesBenford, Frank. 1938. "The law of anomalous numbers." Proceedings of the American Philosophical Society 551572.Namibia Statistics Agency. 2015. "http://www.nsa.org.na/files/downloads/187_Building%20Plans.pdf." Namibia Statistics Agency. February 16. http://www.nsa.org.na/files/downloads/187_Building%20Plans.pdf.Newcomb, Simon. 1881. "Note on the frequency of use of the different digits in natural numbers." American Journal of Mathematics 39-40.Nigrini, Mark J. 1999. I've Got Your Number. May 1. http://www.journalofaccountancy.com/issues/1999/may/nigrini.Weisstien, Eric W. n.d. "Benford's Law" -- from Wolfram MathWorld. http://mathworld.wolfram.com/BenfordsLaw.html.Wolfram Alpha. 2015. first 200 fibonacci numbers - Wolfram|Alpha. February 20. http://www.wolframalpha.com/input/?i=first+200+fibonacci+numbers.

AppendixPositionNumber in sequence

11

21

32

43

55

68

713

821

934

1055

1189

12144

13233

14377

15610

16987

171597

182584

194181

206765

2110946

2217711

2328657

2446368

2575025

26121393

27196418

28317811

29514229

30832040

311346269

322178309

333524578

345702887

359227465

3614930352

3724157817

3839088169

3963245986

40102334155

41165580141

42267914296

43433494437

44701408733

451134903170

461836311903

472971215073

484807526976

497778742049

5012586269025

5120365011074

5232951280099

5353316291173

5486267571272

55139583862445

56225851433717

57365435296162

58591286729879

59956722026041

601548008755920

612504730781961

624052739537881

636557470319842

6410610209857723

6517167680177565

6627777890035288

6744945570212853

6872723460248141

69117669030460994

70190392490709135

71308061521170129

72498454011879264

73806515533049393

741304969544928660

752111485077978050

763416454622906710

775527939700884760

788944394323791460

7914472334024676200

8023416728348467700

8137889062373143900

8261305790721611600

8399194853094755500

84160500643816367000

85259695496911123000

86420196140727490000

87679891637638612000

881100087778366100000

891779979416004710000

902880067194370820000

914660046610375530000

927540113804746350000

9312200160415121900000

9419740274219868200000

9531940434634990100000

9651680708854858300000

9783621143489848400000

98135301852344707000000

99218922995834555000000

100354224848179262000000

101573147844013817000000

102927372692193079000000

1031500520536206900000000

1042427893228399980000000

1053928413764606870000000

1066356306993006850000000

10710284720757613700000000

10816641027750620600000000

10926925748508234300000000

11043566776258854900000000

11170492524767089100000000

112114059301025944000000000

113184551825793033000000000

114298611126818977000000000

115483162952612010000000000

116781774079430987000000000

1171264937032043000000000000

1182046711111473990000000000

1193311648143516980000000000

1205358359254990970000000000

1218670007398507950000000000

12214028366653498900000000000

12322698374052006900000000000

12436726740705505800000000000

12559425114757512700000000000

12696151855463018400000000000

127155576970220531000000000000

128251728825683550000000000000

129407305795904081000000000000

130659034621587630000000000000

1311066340417491710000000000000

1321725375039079340000000000000

1332791715456571050000000000000

1344517090495650390000000000000

1357308805952221450000000000000

13611825896447871800000000000000

13719134702400093300000000000000

13830960598847965100000000000000

13950095301248058400000000000000

14081055900096023500000000000000

141131151201344082000000000000000

142212207101440105000000000000000

143343358302784187000000000000000

144555565404224293000000000000000

145898923707008480000000000000000

1461454489111232770000000000000000

1472353412818241250000000000000000

1483807901929474030000000000000000

1496161314747715280000000000000000

1509969216677189300000000000000000

15116130531424904600000000000000000

15226099748102093900000000000000000

15342230279526998500000000000000000

15468330027629092400000000000000000

155110560307156091000000000000000000

156178890334785183000000000000000000

157289450641941274000000000000000000

158468340976726457000000000000000000

159757791618667731000000000000000000

1601226132595394190000000000000000000

1611983924214061920000000000000000000

1623210056809456110000000000000000000

1635193981023518030000000000000000000

1648404037832974140000000000000000000

16513598018856492200000000000000000000

16622002056689466300000000000000000000

16735600075545958500000000000000000000

16857602132235424800000000000000000000

16993202207781383200000000000000000000

170150804340016808000000000000000000000

171244006547798191000000000000000000000

172394810887814999000000000000000000000

173638817435613191000000000000000000000

1741033628323428190000000000000000000000

1751672445759041380000000000000000000000

1762706074082469570000000000000000000000

1774378519841510950000000000000000000000

1787084593923980520000000000000000000000

17911463113765491500000000000000000000000

18018547707689472000000000000000000000000

18130010821454963500000000000000000000000

18248558529144435400000000000000000000000

18378569350599398900000000000000000000000

184127127879743834000000000000000000000000

185205697230343233000000000000000000000000

186332825110087068000000000000000000000000

187538522340430301000000000000000000000000

188871347450517368000000000000000000000000

1891409869790947670000000000000000000000000

1902281217241465040000000000000000000000000

1913691087032412710000000000000000000000000

1925972304273877740000000000000000000000000

1939663391306290450000000000000000000000000

19415635695580168200000000000000000000000000

19525299086886458700000000000000000000000000

19640934782466626800000000000000000000000000

19766233869353085500000000000000000000000000

198107168651819712000000000000000000000000000

199173402521172798000000000000000000000000000

200280571172992510000000000000000000000000000

14