1 john walker an attempt to replicate the shnoll et al. effect with algorithmic classification of...

38
1 John Walker http://www.fourmilab.ch/ An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

Upload: roberta-nichols

Post on 17-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

1

John Walker

http://www.fourmilab.ch/

An Attempt to Replicate the Shnoll et al. Effect

with Algorithmic Classification of Histogram Similarity

Page 2: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

2

Principal Goal

Development of an easily-replicated stochastic source and an accompanying computer-based toolkit for exploring time-dependence in histogram structure and automated techniques for histogram similarity ranking.

Page 3: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

3

Stochastic Source Oxford Nuclear 5.0 µCi 137Cs 661.6keV gamma source (US$40) Aware Electronics RM-80 Geiger- Müller detector with serial port interface (US$319) Generic PC with MS-DOS and a serial port Modified HotBits generator software (public domain) Event rate 200,000 counts/min Background 60 counts/min

Page 4: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

4

Generator Software MS-DOS (not Windows) 16-bit program Direct port access from assembly language Interval timing from PC ROM BIOS clock Time of day synchronised with Network Time Protocol

Small footprint: “retired” PCs suitable as generators Consistent hardware-based interval timing Accurate detection and accumulation of counts Measurements precisely labeled with date and time

Design Goals:

Page 5: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

5

Raw Data Format One Measurement Record per minute, beginning at start of the minute 100 consecutive Count Windows per minute, each consisting of nine ticks of the 18.2 Hz PC hardware clock Mean ticks per Count Window 1900 CSV output record written in “housekeeping time” between end of 100th Count Window and start of next minute: Unix time() at start of minute followed by 100 comma separated count valuesFile size: 735 Kb/day

964310400,1890,1964,1898,1902,1840,1901,1842,1916,1886,1901,1838,1932,1880,1985,1910,1883,1919,1903,1895,1913,1899,1902,1870,1914,1897,1858,1854,1855,1893,1860,1948,1837,1887,1865,1888,1882,1914,1914,1905,1903,1898,1930,1892,1883,1926,1903,1861,1899,1951,1900,1856,1877,1861,1861,1865,1882,1850,1882,1910,1874,1870,1893,1926,1923,1880,1889,1911,1885,1913,1863,1883,1918,1910,1933,1945,1891,1873,1910,1861,1850,1888,1948,1902,1881,1939,1948,1861,1870,1897,1938,1895,1896,1889,1912,1919,1867,1847,1899,1937,1890

Page 6: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

6

Analysis Software Reads one or more days’ Raw Data CSV files Assembles count histograms into Experiments of 10 minutes each

Raw Data HistogramCompilation

Transformation Modules

Histogram Pair Assembly

Matching Modules

Closeness Sort Ranking Table

Page 7: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

7

Histogram Compilation Arranges raw data into 10 minute Experiments, each beginning at a round 10 minute interval. (Intervals with missing data are discarded.) Builds in-memory raw histograms (number of occurrences of a given count in interval) Computes exponentially smoothed moving average (P = 0.2) of histogram, symmetrically from the mean Creates histogram CSV files (raw and smoothed) for each experiment for subsequent analysis Plots each experiment’s histogram as a GIF file

Page 8: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

8

Transformation ModulesOpen-ended plug-in modules transform experimenthistograms in place:

NORMALISE: Scales histogram values so that maximum value is 1 FOURIER: Replaces histogram with its Fourier transform WAVELET: Replaces histogram with its discrete wavelet transform using the Daubechies 4-coefficient filter coefficients

Multiple transforms can be enabled; new transforms canbe added.Transformed histograms and their inverses can be plotted for debugging.

Page 9: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

9

Histogram Pair Assembly All pairs of histograms are tabulated in memory Assumes matching algorithm is commutative (but this can be changed)

Page 10: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

10

Matching Modules (1) Plug-in modules which, given a pair of experiment histograms, return a floating-point metric of how “close” they are in morphology.

MEAN-ALIGNED ²: Histograms are shifted so mean values align, then ² distance between the curves is computed. SLIDING, MIRRORED ²: Histograms are initially aligned at their mean value, then the histogram pair and pair with one mirrored about the mean are shifted along the X axis and the minimum ² distance is reported.

Page 11: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

11

Matching Modules (2)SLIDING, MIRRORED, STRETCHED ²: Histograms are initially aligned at their mean value, then the histogram pair and pair with one mirrored about the mean, and histograms scaled along the X axis within a defined range, are shifted along the X axis and the minimum ² distance is reported. (Work in progress.)

HUMAN-DIRECTED: It would be possible to input the ranking table from similarity measures made by human judges.

Page 12: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

12

Closeness SortSorts histogram pairs by closeness as determined by the Matching Module.

Produces aligned plots of best and worst matches to evaluate effectiveness of Matching Module.

Page 13: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

13

Ranking TableCSV format file lists histogram pairs in descending order of closeness evaluated by the Matching Module.

Closeness metric and free matching parameters included for downstream analysis programs.

0.5603,964031400,2000-07-19-18-30,964112400,2000-07-20-17-00,1,20.589,964181400,2000-07-21-12-10,964343400,2000-07-23-09-10,-1,-830.5926,964311000,2000-07-23-00-10,964402200,2000-07-24-01-30,1,00.5943,964224000,2000-07-22-00-00,964413000,2000-07-24-04-30,-1,-850.5943,963837000,2000-07-17-12-30,963957000,2000-07-18-21-50,1,1 . .

.31.45,963907800,2000-07-18-08-10,964073400,2000-07-20-06-10,1,431.8,964226400,2000-07-22-00-40,964245000,2000-07-22-05-50,1,-231.92,963907800,2000-07-18-08-10,964158600,2000-07-21-05-50,1,431.92,964226400,2000-07-22-00-40,964270800,2000-07-22-13-00,-1,-8433.9,963907800,2000-07-18-08-10,963950400,2000-07-18-20-00,1,2

Page 14: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

14

Time BinningReads Ranking Table and bins into deciles by closeness metric, creating a histogram of time difference between histograms for each decile.

Creates expectation value table for null hypothesis.

Normalises decile histograms vs. null hypothesis and plots results by decile.

Page 15: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

15

Ranking Table RandomiserShuffles lines in the ranking table produced by the Closeness Sort.

Time binning randomised ranking provides null hypothesis control for closeness matching.

Page 16: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

16

Pilot ExperimentData collected continuously from 2000-07-16 through 2000-07-24; no gaps in data set.

Data set contains: 12,960 one minute measurement records1,296,000 equal duration count windows 1,296 ten-minute experiments 839,160 histogram pairs, excluding self/self

and assuming commutative comparison

Page 17: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

17

Complete Data Set Histogram

µ = 1889.35, = 26.4

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

Counts 1769 1789 1809 1829 1849 1869 1889 1909 1929 1949 1969 1989 2009

Occurrences

Normal Distribution

Page 18: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

18

Representative ExperimentHistograms

Page 19: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

19

Closely Matching Histograms

Page 20: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

20

Closely Matching Histograms

Page 21: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

21

Closely Matching Histograms

Page 22: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

22

Closely Matching Histograms

Page 23: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

23

Poorly Matching Histograms

Page 24: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

24

Poorly Matching Histograms

Page 25: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

25

Null Hypothesis TimeDistribution Expectation

Page 26: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

26

Closeness Ranking: Closest 2000

Page 27: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

27

Closeness Ranking: Decile 1

Page 28: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

28

Closeness Ranking: Decile 2

Page 29: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

29

Closeness Ranking: Decile 9

Page 30: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

30

Closeness Ranking: Decile 10

Page 31: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

31

Control Ranking: Decile 1

Page 32: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

32

Control Ranking: Decile 2

Page 33: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

33

Control Ranking: Decile 9

Page 34: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

34

Control Ranking: Decile 10

Page 35: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

35

Conclusions FromPilot Experiment

No evidence found for time dependence in fine structure of smoothed histograms.

Not a refutation due to very small data set, single generator at one location, limitations in automated histogram similarity scoring, and inability to correlate automated scoring vs. human judging reported by Shnoll et al.

Page 36: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

36

Toolkit AvailabilityAll software developed for this project is in the public domain and all ancillary software is free software included in a standard Linux distribution.

Hardware cost for the stochastic generator is less than US$500, plus a generic MS-DOS PC.

Analysis source code and pilot experiment data set available to all investigators.

Open framework for exploring automated histogram similarity ranking.

Page 37: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

37

ReferencesShnoll, S. et al., “Realization of discrete states during fluctuations in macroscopic processes”, Physics–Uspekhi 41 (10) 1025 –1035 (1998).

Shnoll, S. et al., “Regular variation of the fine structure of statistical distributions as a consequence of cosmophysical agents”, Physics–Uspekhi 43 (2) 25 –209 (2000).

Page 38: 1 John Walker  An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

38