![Page 1: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/1.jpg)
BIG DATA - AS OPPOSED TO SMALL DATA
Mark Whitehorn
![Page 2: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/2.jpg)
2
What is Big data?
Is it really just a marketing campaign?
http://www.perceptualedge.com/articles/visual_business_intelligence/big_data_big_ruse.pdf
“If you’re like me, the mere mention of Big Data now turns your stomach….Why all the fuss? Why, indeed. Essentially, Big Data is a marketing campaign, pure and simple.” Stephen Few
![Page 3: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/3.jpg)
3
Big dataClearly I am not like Stephen Few.
I don’t believe I have a particular axe to grind, I simply find this interesting
This talk is designed to try to explain:• what Big Data is• what characteristics we have found useful• why it may be of interest to you• a paradox
![Page 4: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/4.jpg)
4
Data
All computer applications manipulate data
![Page 5: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/5.jpg)
5
Data
So, in the ’60 and ‘70s we rapidly learnt to separate the data, and its manipulation, from the application
![Page 6: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/6.jpg)
6
Data
So, in the ’60 and ‘70s we rapidly learnt to separate the data, and its manipulation, from the applicationWhich led directly to the development of database engines and, ultimately, relational ones (DB2, Oracle, SQL Server)
![Page 7: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/7.jpg)
7
Data
Data has always existed in two, very broad, flavours…..
• Data that is treated as small, discrete packages and is a good fit with the relational way of storing and querying data
• Data that is not as above
![Page 8: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/8.jpg)
Data is stored in tables
8
Mark Whitehorn
LicenseNo Make Model Year ColourCER 162 C Triumph Spitfire 1965 GreenEF 8972 Bentley Mk. VI 1946 BlackYSK 114 Bentley Mk. VI 1949 Red
![Page 9: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/9.jpg)
Data is stored in tables
9
Mark Whitehorn
LicenseNo Make Model Year ColourCER 162 C Triumph Spitfire 1965 GreenEF 8972 Bentley Mk. VI 1946 BlackYSK 114 Bentley Mk. VI 1949 Red
CarEach table has a name
![Page 10: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/10.jpg)
Data is stored in tables
10
Mark Whitehorn
LicenseNo Make Model Year ColourCER 162 C Triumph Spitfire 1965 GreenEF 8972 Bentley Mk. VI 1946 BlackYSK 114 Bentley Mk. VI 1949 Red
Car
Data isatomic
![Page 11: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/11.jpg)
Data is stored in tables
11
Mark Whitehorn
LicenseNo Make Model Year ColourCER 162 C Triumph Spitfire 1965 GreenEF 8972 Bentley Mk. VI 1946 BlackYSK 114 Bentley Mk. VI 1949 Red
Columns
Car
![Page 12: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/12.jpg)
Data is stored in tables
12
Mark Whitehorn
LicenseNo Make Model Year ColourCER 162 C Triumph Spitfire 1965 GreenEF 8972 Bentley Mk. VI 1946 BlackYSK 114 Bentley Mk. VI 1949 Red
CarColumns
Rows
![Page 13: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/13.jpg)
Data is stored in tables
13
Mark Whitehorn
LicenseNo Make Model Year ColorCER 162 C Triumph Spitfire 1965 GreenEF 8972 Bentley Mk. VI 1946 BlackYSK 114 Bentley Mk. VI 1949 Red
Car
Each row represents a unique entity in the ‘real’ world……
![Page 14: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/14.jpg)
14
![Page 15: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/15.jpg)
15
Data
The manipulation consists typically of sub-setting the data by rows and columns and then doing some sums
![Page 16: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/16.jpg)
16
Data
Note that this kind of manipulation is treating the data as atomic, which is fine, because the relational model assumes atomicity of data
Note also, that the rows are unordered
![Page 17: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/17.jpg)
17
Data
• Data has always existed in two, very broad, flavours…..• Data that is inherently atomic and is a good
fit with the relational way of storing and querying data
• Data that is not as above
![Page 18: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/18.jpg)
Examples
• Examples of ‘other’ data:• Images• Music• Word docs• Sensor data• Web logs• Twitter• Machines
• Point of Sale• Mass spectrometers
18
![Page 19: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/19.jpg)
What’s in a name?
So, what do we call the ‘rest’?• Un-structured?• Semi-structured?• Multi-structured?• Non-relational?• Non-tabular?
19
![Page 20: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/20.jpg)
What’s in a name?
• What about: • Big data?
20
![Page 21: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/21.jpg)
Other definitions?
• V V V v v v v • Volume• Variety• Velocity• Value• Very interesting• Various other words beginning with V…..
21
![Page 22: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/22.jpg)
22
Big Data – not new?
• So why have we focused, for the last 30 years, almost exclusively on the first flavour?
• Because it:• is easy (relatively easy – Jim Gray*)• represents a significant proportion of the
available data
*Jim Gray and Andreas Reuter - Transaction Processing: Concepts and Techniques (1993)Turning Award 1998
![Page 23: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/23.jpg)
23
Big Data has come of age
• Two factors have changed• Rise of the Machines• Increase is computational power
• There is a great synergy here• We are acquiring far more big data and we
have computational power to extract the information it contains
![Page 24: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/24.jpg)
Big Data is hard
• 3 Vs• It is highly variable• We often want to look inside the data
• Frequently non-atomic• Need custom functions for virtually every operation
• find the rotating wing aircraft in the image• Identify the best customer• What does the blog sphere think of our
company?24
![Page 25: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/25.jpg)
• Examples• Log file• Mass spec.• Images
Big Data
25
![Page 26: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/26.jpg)
• Examples• Log file• Mass spectrometer• Image
Big Data
26
![Page 27: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/27.jpg)
• Examples• Log file• Mass spec.• Images
Big Data
27
![Page 28: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/28.jpg)
![Page 29: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/29.jpg)
What is Big Data?• Examples
• Log file• Mass spec.• Images
BIG DATA
![Page 30: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/30.jpg)
Summary so far……
• Just as you can always fit an aircraft engine into a car chassis, you can always put Big Data in a table, but you probably don’t want to
• The analysis is not sub-setting the data by rows and columns
• So each class of big data usually require a (lovingly hand-crafted) custom analysis
30
![Page 31: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/31.jpg)
Case Study
Big Data in the Life Sciences WorldThe massed spectrometers
Why would anyone do that?
31
![Page 32: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/32.jpg)
Human Genome Project$3 billion – 13 Years
Sequencing completed (2003).
32
![Page 33: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/33.jpg)
Human Genome Project
Our genes define us.
Errr…. how does that work exactly?
Human Genome Project$3 billion – 13 Years
33
![Page 34: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/34.jpg)
DNA Protein
blueprint product
What is a protein?
34
![Page 35: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/35.jpg)
Genes contain instructions for creating
proteins
Proteins carry out functions within a cell
GENOME
PROTEOME
Why study proteins
35
![Page 36: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/36.jpg)
Example ProteinsProtein: ACTINFunction: Contracts Muscles
Protein: InsulinFunction: Controls Blood Sugar
O2
Protein: HemoglobinFunction: Carries Oxygen
Protein: KeratinFunction: Forms Hair and Nails
Protein: AntibodyFunction: Fights Viruses
36
![Page 37: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/37.jpg)
20-25,000 genes in the human genome.Every nucleated cell in the same human has the same genome.
But not all genes are active at the same time.Perm any 15-18,00 active proteins in any one cell at any one time.
biS
CIE
NC
E
37
![Page 38: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/38.jpg)
slowly changing millions of years
rapidly changingover a day38
![Page 39: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/39.jpg)
Studying Proteins
Proteins are chopped up using an enzyme to make them easier to measure.
A specialised instrument (Mass Spectrometer) is used to measure (‘weigh’) the small protein fragments.
We can use the mass of the small fragments to carry out intelligent database searches to identify which protein was detected.
39
![Page 40: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/40.jpg)
Protein
MKLNISFPATGCQKLIEVDDERKLRTFYEKRMATEVAADALGEEWKGYVVRISGGNDKQGFPMKQGVLTHGRVRLLLSKGHSCYRPRRTGERKRKSVRGCIVDANLSVLNLVIVKKGEKDIPGLTDTTVPRRLGPKRASRIRKLFNLSKEDDVRQYVVRKPLNKEGKKPRTKAPKIQRLVTPRVLQHKRRRIALKKQRTKKNKEEAAEYAKLLAKRMKEAKEKRQEQIAKRRRLSSLRASTSKSESSQK
Amino Acids
Peptides
40
![Page 41: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/41.jpg)
Mass SpectrometryAn analytical technique for the determination of the elemental composition of a sample.
41
![Page 42: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/42.jpg)
Spectra
P1
P2
P3
42
![Page 43: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/43.jpg)
Mass SpectraFile Sizes: typically several gigabytes per MS run.
Identifications: range from 500-8000 protein identifications.
43
![Page 44: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/44.jpg)
pep TRACKERTRACK. VISUALISE. DISCOVER.
44
![Page 45: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/45.jpg)
80%60%
40%20%
45
![Page 46: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/46.jpg)
Protein Peptide Alignment Map
Normalised Profiles for Synthesis,
Degradation and Turnover
Localisation
Comparison Between Compartments
46
![Page 47: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/47.jpg)
Custom analysis and custom visualisation – vital tools in understanding big data
47
![Page 48: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/48.jpg)
Proteomics Volume 3, Issue 8, Article first published online: 12 AUG 2003
Deisotoping
Base Line Correction Peak Detection
BIOConductor PROcess R Package
Intensive Data Processing Required to derive Information from the raw data
48
![Page 49: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/49.jpg)
“proteomics is much more complicated than genomics . . . while an organism's genome is
more or less constant, the proteome differs from cell to cell
and over time”
Computationally, perhaps three orders of magnitude more
complex than HGP
49
![Page 50: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/50.jpg)
Why bother trying to quantify it?
Because this is payback time.
Documenting the proteome opens the door to a whole new world.
biS
CIE
NC
E
50
![Page 51: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/51.jpg)
So, what is a data scientist?My favourite description comes from Twitter:“Yeah, so I'm actually a data scientist. I just do this barista thing in between gigs.”More cynically:“A data scientist is just an analyst who lives in California.”
biS
CIE
NC
E
51
![Page 52: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/52.jpg)
Possibly more accurate is that a data scientist (DS) is “a better software engineer than any statistician and a better statistician than any software engineer”.
biS
CIE
NC
E
52
![Page 53: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/53.jpg)
DSs are also part artist and part engineer. They need a toolbox of techniques, skills, processes and abilities from which to construct novel solutions. And they need the ability to create a UI that turns their abstract finding into something that the users of the system can understand, so DSs also need the skills to create elegant visualisations that turn raw data into information.
biS
CIE
NC
E
53
![Page 54: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/54.jpg)
And (yes, there’s more) they need to be able to communicate well with people. There is little use in creating a superb analytical process if you can’t communicate how and why it works to the board members.
biS
CIE
NC
E
54
![Page 55: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/55.jpg)
And then there is the curiosity. Duncan Ross (Director of Data Sciences at Teradata) characterised data scientists well:The first and most important trait is curiosity. Insane curiosity. In many walks of life evolution selects against the kind of person who decides to find out what happens “if I push that button”. Data Science selects for it.
biS
CIE
NC
E
55
![Page 56: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/56.jpg)
So, what are the general characteristics of a DS? They include:• insatiable curiosity (see above)• interdisciplinary interests• excellent communication skills • excellent analytical capabilities
biS
CIE
NC
E
56
![Page 57: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/57.jpg)
DSs also need a good working knowledge of:• machine learning techniques• data mining• statistics• maths• algorithm development• code development • data visualisation• multi-dimensional database design and
implementation
biS
CIE
NC
E
57
![Page 58: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/58.jpg)
Specific skills include the technologies to handle big data:• NoSQL databases• Hadoop and related technologies• MapReduce and its implementation on differing
software platforms
biS
CIE
NC
E
58
![Page 59: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/59.jpg)
DSs also have an intimate knowledge of languages such as:• SQL• MDX • R• Functional and OOP languages such as Erlang and
Java
biS
CIE
NC
E
59
![Page 60: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/60.jpg)
Most of all, no matter what they are called, all true data scientists have started playing with some data at 8:00PM and suddenly found it is 3:00AM.
biS
CIE
NC
E
![Page 61: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/61.jpg)
Case Study
TwitterWho loves you?Social/text/sentiment
61
![Page 62: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/62.jpg)
Consider the humble tweet…
62
![Page 63: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/63.jpg)
Consider the humble tweet…
63
As, indeed, Sally Bercow should have done
![Page 64: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/64.jpg)
Consider the humble tweet…
64
As, indeed, Sally Bercow should have done *Innocent Face*
![Page 65: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/65.jpg)
Consider the humble tweet…
I’d just like to apologise for that last slide but I would point out
that it “contained no accusation whatsoever … Mischievous but
not libellous.”
65
![Page 66: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/66.jpg)
Case Study
Oil Rig dataGone fishing
Sensor data
66
![Page 67: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/67.jpg)
Lessons learned
• Engagement
• Choose you battles – look for an area where you can gain competitive advantage
• Choose your platform carefully• Programming – algorithm development• Data scientists
• Custom algorithms • Custom visualisations 67
![Page 69: Big Data as Opposed to Small Data Mark Whitehorn](https://reader038.vdocuments.net/reader038/viewer/2022102901/556307fbd8b42a4b1d8b500d/html5/thumbnails/69.jpg)
BIG DATA - AS OPPOSED TO SMALL DATA
60 minutes
Mark Whitehorn