big data challenges

27
> Small vs. Big Data < What the heck? What does it all mean and how does it help me?

Post on 14-Sep-2014

765 views

Category:

Technology


0 download

DESCRIPTION

The presentation discusses the significance and challenges of working with large data.

TRANSCRIPT

Page 1: Big Data Challenges

>  Small  vs.  Big  Data  <  What  the  heck?  What  does  it  all  mean  and  how  does  it  help  me?  

Page 2: Big Data Challenges

>  Smart  data  driven  marke5ng  

June  2012   ©  Datalicious  Pty  Ltd   2  

Media  A8ribu5on  &  Modeling  

Op5mise  channel  mix,  predict  sales  

Tes5ng  &  Op5misa5on  Remove  barriers,  drive  sales  

Boos5ng  ROI  

Targe5ng  &  Merchandising    Increase  relevance,  reduce  churn  

“Using  data  to  widen  the  funnel”  

Page 3: Big Data Challenges

June  2012   ©  Datalicious  Pty  Ltd   3  

Twi8er  @datalicious  

Page 4: Big Data Challenges

>  Wikipedia:  Big  data  In  informaAon  technology,  big  data  consists  of  datasets  that  grow  so  large  that  they  become  awkward  to  work  with  using  on-­‐hand  database  management  tools.  DifficulAes  include  capture,  storage,  search,  sharing,  analyAcs,  and  visualizing.      This  trend  conAnues  because  of  the  benefits  of  working  with  larger  and  larger  datasets  allowing  analysts  to  spot  business  trends,  prevent  diseases,  combat  crime.      Though  a  moving  target,  current  limits  are  on  the  order  of  terabytes,  exabytes  and  zeMabytes  of  data.  

June  2012   ©  Datalicious  Pty  Ltd   4  

Page 5: Big Data Challenges

June  2012   ©  Datalicious  Pty  Ltd   5  

Big  data  =  bo8lenecks  

Page 6: Big Data Challenges

>  Big  data  analy5cs  bo8lenecks  

June  2012   ©  Datalicious  Pty  Ltd   6  

Fast  laptops  now  have  up  to  8GB  of  RAM,  that  means  you  can  compute  up  to  6GB  of  raw  data  very  fast  in  memory  thus  bypassing  the  biggest  boMleneck:  I/O  

Page 7: Big Data Challenges

>  Power  vs.  distributed  compu5ng  

June  2012   ©  Datalicious  Pty  Ltd   7  

Adding  more  supercomputers  is  difficult  as  they  are  complex  and  expensive  but  adding  machines  to  a  distributed  compuAng  network    is  fairly  cheap  and  ‘easy’.    

Page 8: Big Data Challenges

June  2012   ©  Datalicious  Pty  Ltd   8  

Big  data  =  hype?  

Page 9: Big Data Challenges

>  Importance  of  research  experience  

June  2012   ©  Datalicious  Pty  Ltd   9  

The  consumer  decision  process  is  changing  from  linear  to  circular.  

Considera5on    set  now  grows  during  (online)  research  phase  which  increases  importance  of  user  experience  during  that  phase  

(Online)  Research    

Page 10: Big Data Challenges

>  The  consumer  data  journey  

June  2012   ©  Datalicious  Pty  Ltd   10  

To  reten5on  messages  To  transac5onal  data  

From  suspect  to   To  customer  

From  behavioural  data   From  awareness  messages  

Time  Time  prospect  

Page 11: Big Data Challenges

Campaign  response  data  

>  Single  customer  view  is  key  

June  2012   ©  Datalicious  Pty  Ltd   11  

Customer  profile  data  

+   The  whole  is  greater    than  the  sum  of  its  parts  

Website  behavioural  data  

Page 12: Big Data Challenges

>  Maximise  iden5fica5on  points    

20%  

40%  

60%  

80%  

100%  

120%  

140%  

160%  

0   4   8   12   16   20   24   28   32   36   40   44   48  

Weeks  

−−−  Probability  of  idenAficaAon  through  Cookies  

June  2012   12  ©  Datalicious  Pty  Ltd  

Page 13: Big Data Challenges

>  Tradi5onal  single  customer  view  

June  2012   ©  Datalicious  Pty  Ltd   13  

Vendor    data  feed  #2  

Website    data  

Call  center    data  

Customer    data  

Reports  and  dashboards  

Vendor    data  feed  #1  

Vendor    data  feed  #3  

Targeted  campaigns  

Transac5on    data  warehouse  

Repor5ng  data  warehouse  

Data  import    (ETL)  process  

Page 14: Big Data Challenges

>  Tradi5onal  single  customer  view  

June  2012   ©  Datalicious  Pty  Ltd   14  

Vendor    data  feed  #2  

Website    data  

Call  center    data  

Customer    data  

Reports  and  dashboards  

Vendor    data  feed  #1  

Vendor    data  feed  #3  

Targeted  campaigns  

Transac5on    data  warehouse  

Repor5ng  data  warehouse  

Data  import    (ETL)  process  

Challenge  #1:    Rigid  database  schema  requires  extensive  planning  and  maintenance  

Challenge  #2:    Data  feeds  require  constant  updates  and  maintenance  

Challenge  #3:    Increasing  number  of  (unstructured)  data  sources  

Page 15: Big Data Challenges

Splunk  instance    on  dedicated    AWS  server  

>  Splunk  single  customer  view  

June  2012   ©  Datalicious  Pty  Ltd   15  

3rd  party    campaign  execu5on  

Splunk  saved  searches  and  dashboards  

Splunk    Forwarder  for  data  import  

Website    data  

Call  center    data  

Customer    data  

Splunk  regex  builder  and    data  exports  

SuperTag  integra5on  for  real-­‐5me  data  

3rd  party  data  mining  and  repor5ng  

Page 16: Big Data Challenges
Page 17: Big Data Challenges
Page 18: Big Data Challenges
Page 19: Big Data Challenges
Page 20: Big Data Challenges
Page 21: Big Data Challenges
Page 22: Big Data Challenges
Page 23: Big Data Challenges
Page 24: Big Data Challenges
Page 25: Big Data Challenges

>  Key  Splunk  advantages  §  Powerful  data  mining  –  Structured  and  unstructured  data  

§  Easy  sharing  of  insights  – Online  dashboards  and  reports  

§  Short  project  duraAon  – Quick  implementaAon  and  1st  insights  

§  IntegraAon  with  other  plaeorms  –  Regex  builder  and  data  extracts  

§  Low  technology  and  resource  costs  –  ImplementaAon  and  maintenance  

June  2012   ©  Datalicious  Pty  Ltd   25  

Page 26: Big Data Challenges

June  2012   ©  Datalicious  Pty  Ltd   26  

Contact  us  [email protected]  

 Learn  more  

blog.datalicious.com    

Follow  us  twi8er.com/datalicious  

 

Page 27: Big Data Challenges

Data  >  Insights  >  Ac5on