big data a technological revolution

26
OM DAYAL GROUP OF INSTITUTIONS SUBMITTED BY M SHYAM SUNDER ASHISH MISHRA

Upload: ashish-mishra

Post on 05-Dec-2014

133 views

Category:

Data & Analytics


1 download

DESCRIPTION

 

TRANSCRIPT

  • 1. OM DAYAL GROUP OF INSTITUTIONS SUBMITTED BY M SHYAM SUNDER ASHISH MISHRA
  • 2. A BRIEF OF TODAYS DISCUSSION What is Big Data. Origin. Now or Never! The more the merrier! Messiness a positive feature, not a shortcoming. Relationships are important. Datafication Quantifying the World. Valuing the priceless. Dark side of Big Data. Taming the Bull. The future starts today.
  • 3. WHAT IS BIG DATA? The word data means given in Latin, meaning a fact. Big data is data that is too large, complex and dynamic for any conventional data tools to capture, store, manage and analyze. The right use of Big Data allows analysts to spot trends and gives niche insights that help create value and innovation much faster than conventional methods. The four Vs which drive Big Data are :- Big Data Volume Variety Veracity Velocity
  • 4. ORIGIN OF BIG DATA Increase in Storage capacity. The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s, while the costs have come down as well. High processing speed is readily available. Gordon Moore of the Co- Founder of Intel stated the number of transistors on integrated circuits doubles approximately every two years as a result of which the processing performance is doubling every 18 months. Data analysis generally uses Statistical models taking samples which are erroneous and do not reveal the true picture all the time. Big Data does not consider sample data it covers the whole data set and hence provides a better analysis. Google Flu Trend was a milestone event for Big Data Analysis.
  • 5. PEOPLE BEHIND THE REVOLUTION! STORAGE Stuart Parkin, IBM Fellow and manager of the magneto electronics group at the IBM Almaden Research Center in San Jose, California. In April 2014, Parkin was awarded the Millennium Technology Prize for his work on spintronic materials, "leading to a prodigious growth in the capacity to store digital information". PROCESSOR PERFORMANCE In July 1968, Gordon Moore co- founded NM Electronics which later became Intel Corporation with Robert Noyce. Moore was awarded the 2008 IEEE Medal of Honor for "pioneering technical roles in integrated-circuit processing, and leadership in the development of MOS memory, the microprocessor computer and the semiconductor industry.
  • 6. DATA GROWTH RATE One Zetta Byte(ZB) = 1000 Exa Bytes = 1 Billion Terra Byte (TB)
  • 7. NOW OR NEVER! The Large Hadron Collider uses about 150 million sensors delivering data 40 million times/sec. There are 600 million collisions/sec. As a result, the data flow from all four LHC experiments 25 petabytes annual rate. Big data analysis played a large role in Barack Obama's successful 2012 re- election campaign. Walmart handles more than 1 million customer transactions every hour, which are imported into databases estimated to contain more than 2.5 petabytes of data. Google processes over 24 Peta Bytes of data per day. Snapchat users upload 16 million pictures per hour. Facebook sees 10 million photos uploaded every hour, a Like button is clicked or a comment posted nearly 3 billion times per day.
  • 8. NOW OR NEVER!....(CONTINUED) 800 million monthly users of Youtube service upload over an hour of video every second. The number of messages on Twitter grows at around 200% a year and in 2012 it exceeded 400 million tweets a day. From the Sciences to healthcare, from Banking to Internet, the sectors maybe diverse yet together they tell a similar story. The amount of data in the world is growing fast, outstripping not just our machines but our imaginations as well. If we dont process and analyse this huge amount of Data, Now we would never be able to harness its true potential. So its a question of Now or never.
  • 9. THE MORE THE MERRIER The challenge of processing large piles of data accurately has been with us for a while. For, most of history we worked only with a little data because our tools to collect, organise, store, and analyse large amounts of data were poor. First statistical methods were used to crunch data which used sample set of the total data, this was erroneous. Then came the practice of picking up random samples within the available data set which reduced the probability of error to up to 3%. Big Data works on a sample space where N = All , i.e. it uses up all the available data and hence we get more accurate results. As the data sets become larger and we obtain access to greater amounts of data the results would become more and more accurate. Hence More is merrier.
  • 10. THE MORE THE MERRIER! (CONTINUED) FareCast, a flight reservation price predictor company, initially used 12,000 data points as sample and predicted the ticket prices according to the date of journey, it performed well. But as he went on adding more data the quality of predictions improved significantly. Steve Jobs added 4-5 years to his life by getting his whole DNA Genome sequencing done and with the available information from other patients DNA sequencing doctors could devise his treatment. Steve jobs called it Jumping from one lily pad to another, he also added Im either going to be one of the first to be able to outrun a cancer like this or I am going to be one of the last to die from it.
  • 11. MESSINESS A POSITIVE FEATURE. Messiness refers to the simple fact that the likelihood of errors increases as you add more data points. It can also refer to the inconsistency of formatting, for which the data needs to be cleaned before being processed. It deals with information at Macro levels where scale is a huge factor, hence we can accept some messiness. We can sacrifice a bit of accuracy in return of knowing the general trend. Its application in Natural Language processing (Google Translate). The Billion Prices project by MIT Scientists and Analysts.
  • 12. CORRELATIONS Finding the relationship between the available data can give us greater insights into the behavior of the entity generating the data. At its core correlation quantifies the statistical relationship between data values. A strong correlation means that when one of the data value changes the other is highly likely to change as well. Correlation help us to capture the present and predict the future. The ability to predict with a certain likelihood is extremely valuable. Amazons Recommendation System found correlation in its consumer purchases and their future buys using machine learning. Walmarts innovative Sales practices making it the Worlds largest retail chain.
  • 13. AN EXPERT VIEW ON RELATIONAL INSIGHTS.
  • 14. DATAFICATION Datafication refers to put data in a quantified format so it can be tabulated and analyzed. Digitization of all data in its various formats and collecting them in structured formats helps us in Big Data analysis. Analyzing these data gives us useful insights into the behavioral shifts of customers. Videos and photographs when digitized can help us to gain insights into the behavior patterns of the users. Google Translation service digitized 95 Billions of lines available from every possible book it could access and created a robust and freely accessible database for searching.
  • 15. DATA IS PRICELESS Value of Data is immense, the preconception that existed was that once a data is used it loses its value and is redundant but with Big Data analysis tools this data can be reused and can be worth Billions of Dollars, as will be evident from the next example. Amazons partnership with AOL to improve its e commerce website. Facebooks behavioral analysis. The concept of reCAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) by Luis von Ahn. The commercial valuations of a Company like Facebook, WhatsApp is solely on the terms of Data it can acquire.
  • 16. DATA IS PRICELESS(CONTINUED) By typing a reCAPTCHA image you not only identify yourself as a human but you also decipher optical images from The New York Times and other books on Google Books. Unknowingly you are helping google to Digitise its library. Saving Google around 1 Billion USD every year.
  • 17. THE DARK SIDE OF BIG DATA It paralyzes the privacy of online users and their data. With more and more people expressing themselves on the internet, their data is subject to misuse. A world dominated by Big Data can lead to a situation where Data will act as a Dictator and real human insights might be compromised. As Big Data only gives us the answer to What and not How the real reasons behind the happenings might be wrongly interpreted which can jeopardize the very purpose of analysis altogether. Companies which are sitting on a treasure trove of data like Google and Facebook can manipulate and monopolize the use of data and in the process harm the general consumers.
  • 18. TAMING THE BULL Technology is changing at an incredible pace and the government and the netizens have not been able to anticipate this and act accordingly. The government needs to act on the changing cyber demographics and keep updating their Cyber Laws which will be in sync with the prevalent technological loopholes. Freedom of speech is a constitutional guarantee but this right comes with a responsibility. People share information keeping in mind nobody misuses it but in the age of Big Data the consent of using the data for Big Data analysis lies with the Companies, by changing the rules and empowering the user for consenting to information sharing is a way in the right direction.
  • 19. TAMING THE BULL..(CONTINUED) In every field be it Nuclear Technology to bioengineering, we first build tools that we discover can harm us and only later set out to devise the safety mechanisms to protect us from those new tools. In case of big data as well these issues need to be addressed. Our task is to appreciate the hazards of this powerful technology, support its development and seize its rewards.
  • 20. THE FUTURE STARTS TODAY! Ecommerce and all Business domains are on the verge of a big shift driven by big data and intelligent technologies. This shift is towards a more efficient, personalized, even automated customer journey. Emerging personalization tools are designed to mimic the brain, leveraging neural networks and deep learning.
  • 21. THE FUTURE STARTS TODAY!..(CONTINUED) Video surveillance can gain a much wider application with the addition of behavioral analysis algorithms which can help retail stores to step up sales. Data recorded from sensors can be analysed and used in systems like Anti Theft car, floor pressure mapping systems. Addition of Artificial Intelligence to Big Data analysis can not only answer the What but also give answers to How things happen.
  • 22. HOW DOES IT HELP THE STUDENTS! The last decade of IT industry was mostly driven by Technology but this decade is expected to rise on the back of Information in the form of Big Data. Thus the demand of Data Analysts, Data Scientists is on a rise. It is estimated around 4.4 Million Data Analysts would be required by 2020. Skills required to be a Big Data Professional
  • 23. WANT TO EXPLORE ? Follow this youtube channel :- https://www.youtube.com/user/ibmbigdata Read the Book :- Big Data: A Revolution That Will Transform How We Live, Work, and Think , Author :- Viktor Mayer-Schnberger and Kenneth Cukier.
  • 24. ANY QUESTIONS??
  • 25. BIBLIOGRAPHY https://www.youtube.com/user/ibmbigdata Big Data: A Revolution That Will Transform How We Live, Work, and Think , Author :- Viktor Mayer-Schnberger and Kenneth Cukier. www.google.com www.amazon.com www.en.wikipedia.org/Big_data www.techcrunch.com www.mckinsey.com/insights/big_data_the_next_frontier_for_innovation Articles from IEEE Magazine. www.nytimes.com www.mit.edu And many more.
  • 26. A NOTE OF THANKS! We express our heartfelt gratitude to our Faculty members for providing us with this opportunity to get into a new subject and delve deeper into it. Thank you for your patience and time. If you want to download this presentation follow the link www.slideshare.net/ashishmishraoders/big-data-a-technological- revolution