big data the next frontier for emerging market

Download BIG DATA The next frontier for emerging market

Post on 23-Mar-2016

49 views

Category:

Documents

2 download

Embed Size (px)

DESCRIPTION

BIG DATA The next frontier for emerging market. USC CSSE Annual Research Review March 14, 2013 Rachchabhorn Wongsaroj Bank of Thailand, Visiting Scholar @ USC. Outline. Current situation What is big data? Why big data is important? Big data cases Research challenges - PowerPoint PPT Presentation

TRANSCRIPT

Slide 1

BIG DATAThe next frontier for emerging marketUSC CSSE Annual Research ReviewMarch 14, 2013

Rachchabhorn WongsarojBank of Thailand, Visiting Scholar @ USC1OutlineCurrent situationWhat is big data?Why big data is important?Big data casesResearch challengesBig data in ThailandFuture research2Current SituationData Quantity Data Quality Data VarietyData Timeliness

Lots of data is being created & collectedGlobal data

Problems3Big Data = Volume, Variety and Velocity

Volume

People to PeoplePeople to Machine

Machine to MachineVarietyVelocity

What is big data? 8 Billion messages/day845M active users 340Million Tweets/day140M active users

20 Hours of video uploaded every minute

Source: Gartner & IBM4

Emerging Technologies Hype Cycle 2011 (Gartner)Why big data is important?5

Why big data is important?Emerging Technologies Hype Cycle 2012 (Gartner)6

Source: McKinsey Global Institute AnalysisWhy big data is important?Why big data is important?Big data can generate significant financial value across sectorsUS Health Care

$300 billion value/year 0.7 % annual productivity growthEurope Public Sector Administration

250 billion value/year 0.5 % annual productivity growthGlobal Personal Location Data

$100 billion +revenue for service providerUp to $700 billion value to end usersUS Retail

60+% increase in net margin possible0.5-1.0 % annual productivity growthManufacturing

Up to 50% decrease in product developmentUp to 7% reduction in working capitalSource: McKinsey Global Institute Analysis8$165BClinical$47BAccountHealth Care sector has potential to invest $300BSource: US Department of LaborBusiness Model aggregation of patient records, online platform and communities2% $5BPublic health surveillance and response systems3% $9BAccounts advanced fraud detection: performance based drug pricing14% $47BR&D personalized medicine, clinical trial design32% $108BClinical transparency in clinical data and clinical decision support 49% $165BWhy big data is important?$108BR&D9Cases Data sources / TechniquesOutputGoogle patient search data, Predictive Model, etc.Hospitalization pattern,Customized insurance

Advanced analytic solutionsProcess time reductionCustomer transactions

Customer defection predictionTrading transactions & IP addressPossible Frauds, Financial Bubble, Money LaunderingReal time people & location dataCrime and terrorist preventionProduct search pattern,social mediaWebsite outage/peak time support, Travel trend and patternBig data cases

10FunctionBig data retail leverMarketing Cross-selling Location based marketing In-store behavior analysis Customer micro-segmentation Sentiment analysis Enhancing the multichannel consumer experienceMerchandising Assortment optimization Pricing optimization Placement and design optimizationOperations Performance transparency Labor inputs optimizationSupply Chain Inventory management Distribution and logistic optimization Informing supplier negotiationsNew Business Model Price comparison services Web-based marketsSource: McKinsey Global Institute AnalysisResearch ChallengesCustomer micro-segmentationSentiment analysisPerformance transparencyLabor inputs optimizationPrice comparison servicesResponsive and more Personalized Government

Extract public opinion on policiesLarge-scale technology enabled policye.g. Smart City, personalized healthcare service, preventive/ intelligence surveillancePersonalized policiesMostly as tool for policy implementation

sector 088263143211LanguageCost of implementationMagnitude of dataDemographic data generatorData type

Challenges Big data in Thailand

12Big data in ThailandLanguage (natural language processing)

no space between words Combination between Thai Foreign languages Lack of Thai text analytic componentsExample

September, 2012 Infographic : Thailand Digital Statistic Sep 2012 Thai population 64,076,033, female 32,546,885, male 31,529,148 , live in BKK 8-10M (registration record 5,674,843)People use internet 25M, search engine use is Google 99% , frequency 19.2M per dayBrowser use is Internet Explorer 44%, Google Chrome 31%, Mozilla Firefox 14% and Safari 9%Local bandwidth 1,006,140Mbps, Overseas bandwidth 40,860Mbps Top Website: Sanook.com Kapook.com Mthai.com alexa Facebook.com google.co.th Facebook Facebook 16 1 8,682,940 16,403,280 Facebook Pages iPhone 5 twitter 909,631 ( 15 2555) @khunnie0624 twitter WoodyTalk :http://www.it24hrs.com/2012/thailand-digital-statistic-internet-user/

13Big data in ThailandCost of implementation

13 Big data vendors in 2013Hadoop :

Requires: ~$1 million between 125 and 250 nodes Distribution: Annual costs: ~$4,000 per node-> A small fraction of an enterprise data warehouse $10-$100s of millions.

September, 2012 Infographic : Thailand Digital Statistic Sep 2012 Thai population 64,076,033, female 32,546,885, male 31,529,148 , live in BKK 8-10M (registration record 5,674,843)People use internet 25M, search engine use is Google 99% , frequency 19.2M per dayBrowser use is Internet Explorer 44%, Google Chrome 31%, Mozilla Firefox 14% and Safari 9%Local bandwidth 1,006,140Mbps, Overseas bandwidth 40,860Mbps Top Website: Sanook.com Kapook.com Mthai.com alexa Facebook.com google.co.th Facebook Facebook 16 1 8,682,940 16,403,280 Facebook Pages iPhone 5 twitter 909,631 ( 15 2555) @khunnie0624 twitter WoodyTalk :http://www.it24hrs.com/2012/thailand-digital-statistic-internet-user/

14

44%31%14%9%Big data in ThailandOverseas Bandwidth 405,860 Mbps

Local Bandwidth (.th, or.th, etc) 1,006,140 MbpsMagnitude of dataAs of September 2012

25% use smart phone8% use tablet

60% use Local BandwidthSeptember, 2012 Infographic : Thailand Digital Statistic Sep 2012 Thai population 64,076,033, female 32,546,885, male 31,529,148 , live in BKK 8-10M (registration record 5,674,843)People use internet 25M, search engine use is Google 99% , frequency 19.2M per dayBrowser use is Internet Explorer 44%, Google Chrome 31%, Mozilla Firefox 14% and Safari 9%Local bandwidth 1,006,140Mbps, Overseas bandwidth 40,860Mbps Top Website: Sanook.com Kapook.com Mthai.com alexa Facebook.com google.co.th Facebook Facebook 16 1 8,682,940 16,403,280 Facebook Pages iPhone 5 twitter 909,631 ( 15 2555) @khunnie0624 twitter WoodyTalk :http://www.it24hrs.com/2012/thailand-digital-statistic-internet-user/

15Big data in ThailandDemographic data generator

39% of population use Internet85.9% of data is created by Internet users age 6-24 Population65MInternet users25MMost data are from young generations16Only 2.12% focus on Education

Source: http://www.prd.go.th/ewt_news.php?nid=23168Big data in Thailand

Types of data limited Big data technique application

17Bank of Thailand (BOT)Website As is

Financial institutionBOT data (Internet/Extranet)DB 1DB 2DB3Manual CheckingTemplate InputManual SubmitBTWS WorkingBOT Website

Auto SubmitSource: Bank of ThailandProblems Too many steps Once due - act first, fix later Too many stakeholders Bureaucracy management styleBOT data website As is

Source: Bank of ThailandInputDataComplexValidationCrossValidationManual CheckQueryData (BO)InputTemplateManual SubmitWebsiteApproveManual CheckingTimelinessRevision PolicyAccuracy & ReliabilityVolumeVarietyVelocityBOT data website To be

Source: Bank of ThailandInputDataComplexValidationCrossValidationSystemCheckingSystemWarningSystemApproveWebsiteApproveManual CheckingAccuracy & ReliabilityFuture researchData quality managementToolsTemplateChecklistProcess

21ReferenceBig Data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute AnalysisUnderstanding Big Data: Analytic for Enterprise Class Haddop and Streaming Data, IBMGartner ReportThailand National Statistic OfficeThailand Digital Statistic SourceBank of Thailand (www.bot.or.th)

22BIG DATA The next frontier for emerging marketRachchabhorn WongsarojBank of ThailandVisiting Scholar @ USCThank you Q & A

23

Google File System (GFS)Map Reduce (MR) programming model Use Google Big data infrastructure from papers GFS Hadoop Distributed File System (HDFS) MapReduce Hadoop MapReduceetc.Pig (Yahoo!), Jaqi (IBM), Hive (Facebook)Mostly use MR technique (Pig >60%, Hive QL >90%)Higher-Level Languages

Cosmos, Dryad, DryadLINQ, SCOPE (Bing)Big data technology Big data 24Relational DBMSs Versus Map Reduce/HadoopRelational DBMSsMap Reduce/HadoopProprietary, mostlyOpen sourceExpensiveLess expensiveData requires structuringData does not require structuringGreat for speedy indexed lookupsGreat for massive full data scansDeep support for relational semanticsIndirect support for relational semantics, e.g., HiveIndirect support for complex data structuresDeep support for complex data structuresIndirect support for iteration, complex branchingDeep support for iteration, complex branchingDeep support for transaction processingLittle or no support for transaction processingSource: Kimball Group, April 2011Big data technologyBig data exceeds the reach of commonly use hardware development and software tools to capture, manage, and process it with in a tolerable elapsed time for its user populations (Teradata Magazine article, 2011)Big data refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze (The Mckinsey Global Institute, 2011)What is big data?

Why big data is important?

View more >