how pepsico's big data strategy is disrupting cpg retail analytics
TRANSCRIPT
How PepsiCo’s Big Data Strategy is Disrupting CPG Retail Analytics
Mike Riegling, Analyst, PepsiCo
presented by:
Will Davis, TrifactaJeff Huckaby, Tableau
Camilo Silva, Hortonworks
Your Presenter
Mike Riegling
Analyst, Customer Supply Chain
Q&A Sessionwith your hosts:
Will Davis
Director of Product Marketing
Jeff Huckaby
Market Segment Director, Retail & Consumer Goods
Camilo Silva
Enterprise Account Manager
4
Industry-leading data wrangling solution for data
analysts
Self-service data exploration & preparation
Supporting desktop, cloud and big data deployments
The Best-of-Breed Analytics StackLeading solutions for data processing, wrangling & visualization
Industry-leading Enterprise Analytics Platform
Governance & Self-service analytics at scale
Deploy on premise, in the cloud, or fully hosted
Future-proof scalable data platform to enable storage and
growth of expanding data
Allows business decisions faster and based on more actionable
insight
Enables corporate success in consumer markets
Agenda
CPFR Data Wrangling & Analytics at PepsiCo – Mike Riegling• CPFR Process at PepsiCo• Challenges Managing Diverse Internal & External Data
• Walkthrough of Trifacta + Tableau
5
Question & Answer• Will Davis - Trifacta• Jeff Huckaby - Tableau
• Camilo Silva - Hortonworks
Analytics Infrastructure at PepsiCo – Will Davis• History of Big Data at PepsiCo• IT/Business Collaboration for Analytics
• Analytics Stack: Hortonworks + Trifacta + Tableau
AnalyticsInfrastructureatPepsiCo
AnalyticsJourneyatPepsiCo
• PepsiCo’s journey with Big Data started over 4 years to respond to ever-increasing data requirements across Pepsi
• Focus on providing technology infrastructure and applications that bring shared success to Business & IT
• Eliminating traditional processes where IT was a bottleneck to the business
• Unified Data Architecture has 3 main pillars:• Enterprise Data Warehouse• Hortonworks Data Lake Environment• Data Discovery, Analytics & Business Intelligence tools
(Trifacta & Tableau)
DataPlatform- Hortonworks
• Selected Hortonworks Data Platform (HDP) as foundational technology to extend PepsiCo’s Unified Data Architecture
• Leveraging HDP to acquire, understand and incorporate new forms of internal/external business and consumer data
• HDP provides the platform capable of scaling up to effectively leverage the rapid growth of more granular consumer data
• Still early days on Hadoop at PepsiCo – only managing hundreds TB’s of data in HDP
• Use cases on Hadoop include CPFR (first use case), Consumer & Marketing Analytics
• Need only standard services to support use cases – Hive, YARN, PIG, etc…
• CPFR use case with Trifacta consumes approximately 25-50% of HDP resources
DataWrangling- Trifacta
• Trifacta was selected as the standard self service data wrangling tool within our data discovery infrastructure.
• Provides PepsiCo users with a familiar, yet powerful portal for data discovery and process development.
• By empowering business users, Trifacta helps bridge across the time and resource boundaries between business and IT
• Enables more rapid deployment of solutions that fit business needs precisely
• Collaborative effort, with both sides open to driving innovation and experimentation, delivers greater speed to shared success
DataVisualization&BusinessIntelligence- Tableau
• Tableau is the data visualization & business intelligence standard at PepsiCo
• Over 2000 users, 59 projects & 541 workbooks across PepsiCo
• 7+ Tableau servers in production environment (each server has 8 cores & 64GB RAM)
• Tableau serves as corporate standard for Business Intelligence throughout PepsiCo on top of EDW as well as self-service analytics for departments and individual analysts
• CPFR use case is completely self-service process for end users to discover and prep diverse data in Trifacta and build dashboards in Tableau (without the help of IT)
Hortonworks +Trifacta+TableauinthePepsico DataArchitecture
Unified Data ArchitectureERP
SCM
CRM
Social Media
Sensor Data
MachineLogs
Marketing
Planning
Data Mining
Analytics
Language
Business Analyst
Data Analyst
Data Scientist
Customer Partners
Frontline Workers
DataSources
Tools and Apps
Users
ENTERPRISE DATA WAREHOUSE
DATA DISCOVERY/ ANALYTICS
BUSINESSINTELLIGENCE
ETL
Data Quality
PepsiCoCPFRAnalysisProcess
CollaborationsTeamVision
“ExpandCollaborationwithCustomersbyLeveragingSharedDatatoEnhanceProcesses,ProvideBestinClassServiceandCreateaCompetitiveAdvantageforPepsiCo”
CPFRPillars
Planning
• Promotions• New item
introductions• Transition execution
Forecasting
• Demand planning visibility
• Promotional lifts and pipeline timing
• Seasonal planning
Replenishment
• Store level inventory management
• Right sized inventory position
• Markdown Reduction
Collaboration
ManagingRetailPartnerRelationships
PepsiCoCPFRTeam
Additional Retail Partners
ImprovingBusinesswithEachRetailer
16
POS Data Shipment History
Promotions Forecast
Orders
PepsiCoCPFRTeam
Production Inventory
Promotions Forecast
Orders
Shipments
17
Forecasting Collaboration Process
Why combine this data together?
• Combining the data into a single master report gives a more accurate overall picture performance
• Promotes collaboration between PepsiCo and the customer
• Traditionally the vendor–retailer relationship was contentious
• Combing PepsiCo data and retailer data helps promote shared success goals
• Through this process there was an increase in the forecast accuracy of PepsiCo which resulted in reduced spoilage for retailers
OriginalProcessforBuildingCPFRForecasts
Last-milestructuring,enrichingandcleansing
Initial structuring, enriching, and cleansing
Business
WhattheProcessLookedLikeinAccess
19
ChallengesLeadingtoHortonworks+Trifacta+TableauSolution
• Data Outgrowing Tools: Existing infrastructure pushed to the limits by the size of the source datasets
• Technical Skills Required: Datasets were connected through a large series of elaborate queries and macros.
• Data Quality Issues: Errors difficult to locate.
• Slow, Manual Process: Build time for one CPFR tool could take months.
PepsiCo’s Hortonworks + Trifacta Solution
21
Business
All structuring, enriching, and cleansing
Hortonworks + Trifacta + Tableau Solution Benefits at PepsiCo
• Business Benefits:– Reporting time has been reduced by 70%– Build time has been reduced as much as 90%
• Technical Benefits: – Can easily work with large quantities of non-standard data– Self-service prep for analysts reduces technical dependencies on IT– Trifacta surfaces errors and data problems immediately to analysts
• PepsiCo CPFR teams can now respond more quickly to sales trends and adjust forecasting and inventory distribution accordingly
DEMO Intro - Trifacta Wrangling Process for Retailer Data
• Structure the third party data– BOH: Balance on Hand or Inventory Data
• Cleanse mismatched values and delimiters – Remove the ‘,’ from values that exceed 1,000
• Extract embedded text/numbers– Split the Customer Item Code and Item description into two separate columns
• Convert the customer Item Code to the PepsiCo UPC– Join the BOH dataset with the Item Reference Dataset and build a new master report
• Run the job at scale and profile the results– Publish to Tableau
Trifacta Sample Workflow
CPFR Dashboard in Tableau
Thanks!
Q&A Sessionwith your hosts:
Will Davis
Director of Product Marketing
Jeff Huckaby
Market Segment Director, Retail & Consumer Goods
Camilo Silva
Enterprise Account Manager
28
Trifacta Wrangler Enterprise for Hadoop
https://www.trifacta.com/gated-form/bringing-hadoop-to-an-analysts-
fingertips/
Empowering CPG to Drive Innovation with
Datahttps://www.trifacta.com/resources/emp
owering-consumer-packaged-goods-organizations-to-drive-innovation-with-
data/
Supporting Resources
About the HortonworksSolution
http://hortonworks.com/solutions/
Try HortonworksSandbox
http://hortonworks.com/products/sandbox/
Big Data Analytics for Retail with Hadoop
http://hortonworks.com/info/big-data-analytics-for-retail-with-apache-hadoop/
Tableau for Big Data Analysis
http://www.tableau.com/resource/big-data-analysis
Faster, Smarter Retail Analytics with Tableau
http://www.tableau.com/resource/big-data-analysis
Thanks for joining!