data science and engineering for marketers

71
@MicahHerstand INNOVATION IN MARKETING: DATA SCIENCE & ENGINEERING

Upload: micah-herstand

Post on 21-Apr-2017

37 views

Category:

Marketing


0 download

TRANSCRIPT

@MicahHerstand

INNOVATION IN MARKETING:DATA SCIENCE & ENGINEERING

@MicahHerstand

Software Engineer, User Advocate, Writer, Actor, Singer-Songwriter

@MICAHHERSTAND

ˈmikə

@MicahHerstand

“Marketing has become a technology-powered discipline, and therefore, marketing organizations must infuse technical capabilities into their DNA.”

~Scott Brinker, MarTech Conference Program Chair

@MicahHerstand

LESSON OBJECTIVES: Theory

Discover how data science enables marketing innovationMeasure, Metric, CSF, KPICustomer segmentationBig Data, Open Data, Linked DataGrowth Hacking

Ensure your org’s data engineering empowers marketersDatabaseSQL (Relational), NoSQL (Document, Graph)Data warehouse

@MicahHerstand

LESSON OBJECTIVES: Setup

Install a database managerSequel Pro for MacMySQL Workbench for Windows & Linux

Connect to your org’s databaseStandard, SSH, SSL

Bookmark SQL helpersSQLZoo (run SQL online!), Tutorials Point, Khan AcademyGoogle queries: “site:docs.oracle.com UNKNOWN TERM”

@MicahHerstand

LESSON OBJECTIVES: Practice SQLCreate a mental model for what it’s like to query SQL using English firstAcquire the vocabulary to understand a SQL queryEncounter example SQL queries and see their resultsPractice your knowledge through exercises

@MicahHerstand

DATA SCIENCEMeasure, Metric, CSF, KPI

Customer segmentationBig Data, Open Data, Linked Data

Growth Hacking

@MicahHerstand

DATA SCIENCE

@MicahHerstand

DATA SCIENCE: Measure, Metric, CSF, KPI

It’s the metrics, stupid!

“The price of light is less than the cost of darkness.”~Arthur C. Nielsen, namesake of Nielsen TV ratings

“What gets measured, gets managed.”“There is nothing so useless as doing efficiently that which should not be done at all.”"Management is doing things right; leadership is doing the right things."~Peter Drucker, the founder of modern management

@MicahHerstand

DATA SCIENCE: Measure

Definition: Anything that can be measuredCaveat: Must be a single variable measure

E.g. № of usersE.g. № of active users

Challenges: Definition of terms

E.g. Does account creation make someone a customer?Measurement process

E.g. How frequently should data be collected?

@MicahHerstand

DATA SCIENCE: MetricDefinition: Value derived from 2+ measures

Metric selection: Efficiency vs effectivenessE.g. cost of customer acquisition vs customer lifetime value

Analysis: Information vs insightsE.g. customer value vs value of customers acquired through LinkedIn

Optimization: Source vs campaignE.g. customers w/ expired CC vs customer bounce rate when CC expired

Caution: Vanity, engagement, and benchmark metricsE.g. Facebook Likes, Time on Page, DVD sales

@MicahHerstand

DATA SCIENCE: CSF (Critical Success Factor)

Definition: What is required to achieve business objectives.E.g. acquire new customers

Prerequisites: Business objectivesE.g. to obtain 10% market share (BO), must acquire new customers (CSF)

More on CSFs: bit.ly/sidata-csf

@MicahHerstand

DATA SCIENCE: KPI (Key Performance Indicator)

Definition: A measurable value that demonstrates how effectively a company is achieving key business objectives

E.g. cost per lead, customer lifetime value, traffic-to-lead ratio, retweets of last ten tweets, landing page conversion rates

Prerequisites: Critical Success FactorsE.g. to acquire new customers (CSF), track those acquired per week (KPI)

Requisites: SMART (Specific, Measurable, Achievable, Relevant, Time)E.g. weekly rate of customer acquisition

Caution: Perverse incentives and unintended consequencesE.g. referral programs to increase customer acquisition

More on KPIs: bit.ly/sidata-kpi

@MicahHerstand

DATA SCIENCE: CSFs vs KPIs

Graphic origin: bit.ly/sidata-kpi-vs-csf

@MicahHerstand

DATA SCIENCE: Prioritization

“Never confuse motion with action.” ~Benjamin Franklin

Graphic Origin: bit.ly/sidata-metrics-graphic

@MicahHerstand

DATA SCIENCEMeasure, Metric, CSF, KPI

Customer segmentationBig Data, Open Data, Linked Data

Growth Hacking

@MicahHerstand

DATA SCIENCE: Customer Segmentation

@MicahHerstand

DATA SCIENCE: Customer Segmentation, 2.0

@MicahHerstand

DATA SCIENCE: Customer Segmentation, 3.0

@MicahHerstand

DATA SCIENCE: Customer Segmentation

Definition: the practice of dividing a customer base into groups of individuals that are similar in specific ways relevant to marketing

E.g. SI grads, New Yorkers, users who have yet to purchaseUtility: One size does not fit all. Allows for novel KPIs.Prerequisites: Business Objectives, Metrics

E.g. Want to gain 10% salon market (Biz Objective), while 25% of total customers are men (metric), target men as it’s an under-saturated market

Types: A priori, Needs-based, and Value-basedCaution: Don’t break the law by targeting protected classes

E.g. AirBnb cannot offer Iranian-Americans discounts for Nowruz

More on KPIs: bit.ly/sidata-kpi

@MicahHerstand

DATA SCIENCEMeasure, Metric, CSF, KPI

Customer segmentationBig Data, Open Data, Linked Data

Growth Hacking

@MicahHerstand

DATA SCIENCE: Big Data, Open Data, Linked Data

@MicahHerstand

DATA SCIENCE: Big Data, Open Data, Linked Data

"Big Data will spell the death of customer segmentation and force the marketer to understand each customer as an individual.”~Ginni Rometty, CEO, IBM

"Google only gives you answers for questions people have asked before.”“A mark of a good site is realizing you're not the only site in the world.”~Tim Berners-Lee, inventor of the World Wide Web

@MicahHerstand

DATA SCIENCE: Big Data

@MicahHerstand

DATA SCIENCE: Big DataDefinition: Data sets that are so large or complex that traditional data processing applications are inadequate to deal with them.Technical Challenges:

Volume (amount of data)Velocity (speed of data in and out)Variety (range of data types and sources)

Human Challenges:No magic bullets, easy to overstate current capabilities

Novel Opportunities:Real-time pricing, Sentiment analysis, Optimized offers

@MicahHerstand Designed by Forrester Research, accessed at bit.ly/sidata-bigdata

@MicahHerstandDesigned by Forrester Research, accessed at bit.ly/sidata-bigdata

@MicahHerstandDesigned by Forrester Research, accessed at bit.ly/sidata-bigdata

@MicahHerstand

DATA SCIENCE: Open Data

Definition: Data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. “Free as in speech, not beer.”

E.g. data.gov, census.govAlternate Definition: Public or private data stores available for integration into one’s own data system.

E.g. developer.nytimes.com, Thomson ReutersChallenges:

Low cost, high quality, and large quantity—pick twoData normalization (e.g. gender and sex, China bowls vs China country)

@MicahHerstand

DATA SCIENCE: Linked Data

Definition: A method of publishing structured data so that it can be interlinked and become more useful through semantic queries.

E.g. Facebook’s Open Graph, Google Rich Snippets, Twitter CardsNovelty: Data sources share schema so no middleware necessaryChallenges:

Comparatively few data sourcesData analysis tools less matureFewer trained developers

"Marketing department might want to dominate the Linked Data web.”~Ralph Swick, COO of the W3C, organization responsible for World Wide Web standards

@MicahHerstand

DATA SCIENCE: Linked Data

"When companies post data as Linked Data they can be held accountable. Regex has [fuzzy] responsibility.”

~Ralph Swick, COO of the W3C, organization responsible for World Wide Web’s technology standards

Accessed March 8th, 2017

@MicahHerstand

DATA SCIENCEMeasure, Metric, CSF, KPI

Customer segmentationBig Data, Open Data, Linked Data

Growth Hacking

@MicahHerstand

DATA SCIENCE: Growth Hacking

Graphic origin: bit.ly/sidata-gh-cartoon-2

@MicahHerstand

DATA SCIENCE: Growth Hacking

Graphic origin: bit.ly/sidata-gh-cartoon-3

@MicahHerstand

DATA SCIENCE: Growth Hacking

“Growth hackers are a hybrid of marketer and coder.”“[Growth hacking] requires a blurring of lines between marketing, product, and engineering, so that they work together to make the product market itself.”~Andrew Chen, Head of Rider Growth at Uber

“The true unicorns are those who can go end-to-end designing, building, measuring, analyzing, and iterating with a combination of user intuition and deep analytics.”~Matt Humphrey, Sold his startup HomeRun for $100M+ after 18 months

@MicahHerstand

DATA SCIENCE: Growth Hacking

Definition: A process of rapid experimentation across marketing channels and product development to identify the most effective, efficient ways to grow a business.

E.g. Airbnb cross-listing on CraigslistNovelty: Interdisciplinary skills and knowledgePrerequisites: Interdisciplinary teams, acceptance of failure, outside-the-box thinkingRequisites: Measurable, metric-based

@MicahHerstand

DATA SCIENCE: Growth Hacking Example

@MicahHerstand

DATA SCIENCE: Growth Hacking Example

@MicahHerstand

DATA ENGINEERINGDatabase

SQL (Relational)NoSQL (Document, Graph)

Data warehouse

@MicahHerstand

DATA ENGINEERING: Database

@MicahHerstand

DATA ENGINEERING: Database

Definition: A collection of structured data, organized for rapid search by an automated computer program.

Novelty: List or calculate data from various sourcesE.g. How much revenue has been made by sales from customers whose first visit was referred by a Facebook ad?E.g. How many customers (who have made at least $100 in purchases total) have used our referral program?

@MicahHerstand

DATA ENGINEERING: DatabaseStructured data

Primarykey

@MicahHerstand

DATA ENGINEERING: Levels of structure

Graphic Origin: http://5stardata.info/

@MicahHerstand

DATA ENGINEERING: Database Keys

@MicahHerstand

DATA ENGINEERING: Database Security

@MicahHerstand

DATA ENGINEERINGDatabase

SQL (Relational)NoSQL (Document, Graph)

Data warehouse

@MicahHerstand

DATA ENGINEERING: Relational Database

Definition: A type of database that organizes data into tables (think spreadsheet) and creates clearly defined relationships between those tables.

E.g. SQL (MySQL, PostgreSQL, SQLite, Oracle Database, MS SQL)SQL is a programming language that lets people setup relational database as well as add, update, delete, and lookup data within them.

Novelty: Up-front schema, data integrity checks, transactions.E.g. ensure a movie cannot be added without an associated director

Challenges: Large datasets and an evolving schema are difficult to manage.E.g. you want to track customers’ age, then decide not to, then decide to track gender as a binary, then decide to make gender a free-text option…

bit.ly/sidata-sql-vs-nosql

@MicahHerstand

DATA ENGINEERING: Relational Database

Foreignkey

Movies

@MicahHerstand

DATA ENGINEERING: Relational Database

Movies

Directors

@MicahHerstand

DATA ENGINEERINGDatabase

SQL (Relational)NoSQL (Document, Graph)

Data warehouse

@MicahHerstand

DATA ENGINEERING: NoSQL Databases

Definition: A database that is not a relational database. (NoSQL is colloquial jargon, not a standard)

E.g. MongoDB, Redis, Couchbase, neo4jNovelty: No schema required to store data. Easily scalable. Super fast lookups.

E.g. easy to track customers’ age, then decide not to, then decide to track gender as a binary, then decide to make gender a free-text option…

Challenges: Data integrity, stable transactions.E.g. cannot ensure a director is always included when adding a movie

bit.ly/sidata-sql-vs-nosql

@MicahHerstand

DATA ENGINEERINGDatabase

SQL (Relational)NoSQL (Document, Graph)

Data warehouse

@MicahHerstand

DATA ENGINEERING: Data warehouse

@MicahHerstand

DATA ENGINEERING: Data warehouse

Definition: a computer system optimized for analytical and informational processing that is filled with data copied from both inside and outside the enterprise

E.g. a database with both a sales table and a google analytics table and a census table.

Novelty: analyze business data without affecting day-to-day operationsE.g. you want to see employee clock-in times without preventing them from simultaneously clocking out.

Challenges: large overhead and maintenance costs without being necessary

@MicahHerstand

DATABASE SETUPDatabase manager application

Database ConnectionsSQL Helpers to Bookmark

@MicahHerstand

DATABASE SETUP: DB Manager Application

Definition: A graphical user interface that simplifies database interactions for developers

Examples:Sequel Pro for Mac: bit.ly/sidata-macMySQL Workbench for Windows & Linux: bit.ly/sidata-not-macPHPMyAdmin for web access

@MicahHerstand

DATABASE SETUPDatabase manager application

Database ConnectionsSQL Helpers to Bookmark

@MicahHerstand

DATABASE SETUP: Database connections

Unsecured Connections are often called “standard” and require no setup besides the application you just downloaded

Secured Connections can use SSH or SSL and require additional encryption technology to be installed on your computer.

Your company should have documentation on how to use these.

@MicahHerstand

DATABASE SETUP: DB Connection Info

Server: www.herstand.comUser: sistudentsPassword: Hf68S9CpK67RUDV3Database: simoviesPort: 3306 (default MySQL port)

@MicahHerstand

DATABASE SETUPDatabase manager application

Database ConnectionsSQL Helpers to Bookmark

@MicahHerstand

DATABASE SETUP: SQL Helpers to Bookmark

Learn: TutorialsPoint.com, KhanAcademy.com Play: SQLZoo.net (run SQL online!)Cheatsheet: bit.ly/sidata-sql-cheat-sheetCheatsheet with examples: bit.ly/sidata-cheat-with-examplesRTFM: bit.ly/sidata-mysql-rtfm

@MicahHerstand

PRACTICE SQLEnglish queries

VocabularyStock SQL queries

Exercises

@MicahHerstand

PRACTICE SQL: English queries

Questions SQL can answer: Who, What, Which, Where, When, How ManyE.g. Who directed the film Get Out?E.g. Who acted in the film Get Out?E.g. What films were released before Jan 1, 2000?E.g. Where did the director of Get Out go to college?E.g. Which colleges had the most graduates direct films since Jan 1, 2000.E.g. When was Get Out released?E.g. How many actors were in both Get Out and The West Wing?

@MicahHerstand

PRACTICE SQLEnglish queries

VocabularyStock SQL queries

Exercises

@MicahHerstand

PRACTICE SQL: VocabularySyntax, . ; ( ) “ ” *

VerbsSELECTINSERTUPDATEDELETE

Query PartsASFROMWHEREHAVINGORDER BY GROUP BY

FiltersLIKENOT> <=!=>=<=ANDORIN%

SortASCDESC

Aggregate FunctionsMIN, MAX, SUM, AVG, COUNT

Advanced FunctionsINNER JOINOUTER JOINREGEXP

@MicahHerstand

PRACTICE SQLEnglish queries

VocabularyStock SQL queries

Exercises

@MicahHerstand

PRACTICE SQL: Anatomy of a QuerySELECT * FROM movies;

Result

@MicahHerstand

PRACTICE SQL: Anatomy of a Query

SELECT FROM movies WHERE ;

title AND release_date

title

COUNT(title) AS num_of_titles

title AND MIN(release_date)

title = “%Star Wars%”

release_date > ‘2000-1-1’

release_date > ‘2000-1-1’ AND title = “%Star Wars%”

title = “Get Out”*

Result

@MicahHerstand

PRACTICE SQL: Anatomy of a Query

SELECT * FROM movies

GROUP BY release_date

titleORDER BY ASCDESC

Result

;

@MicahHerstand

PRACTICE SQLEnglish queries

VocabularyStock SQL queries

Exercises

@MicahHerstand

PRACTICE SQL: Exercises

List all movies and their average rating, the average column should be called 'average'List only the top-rated movieList only the bottom-rated movieWhich user gives the highest rating on average?