How to evolve your analytics stack with your business using Snowplow

Download How to evolve your analytics stack with your business using Snowplow

Post on 12-Apr-2017



Data & Analytics

1 download


EVOLVING YOUR ANALYTICS STACK WITH YOUR BUSINESSSNOWPLOW - LONDON MEETUP #4SNOWPLOW - LONDON MEETUP #4BUSINESSES ARE CONSTANTLY EVOLVING Your products (apps & platforms) change Your questions should change too Its critical that the analytics stack can evolve with your businessSNOWPLOW - LONDON MEETUP #4SELF-DESCRIBING DATA EVENT DATA MODELING+EVOLVING EVENT DATA PIPELINEHOW?SELF-DESCRIBING DATAPART 1SNOWPLOW - LONDON MEETUP #4NO TWO COMPANIES ARE ALIKESNOWPLOW - LONDON MEETUP #4DEFINE YOUR OWN EVENTS AND ENTITIESEventsEntities Build castle Form alliance Declare war Player Game Level Castle View product Buy product Deliver product Product Customer Basket Vehicle "description": "Schema for a fighter context", "vendor": "com.ufc", "name": fighter", "version": 1-0-2, "properties": { "FirstName": {"type": "string"}, "LastName": {"type": "string"}, "Nickname": {"type": "string"}, "FacebookProfile": {"type": "string"}, "WeightLbs": {"type": ["integer", "null"]}, "Record": {"type": string", "pattern": "^[0-9]+-[0-9]+-[0-9]+$"} } }SNOWPLOW - LONDON MEETUP #4YOU THEN DEFINE A SCHEMA FOR EACH EVENT AND ENTITYI DONT DO EVENTS THAT ARENT SCHEMAEDSNOWPLOW - LONDON MEETUP #4YOU THEN DEFINE A SCHEMA FOR EACH EVENT AND ENTITY "schema": "iglu:ufc/fighter/jsonschema/1-0-2", "data": { "FirstName": Daniel "LastName": Cormier, "Nickname": DC, "FacebookProfile": Daniel-Cormier, "TwitterName": dc_mma, "WeightLbs": 205 } }SNOWPLOW - LONDON MEETUP #4THE SCHEMAS CAN THEN BE USED IN A NUMBER OF WAYS Validate the data (important for data quality) Load the data into tidy tables in your data warehouse Make it easy / safe to write downstream data processing application (e.g. for real-time users)EVENT DATA MODELINGPART 2SNOWPLOW - LONDON MEETUP #4WHAT IS EVENT DATA MODELING? Event data modeling is the process of using business logic to aggregate over event-level data to produce 'modeled' data that is simpler for querying.SNOWPLOW - LONDON MEETUP #4MODELED VS UNMODELED DATAevent 1event nUsersSessionsFunnelsIMMUTABLE. UNOPINIATED. HARD TO CONSUME. NOT MUTABLE AND OPINIONATED. EASY TO CONSUME. SNOWPLOW - LONDON MEETUP #4IN GENERAL, EVENT DATA MODELING IS PERFORMED ON THE COMPLETE EVENT STREAM Late arriving events can change the way you understand earlier arriving events If we change our data models: this gives us the flexibility to recompute historical data based on the new modelEVOLVING THE DATA PIPELINEPART 3SNOWPLOW - LONDON MEETUP #4HOW DO WE HANDLE PIPELINE EVOLUTION? Businesses change over time The events that occur are going to change Use of the data will change Insight -> more questions -> more insight -> more questions Two types of evolution: push and pullBUSINESSES ARE NOT STATIC, SO EVENT PIPELINES SHOULD NOT BE EITHERSNOWPLOW - LONDON MEETUP #4PUSH EXAMPLE: If data is self-describing it is easy to add an additional sources Self-describing data is good for managing bad data and pipeline evolution IM AN EMAIL SEND EVENT AND I HAVE INFORMATION ABOUT THE RECIPIENT (EMAIL SNOWPLOW - LONDON MEETUP #4ANSWERING THE QUESTION: 1. EXISTING DATA MODEL SUPPORTS ANSWER2. NEED TO UPDATE DATA MODEL3. NEED TO UPDATE DATA MODEL AND DATA COLLECTIONSNOWPLOW - LONDON MEETUP #4SELF-DESCRIBING DATA AND THE ABILITY TO RECOMPUTE DATA MODELS ARE ESSENTIAL TO ENABLE PIPELINE EVOLUTIONSELF-DESCRIBING DATA RECOMPUTE DATA MODELS ON ENTIRE DATA SET Updating existing events and entities in a backward compatible way e.g. add optional new fields Update existing events and entities in a backwards incompatible way e.g. change field types, remove fields, add compulsory fields Add new event and entity types Add new columns to existing derived tables e.g. add new audience segmentation Change the way existing derived tables are generated e.g. change sessionization logic Create new derived tablesQUESTIONS?SNOWPLOW - LONDON MEETUP #4