stefano grazioli - ask for permission for using/quoting: stefano grazioli
DESCRIPTION
The processes, technologies, and people to turn data into information in order to drive profitable business action. - Wayne Eckerson, TDWI Source: B. WixomTRANSCRIPT
© Stefano Grazioli - Ask for permission for using/quoting: [email protected]
Business Intelligence
Stefano Grazioli
Critical Thinking
Easy Meter
Business IntelligenceThe processes, technologies, and peopleto turn data into informationin order to drive profitable business action.
- Wayne Eckerson, TDWI
Source: B. Wixom
BI and AnalyticsAnalytics is “the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions” (Davenport and Harris – Competing on Analytics)
“BI refers to the general ability to organize, access and analyze information in order to learn and understand the business.” (Gartner)
Analysts cannot find what they need 50% of the times
10-25% of the records have inaccuracies or missing elements
Data frequently misinterpreted Known data loss and theft Most databases implement inconsistent definitions
GIGO: data quality affects the quality of your decisions
Source: T. Redman, Data Driven, 2008
Why is Data Bad?
No one gets up in the morning and says
“I’m going to make lots of errors today”
Find the Data Quality issuesCust ID Name Addr1 Addr2 City State Zip Phone0345 Daniel Steeper 765 Spider Cove New York NY 10012 875-32530346 Mr. Bigg Mr. Bigg’s Wigs, Inc. Cville Virginia 22901 434-567-34550467 MJ Watson 753 45th St Apt 45 New York New York 10024 999-99990488 Carl Zeithaml 34 Sprigg Lane Charlottesville VA 22904 (434)-453-35560499 Danny Steeper 765 Spider Cove New York NY 10012 #875-32530722 Ben Grimm Broad and Main Staunton VA 24403 null0834 Sue Storm 8564 Carver Dr. NYC NY null 212-450-35560853 Daniel Steeper 2345 Benson Rd Los Angeles CA 90210 #875-3253
StateID StateVA VirginiaNY New YorkWY null
null null
Approaches to Data Quality
1. Find and Fix2. Prevent at the
source3. Do nothing (3M)
© Stefano Grazioli - Ask for permission for using/quoting: [email protected]
WINITWhat Is New
In Technology?
© Stefano Grazioli - Ask for permission for using/quoting: [email protected]
HomeworkBusiness Scenario:
Google’s Daily Cagr
Realistic task: You are a financial analyst at a broker firm
Many of our customers invest for short amounts of time on Google. They sell their shares within a few weeks…. I wonder: do they make
any money out of it?
Daily Cagr for Googlefile with ~800customers whobought and sold GOOG within thelast two months.
Three steps (and two homework)1. Clean data: phones, dates2. Compute Daily Cagr = [(final price/initial price)1/days ]-13. Report the Average Daily Cagr across all customers.
Cleaning Phone Numbers From:
#2345348565 To:
(234)-534-8565
When the user presses a button labeled “start”, a file selection windows pops out. The user
selects a .csv file. The file is shown starting at “A1”. The start button becomes invisible.Three more buttons appear: “Clean phone
numbers”, “Format Dates”, and “Compute Daily CAGR”.
UML Activity Diagram - Daily Compound Average Growth of a Security (part I)
Select the next phone no. Count its digits
[Compute]
[Exactly 10 digits]
Next homework
[Clean ph.no]
Highlight the cell in red
Format as(xxx)-xxx-xxxx
& clear highlight if any
[No More Ph.No]
[Format Dates]
A
A
Select the next item
[is a date]
Highlight the cell in yellow
Format asmm/dd/yyyy
& clear highlight if any
[No More items in this column]
A
[No more columns]
Select the next column
Reading a File into EXCEL' store the address of the current active sheet, i.e., the ‘target’ Dim myActiveS As Excel.Worksheet = Application.ActiveSheet' select a file Dim myFile As String = Application.GetOpenFilename()' get the data in a new temporary workbook Application.Workbooks.OpenText(myFile, , , Excel.XlTextParsingType.xlDelimited, , , , , True)' store the address of the temporary workbook Dim myActiveWB As Excel.Workbook = Application.ActiveWorkbook' copy the content from the temporary to the ‘target’ sheet myActiveS.Range("A1:J1000").Value = Application.ActiveSheet.Range("A1:J1000").Value‘ close the temp workbook myActiveWB.Close()
Finding the last non-empty row
Dim lastRow As IntegerlastRow = Cells(Rows.Count,1).End(Excel.XlDirection.xlUp).Row
Suggestions
Video available Give yourself plenty of time Ask questions in class if you do not
understand what is going on