stastic data flows
TRANSCRIPT
-
8/13/2019 Stastic Data Flows
1/46
Data flows from sources like..
Village Level
Mandal Level
District Level
State Level
Region Level
Country Level
Global Level
A database is a collection of records
(or data files) combined and treated
as a unit for information retrieval
Data flows
like flood
water!!
Database is
like a
Check
dam!
-
8/13/2019 Stastic Data Flows
2/46
StatisticalDatabases!
We have statistical databases on
various aspects like Food grains
Blood Banks
Tax Payers
Agricultural Output Share Market and many more..
The DATA should be converted intoinformation (reports) by applying
Data Analysis Tools
-
8/13/2019 Stastic Data Flows
3/46
Examining data for its relevance
Preparation of tables
Graphic display of information Estimating the unknown
Example: Agricultural output by Crop-
cutting experiments
Establishing functional relationship
between causes and effect
Computing the Growth rates
Understanding the Trends and making
forecasts and many more!
Preparing a document stating the
methodology and interpreting the results
What is Data
Analysis?
MakingFigures
Speak
(the truth!)
-
8/13/2019 Stastic Data Flows
4/46
The Common and Old Method
Physical counting of cases from data sheets
Hand Calculations
Reference to Statistical Books for formulae
Bypassing complex calculations and
reporting the easy-to-do things alone!How to do?
The Contemporary Method
Get data into the computer Use a statistical software
Prepare document using a Word Processor
-
8/13/2019 Stastic Data Flows
5/46
-
8/13/2019 Stastic Data Flows
6/46
A new health insurance schemeis
introduced by a company for its
employees
The management wishes to know the
reaction of its employeesto the new
scheme
Opinions were collected from 50employees on several aspects like
Age, Gender, Marital Status,
Education level, Presentarrangements for health check
up, monthly incomeand Concept
Rating.
A survey
on health
insurance
-
8/13/2019 Stastic Data Flows
7/46
A questionnaire has been designed and
used for collecting data
Opinions were sought on a five point
scale (multiple choice-tick one only)
Coding of responsesis as follows.
Extremely interested 5
Interested 4
Indifferent 3
Not interested 2
Not at all interested 1
Collection
of data
with
suitable
coding
-
8/13/2019 Stastic Data Flows
8/46
Age(initially no coding )
actual years
Gender Male M
Female F
Marital Status
Married M
Single S
Monthly income
Less than Rs.1000 1 Rs.1000 to Rs.2999 2
Rs.3000 to Rs.4999 3
Rs.5000 & above 4
Coding for
personal
factors
-
8/13/2019 Stastic Data Flows
9/46
Education
Below Higher Secondary 1
Higher Secondary 2
Graduation 3
Post-graduation 4 Present Arrangement
Private doctor-own expenses 1
Government/Corporate Hospitals 2
Partial reimbursement 3
Full reimbursement 4
Coding
forpersonal
factors
-
8/13/2019 Stastic Data Flows
10/46
Analysis is based on the questions for
which the data is expected to provide
answers
Analyzethe Data!
Some questions
Identify how many are interested in the
new scheme and how many are either
indifferent or not interested Cross tabulate them along Gender, Age,
Education, marital status etc
Is there any relationship between the
income level and the type of response?
Identify the factors influencing the
adoption to new scheme?
What else the data speaks!
-
8/13/2019 Stastic Data Flows
11/46
Data Entry
-The First Step
Analysis with Software
The Second Step
-
8/13/2019 Stastic Data Flows
12/46
The data collected from the field contains
filled-in questionnaires or sheets
Each sheet must have a serial number
The sheets should be converted into a data
filefor use in computer
We can probably dividethe work and makemore than one file and assign the work to
Data Entry Operators
The Data Entry Designshould be well
planned and be common for all operators
These data files can bepooled upif
necessary to make aproject-data-file
The
physicalstructure
of data
-
8/13/2019 Stastic Data Flows
13/46
TAKING
DATA FROM
BOOK TO
COMPUTER
Data should be arranged as separate records one for each
individual (entity)
The data should be numeric for carrying out any analysis
Names and other labels will not go in for analysis but can be
used for reporting
Suitable coding should be defined before entering data in the
computer
-
8/13/2019 Stastic Data Flows
14/46
There are many packages for data entry
like..
FoxPro Lotus
MS-Excel
MS-Access
Oracle On-line formats
Software
for data
entry and
data
analysis Packages for Statistical Analysis
SPSS
SAS
MINITAB
SYSTAT
-
8/13/2019 Stastic Data Flows
15/46
-
8/13/2019 Stastic Data Flows
16/46
MAKING ADATA FILE
Open Excel
On the title bar of the Excel windowthe file name appears as MicrosoftExcel Book1
It usually contains three sheetsnamed Sheet1,Sheet2 and Sheet3
In Sheet1 start entering the datafrom cell A1
Reserve the f i rst rowfor columnheadings like Sno, Age, Gender etc
Key in the data row wise or columnwise (press ENTER key after eachentry)
Save the file with a suitable name in
a Folder meant for this project
-
8/13/2019 Stastic Data Flows
17/46
A SAMPLEDATA SHEET
File Name: FoodFolder: D:\Statman
-
8/13/2019 Stastic Data Flows
18/46
NOT THE
CORREC
T STYLE
OF DATA
ENTRY
-
8/13/2019 Stastic Data Flows
19/46
THE RIGHTWAY!
-
8/13/2019 Stastic Data Flows
20/46
DATA SHEET OF HEALTH INSURANCE
-
8/13/2019 Stastic Data Flows
21/46
ANALYTIC
AL
FEATURE
S
IN EXCEL
Finding sums
Data sorting and Filtering
Making one dimension tables
Cross tabulations
Creating different types of graphs
Making abstracts from worksheets
Changing the styles of presenting data
Linking Excel report to a document
-
8/13/2019 Stastic Data Flows
22/46
SOME
TIPS
INDATA
HANDLIN
G
Selecting a part of data
Sorting
Filtering
Column width
Cut, Copy & Paste
Auto Fill
Paste Special
Freeze PanesExporting Excel data to Word
-
8/13/2019 Stastic Data Flows
23/46
D T
N LYSISP K
A free package of simple statistical tools is available in Excel
It is called Data Analysis Pak
It provides for analyses like
Summary statistics
Comparison of groups
Correlations
Regression analysis
Statistical tests of hypothesis
..and many more
-
8/13/2019 Stastic Data Flows
24/46
Data
Prepared
In
Word
Table
S NO NAME GENDER CAS TE ENGLIS H MATHS S CIENCE
1 RAJA. M B SC 60 27 45
2 ANITHA. R G SC 55 44 36
3 NEELIMA. K G ST 46 54 654 SIVARAJAN. A B OC 35 47 28
5 MUTHU. B G OC 20 46 35
6 GOPAL.R B OC 54 50 45
7 BEENA. A G BC 63 46 64
8 ACHUTAN. S B BC 54 52 65
9 PRADEEP.M B BC 35 40 54
10 PERUMAL. S B OC 25 36 45
11 VARADAN. D B OC 28 40 38
12 DIVYA. T G BC 64 56 37
13 VASUMATHI. D G BC 37 45 54
14 ANDAL. B B SC 63 44 36
15 JAYA. L G ST 56 52 63
16 RAMAN. N B BC 45 48 54
17 MUREGESH. M B ST 50 46 68
18 GANESH. L B ST 35 38 65
19 SASIKALA. R G BC 52 50 54
20 VALLI. M G SC 41 55 58
It is enough to copy the Word Table and
Paste in Excel!
-
8/13/2019 Stastic Data Flows
25/46
We have
got it
in Excel!
-
8/13/2019 Stastic Data Flows
26/46
Can we
make
a table of
counts(frequencies)
from this
data?
Soft
Skill
WHY NOT ?
USE
PIVOT TABLESOPTION
-
8/13/2019 Stastic Data Flows
27/46
Make
Frequency
Tables!
You can make one-way and two-way frequency tables fromExcel sheet
Use Data menu and select the Pivot Table and Chartsub
menu
Follow the Wizard steps
You will get the required tables
Skill
Freq enc distrib tion of st dents
-
8/13/2019 Stastic Data Flows
28/46
Frequency distribution of students
by caste (one-way table)
Count of SNO
CASTE Total
BC 7OC 5
SC 4
ST 4Grand Total 20
Frequency distribution of students
-
8/13/2019 Stastic Data Flows
29/46
Frequency distribution of students
by Caste and Gender (two-way table)
Count of SNO GENDER
CASTE B G Grand Total
BC 3 4 7
OC 4 1 5SC 2 2 4
ST 2 2 4
Grand Total 11 9 20
Can we do this with hand calculations if
there are thousands of cases?
Not impossible but difficult to do!
-
8/13/2019 Stastic Data Flows
30/46
Can wemake a
Frequency
table withgiven class
intervals?
Soft
Skill
USE
STATISTICAL
FUNCTIONS
CERTAINLY !
-
8/13/2019 Stastic Data Flows
31/46
Built-in
Functions
In Excel
ENGINEERING FUNCTIONS
-
8/13/2019 Stastic Data Flows
32/46
Built-in
Functions
In Excel
STATISTICAL FUNCTIONS
-
8/13/2019 Stastic Data Flows
33/46
AQUIRE
SKILL
BY
DOING
DEMO FOLLOWS..
M ki F T bl
-
8/13/2019 Stastic Data Flows
34/46
Making a Frequency TableBody length (cm) of 120 fish
16.7 12.6 15.1 13.4 16.7 17.7 14.6 18.0 15.8 14.8
16.9 13.7 16.0 14.4 15.3 16.4 12.8 11.5 13.4 16.0
14.3 18.3 18.3 16.6 13.2 17.5 16.9 15.2 14.0 17.7
13.8 13.2 13.7 18.4 17.1 13.9 20.5 13.2 14.9 17.4
16.9 15.0 17.2 14.5 13.6 16.6 13.0 17.9 18.8 17.915.3 18.9 14.8 16.0 18.5 13.3 19.2 16.2 14.4 17.8
15.6 18.0 15.8 15.7 20.6 13.5 16.3 15.1 14.3 10.7
15.6 15.4 12.6 15.4 17.2 15.1 14.1 13.1 15.4 13.5
12.7 14.1 12.2 16.6 17.0 15.6 14.7 18.7 18.3 13.2
19.5 14.3 16.2 15.9 16.8 15.3 17.3 13.1 12.3 17.0
16.9 12.4 15.4 17.6 16.2 14.4 18.8 13.5 14.2 14.8
12.9 13.5 15.1 14.2 15.3 14.8 15.2 14.4 16.1 18.2
Prepare a frequency table using Excel
-
8/13/2019 Stastic Data Flows
35/46
We use
the Paste
functionFREQUENCY
min 10.7
max 20.6
range 9.9
interval 2lower limit upper limit upper bound (BIN) freq
10 12.0 11.9 2
12 14.0 13.9 26
14 16.0 15.9 43
16 18.0 17.9 3118 20.0 19.9 16
20 22.0 21.9 2
Class freq
10 - 12 2
12 - 14 2614 -16 43
16 - 18 31
18 - 20 16
20 - 22 2
120
Learn
more by
Do it yourself
-
8/13/2019 Stastic Data Flows
36/46
You can
also
construct
a
Bar Chart
Class freq
10 - 12 2
12 - 14 26
14 -16 43
16 - 18 3118 - 20 16
20 - 22 2
TOTAL 120
-
8/13/2019 Stastic Data Flows
37/46
ADVANCED FEATURES
-
8/13/2019 Stastic Data Flows
38/46
Data
Analysis
Pak
-
8/13/2019 Stastic Data Flows
39/46
Data
Analysis
Pak
-
8/13/2019 Stastic Data Flows
40/46
Thet-test
Sugali Yanadi
20.43 17.7
22.51 21.4
18.99 20.7
20.49 19.3
23.12 21
25.63 17.9
18.08 18.6
20.63 18.5
22.55 18.2
22.43 20.3
22.77
23.23
Body Mass Index of
Tribal Groups
Is the Average BMI
Same for the two
groups ?
-
8/13/2019 Stastic Data Flows
41/46
t-testoutput
Sugali YanadiMean 21.73833 19.36
Variance 4.319215 1.898222
Observations 12 10
Pooled Variance 3.229768Hypothesized Mean Difference 0
df 20
t Stat 3.090767
P(T
-
8/13/2019 Stastic Data Flows
42/46
p-p Plot
-
8/13/2019 Stastic Data Flows
43/46
WIDE RANGE OF APPLICATIONS
Control chartsForecasting
Curve fitting
Solver for optimization
College Admissions
Evaluation of test scores & ranking
and many more!
-
8/13/2019 Stastic Data Flows
44/46
The best way of learning
Excel is to work with Excel
-
8/13/2019 Stastic Data Flows
45/46
Statistics Made Simple -
Do it yourself on PC
By
K.V.S.Sarma
Prentice Hall India
-
8/13/2019 Stastic Data Flows
46/46
Thank you