20140211 02 descriptive statisctics
TRANSCRIPT
-
8/11/2019 20140211 02 Descriptive Statisctics
1/29
Statistika RekayasaTabular & Graphical
Types of data
2
-
8/11/2019 20140211 02 Descriptive Statisctics
2/29
Categorical data (qualitative data)
Categorical DataUse labels or names to identify an attribute of each
element.Use either the nominal scale or ordinal scale ofmeasurement and may be nonnumeric or numeric.
The statistical analysis for qualitative data are ratherlimited
Categorical variable is a variable with categoricaldataExamle!
Car tye "sedan# sort car# $U%# minivan# &'%# and soon(
ðod of ayment "cash# credit card# chec)(
*
Data set for 2 !utual fu"ds
+
-
8/11/2019 20140211 02 Descriptive Statisctics
3/29
#ua"titative data
,uantitative data
,uantitative data are always numeric.
Use either the interval or ratio scale of measurement.-rdinary arithmetic oerations are meaningful onlywith quantitative data.
,uantitative variable is a variable withquantitative data.
Discrete if they are countable data and are collectedby counting. Ex! the number of items
Continuous if they are collected by measuring andare exressed on a continuous scale. Ex! time tofailure of um comonent
#uestio"$ %hats the type of these data'
Tyes of shis
Categorical
/T$' students collecting data on the numberof shis entering inner channel of $urabaya
0est 1ccess Channel in a articular day.
#ua"titative discrete
Time until a fuel oil simlex filter getting
clogged u.
#ua"titative co"ti"uous
-
8/11/2019 20140211 02 Descriptive Statisctics
4/29
Tabular a"d graphical !ethods for
su!!arii"g data
3
Su!!arii"g categoricalqualitative data
4requency Distribution
5elative 4requency
'ercent 4requency Distribution6ar 7rah
'ie Chart
8
-
8/11/2019 20140211 02 Descriptive Statisctics
5/29
*reque"cy distributio"
+ freque"cy distributio" is a tabular
summary of data showing the frequency "or
number( of items in each of several
nonoverlaing classes.
The ob9ective is to provide i"sights about the
data that cannot be quic)ly obtained by
loo)ing only at the original data.
:
,-a!ple$ .arada /""
;
-
8/11/2019 20140211 02 Descriptive Statisctics
6/29
,-a!ple$ .arada /""
;;
Relative freque"cy distributio"
;2
-
8/11/2019 20140211 02 Descriptive Statisctics
7/29
0erce"t freque"cy distributio"
The perce"t freque"cy of a class is the
relative frequency multilied by ;
-
8/11/2019 20140211 02 Descriptive Statisctics
8/29
1ar graph
1 bar graph is a grahical device for deictingqualitative data that have been summari=ed in a
frequency# relative frequency# or ercentfrequency distribution.
-n the hori=ontal axis we secify the labels thatare used for each of the classes.
1 freque"cy# relative freque"cy# or perce"tfreque"cy scale can be used for the vertical axis.
Using a bar of fixed width drawn above each
class label# we extend the height aroriately.The bars are separated to emhasi=e the factthat each class is a searate category.
;
1ar graph .arada /""
;
-
8/11/2019 20140211 02 Descriptive Statisctics
9/29
0ie chart
The pie chart is a commonly used grahicaldevice for resenting relative frequency
distributions for qualitative data.4irst draw a circle> then use the relativefrequencies to subdivide the circle intosectors that corresond to the relativefrequency for each class.
$ince there are *< degrees in a circle# a class
with a relative frequency of
-
8/11/2019 20140211 02 Descriptive Statisctics
10/29
,-a!ple .arada /""
@nsights 7ained from the'receding 'ie Chart
-nehalf of the customers surveyedgave &arada a quality rating of Aabove averageB or AexcellentB"loo)ing at the left side of the ie(.This might lease the manager.
4or each customer who gave anAexcellentB rating# there were t%o
customers who gave a AoorB rating"loo)ing at the to of the ie(. Thisshould dislease the manager.
;:
,-a!ple *ive popular softdri"ks
2
-
8/11/2019 20140211 02 Descriptive Statisctics
11/29
-
8/11/2019 20140211 02 Descriptive Statisctics
12/29
Su!!arie of qua"titative data
4requency Distribution
5elative 4requency and 'ercent 4requency
Distributions
Dot 'lot
istogram
Cumulative Distributions
-give
2*
,-a!ple$ 4udso" +uto Repair
2+
The manager of udson 1uto would li)e to get a bettericture of the distribution of costs for engine tuneu arts. 1samle of < customer invoices has been ta)en and the costs
of arts# rounded to the nearest dollar# are listed below.
-
8/11/2019 20140211 02 Descriptive Statisctics
13/29
*reque"cy distributio"
7uidelines for $electing umber of Classes
Use between and 2< classes.
1roximate formula to calculate number of
class may also be introduced as!
ra"ge 5 largest data value s!allest data value
class 5 k 5 67838 log 9: %here 9 is "u!ber of sa!ples
i"terval 5 ra"geclass
Data sets with a larger number of elements usuallyrequire a larger number of classes.
$maller data sets usually require fewer classes.
2
*reque"cy distributio"
7uidelines for selecting width of classes
Use classes of equal width.
2
Approximate class width =
-
8/11/2019 20140211 02 Descriptive Statisctics
14/29
,-a!ple$ 4udso" +uto Repair
23
4requency distribution
,-a!ple$ 4udso" +uto Repair
28
-
8/11/2019 20140211 02 Descriptive Statisctics
15/29
,-a!ple$ 4udso" +uto Repair
2:
@nsights gained from the ercent frequencydistribution!
-nly + of the arts costs are in the F
-
8/11/2019 20140211 02 Descriptive Statisctics
16/29
,-a!ple$ 4udso" +uto Repair
*;
4istogra!
1nother common grahical resentation ofquantitative data is a histogram.
The variable of interest is laced on the
hori=ontal axis and the frequency# relativefrequency# or ercent frequency is laced on thevertical axis.
1 rectangle is drawn above each class intervalwith its height corresonding to the intervalGsfrequency# relative frequency# or ercentfrequency.
Unli)e a bar grah# a histogram has no naturalsearation between rectangles of ad9acentclasses.
*2
-
8/11/2019 20140211 02 Descriptive Statisctics
17/29
,-a!ple$ 4udso" +uto Repair
**
Cu!ulative distributio"
The cumulative frequency distribution showsthe number of items with values less than orequal to the uer limit of each class.
The cumulative relative frequencydistribution shows the roortion of itemswith values less than or equal to the uerlimit of each class.
The cumulative ercent frequencydistribution shows the ercentage of items
with values less than or equal to the uerlimit of each class.
*+
-
8/11/2019 20140211 02 Descriptive Statisctics
18/29
,-a!ple$ 4udso" +uto Repair
*
;give
1n ogive is a graph of a cu!ulative distributio".
The data values are shown on the hori=ontalaxis.
$hown on the vertical axis are the!cumulative frequencies# or
cumulative relative frequencies# or
cumulative ercent frequencies
The frequency "one of the above( of each class islotted as a oint.
The lotted oints are connected by straightlines.
*
-
8/11/2019 20140211 02 Descriptive Statisctics
19/29
,-a!ple$ 4udso" +uto Repair
-give
6ecause the class limits for the artscost data
are
-
8/11/2019 20140211 02 Descriptive Statisctics
20/29
,-ploratory data a"alysis
The techniques of exloratory data analysis
consist of simle arithmetic and easytodraw
ictures that can be used to summari=e data
quic)ly.
-ne such technique is the stemandleaf
dislay.
*:
Ste!
-
8/11/2019 20140211 02 Descriptive Statisctics
21/29
,-a!ple$ 4udso" +uto Repair
+;
Stretched ste!
-
8/11/2019 20140211 02 Descriptive Statisctics
22/29
,-a!ple$ 4udso" +uto Repair
+*
Ste!
-
8/11/2019 20140211 02 Descriptive Statisctics
23/29
,-a!ple$ =eaf u"it 5 >36
+
,-a!ple$ =eaf u"it 5 6>
+
-
8/11/2019 20140211 02 Descriptive Statisctics
24/29
Crosstabulatio"s a"d scatter diagra!s
Thus far we have focused on methods that
are used to summari=e the data for one
variable at a time.
-ften a manager is interested in tabular and
grahical methods that will hel understand
the relationshi between two variables.
Crosstabulation and a scatter diagram are
two methods for summari=ing the data fortwo "or more( variables simultaneously.
+3
Crosstabulatio"
Crosstabulation is a tabular method for
summari=ing the data for two variables
simultaneously.
Crosstabulation can be used when!
-ne variable is qualitative and the other is
quantitative
6oth variables are qualitative
6oth variables are quantitative
The left and to margin labels define theclasses for the two variables.
+8
-
8/11/2019 20140211 02 Descriptive Statisctics
25/29
,-a!ple$ *i"ger =akes 4o!es
+:
Crosstabulation
The number of 4inger Ha)es homes sold for each styleand rice for the ast two years is shown below.
,-a!ple$ *i"ger =akes 4o!es