statistic concepts

Upload: saurav-gupta

Post on 04-Apr-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Statistic Concepts

    1/30

    Page | 1

    Table of ContentsAbout the Data ............................................................................................................................................................. 2

    Frequency Distribution ................................................................................................................................................. 4

    Histogram ..................................................................................................................................................................... 6

    Frequency Polygon ....................................................................................................................................................... 7

    OGIVE(LESS THAN) ........................................................................................................................................................ 8

    OGIVE (MORE THAN) .................................................................................................................................................... 9

    Pareto Charts .............................................................................................................................................................. 10

    Pie Charts .................................................................................................................................................................... 11

    Stem and Leaf Charts and Mode ................................................................................................................................ 12

    Scatter Plots ............................................................................................................................................................... 13

    Geometric Mean ........................................................................................................................................................ 15

    Arithmetic Mean ........................................................................................................................................................ 17

    Weighted Arithmetic Mean ........................................................................................................................................ 19

    Median ....................................................................................................................................................................... 20

    Quartiles ..................................................................................................................................................................... 22

    Range .......................................................................................................................................................................... 24

    Mean Absolute Deviation ........................................................................................................................................... 25

    Variance ...................................................................................................................................................................... 25

    Standard Deviation ..................................................................................................................................................... 26

    Coefficient of Variation .............................................................................................................................................. 26

    Skewness .................................................................................................................................................................... 28

    Kurtosis ....................................................................................................................................................................... 28

    Percentile ................................................................................................................................................................... 29

    Decile .......................................................................................................................................................................... 30

  • 7/29/2019 Statistic Concepts

    2/30

    Page | 2

    About the Data

    TAX REVENUE OF CENTRE AND THE STATES: 1950-51 to 2009-10 (Rs. Crore)

    Total Tax Revenue(A+C) Central Taxes Gross(A) States' own Taxes( C)

    Year Direct Indirect Total Direct Indirect Total Direct Indirect Total

    1950-51 231 396 627 176 229 405 55 167 222

    1951-52 244 495 739 190 322 512 54 173 227

    1952-53 252 426 678 186 259 445 66 167 233

    1953-54 242 430 672 166 254 420 76 176 252

    1954-55 240 480 720 161 294 455 79 186 265

    1955-56 259 509 768 171 314 485 88 195 283

    1956-57 288 602 890 194 376 570 94 226 320

    1957-58 327 718 1045 230 462 692 97 256 353

    1958-59 344 745 1089 238 463 701 106 282 388

    1959-60 378 838 1216 269 525 794 109 313 422

    1960-61 402 948 1350 292 603 895 110 345 455

    1961-62 449 1094 1543 337 717 1054 112 377 489

    1962-63 560 1305 1865 423 862 1285 137 443 580

    1963-64 693 1632 2325 550 1084 1634 143 548 6911964-65 743 1856 2599 600 1221 1821 143 635 778

    1965-66 734 2188 2922 598 1463 2061 136 725 861

    1966-67 767 2494 3261 657 1650 2307 110 844 954

    1967-68 780 2676 3456 655 1698 2353 125 978 1103

    1968-69 840 2919 3759 698 1812 2510 142 1107 1249

    1969-70 963 3237 4200 826 1996 2822 137 1241 1378

    1970-71 1009 3743 4752 869 2337 3206 140 1406 1546

    1971-72 1171 4404 5575 1047 2826 3873 124 1578 1702

    1972-73 1346 5090 6436 1233 3272 4505 113 1818 1931

    1973-74 1552 5837 7389 1375 3695 5070 177 2142 2319

    1974-75 1834 7389 9223 1650 4672 6322 184 2717 2901

    1975-76 2493 8689 11182 2205 5404 7609 288 3285 3573

    1976-77 2585 9747 12332 2328 5943 8271 257 3804 4061

    1977-78 2680 10557 13237 2405 6453 8858 275 4104 4379

    1978-79 2851 12677 15528 2528 7997 10525 323 4680 5003

    1979-80 3096 14587 17683 2818 9156 11974 278 5431 5709

    1980-81 3268 16576 19844 2997 10182 13179 271 6394 6665

    1981-82 4133 20009 24142 3786 12061 15847 347 7948 8295

    1982-83 4492 22750 27242 4139 13557 17696 353 9193 95461983-84 4907 26618 31525 4498 16223 20721 409 10395 10804

    1984-85 5330 30484 35814 4798 18673 23471 532 11811 12343

    1985-86 6252 37015 43267 5620 23050 28670 632 13965 14597

    1986-87 6889 42650 49539 6236 26602 32838 653 16048 16701

    1987-88 7483 49493 56976 6752 30913 37665 731 18580 19311

    1988-89 9758 57168 66926 8830 35644 44474 928 21524 22452

    1989-90 11165 66528 77693 10003 41633 51636 1162 24895 26057

    1990-91 12260 75462 87722 11030 46547 57577 1230 28915 30145

    1991-92 16657 86541 103198 15353 52008 67361 1304 34533 35837

    1992-93 19387 94779 114166 18140 56496 74636 1247 38283 39530

    1993-94 21713 100248 121961 20299 55443 75742 1414 44805 46219

    1994-95 28878 118971 147849 26973 65324 92297 1905 53647 55552

  • 7/29/2019 Statistic Concepts

    3/30

    Page | 3

    1995-96 35777 139482 175259 33564 77660 111224 2213 61822 64035

    1996-97 41061 159995 201056 38898 90864 129762 2163 69131 71294

    1997-98 50538 170121 220659 48282 90938 139220 2256 79183 81439

    1998-99 49119 183898 233017 46601 97196 143797 2518 86702 89220

    1999-2000 60864 213719 274583 57960 113792 171752 2904 99927 102831

    2000-01 71762 233558 305322 68305 120298 188605 3457 113260 116717

    2001-02 73109 241426 314535 69198 117862 187060 3911 123564 127475

    2002-03 87365 268912 356277 83363 132542 215905 4002 136370 140372

    2003-04 109546 304538 414084 105091 149257 254348 4455 155281 159736

    2004-05 137093 357277 494370 132183 172774 304957 4910 184503 189413

    2005-06 167635 420053 587688 162337 203814 366151 5298 216239 221537

    2006-07 231376 505331 736708 225045 248467 473513 6331 256864 263195

    2007-08 318839 551490 870329 312220 280927 593147 6619 270563 277182

    2008-09(R.E.) 346390 601270 947660 338906 289043 627949 7484 312227 319711

    2009-10(B.E.) 372061 624823 996885 363956 277123 641080 8105 347700 355805

    Data collected from:http://www.finmin.nic.in/reports/IPFStat200910.pdf

    Importance of data:The table shows the income of India of last sixty years and its pattern of

    growth.

    Type of data: Continuous numerical type data

    Rs. Crore

    Raw data in arranged array

    627

    672

    678

    720

    739

    768890

    1045

    1089

    1216

    1350

    1543

    1865

    2325

    2599

    2922

    3261

    3456

    3759

    4200

    4752

    5575

    6436

    7389

    9223

    Contd..

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    4/30

    Page | 4

    Frequency Distribution

    Max No 996885

    Min No 627

    Range 996258

    No of Classes taken 10Class interval 100000(approx99625)

    Rs. Crore

    Raw data in arranged array

    11182

    12332

    13237

    15528

    17683

    19844

    24142

    27242

    31525

    35814

    43267

    49539

    56976

    66926

    77693

    87722103198

    114166

    121961

    147849

    175259

    201056

    220659

    233017

    274583

    305322

    314535

    356277

    414084

    494370

    587688

    736708

    870329

    947660

    996885

  • 7/29/2019 Statistic Concepts

    5/30

    Page | 5

    Class boundary Class Mid-point FrequencyRelative

    Freq(%)

    Cumulative

    Frequency()

    100000 0-100000 50000 41 68.333333 41 60

    200000100001-

    200000150000 5 8.3333333 46 19

    300000200001-

    300000250000 4 6.6666667 50 14

    400000300001-

    400000350000 3 5 53 10

    500000400001-

    500000450000 2 3.3333333 55 7

    600000500001-

    600000550000 1 1.6666667 56 5

    700000600001-

    700000650000 0 0 56 4

    800000700001-

    800000750000 1 1.6666667 57 4

    900000800001-

    900000 850000 1 1.6666667 58 3

    1000000900001-

    1000000950000 2 3.3333333 60 2

    Here we can see that in class interval (0 to 100000) we maximum no of frequency, almost 68.33 % data

    fall in this class interval.

  • 7/29/2019 Statistic Concepts

    6/30

    Page | 6

    Histogram

    Type of Data: Interval

    Concept Name:Histogram

    Selection of variable:Tax Revenue collection in Rs. Crore from 1950-51 to 2009-10

    Formula and calculation steps:

    A graph of the data in a frequency distribution is called a histogram. The class boundaries (or class

    midpoints) are shown on the horizontal axis. The vertical axis is either frequency, relative frequency, or

    percentage. Bars of the appropriate heights are used to represent the number of observations within

    each class.

    Findings and Interpretation of results:In class interval (0 to 100000) has maximum no of frequency.

    0

    10

    20

    30

    40

    50

    FrequencyDistriution

    Tax Revenue Collection in Rs. Crore

    Histogram

    0-100000

    100001-200000

    200001-300000

    300001-400000

    400001-500000

    500001-600000

    600001-700000

    700001-800000

  • 7/29/2019 Statistic Concepts

    7/30

    Page | 7

    Frequency Polygon

    Type of Data: Interval

    Concept Name: Frequency Polygon

    Selection of variable:Data taken fromTAX REVENUE OF CENTRE AND THE STATES: 1950-51 to 2009-10

    (Rs. Crore)

    Formula and calculation steps: Midpoints of the interval of corresponding rectangle in a histogram arejoined together by straight lines. It gives a polygon i.e. a figure with many angles

    Findings and Interpretation of results:It shows the class interval where maximum no of data fall

    Mid point Frequency

    0 0

    50000 41

    150000 5

    250000 4

    350000 3450000 2

    550000 1

    650000 0

    750000 1

    850000 1

    950000 2

    1050000 0

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    Frequency

    Tax Revenue Collection in Rs. Crore

    Frequency Polygon

  • 7/29/2019 Statistic Concepts

    8/30

    Page | 8

    OGIVE(LESS THAN)

    Type of Data: Interval

    Concept Name:Ogive (LessThan)

    Selection of variable:Cumulative Frequency Less than Vs Tax Revenue collection

    Formula and calculation steps:taking class boundary at X axis and cumulative frequencies in Y axis

    Findings and Interpretation of results:It shows no of data fall in bellow that class boundary.

    0

    20

    40

    60

    80

    0 200000 400000 600000 800000 1000000 1200000

    CumulativeFrequencylessthan

    Tax Revenue Collection in Rs. Crore

    Ogive(Less Than)

  • 7/29/2019 Statistic Concepts

    9/30

    Page | 9

    OGIVE (MORE THAN)

    Type of Data: Interval

    Concept Name:Ogive (More Than)

    Selection of variable:Cumulative Frequency Less than Vs Tax Revenue collection

    Formula and calculation steps: taking class boundary at X axis and cumulative frequencies in Y axis

    Findings and Interpretation of results: It shows no of data fall in beyond that class boundary.

    0

    20

    40

    60

    80

    0 200000 400000 600000 800000 1000000 1200000CumulativeFrequencymore

    than

    Tax Revenue Collection in Rs. Crore

    Ogive(More Than)

  • 7/29/2019 Statistic Concepts

    10/30

    Page | 10

    Pareto Charts

    Source:http://www.who.int/retrieved on 24-07-2011.

    Type of Data:Numerical

    Concept Name:Pareto Charts

    Selection of variable: Top ten causes of death in India according to WHO

    Formula and calculation steps:Calculated the relative frequency % of each cause and the cumulative relativefrequency and plotted it.

    Findings and Interpretation of results:The chart shows that 80% deaths in India are caused by followingdiseases Coronary Heart Disease, Diarrhoeal diseases, Lung Disease and Stroke. So the Initial focus of Government

    should be taking requisite steps to reduce the deaths due to these diseases

    Cause of Death No. of Deaths in '000Relative Frequency (%)

    ((2) 6767)*100

    Cumulative

    Relative

    Frequency

    Coronary Heart Disease 1416 20.93 20.93

    Diarrhoeal diseases 1231 18.19 39.12Lung Disease 1122 16.58 55.70

    Stroke 940 13.89 69.59

    Influenza & Pneumonia 760 11.23 80.82

    Tuberculosis 317 4.68 85.50

    Low Birth Weight 279 4.12 89.63

    Suicide 243 3.59 93.22

    Liver diseases 236 3.49 96.70

    Road Traffic Accidents 223 3.30 100.00

    Total 6767 100.00

    0.0010.0020.0030.00

    40.0050.0060.0070.0080.0090.00100.00

    0.00

    5.00

    10.00

    15.00

    20.00

    25.00

    Pareto Chart

    Relative Frequency %

    Cumulative Relative Frequency

    http://www.who.int/http://www.who.int/http://www.who.int/http://www.who.int/
  • 7/29/2019 Statistic Concepts

    11/30

  • 7/29/2019 Statistic Concepts

    12/30

    Page | 12

    Stem and Leaf Charts and Mode

    Source:http://loksabha.nic.in/ retrieved on 23-07-2011.

    Type of Data: Numerical

    Concept Name:Stem and Leaf Charts and Mode

    Selection of variable: No. of Members of Parliament of LokSabha from each State and Union Territory

    Formula and calculation steps: Stems are taken as the digits in Tens place and leaf is unit place

    Findings and Interpretation of results: We can see from the Stem and leaf plot that maximum number

    of State's and Union territories have less than 10 members of Parliament also the Mode for the given

    data set is '1' i.e. maximum no. States and Union Territories have only single representation

    Sl.No.Name of State/Union

    Territory

    No. of Member of

    Parliament for Lok Sabha

    1 Andhra Pradesh 42

    2 Andaman and Nicobar Islands 1 Stem Leaf

    3 Arunachal Pradesh 2 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 4 5 6 7

    4 Assam 14 1 0 1 3 4 45 Bihar 40 2 0 1 5 6 7 7 7 7 7 7 7 7 9

    6 Chandigarh 1 3 9

    7 Chhattisgarh 11 4 0 2 2 7 7 7 7 7 7 7 7

    8 Dadra and Nagar Haveli 1 5

    9 Daman and Diu 1 6

    10 Delhi 7 7

    11 Goa 2 8 012 Gujarat 26

    13 Haryana 10

    14 Himachal Pradesh 4 Mode 115 Jammu and Kashmir 6

    16 Jharkhand 14

    17 Karnataka 28

    18 Kerala 20

    19 Lakshadweep 1

    20 Madhya Pradesh 29

    21 Maharashtra 48

    22 Manipur 2

    23 Meghalaya 2

    24 Mizoram 1

    25 Nagaland 1

    26 Orissa 21

    27 Puducherry 1

    28 Punjab 13

    29 Rajasthan 25

    30 Sikkim 1

    31 Tamil Nadu 39

    32 Tripura 2

    33 Uttar Pradesh 80

    34 Uttarakhand 5

    35 West Bengal 42

    http://loksabha.nic.in/http://loksabha.nic.in/http://loksabha.nic.in/http://loksabha.nic.in/
  • 7/29/2019 Statistic Concepts

    13/30

    Page | 13

    Scatter Plots

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of Data: Numerical

    Concept Name: Scatter Plot

    Selection of variable: Tax Revenue Collected of Centre and the States: 1950-51 to 2009-10 for both

    Direct and indirect Tax

    Formula and calculation steps: For Scatter Plot we have taken direct and indirect tax collection over the

    period for seeing the correlation between them

    Findings and Interpretation of results: From the plot we can see that there is a strong correlation

    between the two variables

    Total Tax Revenue(All India) in Rs. Crore

    Year Direct Indirect Total

    1950-51 231 396 627

    1951-52 244 495 739

    1952-53 252 426 678

    1953-54 242 430 672

    1954-55 240 480 720

    1955-56 259 509 768

    1956-57 288 602 890

    1957-58 327 718 1045

    1958-59 344 745 1089

    1959-60 378 838 1216

    1960-61 402 948 1350

    1961-62 449 1094 1543

    1962-63 560 1305 1865

    1963-64 693 1632 2325

    1964-65 743 1856 2599

    1965-66 734 2188 2922

    1966-67 767 2494 3261

    1967-68 780 2676 3456

    1968-69 840 2919 3759

    1969-70 963 3237 4200

    1970-71 1009 3743 4752

    1971-72 1171 4404 5575

    1972-73 1346 5090 6436

    1973-74 1552 5837 7389

    1974-75 1834 7389 9223

    1975-76 2493 8689 11182

    1976-77 2585 9747 12332

    1977-78 2680 10557 13237

    1978-79 2851 12677 155281979-80 3096 14587 17683

    1980-81 3268 16576 19844

    1981-82 4133 20009 24142

    1982-83 4492 22750 27242

    1983-84 4907 26618 31525

    1984-85 5330 30484 35814

    1985-86 6252 37015 43267

    1986-87 6889 42650 49539

    1987-88 7483 49493 56976

    1988-89 9758 57168 66926

    1989-90 11165 66528 77693

    1990-91 12260 75462 87722

    1991-92 16657 86541 103198

    1992-93 19387 94779 114166

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    14/30

    Page | 14

    Total Tax Revenue(All India) in Rs. Crore

    Year Direct Indirect Total

    1993-94 21713 100248 121961

    1994-95 28878 118971 147849

    1995-96 35777 139482 175259

    1996-97 41061 159995 201056

    1997-98 50538 170121 220659

    1998-99 49119 183898 233017

    1999-2000 60864 213719 274583

    2000-01 71762 233558 3053222001-02 73109 241426 314535

    2002-03 87365 268912 356277

    2003-04 109546 304538 414084

    2004-05 137093 357277 494370

    2005-06 167635 420053 587688

    2006-07 231376 505331 736708

    2007-08 318839 551490 870329

    2008-09(R.E.) 346390 601270 947660

    2009-10(B.E.) 372061 624823 996885

    0

    100000

    200000

    300000

    400000

    500000

    600000

    700000

    0 100000 200000 300000 400000

    In

    directTaxRevenue

    Direct Tax Revenue

    Scatter Plot

    Direct Tax Vs Indirect Tax

    Revenue

  • 7/29/2019 Statistic Concepts

    15/30

    Page | 15

    Geometric Mean

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of Data: Numerical

    Concept Name: Geometric Mean

    Selection of variable: Total Tax Revenue Collected of Centre and the States: 1950-51 to 2009-10

    Formula and calculation steps: Calculated the Growth of revenue over the previous year and the

    corresponding growth factor and then multiplying and taking nth root to find geometric mean

    Findings and Interpretation of results: The Average growth factor comes out to 1.13308 so our annual

    rate of increase in tax revenue collection is 13.30%

    Total Tax Revenue(All India) Growth over the previous year Growth Factor (x)

    Year Rs. Crore % (g) (g100)+1

    1950-51 627

    1951-52 739 17.86 1.179

    1952-53 678 -8.25 0.917

    1953-54 672 -0.88 0.991

    1954-55 720 7.14 1.0711955-56 768 6.67 1.067

    1956-57 890 15.89 1.159

    1957-58 1045 17.42 1.174

    1958-59 1089 4.21 1.042

    1959-60 1216 11.66 1.117

    1960-61 1350 11.02 1.110

    1961-62 1543 14.30 1.143

    1962-63 1865 20.87 1.209

    1963-64 2325 24.66 1.247

    1964-65 2599 11.78 1.118

    1965-66 2922 12.43 1.124

    1966-67 3261 11.60 1.1161967-68 3456 5.98 1.060

    1968-69 3759 8.77 1.088

    1969-70 4200 11.73 1.117

    1970-71 4752 13.14 1.131

    1971-72 5575 17.32 1.173

    1972-73 6436 15.44 1.154

    1973-74 7389 14.81 1.148

    1974-75 9223 24.82 1.248

    1975-76 11182 21.24 1.212

    1976-77 12332 10.28 1.103

    1977-78 13237 7.34 1.073

    1978-79 15528 17.31 1.173

    1979-80 17683 13.88 1.139

    1980-81 19844 12.22 1.122

    1981-82 24142 21.66 1.217

    1982-83 27242 12.84 1.128

    1983-84 31525 15.72 1.157

    1984-85 35814 13.61 1.136

    1985-86 43267 20.81 1.208

    1986-87 49539 14.50 1.145

    1987-88 56976 15.01 1.150

    1988-89 66926 17.46 1.175

    1989-90 77693 16.09 1.1611990-91 87722 12.91 1.129

    1991-92 103198 17.64 1.176

    1992-93 114166 10.63 1.106

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    16/30

    Page | 16

    Year Rs. Crore Growth over the previous year % Growth Factor (x)

    1993-94 121961 6.83 1.068

    1994-95 147849 21.23 1.212

    1995-96 175259 18.54 1.185

    1996-97 201056 14.72 1.147

    1997-98 220659 9.75 1.098

    1998-99 233017 5.60 1.056

    1999-2000 274583 17.84 1.178

    2000-01 305322 11.19 1.112

    2001-02 314535 3.02 1.030

    2002-03 356277 13.27 1.133

    2003-04 414084 16.23 1.162

    2004-05 494370 19.39 1.194

    2005-06 587688 18.88 1.189

    2006-07 736708 25.36 1.254

    2007-08 870329 18.14 1.181

    2008-09(R.E.) 947660 8.89 1.089

    2009-10(B.E.) 996885 5.19 1.052

    The Geometric Mean is given by

    Where xiare the variables for which mean is required and n is the number of the variables.

    In above case n= 59 and xiare the growth factor values for different years

    Therefore

    Geometric Mean (G) 1.13308

    The geometric mean is more appropriate than the arithmetic mean for describing proportional growth,

    both exponential growth (constant proportional growth) and varying growth; in business the geometric

    mean of growth rates is known as the compound annual growth rate (CAGR). The geometric mean of

    growth over periods yields the equivalent constant growth rate that would yield the same final amount.

  • 7/29/2019 Statistic Concepts

    17/30

  • 7/29/2019 Statistic Concepts

    18/30

    Page | 18

    Year Rs. Crore

    1993-94 121961

    1994-95 147849

    1995-96 175259

    1996-97 201056

    1997-98 220659

    1998-99 233017

    1999-2000 2745832000-01 305322

    2001-02 314535

    2002-03 356277

    2003-04 414084

    2004-05 494370

    2005-06 587688

    2006-07 736708

    2007-08 870329

    2008-09(R.E.) 947660

    2009-10(B.E.) 996885

    Total 8275357

    The Arithmetic Mean is given by

    Where xiare the variables for which mean is required and n is the number of the variables.

    Here

    n = 60

    x = 8275357

    Therefore

    Arithmetic Mean= = = Rs. 140260.288

  • 7/29/2019 Statistic Concepts

    19/30

    Page | 19

    Weighted Arithmetic Mean

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of Data: Numerical

    Concept Name: WeightedArithmetic Mean

    Selection of variable: Total Tax Revenue Collected of Centre and the States: 1950-51 to 2009-10

    Formula and calculation steps: Calculated the sum of product of frequency and midpoint of each classand then divide it by the total frequency. Here frequency are taken as weights

    Findings and Interpretation of results: Theweighted average Revenue collected from 1950 to 2009 is Rs.

    163333.33

    Class (Rs. Crore) Mid-point (x) Frequency (f) f * x

    0-100000 50000 41 2050000

    100001-200000 150000 5 750000

    200001-300000 250000 4 1000000

    300001-400000 350000 3 1050000

    400001-500000 450000 2 900000

    500001-600000 550000 1 550000

    600001-700000 650000 0 0

    700001-800000 750000 1 750000

    800001-900000 850000 1 850000

    900001-1000000 950000 2 1900000

    Total 60 9800000

    Here

    f = n = 60

    (f*x) = 9800000

    Therefore

    Weighted Arithmetic Mean of Grouped data = = = Rs. 163333.33

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    20/30

    Page | 20

    Median

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of Data: Numerical

    Concept Name: Median

    Selection of variable: Total Tax Revenue Collected of Centre and the States: 1950-51 to 2009-10

    Formula and calculation steps: Arranged the data in ascending order and found the mean of 30th

    and31

    stitem to find the median

    Findings and Interpretation of results: TheMedian of the collected data is Rs. 18763.50 which means

    half of the items lie above this point, and the other half lie below it.

    Total Revenue Collected

    Raw Data

    Total Revenue Collected

    Raw data in Arranged

    Array

    627 627

    739 672

    678 678

    672 720

    720 739

    768 768

    890 890

    1045 1045

    1089 1089

    1216 1216

    1350 1350

    1543 1543

    1865 1865

    2325 2325

    2599 25992922 2922

    3261 3261

    3456 3456

    3759 3759

    4200 4200

    4752 4752

    5575 5575

    6436 6436

    7389 7389

    9223 9223

    11182 11182

    12332 1233213237 13237

    15528 15528

    17683 17683 30th item

    19844 19844 31st item

    24142 24142

    27242 27242

    31525 31525

    35814 35814

    43267 43267

    49539 49539

    56976 56976

    66926 66926

    77693 77693

    87722 87722

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    21/30

    Page | 21

    Total Revenue Collected

    Raw Data

    Total Revenue Collected

    Raw data in Arranged

    Array

    103198 103198

    114166 114166

    121961 121961

    147849 147849

    175259 175259201056 201056

    220659 220659

    233017 233017

    274583 274583

    305322 305322

    314535 314535

    356277 356277

    414084 414084

    494370 494370

    587688 587688

    736708 736708

    870329 870329947660 947660

    996885 996885

    The Median is given by

    Median = ( th item in the arranged data array

    In our case n= 60 therefore the Median is the (60+1)/2th item i.e. 30.5 item or we can take the mean of

    the 30th

    item and the 31st

    item when the data is arranged in ascending or descending order.

    In our case 30th

    item is 17683 and 31st

    item is 19844

    Therefore

    Median =

    = 18763.50

  • 7/29/2019 Statistic Concepts

    22/30

  • 7/29/2019 Statistic Concepts

    23/30

    Page | 23

    Total Tax Revenue in Rs. Crore

    43267

    49539

    56976

    66926

    77693

    87722

    103198

    114166

    121961

    147849

    175259

    201056

    220659

    233017

    274583

    305322

    314535

    356277

    414084

    494370

    587688

    736708

    870329

    947660

    996885

    The first quartile is calculated as ( ) item. Here it is the 15.25th

    item. To find that, we calculate it

    as,

    First QuartileQ1 = 15.25th

    item = 15th

    item + (1/4) (16th

    item 15th

    item) = 2679.75

    Similarly values for other quartiles can be found out as,

    Quartile Value

    Q1 2679.75

    Q2 18763.5Q3 168406.5

  • 7/29/2019 Statistic Concepts

    24/30

    Page | 24

    Range

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data: Numerical

    Concept Name: Range

    Variable selected: Total Tax revenues of Centre and State

    Formula and Calculation: Range of a data set is calculated as the difference between the maximum

    value and the minimum value.

    Findings and Interpretation of results: For the above data set, the range is 996258

    Interquartile range

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data:Numerical

    Concept Name:Interquartile range

    Variable Selected:Total Tax revenues of Centre and State

    Formula and Calculation:Interquartile range is the difference in value of the third quartile and the firstquartile.

    Interquartile range = Q3 Q1

    = 168406.5 - 2679.75

    = 165726.75

    Findings and Interpretation:From the range value, we get a very high dispersion. But on closer look at

    the data, the increase in tax collected in one year over the previous year has increased every year and in

    the last twenty years, it has sometimes doubled itself. So the range value does not perfectly represent

    the spread of the data set, whereas the quartiles and interquartile range are more fair representations

    of the dispersion. We infer that 50% of the values lie between 2679.75(Q1) and 168406.5(Q3) and also

    that 75% lie below 168406.5(Q3). The very fact that from 168406.5(Q3) to the maximum value just 25%

    of the data are present shows the inaccuracy of the range value in determining the spread.

    Minimum 627

    Maximum 996885

    Range 996258

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    25/30

    Page | 25

    Mean Absolute Deviation

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of Data:Numerical

    Concept Name:Mean Absolute Deviation

    Variable Selected:Total Tax revenues of Centre and State

    Formula and Calculation:

    For the above data set, the Mean Absolute Deviation is found as Rs 168983.9711

    Findings and Interpretation:

    In the above data set, the income tax collected over all the years has a dispersion of Rs 168983.9711

    from the average tax collected of Rs 137922.6167.

    Variance

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data:Numerical

    Concept Name:Variance

    Variable Selected:Total Tax revenues of Centre and State

    Formula and Calculation:

    For the above data set,Variance is found to be58848680358.4099.

    Findings and Interpretation:

    Since variance is ameasure of by how much the values in the data set are likely to differ from the meanof the values, we can see that in the above data set, the data are very widely dispersed from the mean.

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    26/30

    Page | 26

    Standard Deviation

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data: Numerical

    Concept Name: Standard Deviation

    Variable Selected: Total Tax revenues of Centre and State

    Formula and Calculation:

    For the given data set, Standard Deviation is242587.4695

    Findings and Interpretations:

    The average tax collected is Rs 137922.6167 and the tax collected for all the years are at a standarddeviation of Rs 242587.4695 away from the mean.

    Coefficient of Variation

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data: Numerical

    Concept Name: Coefficient of Variation

    Variable Selected: Central Taxes Gross and States own taxes

    Formula and Calculation:

    Findings and Interpretation:

    From the coefficient of variation of the two data sets, we can infer that the dispersion is more or less the

    same.

    Rs. Crore

    States Taxes Centres Taxes

    222 405

    227 512

    233 445

    252 420

    265 455

    283 485

    320 570

    353 692

    388 701

    422 794

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    27/30

    Page | 27

    Rs. Crore

    States Taxes Centres Taxes

    455 895

    489 1054

    580 1285

    691 1634

    778 1821

    861 2061

    954 2307

    1103 23531249 2510

    1378 2822

    1546 3206

    1702 3873

    1931 4505

    2319 5070

    2901 6322

    3573 7609

    4061 8271

    4379 8858

    5003 10525

    5709 11974

    6665 13179

    8295 15847

    9546 17696

    10804 20721

    12343 23471

    14597 28670

    16701 32838

    19311 37665

    22452 44474

    26057 51636

    30145 57577

    35837 67361

    39530 74636

    46219 75742

    55552 92297

    64035 111224

    71294 129762

    81439 139220

    89220 143797

    102831 171752

    116717 188605

    127475 187060

    140372 215905159736 254348

    189413 304957

    221537 366151

    263195 473513

    277182 593147

    319711 627949

    355805 641080

    For the data set on the left, (i.e. States taxes)

    Mean 49644.05

    S.D 85675.55CV % 172.5797

  • 7/29/2019 Statistic Concepts

    28/30

    Page | 28

    CV = 85675.55 / 49644.05 = 172.5797

    For the data set on the right, (i.e. Centres taxes)

    Mean 88278.57

    S.D 157277.2

    CV % 178.1602

    CV = 157277.2/ 88278.57 = 178.1602

    Skewness

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data: Numerical

    Concept Name: Skewness

    Variable Selected: Total Tax revenues of Centre and State

    Formula and Calculation:

    For the above data set, the skewness is 2.30112980468625.

    Findings and Interpretations:

    It is positive which indicates that more values are present to the left side( or below) the mean value and

    the distribution will be skewed to the left. Thus this also supports what we have inferred from quartiles

    and interquartile range that 75% of values lay below Q3. Therefore, tax collected by the government is

    mostly below the mean value.

    Kurtosis

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data: Numerical

    Concept Name: Kurtosis

    Variable Selected: Total Tax revenues of Centre and State

    Formula and Calculation:

    For the data set, the kurtosis value is 4.79782333743203.

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    29/30

    Page | 29

    Findings and Interpretation:

    Kurtosis is a measure of the peakedness of the distribution curve. Taking into account that the kurtosis

    for a standard normal distribution is 3, our data set has a higher value.

    From this we can conclude that our data if organized in a distribution will be more peaked than a

    standard normal distribution as the kurtosis value is higher. Kurtosis values indicate a sharp peak nearthe mean. This can be seen from the histogram that there is a sharp peak near the mean value of

    137922.616666667 and then a long tail.

    Percentile

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data: Numerical

    Concept Name: Percentile

    Variable Selected: Total Tax revenues of Centre and State

    Formula and Calculation:

    L/N(100) = P

    where L is the number of items less than a value, N is the total number of items (here 60) and P is the

    percentile.

    So for a tax revenue of Rs 43267 in the year 1985-86 the percentile value is found out as:

    L = number of items less than 19844 = 35

    N = 60

    P = (35/60) * 100 = 58.33

    Findings and Interpretation:

    Therefore a value of 43267 is at a percentile of 58.33 which means it is greater than approximately 58%

    of the items in the data set.

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf
  • 7/29/2019 Statistic Concepts

    30/30

    Decile

    Source:http://www.finmin.nic.in/reports/IPFStat200910.pdf retrieved on 23-07-2011.

    Type of data: Numerical

    Concept Name: Decile

    Variable Selected: Total Tax revenues of Centre and State

    Formula and Calculation:

    Deciles are the points in a data set which divide the set into 10 equal parts.

    For the above data set, the deciles are:

    Decile Value

    1 768

    2 1543

    3 3456

    4 7389

    5 17683

    6 43267

    7 103198

    8 220659

    9 414084

    Findings and Interpretation:

    The decile values divide the data into 10 equal parts and a value of Rs 672 is in the first one-tenth of thedata set and less than the first decile.

    http://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdfhttp://www.finmin.nic.in/reports/IPFStat200910.pdf