statistical description of data · the input-output analysis is based on statistical data which ......

86
STATISTICAL DESCRIPTION OF DATA INTRODUCTION The word ‘Statistics’ has been derived from the Latin word ‘Status’ which means a political state. It has also its root either to the Italian word ‘Statista’ or the German word ‘Statistik’ each one of which means a political state. For several decades, the word ‘statistics’ was associated solely with the display of facts and figures pertaining to the economic, demographic and political situations prevailing in a country, usually, collected and brought out by the local governments. Statistics is a tool in the hands of mankind to translate complex facts into simple and understandable statements of facts. MEANING AND DEFINITION OF STATISTICS Meaning of Statistics. The word Statistics is used in two different senses - Plural and Singular. In its plural form, it refers to the numerical data collected in a systematic manner with some definite aim or object in view such as the number of persons suffering from malaria in different colonies of Delhi or number of unemployed girls in different states of India and so on. In singular form, the word statistics means the science of statistics that deals with the principles. devices or statistical methods of collecting, analysing and interpreting numerical data. Thus, ‘statistics’ when used in singular refers to that branch of knowledgr which implies Applied Mathematics. The science of statistics is an old science and it has developed through ages. This science has been defined in different ways by different authors and even the same author has defined it in different ways on different occasions. 1 STATISTICAL DESCRIPTION OF DATA 1

Upload: others

Post on 24-Mar-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

STATISTICAL DESCRIPTION OFDATA

INTRODUCTION

The word ‘Statistics’ has been derived from the Latin word ‘Status’which means a political state. It has also its root either to the Italianword ‘Statista’ or the German word ‘Statistik’ each one of whichmeans a political state. For several decades, the word ‘statistics’ wasassociated solely with the display of facts and figures pertaining tothe economic, demographic and political situations prevailing in acountry, usually, collected and brought out by the local governments.

Statistics is a tool in the hands of mankind to translate complexfacts into simple and understandable statements of facts.

MEANING AND DEFINITION OF STATISTICS

Meaning of Statistics. The word Statistics is used in two differentsenses - Plural and Singular. In its plural form, it refers to thenumerical data collected in a systematic manner with some definiteaim or object in view such as the number of persons suffering frommalaria in different colonies of Delhi or number of unemployed girlsin different states of India and so on. In singular form, the wordstatistics means the science of statistics that deals with the principles.devices or statistical methods of collecting, analysing and interpretingnumerical data.

Thus, ‘statistics’ when used in singular refers to that branch ofknowledgr which implies Applied Mathematics.

The science of statistics is an old science and it has developedthrough ages. This science has been defined in different ways bydifferent authors and even the same author has defined it in differentways on different occasions.1

STATISTICAL DESCRIPTION OF DATA 1

Page 2: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

It is impossible to enumerate all the definitiotis given to statistics both as “Numerical Data, i.e.,Plural Form” and “Statistical Methods, i.e., Singular Form”. However, we have give below someselected definitions of both the forms.

DefinitIons of “Statistics in Plural Form or Numerical Data”

Different authors have given different definitions of statistics. Some of the definitions of statisticsdescribing it quantitatively or in plural form are:

“Statistics are the classified facts representing the conditions of the people in a state especiallythose facts which can be stated in number or in a table of numbers or in any tabular or classifiedarrangement.” — Webster.

This definition is narrow as it is confined only to the collection of the people in a state.

But the following definition given by Secrist is modem and convincing. It also brings out the majorcharacteristics of statistical data.

“By statistics we mean the aggregate offacts affected to a marked extent by multiplicity ofcauses, numerically expressed, enumerated or estimated according to reasonable standards ofaccuracy collected in a systematic manner for a pre-determined purpose and placed in relation toeach other.” — Secrist.

This definition makes it clear that statisties: (in plural form or numerical data) should possess thefollowing characteristics.

I. Statistics are aggregate of factsII. Statistics are affected by a large number of causesIII. Statistics are always numerically expressedIV. Statistics should be enumerated or estimatedV. Statistics should be collected in a systematic mannerVI. Statistics should be collected for a pre-determined purposeVII. Statistics should be placed in relation to each other

Statistics as Statistical Methods or Statistics in Singular Sense

We give below the definitions of statistics used in singular sense, i.e., statistics as statisticalmethods.

Statistical methods provide a set of tools which can be profitably used by different sciences in themanner they deam fit. The term Statistics in this context has been defined differently by differentauthors. A few definitions are given below:

“Statistics may be called the scienée of counting.” - A.L. Bowley.

This definition covers only one aspect, i.e., counting, but the other aspects such as classification,tabulation, etc., have bech ignored. As such, the definition is inadequate and incomplete.

“Statisti t may be defined as the collection, presentation; analysis and interpretation of numericaldata.”

- Croton and Cowden.

2 STATISTICS (CPT)

Page 3: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

This definition given by Croxton and Cowden is simple, clear and concise.

According to this definition, there are collection of data, and presentation of data, analysis of data, andinterpretation However, one more stage may be added and that is the organisation of data Thus there arefour stages

1. Collection and Organisation of data. Three are various methods for collecting the data suchas census sampling primary and data etc

2. Presentation of data. The mass data collected should be presented in a suitable, concise formas the mass data collected is difficult~to undprstand and analyse.

3. Analysis of data. The mass data collected should be presented in a suitable, concise form forfrirther analysis. Analysis includes condensation, summarisation, conclusion, etc., throughthe means of measures of central tendencies, dispersion, skewness, kurtosis, correlation,regression, etc.

4. Interpretation of data. The last step is the thawing conclusions from the data collected as thefigures do not speak for themselves.

Having briefly discussed some of of the term statistics and having seen their drawbacks we arenow in a position to give a and complete definition of the ‘statistics’ in the following words:

Statistics (as used in the sense of data) statements offacts capable of analysis and interpretationand the science of statistics is a study of the principles and methods used in the colléétion,presentation, analysis and interpretation of numerical data in any sphere of enquiry.

IMPORTANCE AND SCOPE OF STATISTICS

I. Statistics and Economics

According to Prof Alfred Marshall, “Statistics are the straws out of which I, like everyother economist, have to make bricks.” The following are some of the fields of economicswhere statistics is extensively used.

(a) Consumption. Statistical data of consumption enable us to find out the ways in whichpeople in different strata of society spend their incomes.

(b) Production. The statistics of production describe the total productivity in the country.This enables us to compare otirselves with other countries of the world.

(c) Exchange. In the field of exchange, an economist studies markets, laws of prices aredetermined by the forces of demand and supply, cost of production, monopoly,competition, banking etc. A systematic study of all these can be made only with the helpof statistics.

(d) Econometrics. With the help of econometrics, economics has become exact science.Econometrics is the combination of economics, mathematics and statistics.

(e) Public Finance. Public finance studies the revenue and expenditure activities of a country.Budget, (a statistical document), fiscal policy, deficit financing, etc., are the concepts ofeconomics which are based on statistics.

STATISTICAL DESCRIPTION OF DATA 3

Page 4: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

(f) Input-Output Analysis. The input-output analysis is based on statistical data whichexplain the relationship between the input and the output. Sampling, Time series, Indexnumbers, Probability, Correlation and Regressions are some other concepts which areused in economic analysis.

II. Statistics and Commerce

Statistical methods are widely applied in the solution of most of the business and trade activitiessuch as production, financial analysis, costing, manpower, planning, business, market research,distribution and forecasting etc. A shrewd businessman always makes a proper and scientificanalysis of the past records in order to predict the future course of the business conditions.Index numbers help in predicting the future course of business and economic events. Statisticsor statistical methods help the business establishments in analysing the business activitiessuch as:

(a) Organisation of Business. Businessman makes extensive use of statistical data to arriveat the conclusion which guides him in establishing a new firm or business house.

(b) Production. The production department of an organisation prepares the forecastregarding the production of the commodity with the help of statistical tools.

(c) Scientific Management and Business Forecasting. Better and efficient control of abusiness can be achieved by scientific management with the help of statistical data. “Thesuccess of businessman lies on the accuracy of forecast made.” “The successfulbusinessman is one who estimate most closely approaches accuracy,” said Prof.Boddington.

(d) Purchase. The price statistics of different markets help the businessman in arriving atthe correct decisions. Raw material is purchased from those markets only where theprices are low.

III. Statistics and ‘Auditing and Accounting’

Statistics is widely used in accounting and auditing.

IV. Statistics and Economic Planning

According to Prof. Dickinson,

“Economic Planning is making of major decisions — what and how much is to be produced,and to whom it is to be allocated — by the conscious decisions of a determinate authority onthe basis of a comprehensive survey of economy as a whole. The various documentsaccompanying, preceding and following each of the eight Five Year Plans of India are astanding testimony to the fact that statistics is an indispensable tool in economic planning.

V. Statistics and Astronomy

Statistics were first collected by astronomers for the study of the movement of stars andplanets. As there are a few things which are common between physical sciences, and statisticalmethods, astronomers apply statistical methods to go deep in their study. Astronomers,generally, take a large number of measurements and in most cases there is some difference

4 STATISTICS (CPT)

Page 5: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

between these several observations. In order to have the best possible measurement they haveto make use of the technique of the law of errors in the form of method of least squares.

VI. Statistics and Meteorology

Statistics is related to meteorology. To compare the present with the past or to forecast for thefuture either temperature or humidity of air or barometrical pressures etc., it becomes necessaryto average these figures and thus to study their trends and fluctuations. All this cannot bedone without the use of statistical methods. Thus, the science of statistics helps meteorologyin a large number of ways.

VII. Statistics and Biology

The development of biological theories has been found to be closely associated with statisticalmethods. Professor Karl Pearson in his Grammar of Sciences has written, “The whole doctrineof heredity rests on statistical basis.”

VIII. Statistics and Mathematics

Mathematics and Statistics have been closely in touch with each other ever since the 17thcentury when the theory of probability was found to have influence on various statisticalmethods.

Bowley was right when he said, “A knowledge of Statistics is like a knowledge offoreignlanguage or of algebra it may prove of use at any time under any circumstances.”

Thus, we abserve that:

“Sciences without statistics bear no fruit, statistics without sciences have no root.”

IX. Statistics and Research

Statistical techniques are indispensable in research work. Most of advancement in knowledgehas taken place because of experiments conducted with the help of statistical methods.

X. Statistics and Natural Sciences

Statistics finds an extensive application in physical sciencess, especially in engineering, physics,chemistry, geology, mathematics, astronomy, medicine, botany, meteorology, zoology, etc.

XI. Statistics and Education

There is an extensive application of statistics in Education. Statistics is necessary for fonnulationof policies to start new required for new courses, consideration of facilities available for newcourses etc.

XII. Statistics and Business

Statistics is an indispensable tool insll aspects of business. When a man enters business heenters the profession of forecasting because success in business is always the result of precisionin forecasting and failure in business is very often due to wrong expectations, which arise inturn due to faulty reasoning and inaccurate analysis of various causes affecting a particularphenomenon.

STATISTICAL DESCRIPTION OF DATA 5

Page 6: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Boddington observes, “The successful businessman is the one, whose estimate most closelyapproaches the accuracy.”

LIMITATIONS OF STATISTICS

Statistics and its techniques are widely used in every branch of knowledge. W.I. King rightly says:“Science of statistics is the most useful servant, but only of great value to those who understandits proper use.” The scope of statistics is very wide and it has great utility; but these are restrictedby its limitations. Following are the important limitations of statistics:

1. Statistics does not deal with individual item. King says, “Statistics from the very natureof the subject cannot and never will be able to take into account individual cases.” Statisticsproves inadequate, where one wants to study individual cases. Thus, it fails to reveal the trueposition.

2. Statistics deals with quantitative data. According to Prof. Horace Secrist, “Somephenomenon cannot be quantitatively measured; honesty, resourcefulness, integrity, goodwll4all important In industry as well as in life, are generally not susceptible to direct statisticalmeasurement”

3. Statistical laws are true only on averages. According to W.I. King, “Statistics largely dealswith averages and these may be made up of individual items radically different from eachother.” Statistics are the means and not a solution to a problem.

4. Statistics does not reveal the entire story. According to Marshall, “Statistics are the straws,out of which, I like every other economist to have to make bricks. “ Croxton says: “It mustnot be assumed that statistical method is the only method for use in research; neither shouldthis method be considered the best attack for every problem.”

5. Statistics is liable to be misused. According to Bowley, “Statistics only furnishes a toolthough imperfect, which is dangerous in the hands of those who do not know its use anddeficiencies.” W.I. King states, “Statistics are like clay of which you can make a God orDevil as you please.” He remarks, “Science of Statistics is the useful servant, but only ofgreat value to those who understand its proper use.”

6. Statistical data should be uniform and homogeneous.

STATISTICAL TOOLS USED IN ECONOMIC ANALYSIS

The following are some of the important statistical techniques which are applied in economicanalysis.

(a) Collection of data; (b) Tabulation;

(c) Measures of Central Tendency; (d) Measures of Dispersion

(e) Time Series; (f) Probability;

(g) Index Numbers; (h) Sampling and its uses;

(i) Business Forecasting;

(f) Tests of Significance and Analysis of Variance; (k) Statistical Quality Control.

6 STATISTICS (CPT)

Page 7: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

COLLECTION OF DATA

Data The information collected through surveys or in a routine manner or other sources is calleda raw data. The word data means information (its literary meaning is given as facts). The adjectiveraw attached to data indicates that the information thus collected and recorded cannot be put toany use immediately and directly. It has to be converted into more suitable form or processedbefore it begins to make sense to be utilised gainfully. A raw data is a statistical data in originalform before any statistical techniques are used to redefine, process or summarize it.

There are two types of statistical data

(i) Primary data (ii) Secondary data

1. Primary data. It is the data collected by a particular person or organisation for his own usefrom the primaty source.

2. Secondary data. It is the data collected by some other person or organisation for their ownuse but the investigator also gets it for his use.

In other words, the primary data are those data which are collected by you to meet your ownspecific purpose, whereas the secondary data are those data which are collected by somebodyelse.

A data can be primary for one person and secondary for the other.

Methods of Collecting Primary Data

The primary data can be collected by the following methods

1. Direct personal observation. In this method, the investigator collects the data personallyand, therefore, it gives reliable and correct information.

2. Indirect oral investigation. In this method, a third person is contacted who is expected toknow the necessary details about the persons for whom the enquiry is meant.

3. Estimates from the local sources and correspondence. Here the investigator appoints agentsand correspondents to collect the data.

4. Data through questionnaires. The data can be collected by preparing a questionnaire andgetting it filled by the persons concerned.

5. Investigations through enumerators. This method is generally employed by the Governmentfor population census, etc.

Methods of Collecting Secondary Data

The secondary data can be collected from the following sources

1. Information collected through newspapers and periodicals.

2. Information obtained from the publications of trade associations.

3. Information obtained from the research papers published by University departments orresearch bureaus or U. C. C.

STATISTICAL DESCRIPTION OF DATA 7

Page 8: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

4. Information obtained from the official publications of the central, state, and the localgovernments dealing with crop statistics, industrial statistics, trade and transport statistics,etc.

5. Information obtained from the official publications of the foreign governments forinternational organisations.

CLASSIFICATION OF DATA

The process of arranging things in groups or classes according to their common characteristicsand affinities is called the classification of data.

“Classification is the process of arranging data into sequences and groups according to their commoncharacteristics or separating them into different but related parts. — Secrist.

Thus classification is the process or arranging the available data into various homogenous classesand sub-classes according to some common characteristic or attribute or objective of investigation.

REQUISITES OF A GOOD CLASSIFICATION

The main characteristics of a good classification are

1. It should be exhaustive. 2. It should be unambiguous.

3. It should be mutually exclusive. 4. It should be stable.

5. It should be flexible. 6. It should have suitability.

7. It should be homogeneous. 8. It should be a revealing classification.

ADVANTAGES OF CLASSIFICATION OF DATA

(i) It condenses the data and ignores unnecessary details.

(ii) It facilitates comparison of data.

(iii) It helps in studying the relationships between several characteristics.

(iv) It facilitates further statistical treatments.

TYPES OF CLASSIFICATION OF DATA

There are four types of classification of data:

(i) Quantitative Classification; (ii) Temporal Classification;

(iii) Spatial Classification and (iv) Qualitative Classification.

(i) Quantitative Classification: When the basis of classification is according to differences inquantity, the classification is called quantitative.

A quantitative classification refers to classification that is based on figures. In other words, itis a classification which is based on such characteristics which are capable of quantitativemeasurement such as height, weight, number of marks obtained by students of a class.

(ii) Temporal Classification : When the basis of classification is according to differences in time,the classification is called temporal or chronological classification.

8 STATISTICS (CPT)

Page 9: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

(iii) Spatial or Geographical Classification: When the basis of classification is according togeographical location or place, the classification is called spatial or geographical.

(iv) Qualitative Classification: When the basis of classification is according to characteristics orattributes like social status etc. is called qualitative classificatiot

Classification according to attributes is a method in which the data are divided on the basis ofqualities. (i.e. married or single; honest or dishonest; beautiful or ugly; on the basis of religion,viz., Hindu, Muslim, Sikh, Christian, etc. known as attributes), which cannot be measuredquantitatively.

Classification of this nature is of two types:

(i) Simple. Classification or Two-Fold Classification.

(ii) Man{fold Classification.

1. Simple Classification or Two-fold Classification

If the data are classified only into two categories according to the presence or absence of onlyone attribute, the classification is known as simple or two—fold classification or DichotomousFor example, the population of India may be divided into males and females; literate’andilliterate etc.

Population

Male Female

Moreover, if the classification is done according to a single attribute it is also known as One-wayClassification.

2. Manifold Classification: It is a classification where more than one attributes are involved.

MODE OF PRESENTATION OF DATA

In this section we shall consider the following three modes of presentation of data.

(a) Textual presentation;

(b) Tabular presentation or Tabulation;

(c) Diagrammatic presentation.

TEXTUAL PRESENTATION

This method, comprises presenting data with the help of a paragraph or a Qumber of paragraphs.The official report of an enquiry commission is usually made by textual presentation. Followingare the examples of textual presentation.

Example 1. In 1995, out of total of 2,000 students in a college, 1,400 were for graduation and the rest forpost-graduation (P. G.). Out of 1,400 Graduate students 100 were girls, however, in all there were 600girls in the college. In 2000, number of graduate students increased to 1,700 out of which 250 were girls,but the number of P. G. students fall to 500 of which only 50 were boys. In 2005, out of 800 girls 650 werefor graduation, whereas the total number of graduates was 2,200. The number of boys and girls in P.G.classes was equal.

STATISTICAL DESCRIPTION OF DATA 9

Page 10: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Example 2. Out of total number of 2,807 women, who were interviewed for employment in a textilefactory, 912 were from textile areas and the rest from non-textile areas. Amongst the married women,who belonged to textile areas, 347 were having some work experience and 173 did not have work experience,while for non-textile areas the corresponding figures were 199 and 670 respectively. The total number ofwomen having no experience was 1,841 of whom 311 resided in textile areas. Of the total number ofwomen, 1,418 were unmarried and of these the number of women having experience in the textile andnon-textile areas was 254 and 166 respectively.

Merits and Demerits of Texual Presentation

The merit of this mode of presentation lies in its simplicity and even a layman can present andunderstand the data by this method. The observations with exact magnitude can be presentedwith the help of textual presentation. This type of presentation can be taken as the first steptowards the other methods of presentation.

Textual presentation, however, is not preferred by a statistician simply because, it is dull,monotonous and comparison between djfferent observations is not possible in this method. Formanifold classification, this method cannot be recommended.

TABULAR PRESENTATION OR TABULATION OF DATA

Tabulation is a scientific process used in setting out the collected data in an understandable form.

Tabulation may be defined as logical and systematic arrangement of statistical data in rowsand columns. It is designed to simplify the presentation of data for the purposes of analysisand statistical inferences.

Secrist has defined tabulation in the following words:

“Tables are a means of recording in permanent form the analysis, that is, made throughclassification and by placing in juxtaposition things that are similar and should be compared.”

— Secrist

The above definition clearly points out that tabulation is a process which gives classification ofdata in a systematic form and is meant for the purpose of making comparative studies.

Professor Bowley refers to tabulation as

“The intermediate process between the accumulation of data in whatsoever form they are obtained,and the final reasoned account of the result shown by the statistics”.

- Bowley.

“Tabulation is the process of condensing classified data in the form of a table so that it may bemore easily understood and so that any comparison involved may be more readily made.”

- D. Gregory and H. Ward.

Thus tabulation is one of the most important and ingenious devices of presenting the data in acondensed and readily comprehensible form. It attempts to furnish the maximum information inthe minimum possible space, without sacrificing the quality and usefulness of the data.

10 STATISTICS (CPT)

Page 11: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Objectives of Tabulation

The purpose of tabulation is to summarise lots of information in such a simple manner that it canbe easily analysed and interpreted.

The main objectives of the Tabulation are

1. To simplify the complex data.

2. To clarify the objective of investigation.

3. Economise space.

4. To facilitate comparison.

5. To depict trend and pattern of data.

6. To help reference for future studies.

7. To facilitate statistical analysis.

8. To detect errors and omissions in the data.

9. To clarify the characteristics of data.

Essential Parts of a Statistical Table

A good statistical table should invariably has the following parts

1. Table Number. A table should be numbered for identification, especially, when there are alarge number of tables in a study. The number may be put at the centre, above the title or atthe bottom of the table.

2. Title of the table. Every table should have a title. It should be clear, brief and self-explanatory.The title should be set in bold type so as to give it prominence.

3. Date. The date of preparation of table should always be written on the table. It enables torecollect the chronological order of the table prepared.

4. Stubs or Row designations. Each row of the table must have a heading. The designations ofthe rows are called stubs or stub items. Stubs clarify the figures in the rows. As far as possible,the items should be condensed so that they can be included in a single row.

5. Captions or Column headings. A table has many columns. Sub-headings of the columns arecalled captions or headings. They should be well-defmed and brief.

6. Body of the Table. It is the most vital part of the table. It contains the numerical information.It should be made as comprehensive as possible. The actual data should be arranged in sucha manner that any figure may be readily located. Different categories of numerical variablesshould be set out in an ascending order, from left to right in rows and in the same fashion inthe columns, from top downwards.

7. Unit of Measurements. The unit of measurement should always be stated along with thetitle, if this is uniform throughout. If different units have been adopted, then they should bestated along the stubs or captions.

8. Source Notes. A note at the bottom of the table should always be given to indicate theSTATISTICAL DESCRIPTION OF DATA 11

Page 12: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

primary, source as well as the secondary source from where the data has been taken,particularly, when there is more than one source.

9. Foot Notes and References. It is always placed at the bottom of the table. It is a statementwhich contains explanation of some specific items, which cannot be understood by the readerfrom the title, or captions and stubs. -

Format of a Table

The following is the specimen or format of a table indicating the above parts.

Number and Date : No.DC 1172/25/2/2006Example 3. In 1995. out of total of 2,000 students in a college, 1,400 were for graduation and the rest forPost-graduation (P.G.). Out of 1,400 Graduate students 100 were girls, however, in all there were 600girls in the college. In 2000, number of graduate students increased to 1,700 out of which 250 were girls,

12 STATISTICS (CPT)

Page 13: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

but the number of P.G. students fall to 500 of which only 50 were boys. In 2005, out of 800 girls 650 werefor graduation, whereas the total number of graduates was 2,200. The number of boys and girls in P.G.classes was equal.

Represent the above information in tabular form. Also calculate the percentage increase in the number ofgraduate students in 2005 as compared to 1995.

Solution.

Source: University Paper. No. PC 293 / 5 /3 / 2006.

Percentage increase in graduation in 2000 as compared to 1990

1400

14002200 x 100 = 57.14%

Example 4. Out of total number of 2,807 women, who were interviewed for employment in a textilefactory, 912 were from textile areas and the rest from non-textile areas. Amongst the married women,who belonged to textile areas, 347 were having some work experience and 173 did not have work experience,while for non-textile areas the corresponding figures were 199 and 670 respectively. The total number ofwomen having no experience was 1,841 of whom 311 resided in textile areas. Of the total number ofwomen, 1,418 were unmarried and of these the number oJ women having experience in the textile andnon-textile areas was 254 and 166 respectively.

Tabulate the above information.

STATISTICAL DESCRIPTION OF DATA 13

Page 14: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Solution.

Source : Institute of Chartered Accountants of India. No. CA/6/5/1998.Example 5. In a trip organized by a college, there were 100 persons the average cost works out to Rs.15.60 per head. There were 80 students each of whom paid Rs. 16. Members of the teaching staff werecharged at a higher rate. The number of servants was 6 (all males) and they were not charged. Thenumber of ladies was 20% of the total of which two were ladies staff members.Tabulate the above information in proper tabular form.Solution.

Source: Institute of Chartered Accountants of India. No. CA/7/1 1/1 998

14 STATISTICS (CPT)

Page 15: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

DIFFERENCE BETWEEN TEXUAL AND TABULAR PRESENTATION

The tabulation method is usually preferred to textual presentation as

(i) It facilitates comparison between rows and columns.

(ii) Complicated data can be represented using tabulation.

(iii) Without tabulation, statistical analysis of data is not possible.

(iv) It is a must for diagrammatic representation.

DIAGRAMMATIC PRESENTATION OF DATA

The representation of statistical data through charts, diagrams and picture is another attractiveand alternative method. Unlike the first two methods of representation of data, diagrammaticrepresentation can be used for both the educated section and uneducated section of the society.Furthermore, any hidden trend presented in the given data can be noticed only in this mode ofresentation. However, compared to tabulation, this is less accurate. So if there is a priority foraccuracy, we have to recommend. tabulation.

In this chapter we shall consider the following three types of diagrams

I. Line diagram or Historiagram;

II. Bar diagram;

III. Pie chart.

Rules For Construction of Diagrams

1. Title: Each iagramould be given a suitable title, either at the top or at the bottom. It shouldclearly convey the main theme which the diagram intends to portray. The title should bebrief, self-explanatory, clear and un-ambiguous.

2. Foot Notes: Foot notes are given at the bottom of each diagram in order to clari& somepoints relating to the diagram.

3. Source Note: Source note, wherever possible, should be appended at the bottom of thediagram. It is given to indicate the source from where the data have been taken.

4. Neat and Attractive: Diagrams should be made very neat, clean and attractive by propersize and lettering.

5. Size of diagram: The size of diagram would depend on the magnitude of data which is to beshown. The size should be such that all the important characteristics of the data are properlyemphasised and can be understood by a look at the diagram.

STATISTICAL DESCRIPTION OF DATA 15

Page 16: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

6. Selection of scale: Scale of presentation should be consistent with the size of the paper andthe size of the observations to be displayed, so that the diagram obtained is neither too smallnor too big. The lettering of the scale must be very clear and legible.

7. Index: A brief index explaining various types of shades, colours, lines and designs used inthe diagram should be given for clear understanding of the diagram.

8. Simplicity: The diagrams should be as simple as possible so that they are easily understoodeven by a lay-man who does not have any mathematical background.

9. Proportion between length and breadth: The length and breadth of the diagram should beconsistent with the space available.

10. Choice of a diagram: The choice of a diagram for the given data (or data under consideration)be made with utmost caution and care.

LINE DIAGRAM OR HISTORIGRAM

It is the simplest of all the diagrams. Line graphs are simple mathematical graphs that are drawnon the graph paper by plotting the data concerning one variable on the horizontal known as x-axis and other variable of data on the vertical known as y-axis. With the help of such graphs, theeffect of one variable upon another variable during an experimental or normative study may beclearly demonstrated. The x values are presented along x-axis and the corresponding frequenciesare presented along y-axis. This diagram facilitates comparison. Even a time series data can bepresented by a line diagram. The distance between the lines is, generally, kept uniform.

When we observe the values of a variable at different period of time, the series so formed is knownas Time series. We take the time on X-axis and value of the variable on Y-axis. Join the points bystraight lines. This is known as Line graph or the graph of Time series. A time series graph isalso known as Historigram.

This is the simplest and easiest graph, and no technical skill is needed in its construction. Moreover, it enables an individual to present more information in a more precise form than any otherkind of chart can do. Two or more variables can be shown on the same graph and thus comparisonbecomes easy. Graphs of Time series can be constructed either on a natural scale or on a ratioscale. In natural scale, absolute changes from one period to another are shown, whereas in aratio scale the relative changes are shown.

Example 6. A word-nonsense syllabJes association test was administered on students of a class Xto demonstrate the effect of practice on learnings. The data so obtained may be studied from thefollowing table:

16 STATISTICS (CPT)

Page 17: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Trial No. : 1 2 3 4 5 6 7 8 9 10 11 12

Score : 4 5 8 8 10 13 12 12 14 16 16 16

Draw a line graph for the representatioh and interpretation of the above data.

Solution. Plot the points (1, 4), (2, 5), (3, 8) (4, 8), (5, 10), (6, 13), (7, 12), (8, 12), (9, 14) and (10,16). Join them to get the required line graph as given by figure 1.1 on page 1.14.

BAR DIAGRAM

The simplest type of graph is the bar diagram. It is especially useful in comparing qualitative dataor quantitative data of discrete type. A bar diagram is a graph on which the data are representedin the form of bars. It consists of a number of bars or rectangles which are ofunjform width with

STATISTICAL DESCRIPTION OF DATA 17

Page 18: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

equal space between them on the x-axis. The length of the bar is proportional to the value itrepresents. It should be seen that the bars are neither too short nor too long. The scale should beclearly indicated and base line be clearly shown.Bars may be drawn either horizontally or vertically. A good rule to use in determining the directionis that if the legend describing the bar can be written under the bars when drawn vertically,vertical bars should be used; when it cannot be, horizontal ones must be used. In this way, thelegends can be read without turning the graph. The descriptive legend should not be written atthe ends of the bars or within the bars, since such writing may distort the comparison. Usually thediagram will be more attractive if the bars are wider than the spaces between them.The width of bars is not governed by any set rules. It is an arbitrary factor. Regarding the spacebetween two bars, it is conventional to have a space about one half of the width of a bar.The data capable of representation through bar diagrams, may be in the form of row scores, ortotal scores, or frequencies, or computed statistics and summarised figures like percentages andaverages etc.The bar diagram is generally used for comparison of quantitative data. It is also used in presentingdata involving time factor. When two or more sets of data over a certain period of time are to becompared a group bar diagram is prepared by placing the related data side by side in the shape ofbars. The bars may be vertical or horizontal in a bar diagram. If the bars are placed horizontally,it is called a Horizontal Bar Diagram. When the bars are placed vertically, it is called a VerticalBar Diagram.There are six types of Bar diagram: (i) Simple Bar Diagram; (ii) Multiple or Grouped Bar Diagram;(iii) Subdivided or Component Bar Diagram; (iv) Percentage Subdivided Bar Diagram; (v)Deviation or Bilateral Bar Diagram; (vi) Broken Bars.

Simple Bar Diagram

It is used to compare two or more items related to a variable. In this case, the data are presentedwith the help of bars. These bars are usually arranged according to relative magnitude of bars.The length of a bar is determined by the value or the amount of the variable. A limitation ofSimple Bar Diagram is that only one variable can be represented on it.

Example 7. Draw a Bar chart of the procurement of rice (in tons) in an Indian state:

Year : 1998 1999 2000 2001 2002 2003

Rice (in tons) : 4500 5700 6100 6500 4300 7800

Solution. The given data is represented by the Simple Bar Diagram as only one variable is to berepresented.

18 STATISTICS (CPT)

Page 19: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Fig. 1.2. Bar diagram of the procurement of rice.

Here we represent years on the x-axis and procurement of rice on the y-axis.

Multiple or Grouped Bar Diagram

A multiple or grouped bar diagram is used when a number of items are to be compared in. respectof two, three or more values. In this case, the numerical values of major categories are arranged inascending or descending order so that the categories can be readily distinguished. Different shadesor colours are used for each category.Example 8. Represent the following data by a suitable diagram showing the difference betweenproceeds and costs:Proceeds and costs of a firm (in thousands of rupees).

Year Total Proceeds Total costs1999 22.0 19.52000 27.3 21.72001 28.2 30.02002 30.3 25.62003 32.7 26.12004 33.3 34.2

STATISTICAL DESCRIPTION OF DATA 19

Page 20: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Solution. In this case, the two types of information: proceeds and costs are shown on a diagramindicating the difference between them. The two types of information for each year arc placed insuch a way that a comparison may be made between them. So a grouped bar diagram drawn inorder to represent the given data.

FIg. 1.3. Grouped bar diagram showing proceeds and costs.

Sub-divided or Component Bar Diagram

A component bar diagram is one which is formed by dividing a single bar into several componentparts. A single bar represents the aggregate value whereas the component parts represent thecomponent values of the aggregate value. It shows the relationship among the different parts andalso between the different parts and the main bar.

Example 9. Represent the following data of the development expenditure of Central Governmentsin India during 1997— 98, 1998 — 99, 1999 — 2000 by bar diagram.

Year Loans and advances Capital Revenue Total

1997—98 8,601 3,787 3,477 15,865

1998—99 10,335 4,456 4,036 18,827

1999 — 2000 11,549 4,803 3,709 20,061

Solution. In this case, we use subdivided bar diagram. Here the data for different years isrepresented by various parts of the single bar for the year. Different parts are indicating loans andadvances capital and revenues for each year as shown in the figure.

20 STATISTICS (CPT)

Page 21: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

FIg. 1.4. Development expenditure of central and state governments in India

Percentage Sub-divided Bar Diagram

It consists of one or more than one bars where each bar totals 100%. Its construction is similar tothe sub-divided bar diagram with the only difference that whereas in the sub-divided bar diagramsegments are used in absolute quantities, in the percentage bar diagram the quantities aretransformed into percentages.

Example 10. Following are the heads of income of Railways during 2004 and 2005.

2004 2005

(in crores of rupees) (in crores of rupees)

Coaching 26 31

Goods 40 39

Others 4.50 3.50

STATISTICAL DESCRIPTION OF DATA 21

Page 22: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Represents the above data by a bar chart.

Solution. We are given three types of information: coaching, goods and others for two years. Inorder to facilitate comparison among them and also between the two years, a percentage subdivided bar chart is drawn to represent the given data. The graph has been drawn on percentagefigures. Percentage of each item has been expressed on the total income of respective years by theformula.

Percentage = yearrespective of income Totalitem the of Income

x 100

Calculation of Percentage

Items 2004 2005

income Percentage Income Percentage

Coaching 26 36.88 31 42.18

Goods 40 56.74 39 53.06

Other 4.50 6.38 3.50 4.76

Total 70.50 100.00 73.50 100.00

Percentage sub-divided bar chart showing incomes of Railways.

Deviation Bars

Deviation bars are popularly used for graphic presentation of net deviations in different valueswhich have both positive and negative values. The positive values are shown above the baseline(or x-axis) and the negative values below it.

22 STATISTICS (CPT)

Page 23: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Broken Bars

In certain cases we may come across data which contain very wide variations in values, i.e., verysmall or very large. In order to provide adequate and reasonable shape to the smaller bars, thelarger bars may be broken at the top. The value of each bar is written at the top of the bar.

PIE DIAGRAM OR ANGULAR DIAGRAM

A pie diagram is a circular graph which represents the total value with its components. The areaof a circle represents the total value and the different sectors of the circle represent the differentparts. The circle is divided into sectors by radii and the areas of the sectors are proportional to theangles at the centre. It is generally used for comparing the relation between various componentsof a yalue and between components and the total value. In pie diagram, the data are expressed aspercentages. Each component is expressed as percentage of the total value. A pie diagram is alsoknown as angular diagram.

The name pie diagram is given to a circle diagram because in determining the circumference of acircle we have to take into consideration a quantity known as ‘pie’ (written as ).

Method of construction. The surface area of a circle is known to cover 2 radians or 360 degrees.The data to be represented through a circle diagram may therefore be presented through 360degrees, parts or sections of a circle. The total frequencies or value is equated to 360° and then theangles corresponding to component parts are calculated (or the component parts are expressedas percentages of the total and then multiplied by 360/100 or 3.6). After determining these angles,the required sectors in the circle are drawn. Different shades or colours of designs or differenttypes of cross-hatchings are used to distinguish the various sectors of the circle.

Example 11. 120 students of a college were asked to opt for d(fferent work experiences. The details ofthese options are as under.

Areas of work experience No. of students

Photography 6

Clay modelling 30

Kitchen gardening 48

Doll making 12

Book binding 24

Represent the above data through a pie diagram.

STATISTICAL DESCRIPTION OF DATA 23

Page 24: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Solution. The numerical data may be converted into the angles of the circle as given below:

Areas of work experience No. of students Angle of the circle

Photography 6 (6/120) x 360 = 18°

Clay modelling 30 (30/120) x 360 = 90°

Kitchen gardening 48 (48/120) x 360 = 144°

Doll making 12 (12/120) x 360 = 36°

Book binding 24 (24/120) x 360 = 72°

Total 120 360°

Pie diagram: Areas of work experience opted by the students.

Example 12: The following figures relate to the cost of construction of a house in Delhi:

Item : Cement Steel Bricks Timber Labour Miscellaneous

Expenditure : 20% 18% 10%. 15% 25% 12%

Represent the data by a suitable diagram.

24 STATISTICS (CPT)

Page 25: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Solution.

Example 13: Draw a suitable diagram for the following:

Expenditure on item Percentage of Totql expenditure

Food 65

Clothing 10

Housing 12

Fuel and lighting 5

Miscellaneous 8

STATISTICAL DESCRIPTION OF DATA 25

Page 26: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Solution.

1.19 PRESENTATION OF DATA BY FREQUENCY DISTRIBUTION

Statistical data are collected in order to serve a purpose. It should be presented in such a manner,so that it may be easily understood and grasped, and the conclusion may be drawn promptly.from the data presented. The data can be presented in a tabular or graphical or diagramaticalform. When the data are presented in tables or charts in order to bring out their salient features,it is called the presentation of data.

In other words, the methods to condense the data (collected by an investigator) in the tabularform to study their ~salient features is called the presentation of data.

The raw data can be arranged in any one of the following ways:

(i) Serial order or Alphabetical order

(ii) Ascending order or Descending order

(iii) Tables or Charts

(iv) Groups or Class intervals.

Illustration 1. The marks obtained by 30 students in a class test, out of 50 marks, according totheir roll numbers are

26 STATISTICS (CPT)

Page 27: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

41, 25, 5, 33, 12, 21, 19, 39, 19, 21, 12, 1, 19, 12, 19, 17, 12, 17, 17, 41, 41, 19, 41, 33, 12, 21, 33, 5,1, 21.

The data in this form is called a raw data.

The above data can be arranged in serial order.

Data in Serial Order

The above raw data can be arranged in serial order as follows

Roll No. Marks Roll No. Marks Roll No. Marks

1 41 11 12 21 41

2 25 12 1 22 19

3 5 13 19 23 41

4 33 14 12 24 33

5 12 15 19 25 12

6 21 16 17 26 21

7 19 17 12 27 33

8 39 18 17 28 5

9 19 19 17 29 1

10 21 20 41 30 21

The abovearrangement is called the data is arranged in ‘Serial Order’.

The data in this fonn donot gives us clear picture of the class.

Data in Ascending or Descending Order

Now suppose we wish to judge the standard of achievement of the students. The data inthis formdo not give us a clear picture of the group. If we arrange them in ascending or descending order,it gives us a slightly better picture. In ascending order, the data of illustration I will look asfollows:

1, 1,5,5, 12, 12, 12, 12. 12, 17, 17, 17, 19, 19, 19, 19, 19, 21, 21, 21, 21, 25, 33, 33, 33, 39, 41, 41, 41,41.

STATISTICAL DESCRIPTION OF DATA 27

Page 28: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

In descending order, the data look as follows

41,41,41,41, 39, 33, 33, 33, 25, 21,21, 21, 21, 19, 19, 19, 19, 19, 17, 17, 17, 12, 12, 12, 12, 12, 5,5,1,1.

The raw data when put in ascending or descending order of magnitude is called an array orarrayed data.

Array. An array is an arrangement of data in order of magnitude either in ascending order or indescending order.

The arrayed data do not tell us much except the maximum(s) or minimum(s) of data.

Moreover, if the observations are large, then arranging the data in ascending or descending orserial order is a difficult job.

TALLY BARS AND FREQUENCY

In order to make the data easily understandable, we tabulate the data in the form of tables orcharts. A table has three columns.

(i) Variable; (ii) Tally marks; (iii) Frequency.

Variable

Any character which can vary from one individual to another is called a variable or a variate. Forexample, age, income, height, intelligence, colour etc. are variates. Some variates are measurableand others are not directly measurable. The examples of measurable variates are age, height,temperature, etc., whereas colour and intelligence are the examples of those variates which cannotbe measured numerically. Variables or observations with numbers as possible values are calledquantitative variables, whereas those with names of places, quality, things etc., as possible valuesare called qualitative variables or attributes.

Variables are of two types (i) Continuous; (ii) Discontinuous or Discrete. Quantities which cantake all numerical values within a certain interval are called continuous variables; But thosevariables which can take only a finite of values are called discrete variables; For example, numberof students in a particular class, number of sections in a school etc.

Tally

It is a method of keeping count in blocks of five. For example: 1 = I; = 2; II = 3; IIII = 4; = 5;= 6 and so on.

Tally Bars. These are the straight bars used in the Tally.

Each item falling in the class interval, a stroke (vertical Bar) is marked against it. This stroke(vertical Bar) is called the Tally Bar. Usually, after every four strokes (Tally Bar), in a class, thefifth item is marked by a horizontal or slanted line across the Tally Bars (strokes). For example,the frequency 5, 6, 7 is represented by, , , I respectively.

28 STATISTICS (CPT)

Page 29: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

The above method of presentation of data is known as ‘Frequency Distribution’. Marks are calledvariates. The number of students who have secured a particular number of marks- is calledFrequency of that variate.

In the first column of the table, we write all marks from lowest to highest. We now look at the firstmark or value in the given raw data and put a bar (vertical line) in the second column opposite toit. We then, see the second mark or value in the given raw data and put a bar opposite to it in thesecond column. This process is repeated till all the observations in the given raw data areexhausted. The bars drawn in the second column are known as tally marks and to facilitate werecord tally marks in bunches offive, the fifih tally marks is drawn diagonally .across the firstfour. For example, III III = 8. We flnally count the number of tally marks corresponding to eachobservation and write in the third column headed .by frequency or number of students.

Frequency

The number of times an observation occurs in the given data is called the frequency of the observation.

Frequency Distribution. A frequency distribution is the arrangement of the given data in theform of a table showing frequency with which each variable occurs. In other words, frequencydistribution of a variable x is the ordered set {x,f}, wheref is the frequency. It shows all scores ina set of data together with the frequency of each score.

The data of illustration I can be put in the tabular form as given below:

The presentation of data can be further condensed into groups or classes. In this presentation allobservations are divided into groups. These groups are called classes or class intervals.

STATISTICAL DESCRIPTION OF DATA 29

Page 30: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

We can arrange the above data into classes as follows

The class 11 - 20 means the marks obtained between 11 and 20 including both 11 and 20. Thenumber of observations falling in a particular class is called the frequency of that class orclass frequency. The frequency of the class 1 - 10 is 4, 11 - 20 is 13, 21 - 30 is 5 and so on.

In the class 11 - 20, 11 is called the lower limit and 20 is called the upper limit of the class. Thepresentation of data in the form of above table is called Grouped Frequency Distribution.

TYPES OF FREQUENCY DISTRIBUTIONS

Frequency distribution are of two types

(i) Discrete Frequency Distribution

(ii) Grouped (or Continuous) Frequency Distribution

Discrete Frequency Distribution

The construction of discrete frequency distribution from the given raw data is done by the methodof tally marks as explained earlier.

Construction of Discrete Frequency Distribution Table

The frequency distribution table. has three columns headed by

1. Variables (or Classes) 2. Tally Marks or Bars 3. Frequency.

The table is constructed by the following steps

Step 1. Prepare three columns, viz., one for the variable (or classes), another for tally marks and thethird for the frequency corresponding the variable (or class).

Step 2. Arrange the given data (or values) from the lowest to the highest in the first column under theheading variable (or classes).

Step 3. Take the first observation in the raw data and put a bar (or vertical line I) in the second columnunder Tally Marks opposite to it. Then take a second observation and put a tally markopposite to it. Continue this process till all the observations of the given raw data are exhausted.

30 STATISTICS (CPT)

Page 31: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

For the sake of convenience, record the tally marks in bunches of five, the fifth bar is placeddiagonally crossing the other four (5 is represented by ) leave some space between eachblock of bars.

Step 4. Count the tally marks of column 2 and place this number opposite to the value of the variablein the third column headed by Frequency.

Step 5. Give a suitable title to the frequency distribution table so that it exactly conveys the informationcontained in the table.

Example 14. Form a discrete frequency distribution from the following scores

15, 18, 16, 20, 25, 24, 25, 20, 16, 15, 18, 18, 16, 24, 15, 20, 28, 30, 27, 16, 24, 25, 20, 18, 28, 27, 25,24, 24, 18, 18, 25, 20, 16, 15, 20, 27, 28, 29, 16.

Solution.

Continuous or Grouped Frequency Distribution

The method of presenting the raw data in Discrete Frequency Distribution form is convenientonly when the values in the raw data are largely repeating and the difference between the greatestand the largest observation is not very large. If the number of observations in the raw data is largeand the difference between the greatest and the smallest observation is large, we condense thedata in classes or groups.

We now define certain statistical terms required for continuous frequency distribution.

SOME STATISTICAL TERMS

Raw Data or Data. A raw data is a statistical data in original form before any statisticaltechnique is applied to redefine, process or summarize it.

Variable or Variate. Any character which can vary from one individual to another is calledvariable or variate. For example, age, income, height, intelligence, colour, etc., are variates.Some variates are measurable and others are not directly measurable. The examples of

STATISTICAL DESCRIPTION OF DATA 31

Page 32: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

measurable variates are : age, height, temperature, etc., whereas colour. and intelligenceare the examples of those variates which cannot be measured numerically. Variables orobservations with numbers as possible values are called quantitative variables,whereas those with names Of places, attrlbutes, things etc., as possible values arecalled qualitative varilbles.

Variables are of two types.

(i) Continuous; (ii) Discontinuous or Discrete

Continuous Variable. A continuous variable is capable of assuming any value within a certainrange or interval. The height, weight, age and temperature of any person can be expressed notonly in integral part but also in fractions of any part. For example, the weight of a boy may 44.0kg or 44.6 kg or 44.65 kg, similarly, his height may be 56 inches or 56.4 inches and age may be 10years or 10.5 years. Thus, the height, weight, age or temperature etc. are continuous variables.

Discrete Variable. A discrete variable can assume only integral values and is capable of exactmeasurement. In other words, those variables which can take only a finite set of values arecalled discrete variables. For example : the number of students in a particular class, or thenumber of sections in a school, etc. are the examples of discrete variables. Discrete variables arealso known as discontinuous variables.

Continuous Series. When the continuous variables are arranged in the form of a series,it is called continuous series or exclusive series.

Discrete or Discontinuous Series. When the discrete variables are arranged in theform of a series, it is called a discrete or discontinuous series.

Array. An array is an arrangement of data in order of magnitude either in descendingor ascending order.

Descending Order. When data is arranged from the highest value to the lowest value,the array so formed is in descending order.

Ascending order. When the data is arranged from the lowest value to the highest value,the array so formed is in ascending order.

Illustration 2. if the given data is 17, 7, 11, 5, 13, 9 then

Array in Ascending order : 5, 7, 9, 11, 13, 17.

Array in Descending order : 17, 13, 11, 9, 7, 5.

Range. It is the dttference between the largest and the smallest number in the givendata

The range of the data given in illustration 2 is 17 5 = 12.

32 STATISTICS (CPT)

Page 33: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Class, Class Interval and Class limits. if the observations of a series are divided intogroups and the groups are bounded by limits, then each group is called a class. Theend values of a class are called the class limits. In other words, the two end limits ofeach class are called class limits. The smaller value of the two limits is called the lowerlimit and the higher value of the same is called the upper limit of the class. These twoclass limits are sometimes called the stated class limits.

Class Interval. The difference between the lower limit (L) and the upper limit (U) of theclass is known as class interval (I). Thus I = U - L.

In other words, the range of a class is called its Class interval.

Illustration 3. The given data is

In the above data the classes are 1 - 10, 11 - 20, 21 - 30, 31- 40.

Class Interval. The range of the marks from I to 40 is grouped into four classes orgroups, viz. : 1 10, 11 20, 21 — 30, 31 — 40. Each group is known as class interval. Theinterval between one class and its adjacent class being 9.

[as: 10 - 1 = 9, 20 - 11= 9,30 - 21 = 9, etc.]

Class Limits. In the first class 1 10, its lower limit is 1 and upper limit is 10. Similarly, 31is the lower limit and 40 is the upper limit of the class interval 31 - 40.

Actual Class Limit or Class Boundaries. In the illustration 3, there is a gap of 1 markbetween the limits of any two adjacent classes. This gap may be filled up by extendingthe two limits of each class by half of the value of the gap. Thus

Lower class boundary = Lower class Limit - 21

of the gap

Upper class boundary = Upper class limit + 21

of the gap

STATISTICAL DESCRIPTION OF DATA 33

Page 34: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

The class boundary of the class 11 to 20 are

Lower class boundary = 11- 21

of 1 = 11 - 0.5 = 10.5

Upper class boundary = 20 + 21

of 1 = 20 + 0.5 = 20.5

In other words, the class boundaries are the Limits up to which the two limits, (actual) of eachclass may be extended to fill up the gap that exists between the classes. The class boundariesof each class, so obtained are called the Actual class limits or True class limits.

True lower class limit. = Lower class limit - 21

of the gap

True upper class limit = Upper class limit + 21

of the gap

Note: 1. In the case of exclusive series True class limits are the same as class limits.

Illustration 4.

Class Interval Class Boundaries

11 - 20 10.5 - 20.5

21 - 30 20.5 - 30.5

31 - 40 30.5 - 40.5

Class-mark or Mid-point or Mid-value. The central value of the class interval is calledthe mid-point or mid-value or class mark. It is the arithmetic mean of the lower classand upper class limit of the same class.

Mid-value of Class = 2limit class Upper limit class Lower

or

Class mark = 2limit class lower True limit class upper True

The class mark of the class 11 - 20 is 22011

= 15.5.

Class Magnitude. It is the difference between the upper class boundary and the lowerclass boundary of the class. In the illustration 4 : the class magnitude of the class, 20.530.5 is (30.5 20.5) = 10.

Inclusive and Exclusive Series. In the above illustration, all the maiks we consideredwere integers. Hence, it was possible for us to choose classes 11 to 20, 21 to 30 etc. Thereis a gap of 1 between the upper limit of a class and the lower limit of its next consecutiveclass, which has not created any difficulty. But there can be situations where the raw

34 STATISTICS (CPT)

Page 35: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

data is not in integers. For example, in the information regarding maximum temperatureof a city or time required to solve a statistical problem is recorded in the data, it maycontain fractions as well. In such cases, the consecutive classes have to be necessarilycontinuous. We have the following.

Inclusive Series. When the class-intervals are so fixed that the upper limit of the class is included inthat class, it is known as inclusive method of classWcation, e.g., 0 - 5, 6 - 10, 11-15, 16 - 20.

In the inclusive series, the upper limit and lower limit are included in that class interval. Forexample, in illustration 3, the marks 11 and 20 are included in the class 11 20. It is a discontinuousseries or inclusive series. In order to make it a continuous one, some adjustment with the classlimits is necessary. The class limits are extended to class boundaries by the adjusting adjustmentfactor, which is equal to half of the difference between the upper limit of the one class andlower limit of the next class. The series so obtained is continuous and is known as exclusiveseries.

Exclusive or Continuous Series. In this series the upper limit of one class is the lower limit ofthe other class, the common point of the two classes is included in the higher class. For example,10 - 15, 15 - 20, 20 - 25 , represent a continuous series or the exclusive series. In this series, 15 isincluded in the class 15 - 20 and 20 is included in 20 - 30. Here the class intervals overlap and theupper limit of each class is treated as less than that limit and lower limit of each class actuallyreprescnts exact value. Thus

When the class-intervals are so fixed that the upper limit of one class is the lower limit of thenext class, it is known as Exclusive method of classification.

CONVERSION OF A DISCONTINUOUS SERIES INTO CONTINUOUS SERIES(INCLUSIVE SERIES TO EXCLUSIVE SERIES)

Step 1. Find the adjustment factor by the following formula:

Adjustment factor = 21

x [Lower limit of second class Upper limit of the first classi

Step 2. Subtract the adjustment factor from the lower limit and add it to the upper limit of eachclass. We will get the required exclusive series.

For Example, to convert the Discontinuous series (or Inclusive series) 11 - 20, 21 - 30, 31- 40, wehave the following method.

Step 1. Adjustment factor. The difference between the upper limit of first class and the lowerlimit of the second class is 1.

Adjustment factor = 21

[21 - 20] = 21

= 0.5

STATISTICAL DESCRIPTION OF DATA 35

Page 36: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Step 2. The continuous series (or exclusive series) of the given discontinuous series is

11 - 0.5 to 20 + 0.5, i.e., 10.5 - 20.5

21 - 0.5 to 30 + 0.5, i.e., 20.5 - 30.5

31 - 0.5 to 40 + 0.5, i.e. 30.5 - 40.5

Inclusive Form Exclusive Form

11 - 20 10.5 - 20.5

21 - 30 20.5 - 30.5

31 - 40 30.5 - 40.5

Example 15. Construct a continuous frequency distribution table (inclusive type) of the followingmarks scored by a class of 40 students.

20, 11, 11, 37, 15, 40, 31, 29, 38, 27, 13, 7, 29, 25, 37, 42, 30, 10, 9, 27, 25, 18, 2, 9, 47, 17, 11, 32,41, 6, 29, 15, 13, 39, 21, 40, 10, 15, 3, 4.

Solution. Here the minimum marks scored by a student = 2.

Maximum marks scored = 47.

Range = (Maximum marks Minimum marks) = 47 - 2 = 45.

Suppose, we want to represent the given data by 5 classes, then the each class interval should

be 545

= 9.

Thus, the classes are 0 - 9, 10 - 19, 20 - 29, 30 - 39, 40 - 49. We have the following Table:

Example 16. The class marks of a frequency distribution are & 10, 14, 18, 22, 26, 30. Find the class sizeand the class intervals.

Solution. Class size = difference between two successive class marks = (10 - 6) = 4

36 STATISTICS (CPT)

Page 37: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Let the lower limit of the first class interval be a.

Then, its upperlimit = (a+4). = 6 2a = 8 a = 4.

The first class interval is: 4 - 8.

Let the lower limit of the last class interval be b. Then, its upper limit = (b + 4).

= 30 2b = 56 b = 28.

The last class interval is: 28 - 32.

Hence, the required class intervals are : 4 - 8, 8 - 12, 12 - 16, 16 - 20, 20 - 24, 24 - 28 and 28 - 32.

Example 17. The class marks of a distribution are: 26, 31, 36, 41, 46,- 51, 56, 61, 66. 71. Find the trueclass limits.

Solution. As the class marks are uniformly spaced, so the class size is the difference between any twoconsecutive class marks.

Class size=31 - 26=5.

If a is the class mark of a class interval of size h, then the lower and upper limits of the class

interval are a 2ha and a 2

ha respectively. Here h = 5.

Lower limit of first class interval- = 26 - 25

= 23.5.

And, Upper limit of first class interval = 26 + 25

= 28.5.

First class interval is : 23.5 - 28.5.

Thus, the class intervals are:

23.5 - 28.5, 28.5 - 33.5, 33.5 - 38.5, 38.3 - 43.5, 43.5 - 48.5, 48.5 - 53.5

Since, the classes are formed by exclusive method, therefore, these class limits are true classlimits.

CUMULATIVE FREQUENCY DISTRIBUTION

Cumulative frequency corresponding to a class is the sum of all the frequencies upto and includingthat class. In cumulative frequency distribution (or series), the frequency of a particular class is obtainedby adding to the frequency of that class all the frequencies of its previous classes. Thus, thecumulative frequency table is obtained fromthe ordinary frequency table by successively addingthe several frequencies.

Cumulative frequency series are of two types.

(i) Less than Series (ii) More than Series.

Suppose, we are given the following discrete series of marks obtained by 100 students. With thehelp of this series, we shall form the ‘Less than’ and ‘More than’ series.

Marks obtained: 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 - 90

No. of Students : 8 12 20 25 18 17

a + (a + 4)2

b + (b + 4)2

STATISTICAL DESCRIPTION OF DATA 37

Page 38: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Less than Cumulative Series More than Cumulative SeriesMarks obtained No. of Students Marks obtained No. of Students

Less than 40 8 More than 30 100Less than 50 20 (= 8 + 12) More than 40 92 (= 100 - 8)Less than 60 40 (= 20 + 20) More than 50 80 (= 92 - 12)Less than 70 65 (= 40 + 25) More than 60 60 (= 80 - 20)Less than 80 83 (= 65 + 18) More than 70 35 (= 60 - 25)Less than 90 100 (= 83 + 17) More than 80 17 (= 35 - 18)

Example 18. The distances (in km) covered by 24 cars in 2 hours are given below:

125, 140, 128, 108, 96, 149, 136, 112, 84, 123, 130, 120, 103, 89, 65, 103, 145, 97, 102, 87, 67, 78,98, 126

Represent them as a cumulative frequency table using 60 as the lower limit of the first group and all theclasses having the class size of 15.

Solution. Class size = 15.

Maximum distance covered = 149 km.

Minimum distance covered = 65 lan. Range = (maximum - minimum) = (149 65) km = 84 km.

Number of classes = = = 5.6 or 6.

Thus, the class intervals are 60 - 75, 75 - 90, 90 - 105, 105 - 120, 120 - 135, 135 - 150.

Example 19. The heights (in centimetres) of 40 persons are

110, 112, 125, 135, 150, 155, 152, 150, 159, 130, 128, 138, 133, 143, 147, 151, 154, 156, 112, 116,119, 111, 113, 115, 118, 121, 123, 120, 125, 121, 110, 113, 114, 149, 153, 155, 150, 155, 152, 111.

Range

Class Size85

15

38 STATISTICS (CPT)

Page 39: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Array the data and form a cumulative frequency table with class interval of 10.

Solution. Arranging the given data in ascending order of magnitude, we get

110, 110, 111, 111, 112, 112, 113, 113, 114, 115, 116, 118, 119, 120, 121, 121, 123, 125, 125, 128,130, 133, 135, 138, 143, 147, 149, 150, 150, 150, 151, 152, 152, 153, 154, 155, 155, 155, 156, 159.

Example 20. Form a frequency table from the following data

Age (in years) No. of Persons Age (in years) No. of Persons

under 5 0 under 40 22

under 10 7 under 45 29

under 15 10 under 50 35

under 20 14 under 55 41

under 25 16 under 60 47

under 30 17 under 65 50

under 35 20

Solution. Here cumulative frequencies of the variates have been given. So to get the frequency ofa particular class-interval, subtract the cumulative frequency of the lower limit from that of theupper limit.

Age (in years) No. of Persons Age (in years) No. of Persons

5 -10 7 35 - 40 2

10 -15 3 40 - 45 7

15 - 20 4 45 - 50 6

20 - 25 2 50 - 55 6

25 - 30 1 55 - 60 6

30 - 35 3 60 - 65 3

STATISTICAL DESCRIPTION OF DATA 39

Page 40: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

PRACTICAL PROBLEMS IN FORMING A FREQUENCY DISTRIBUTION

The following practical problems are, generally, faced while forming a frequency distribution:

1. The number of classes should neither be too small nor too large. It should preferably liebetween 5 and 15. As far as possible, open-ended classes should be avoided as these createproblems in analysing and interpretation.

2. Class interval is obtained by dividing the range with the number of classes. Again this maycreate problem unless the classes are of equal width. Sometimes, however, unequal classintervals become necessary.

3. While recording the data in frequency tally, one has to be careful in putting the tally marks,especially, when the number of observations is large.

4. Once tally marks are obtained, the frequencies are obtained by counting the tallies. Oneshould be careful while counting the tallies.

RELATIVE FREQUENCY AND PERCENTAGE FREQUENCY OF A CLASS INTERVAL

Relative Frequency: Frequency of each class can also be expressed as a fraction or percentageterms. These are known as relative frequencies. In other words, a relative frequency is theclass frequency expressed as a ratio of the total frequency, i.e.,

Relative frequency = frequency Totalfrequency Class

Percentage Frequency: Percentage frequency of a class interval may be defined as the ratio ofthe class frequency to the total frequency expressed as a percentage.

Percentage frequency = frequency Totalfrequency Class

x 100.

Relative frequency distribution

Class Frequency Relative frequency

(Marks) (No. of Students)

20 - 40 6 6/25 = 0.24 or 24%

40 - 60 12 12/25 = 0.48 or 48%

60 - 80 4 4/25 = 0.16 or 16%

80 - 100 3 3/25 = 0.12 or 12%

Total 25 1.00 or 100 %

40 STATISTICS (CPT)

Page 41: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

It may be observed that the sum of all relative frequencies is 1.00 or 100 per cent because thefrequency of each class has been expressed as a percentage of the total frequencies (or data).

The relative frequencies express the frequency of any class as a percentage of the total frequency.It is obtained by the formula:

Relative frequency = group the offrequency Totalgroup a in item ofFrequency

The relative frequencies are used to compare two or more frequency distributions or two or moreitems in the same frequency distribution.

In the above example, we can compare the number of students securing the marks 20 40 and 80- 100. As there are 24% students who secured marks between 20 and 40 and 12 % students havesecured marks between 80 and 100. In the same manner, we can compare the production of ricein Punjab and U.P. in 2006.

It is important to note that for comparing two frequency distributions by means of relativefrequencies, the variates as well as the division of the values of the variate or variable into classesmust be same for both the distributions. Relative frequencies are used in Percentage, sub-dividedBar diagrams and Pie diagrams.

BIVARIATE OR TWO-WAY FREQUENCY DISTRIBUTION

So far we studied the frequency distribution of one variable only. Such frequency distribution iscalled a univariate frequency distribution. In many situations simultaneous study of two variablesbecomes necessary. For example, we may study the heights and weights of a group of students, orthe marks obtained by a group of students in Statistics and Economics; or income and expenditureof a group of individuals etc.. The frequency distribution so obtained as a result of this crossclassification of two variables give rise to bivariate frequency distribution and this can besummarised in the form of a two-way table called Bivariate Frequency Table.

While preparing a bivariate frequency distribution, the same consideration of classifigation areapplied as for univariate frequency distribution. If the data corresponding to one variable, say X,is grouped into m classes and the data corresponding to the other variable, say Y, is grouped inton classes, then the bivariate table will consist of m x n cells. By going through the different pairs ofthe values (X, fl of the variables and using tally marks we can find the frequency of each cell andthus form bivariate frequency distribution.

The frequency distribution of the values of the variable X together with their frequency total iscalled the marginal distribution of X. The frequency distribution of the values of the variable Ytogether with the total frequencies is known as the marginal frequency distribution of Y.

STATISTICAL DESCRIPTION OF DATA 41

Page 42: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Example 21. Prepare a bivariate frequency distribution for the following data for 20 students

Marks in Law (X) : 10 11 10 11 11 14 12 12 13 10

Marks in Statistics (Y) : 20 21 22 21 23 23 22 21 24 23

Marks in Law (X) : 13 12 11 12 10 14 14 12 13 10

Marks in Statistics (Y) : 24 23 22 23 22 22 24 20 24 23

Solution.

The bivariate table will have 25 cells (5 x 5).

Let themarks in Law be denoted by X (values 10 to 14 i.e., 5 values).

Let the marks in Statistics be denoted by Y (values 20 to 24 i.e., 5 values).

Note : The figures within brackets show the frequency corresponding to each cell.

GRAPH OF FREQUENCY DISTRIBUTION

The graphs of frequency distribution are designed to present the characteristic features of afrequency data. They facilitate comparative study of two or more frequency distributions regardingtheir shape and pattern.

The most commonly used graphs are:

1. Histogram 2. Frequency Polygon

3. Frequency Curve 4. Cumulative Frequency Curve or Ogive

Histogram

A Histogram is a graph containing a set of rectangles, each being constructed to represent the sizeof the class interval by its width and the frequency in each class-interval by its height. The area of

42 STATISTICS (CPT)

Page 43: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

each rectangle is proportional to the frequency in the respective class-interval and the total areaof the histogram is proportional to the total frequency. A histogram is used to depict a frequencydistribution.

In constructing a histogram, the class-boundaries are considered to be very important; and theyare located on the horizontal axis of the graph. The vertical rectangles are erected side by side onthe basis of the frequencies over the class-intervals which are extended to their class-boundaries.

When the class-intervals are unequal, difficulty arises, because the rectangles so formed will alsoof unequal width. So in order to remove this difficulty, the heights of rectangles are madeproportional not to the class frequencies, but to the frequency densities.

This type of diagram is used exclusively for showing frequency distributions of quantitative datathat are continuous in nature. It is essentially an area diagram composed of a series of adjacentrectangles. Hence, the areas used to represent the frequency in groups or class intervals whenadded together will give the composite area for the entire group.

REMARKS:

1. It is not necessary that the scale on x-axis and y-axis be the same. Different scales may betaken on the two axes.

2. The position of origin on the y-axis is according to scale, which may not be so on the x-axis.If class intervals are not starting from 0, then it is indicated by drawing a kink mark onthe x-axis near the origin. lf necessary the kink mark may be made on y-axis or on boththe axes.

3. A histogram may not be conthsed with a bar diagram. In a bar diagram, the height of eachbar matters and not its width, whereas in a histogram height as well as the width of eachrectangle matters.

Types of Histograms

There are two types of Histograms

(a) Histograms with equal class intervals (b) Histograms with unequal class intervals.

TYPE I. HISTOGRAM WITH EQUAL CLASS INTERVALS

The sizes of class intervals are drawn on x-axis with equal distances and their respective frequencieson y-axis. Class and its frequency taken together form a rectangle. The graph of rectangles isknown as histogram.

STATISTICAL DESCRIPTION OF DATA 43

Page 44: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Each class has lower and upper values. This gives us two equal lines represeting the frequencies.Upper ends of the lines are joined together. This process will give us rectangles. The heights of therectangles will be proportional to their frequencies.

Example 22. Draw the histogram for the following data

Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70

No. of Students 20 30 70 50 40 50 40

Solution. We represent the class limits along x-axis and frequencies along y-axis. Taking classintervals as bases and the corresponding frequencies (No. of students) as heights, we constructthe rectangles to get the histogram of the given frequency distribution as shown in Fig. 3.42.

TYPE II. HISTOGRAM WHEN CLASS INTERVALS ARE GIVEN IN EXCLUSIVEFORM, i.e., WHEN CLASS INTERVALS ARE NOT CONTINUOUS

In a histogram, it is necessary that the adjacent rectangles be attached to each other. If we try torepresent the given data (in exclusive form) as such we shall get a bar diagram as there will begaps in between the classes. But in a histogram the bars are continuous without any gaps. Therefore,we shall convert the exclusive series into an inclusive series, i.e., make the classes continuous bytaking class limits.

44 STATISTICS (CPT)

Page 45: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

STEPS FOR CONVERTING AN EXCLUSIVE SERIES INTO AN INCLUSIVE SERIES

Step I. Find the difference between the lower limit of second class interval and the upper limitof the first class interval. Let the difference be 1..

Step II. Subtract h/2from the lower limit of each and every class, Le, find (lower limit and add(1.72) to the upper limits of every class, i.e.,find (upper limit

Step III. Now write down the frequency distribution with continuous class intervals.

Step IV. Draw the histogram of the new distribution.

The above method is illustrated by the following examples:

Example 23. Draw the histogram of the following frequency distribution

Classinterval : 12 - 16 17 - 21 22 - 26 27 - 31 32 - 36

Frequency : 2 6 7 5 3

Solutien. The given distribution is not continuous. If we represent the given data by a graph; weshall not get a histogram. But we will get a bar diagram as there will be gaps in between theclasses, i.e., we shall get isolated rectangles. But in a histogram the bars or rectangles are continuouswithout gaps. Therefore, we shall have to make the classes continuous by taking actual classlimits.

Here : Lower limit of 2nd class interval = 5; Upper limit of 1st class interval = 4.

h = difference between the lower limit of second class interval and upper limit of the firstclass interval 5 - 4 = 1.

To convert the given frequency distribution, we subtract 2h

= 21

= 0.5 from each lower

limit and add 2h

= 0.5 to each upper limit. The new distribution obtained is given below:

Class Interval : 11.5 - 16.5 16.5 - 21.5 21.5 - 26.5 26.5 - 31.5 31.5 - 36.5

Frequency : 2 6 7 5 3

STATISTICAL DESCRIPTION OF DATA 45

Page 46: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Take class intervals along x-axis. The kink mark on x-axis means that owing to lack of space thesmall distance represents 0 - 11.5. Take frequencies along y-axis.

Construct rectangles having base equal to class intervals and the corresponding frequencies asheights. Join these rectangles to get the histogram given by Fig. 3.43.

TYPE III. HISTOGRAM WHEN MID-POINTS OF CLASS INTERVALS ARE GIVEN

Working Rule

Step I. Find the difference between the second and first mid-point. Let it be h.

Step II. To get the lower limit of first class, subtract (h/2) from the mid-points.

To get the upper limit of first class, add (h/2) to the mid-point

i.e, First Class Interval = (mid-point - h/2 mid-point + h/2).

Step III. Find all other class intervals by repeating the Step I and Step II

Now write the given data in the class intervals with frequency form and thaw histogram of thisfrequency distribution.

46 STATISTICS (CPT)

Page 47: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Example 24. Construct a histogram from the following distribution of total marks of 30 studentsof a class;

Marks (mid-points) : 100 120 140 160 180 200

No.ofStudents : 5 6 4 7 5 3

Solution. We shall first convert the given data of mid-points into class interval form.

Here h = Difference between the second and first mid-point.

= 120 - 100 = 20. 2h =

220 = 10

First class interval =

2h-100

2h100 = (100 - 10) - (100 + 10) i.e., 90 - 110.

Using the above procedure, we get the other class intervals as under:

Marks : 90 - 110 110 -130 130 - 150 150 - 170 170 - 190 190 - 210

No. of Students : 5 6 4 6 5 4

The histogram of the above frequency distribution is given by Fig. 1.11.

Note: The crank mark or kink in the curve on the horizontal axis means that owing to lack ofspace this small distance represents 0 to 90.

STATISTICAL DESCRIPTION OF DATA 47

Page 48: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

FREQUENCY POLYGON

A frequency polygon is a line graph for the graphical representation of the frequency distribution.If we mark the mid-points of the top horizontal sides of the rectangles in a histogram and jointhem by straight lines, the figure so formed is called a frequency polygon. It is assumed that thefrequencies in a class interval are evenly distributed throughout the classes and consequently themid-points are representative. A frequency polygon is useful in comparing two or more frequencydistributions. It is called a polygon as it consists of a number of lines as the sides of a polygon. Ithas no special significance but it gives a fair idea about shape of frequency distribution. In orderto complete the drawing of a frequency polygon, the two end points of it are joined to the baseline at the mid points of the empty classes at both ends of the frequency distribution. A frequency,polygon for a grouped frequency distribution with equal class intervals, may be constructed intwo ways.

CASE 1. WHEN A HISTOGRAM AND A FREQUENCY POLYGON ARE DRAWN ON THESAME SCALE

Working Rule

Step I. Draw a histogram from the given data.

Step II. Join the consecutive mid-points of the upper sides (tops) of the adjacent rectangle of thehistogram by means of line segments.

Step III. It is assumed that the class preceding the first class and the class succeeding the lastclass in the cldssjfication exists and the frequency of each extreme class is zero. Join themid-points of these extreme classes with the mid-points of upper horizontal sides of theextreme (first and last) rectangles of the histogram.

Example 25. The.followbug is the distribution of total household expenditure (in `) of 202 workersin.a city.

Expenditure Number of Expenditure Number of

(in `) Workers (in `) Workers

100 - 150 25 300 - 350 30

150 - 200 40 350 - 400 22

200 - 250 33 400 - 450 16

250 - 300 28 450 - 500 8

48 STATISTICS (CPT)

Page 49: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Draw a histogram and a frequency polygon of the above data.

Solution. SCALE. Take expenditure along x-axis to represent: 1 cm = ` 50

Take workers along y-axis to represent 1 cm = 5 workers

Also the two extreme classes are 50 - 100 and 500 - 550 whose frequencies are taken ‘zero’ ineach case.

The histogram and a frequency polygon of the given data are given by Fig. 1.12.

Example 26. Draw, on the same diagram, a histogram and a frequency polygon to represent thefollowing data which shows the monthly cost of living index of a city in a period of 2 years.

Cost of living index Number of months Cost of living index Number of months

440 - 460 2 520 - 540 3

460 - 480 4 540 - 560 2

480 - 500 3 560 - 580 1

500 - 520 5 580 - 600 4

Total 24

STATISTICAL DESCRIPTION OF DATA 49

Page 50: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Solution. Take the following scale

SCALE: 1 cm = 20 units along x-axis. 1 cm = 1 month along y-axis.

Also take the extreme classes as 420 - 440 and 600 - 620 with zero frequency in eaeh case.

The required histogram and frequency polygon is given by the Fig. 1.13.

CASE II. WHEN ONLY THE FREQUENCY POLYGON IS TO BE DRAWN.

Working Rule

Step I. Represent the mid-points of the class intervals along x-axis and the frequencies alongthe y-axis.

Step II. Let x. and f, respectively be the mid-points and corresponding frequencies of the classintervals. Find all the points (xi, fi).

Step III. Plot the point (xi, fi).

Step IV. Join all these points (xi, fi) by straight lines taken in order to get a frequency polygon.

Step V. The polygon obtained in Step IV is enclosed by joining the first and last point to thecorresponding points on the x-axis representing the mid-values of the hypotheticalclasses, i.e., the class just before the first class or class just after the last class.

50 STATISTICS (CPT)

Page 51: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Example 27. The distribution of monthly wages (in `) of 40 workers in a factory is given below:

Monthly wages Noof workers Monthly wages No. of workers

210 - 230 4 290 - 310 5

230 - 250 7 310 - 330 6

250 - 270 5 330 - 350 4

270 - 290 9

Draw a frequency polygon.

Solution. The two hypothetical intervals; one in the beginning and the other at the end with zerofrequency are 190 - 210 and 350 - 370 respectively.

Step I. Represent monthly wages along x-axis with scale 1 cm = ` 20 and the number of workersalong y-axis with scale 1 cm = 1 labour.

Step II. We prepare the following frequency table:

Monthly wages Class marks No. of workers

190 - 210 200 0

210 - 230 220 4

230 - 250 240 7

250 - 270 260 5

270 - 290 280 9

290 - 310 300 5

310 - 330 320 6

330 - 350 340 4

350 - 370 360 0

Step III. Plot the points (220, 4) (240, 7) (260, 5), (280, 9), (300, 5), (320, 6) and (340, 4). Also plotthe two hypothetical points (200, 0) and (360, 0).

STATISTICAL DESCRIPTION OF DATA 51

Page 52: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Step IV. Join all these points to get the required frequency polygon.

The frequency polygon of the given data is given by Figure 1.14.

FREQUENCY CURVE

A frequency curve is drawn by smoothing the frequency polygon. It is smoothed in such a way that thesharp turns are avoided.

The curve is drawn free hand in such a way that the area included under the curve is approximatelythe same as that of the frequency polygon.

A frequency polygon, if smoothed further, so as to minimise sudden changes, results into acontinuous smooth curve known as frequency curve or smooth frequency curve. The curveshould begin and end at the base line. To draw the frequency curve, firstly draw a histogram,then draw afrequency polygon and lastly smooth it to get frequency curve. The basic object ofdrawing a frequency curve is to present graphically, the area covered by histogram in a moresymmetrical manner. It is also known as smooth frequency curve.

Example 28. Draw a histogram, a frequency polygon and a smoothed frequency curve from thefollowing data relating to marks secured in Economics by students of a school in a house-test.

52 STATISTICS (CPT)

Page 53: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Marks No. of students Marks No. of students0 - 5 3 25 - 30 14

5 - 10 7 30 - 35 1010 - 15 13 35 - 40 715 - 20 25 40 - 45 420 - 25 40 45 - 50 2

Solution: The required curves are shown in

CUMULATIVE FREQUENCY CURVE OR OGIVE

It is a graph which represent the data of a cumulative frequency distribution. An Ogive, when itis drawn, is in the form of a curve. The curve drawn from the data cumulated downward isknown as less than ogive and the curve drawn from the data cumulated upward as more thanogive. An ogive is used to find median, quartites, dediles and percentiles etc. It is also used to find

STATISTICAL DESCRIPTION OF DATA 53

Page 54: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

the number ol observations which are expected to lie between two given values.A cumulative frequency curve or ogive is obtained by plotting the successive cumulative frequencieson the graph paper and joining them by a free hand. A cumulative frequency curve or ogiverepresents the cumulative frequency distribution by plotting actual upper limits of the class intervalson the x-axis and the representing a cumulative frequency of these class intervals on the y-axis.In thawing an ogive, the cumulative frequency is plotted at~the upper limit of the class interval.The successive points are later joined ‘together to get an ogive curve. There are two types ofogives:(i) Less Than Ogive and(ii) More Than Ogive.In drawing an ogive, if the cumulative frequencies are plotted at the upper limit of the classinterval, if it is a lEss than ogive.But in the case of more than ogive, the cumulative frequencies are plotted against the lower classboundaries of the respective class intervals. :Less Than Ogive orCumulative Frequency Curve Less than the upper limitWorking RuleStep I. Represent the upper limits of the classes along x-axis.Step II. Represent the cumulative frequency of respective frequency along y-axis.Step IlI. Plot the points corresponding to upper class limits and the cumulative frequencies less than the

respective upper limits.Step IV. Join the points plotted in Step III by a free hand.Step V. It is assumed that the class preceding the first class in the classification exists and its frequency

is zero. Plot the point corresponding to this hypothetical point and join it to the point of thefirst class, i.e., this point is (lower limit ofthefirst class, 0).

The curve so obtained is Less than Ogive or Cumulative frequency curve less than the upperclass limit.It is illustrated by the following example.Example 29. Draw a less than ogive for the following frequency distribution:L.Q. : 60 - 70 70 - 80 80 - 90 90 - 100 100 - 110 110 - 120 120 - 130No. ofstudents: 2 5 12 31 39 10 4Solution. Let us prepare following table showing the cumulative frequencies more than the upperlimit.

54 STATISTICS (CPT)

Page 55: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Class Interval (I. Q) Frequency Cumulative Frequency Plots to be(c.f) plotted

50 - 60 0 0 (60,0)60 - 70 2 2 (70,2)70 - 80 5 2+5=7 (80,7)80 - 90 12 7+12=19 (90,19)

90 - 100 31 19+31=50 (100,50)100 - 110 39 50+39=89 (110,89)110 - 120 10 89+10=99 (120,99)120 - 130 4 99+4=103 (30,103)

From this table, we get the pairs of numbers corresponding to the upper class limits and theircorresponding cumulative frequencies as (70, 2), (80, 7), (90, 19), (100, 50), (110, 89), (120, 99)and (130, 103). Plot these points and join them by a free hand curve to get the ogive as shown inthe Fig. 1.16. To complete it, we join the curve to the hypothetical point (Lower limit of firstInterval, 0), i.e. (60, 0) (by means of broken lines).

STATISTICAL DESCRIPTION OF DATA 55

Page 56: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Example 30. Draw the ogive for the following distribution:

Age (inyears) : 0 - 9 10 - 19 20 - 29 30 - 39 40 - 49 50 - 59 60 - 69

No. of persons : 5 15 20 25 15 12 8

Solution. The given frequency distribution is not continuous. We first make it continuous andprepare the cumulative frequency distribution given by the following table.

Age Actual limits No. of persons Cumulative Points to be plotted

(in years) (Frequency) Frequency

0 - 9 -0.5 - 9.5 5 5 (9.5,5)

10 - 19 9.5 - 19.5 15 20 (19.5, 20)

20 - 29 19.5 - 29.5 20 40 (29.5, 40)

30 - 39 29.5 - 39.5 25 65 (39.5, 65)

40 - 49 39.5 - 49.5 15 80 (49.5, 80)

50 - 59 49.5 - 59.5 12 92 (59.5, 92)

60 - 69 59.5 - 69.5 8 100 (69.5, 100)

Plot the points (9.5, 5), (19.5, 20), (29.5, 40), (39.5, 65), (49.5, 80), (59.5, 92), and (69.5, 100) andjoined them by a free hand smooth curve. To complete it, the curve is joined to the hypotheticalpoint (lower limit of the first class interval, 0), i.e., the point 0.5, 0). This portion is alwaysshown by dotted line.

Class Frequency Cumulative frequency Points to

(More than the lower limits) plotted

40 - 44 7 128 + 7 = 135 (40,135)

44 - 48 12 116 + 12= 128 (44,128)

48 - 52 33 83 + 33 = 116 (48, 116)

52 - 56 47 36 + 47 = 83 (52,83)

56 - 60 20 16 + 20 = 36 (56,36)

60 - 64 11 5 + 11 = 16 (60,16)

64 - 68 5 5 = 5 (64,5)

From the table we get the points of numbers corresponding to the lower class limits and therespective cumulative frequency, as : (40, 135), (44, 128), (48, 116) (52, 83), (56, 36), (60, 16) and(64, 5).

56 STATISTICS (CPT)

Page 57: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Plot these points and join them by a free hand curve to get the More Than Ogive as shown in theFig. 1.17. In order to complete it, we join the curve to the hypothetical point (upper limit of lastinterval, 0), i.e., (68, 0) (shown by dotted lines).

The More Than Ogive of the given data is given by Fig. 1.17.

Example 31. Plot Less than Ogive and More than Ogive for the following data:

Cost of production 4 - 6 6 - 8 8 - 10 10 - 12 12 - 14 14 - 16

No. of farms 13 111 182 105 19 7

Solution. We prepare a cumulative frequency distribution table showing cumulative frequency‘less than Ogive’ and ‘more than Ogive’ as shown below:

STATISTICAL DESCRIPTION OF DATA 57

Page 58: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Class interval, Class boundary Frequency Cumulative Frequency

(Cost of production) (No. offarms) (Less than (More than

Ogive) Ogive)

2 - 4 4 0 0 437

4 - 6 6 13 13 424

6 - 8 8 111 124 313

8 - 10 10 182 306 131

10 - 12 12 105 411 26

12 - 14 14 .19 430 7

14 - 16 16 7 0

By plotting the points (4, 0), (6, 13), (8, 124), (10, 306), (12, 411), (14, 430), (16, 437) and joiningthem by a free hand, we get the less than Ogive. Again by plotting the points (16, 0), (14, 7), (12,26), (10, 131), (8, 313), (6, 424), (3, 37) and (16, 0) then by a free hand, we get the more thanOgive. Both the curves are given by Figure 1.18.

58 STATISTICS (CPT)

Page 59: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

Example 32. Convert the following distribution into more than frequency distribution.Weekly Wages less than (Rs.) 20 40 60 80 100No. of workers : 41 92 156 194 201For the data given above, draw ‘less than’ and ‘more than’ ogiveSolution. Table: Computation of less than’ and more than Ogive

Cumulative frequenciesWeekly Wages N. of Workers(f) Less than More than

0 - 20 41 41 20120 - 40 92 - 41 = 51 92 16040 - 60 156 - 92 = 64 156 10960 - 80 194 - 156 = 38 194 45

80 - 100 201 - 194 = 7 201 7

***************

STATISTICAL DESCRIPTION OF DATA 59

Page 60: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

CLASS WORK1. Numerical data presented in descriptive form are called

(a) Classified presentation (b) tabular presentation

(c) Graphical presentation (d) textual presentation

2. Whether tabulation is done first or classification?

(a) Both are done simultaneously (b) Classification precedes tabulation

(c) Classification follows tabulation (d) No criterion

3. Charts and graphs are the presentation of numerical facts by means of

(a) symbols (b) points and lines

(c) area an other geometrical forms (d) all the above

4. Graphs and charts facilitate

(a) to comparison of data (b) to know the trend

(c) to know relationship (d) all the above

5. If there is an increase in a series at constant rate, the graph will be a :

(a) convex curve (b) concave curve

(c) a straight line from left top to (d) a straight line from left bottom to

right bottom right top

6. If there is a decrease in a series at constant rate, the graph will be a :

(a) hyperbola (b) a straight line from left top to right bottom

(c) a straight line from left bottom (d) none of them

to right top

7. For the mid-values 4,8,12,16,20 the first class of the distribution is:

(a) 0-8 (b) 1-7(c) 2-6 (d) none of them

8. In an exclusive type distribution, the limits excluded are(a) upper limits (b) lower limits(c) either of the lower or upper limit (d) lower limit and upper limits both

60 STATISTICS (CPT)

Page 61: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

9. A series showing the sets of all distinct values individually with their frequencies is known as(a) continuous frequency distribution (b) simple frequency distribution

(c) cumulative frequency distribution (d) none of them

10. A series showing the sets of all values in classes with their corresponding frequencies is knownas

(a) grouped frequency distribution (b) simple frequency distribution

(c) cumulative frequency distribution (d) none of them

11. If the lower and upper limits of a class are 20 and 30 respectively, the midpoint of the class is

(a) 25.0 (b) 25.5 (c) 24.5 (d) none of them

12. In a grouped data, the number of classes preferred are

(a) maximum possible (b) adequate

(c) minimum possible (d) any arbitrarily chosen numbers.

13. Class interval is measured as :

(a) the difference between upper (b) half of the sum of lower & upper limits

and lower limits

(c) half of the difference between (d) The sum of the upper and lower limits

upper and lower limits

14. The class interval of the following continuous grouped data is

0-9 10-19 20-29 30-39 40-49

5 8 12 15 4

(a) 10 (b) 9

(c) 24.5 (d) none of them

15. A grouped frequency distribution with uncertain first or last classes is known as :

(a) discrete frequency distribution (b) inclusive class distribution

(c) open end distribution (d) exclusive class distribution

16. The distribution,

Values 20 25 30 35 40 45 50

frequency 5 12 15 20 15 10 5

is of the type

(a) exclusive class type (b) discrete type

(c) inclusive class type (d) none of them

STATISTICAL DESCRIPTION OF DATA 61

Page 62: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

17. Data can be well displayed or presented by way of(a) Graph (b) Tabulation(c) cross classification (d) all the above

18. A simple table represents

(a) only one factor or variable (b) always two factors or variables(c) two or more number of factors (d) all the above

or variables

19. Ogive curve can be drawn for(a) more than type distribution (b) less than type distribution(c) both (a) and (b) (d) none of (a) and (b)

20. In an Ogive curve, the points are plotted for

(a) The values and frequencies (b) Frequencies and cumulative frequencies(c) The values and cumulative frequencies (d) None of the above

21. Ogives for more than type and less than type distributions intersect at(a) mean (b) median(c) mode (d) none of them

22. A complex table represents(a) only one factor or variable (b) always two factors or variables(c) two or more factors or variables (d) all the above

23. The headings of the rows given in the first column of a table are called

(a) stubs (b) captions

(c) sub titles (d) prefatory notes

24. The column headings of a table are known as

(a) sub-titles (b) stubs (c) reference notes (d) captions

25. The following series is of the type of

Place Population

A 15,00,000

B 20,00,000

C 30,00,000

D 40,00,000(a) Discrete (b) Industrial

(a) Discrete (b) Industrial (c) Geographical (d) Time series

62 STATISTICS (CPT)

Page 63: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

26. The following series is of the type of

Year Population in a particular city

1995 19,00,000

1996 19,80,000

1997 25,00,000

1998 25,50,000

(a) individual series (b) Geographical (c) discrete series (d) time series

27. The following series is of the type of

Person Age (in years)

A 35

B 23

C 26

D 32

E 30

(a) Individual (b) Discrete (c) Geographical (d) Time series.

28. The following series is of the type of

Marks No. of students

10-15 20

15-20 21

20-25 26

25-30 15

30-35 7

35-40 3

(a) Discrete (b) Continuous (c) Individual (d) Time series

29. The purpose served by diagrams and charts is

(a) to avoid textual form (b) to avoid tabulation

(c) simple presentation of data (d) all the above

30. Choice of a particular chart depends on

(a) the purpose of study (b) the type of data

(c) the nature of data (d) all the above

STATISTICAL DESCRIPTION OF DATA 63

Page 64: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

31. Which of the following is one-dimensional diagram?

(a) Bar (b) Square (c) Rectangle (d) A graph

32. Which of the followings is not two-dimensional diagram?

(a) Pie-chart (b) Multiple bar diagram

(c) Rectangular diagram (d) Square diagram

33. Which of the followings is two-dimensional diagram?

(a) Pie-chart (b) Multiple bar

(c) Bar (d) None of them

34. Non-dimensional diagrams are also known as :

(a) cubes (b) spheres

(c) pictograms (d) all the above

35. A frequency distribution can be

(a) discrete (b) continuous

(c) both (a) and (b) (d) none of (a) & (b)

36. Which of the following statements is true ?

(a) An individual series is a special case of continuous series

(b) An individual series is a particular case of discrete and continuous series

(c) An individual series is a particular case of discrete series

(d) There is nothing like individual series

37. Frequency of a variable is always

(a) in ratio (b) a fraction (c) an integer (d) in percentage

38. The data given as 22,27,28, 29,35,37, 55, 62 will be called as

(a) a continuous series (b) a discrete series

(c) an individual series (d) time series

39. The following frequency distribution is classified as

x: 0 1 2 3 4 5 6

f: 2 5 8 10 9 6 3(a) continuous distribution (b) discrete distribution

(c) cumulative frequency distribution (d) none of the above

40. In an ordered series, the data are :(a) in descending order (b) in ascending order

(c) either (a) or (b) (d) neither (a) nor (b)

64 STATISTICS (CPT)

Page 65: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

41. The following frequency distribution is known as

Classes Frequency

0-100 3

0-200 8

0-300 12

0-400 19

0-500 30

(a) continuous frequency distribution(b) discrete frequency distribution(c) cumulative distribution in less than type(d) cumulative distribution in more than type

42. It is necessary to find cumulative frequencies in order to draw a/an :

(a) pie chart (b) frequency polygon(c) Ogive curve (d) column chart

43. If we plot the points of a less than type or more than type frequency distribution, the shape ofgraph is :

(a) Ogive curve (b) scatter diagram

(c) zig-zag curve (d) parabola

44. With the help of ogive curve, one can determine

(a) median (b) quartile

(c) percentiles (d) all the above

45. Histogram is suitable for the data presented as :

(a) continuous grouped frequency (b) discrete grouped frequency distribution

distribution

(c) individual series (d) all the above

46. In a histogram with equal class intervals; heights of bar are proportional to :

(a) mid-values of the class (b) frequencies of respective classes

(c) either (a) or (b) (d) neither (a) nor (b)

47. Historigram is suitable for

(a) time series data (b) chronological distribution

(c) none of (a) or (b) (d) both (a) and (b)

STATISTICAL DESCRIPTION OF DATA 65

Page 66: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

48. The following frequency distribution

Classes Frequency

0-50 25

0-30 18

0-10 5

is classified as :(a) cumulative distribution in less than type (b) cumulative distribution in more than type

(c) discrete frequency distribution (d) cumulative frequency distribution

49. Classification is applicable in case of

(a) quantitative data (b) qualitative data (c) both (a) and (b)(d) none of the above

50. Which of the statement is correct ?

(a) Histograms and historigrams are (b) Cube and square diagrams are

similar in look similar in look

(c) Pie-chart and ogives are (d) none of them

similar in look

51. In a bar diagram, the base line is

(a) horizontal (b) vertical

(c) false base line (d) any of the above

52. In a bar diagram, the bars are

(a) horizontal (b) vertical

(c) slanting (d) none of the above

53. In case of frequency distribution with classes of unequal widths, the heights of bars of a histogramare proportional to(a) frequency (b) class intervals(c) frequency densities (d) frequencies in percentage

54. Year wise production of rice, wheat and maize for the last ten years can be displayed by :(a) simple column chart (b) multiple column chart(c) Broken column chart (d) subdivided column chart

55. With the help of histogram we can prepare :

(a) frequency polygon (b) frequency curve

(c) frequency distribution (d) all the above

66 STATISTICS (CPT)

Page 67: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

56. Pictograms are

(a) least accurate (b) very accurate

(c) mostly used (d) scientifically correct

57. Pictograms are shown by

(a) rays (b) lines (c) circles (d) pictures

58. An alternative chart to pie-chart is

(a) sphere (b) rectangular chart

(c) step bar diagram (d) none of the above

59. The graph of the successive points of a distribution joined by straight lines in statisticalterminology is known as

(a) frequency polygon (b) frequency curve

(c) trend (d) cumulative distribution curve

60. Pie-chat represents the components of a factor by :

(a) percentages (b) angles (c) sectors (d) ratio

* * *

STATISTICAL DESCRIPTION OF DATA 67

ANSWER KEYS

Page 68: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

HOME WORK-1

1. Which of the following statements is false?(a) Statistics is derived from the (b) Statistics is derived from the

Latin word ‘Status’ Italian word ‘Statista’

(c) Statistics is derived from the (d) None of these.

French word ‘Statistik’

2. Statistics is defined in terms of numerical data in the(a) Singular sense (b) Plural sense

(c) Either (a) or (b) (d) Both (a) and (b).

3. Statistics is applied in(a) Economics (b) Business management

(c) Commerce and industry (d) All these.

4. Statistics is concerned with(a) Qualitative information (b) Quantitative information

(c) (a) or (b) (d) Both (a) and (b).

5. An attribute is(a) A qualitative characteristic (b) A quantitative characteristic

(c) A measurable characteristic (d) All these.

6. Annual income of a person is(a) An attribute (b) A discrete variable

(c) A continuous variable (d) (b) or (c).

7. Marks of a student is an example of(a) An attribute (b) A discrete variable

(c) A continuous variable (d) None of these.

8. Nationality of a student is(a) An attribute (b) A continuous variable

(c) A discrete variable (d) (a) or (c).

9. Drinking habit of a person is(a) An attribute (b) A variable

(c) A discrete variable (d) A continuous variable.

68 STATISTICS (CPT)

Page 69: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

10. Age of a person is

(a) An attribute (b) A discrete variable

(c) A continuous variable (d) A variable.

11. Data collected on religion from the census reports are

(a) Primary data (b) Secondary data

(c) Sample data (d) (a) or (b).

12. The data collected on the height of a group of students after recording their heights with ameasuring tape are

(a) Primary data (b) Secondary data

(c) Discrete data (d) Continuous data.

13. The primary data are collected by

(a) Interview method (b) Observation method

(c) Questionnaire method (d) All these.

14. The quickest method to collect primary data is

(a) Personal interview (d) Indirect interview

(c) Telephone interview (d) By observation.

15. The best method to collect data, in case of a natural calamity, is

(a) Personal interview (b) Indirect interview

(c) Questionnaire method (d) Direct observation method.

16. In case of a rail accident, the appropriate method of data collection is by

(a) Personal interview (b) Direct interview

(c) Indirect interview (d) All these.

17. Which method of data collection covers the widest area?

(a) Telephone interview method (b) Mailed questionnaire method

(c) Direct interview method (d) All these.

18. The amount of non-responses is maximum in

(a) Mailed questionnaire method (b) Interview method

(c) Observation method (d) All these.

19. Some important sources of secondary data are

(a) International and Government sources

STATISTICAL DESCRIPTION OF DATA 69

Page 70: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

(b) International and primary sources

(c) Private and primary sources

(d) Government sources.

20. Internal consistency of the collected data can be checked when

(a) Internal data are given (b) External data are given

(c) Two or more series are given (d) A number of related series are given.

21. The accuracy and consistency of data can be verified by

(a) Internal checking (b) External checking

(c) Scrutiny (d) Both (a) and (b).

22. The mode of presentation of data are

(a) Textual, tabulation and diagrammatic (b) Tabular, internal and external

(c) Textual, tabular and internal (d) Tabular, textual and external.

23. The best method of presentation of data is

(a) Textual (b) Tabular

(c) Diagrammatic (d) (b) and (c).

24. The most attractive method of data presentation is

(a) Tabular (b) Textual

(c) Diagrammatic (d) (a) or (b).

25. For tabulation, ‘caption’ is

(a) The upper part of the table (b) The lower part of the table

(c) The main part of the table

(d) The upper part of a table that describes the column and sub-column.

26. ‘Stub’ of a table is the

(a) Left part of the table describing the columns

(b) Right part of the table describing the columns

(c) Right part of the table describing the rows

(d) Left part of the table describing the rows.

27. The entire upper part of a table is known as

(a) Caption (b) Stub (c) Box head (d) Body.

28. The unit of measurement in tabulation is shown in

(a) Box head (b) Body (c) Caption (d) Stub.

70 STATISTICS (CPT)

Page 71: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

29. In tabulation source of the data, if any, is shown in the

(a) footnote (b) Body (c) Stub (d) Caption.

30. Which of the following statements is untrue for tabulation?

(a) Statistical analysis of data (b) It facilitates comparison

requires tabulation between rows and not column

(c) Complicated data can be presented (d) Diagrammatic representation of datarequires tabulation.

31. Hidden trend, if any, in the data can be noticed in

(a) Textual presentation (b) Tabulation

(c) Diagrammatic representation (d) All these.

32. Diagrammatic representation of data is done by

(a) Diagrams (b) Charts (c) Pictures (d) All these.

33. The most accurate mode of data presentation is

(a) Diagrammatic method (b) Tabulation

(c) Textual presentation (d) None of these.

34. The chart that uses logarithm of the variable is known as

(a) Line chart (b) Ratio chart

(c) Multiple line chart (d) Component line chart.

35. Multiple line chart is applied for

(a) Showing multiple charts (b) Two or more related time series when thevariables are expressed in the same unit

(c) Two or more related time (d) Multiple variations in the time series.

series when the variables are

expressed in different unit

36. Multiple axis line chart is considered when

(a) There is more than one time series

(b) The units of the variables are different

(c) (a) or (b) (d) (a) and (b).

37. Horizontal bar diagram is used for

(a) Qualitative data (b) Data varying over time

(c) Data varying over space (d) (a) or (c)

STATISTICAL DESCRIPTION OF DATA 71

Page 72: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

38. Vertical bar diagram is applicable when(a) The data are qualitative (b) The data are quantitative(c) When the data vary over time (d) (a) or (c)

39. Divided bar chart is considered for(a) Comparing different components (b) The relation of different components

of a variable to the table(c) (a) or (b) (d) (a) and (b).

40. In order to compare two or more related series, we consider(a) Multiple bar chart (b) Grouped bar chart(c) (a) or (b) (d) (a) and (b).

41. Piediagram is used for(a) Comparing different components (b) Representing qualitative data in a circle

and their relation to the total(c) Representing quantitative data (d) (b) or (c).

in circle

42. A frequency distribution(a) Arranges observations in an (b) Arranges observation in terms of

increasing order a number of groups(c) Relates to a measurable (d) all these.

characteristic

43. The frequency distribution of a continuous variable is known as(a) Grouped frequency distribution (b) Simple frequency distribution(c) (a) or (b) (d) (a) and (b)

44. The distribution of shares is an example of the frequency distribution of(a) A discrete variable (b) A continuous variable(c) An attribute (d) (a) or (c).

45. The distribution of profits of a blue-chip company relates to(a) Discrete variable (b) Continuous variable(c) Attributes (d) (a) or (b).

46. Mutually exclusive classification(a) Excludes both the class limits (b) Excludes the upper class limit but

includes the lower class limit

(c) Includes the upper class limit but (d) Either (b) or (c)

excludes the upper class limit

72 STATISTICS (CPT)

Page 73: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

47. Mutually inclusive classification is usually meant for

(a) A discrete variable (b) A continuous variable

(c) An attribute (d) All these.

48. Mutually exclusive classification is usually meant for

(a) A discrete variable (b) A continuous variable

(c) An attribute (d) Any of these.

49. The LCB is

(a) An upper limit to LCL (b) A lower limit to LCL

(c) (a) and (b) (d) (a) or (b).

50. The UCL is

(a) An upper limit to UCL (b) A lower limit to LCL

(c) Both (a) and (b) (d) (a) or (b).

51. Length of a class is

(a) The difference between the UCB (b) The difference between the UCL

and LCB of that class and LCL of that class

(c) (a) or (b) (d) Both (a) and (b).

52. For a particular class boundary, the less than cumulative frequency and more than cumulativefrequency add up to

(a) Total frequency (b) Fifty per cent of the total frequency

(c) (a) or (b) (d) None of these.

53. Frequency density corresponding to a class interval is the ratio of

(a) Class frequency to the total frequency (b) Class frequency to the class length

(c) Class length to the class frequency (d) Class frequency to the cumulative frequency.

54. Relative frequency for a particular class

(a) Lies between 0 and 1 (b) Lies between 0 and 1, both inclusive

(c) Lies between -1 and 0 (d) Lies between -1 to 1.

55. Mode of a distribution can be obtained from

(a) Histogram (b) Less than type Ogives

(c) More than type Ogives (d) Frequency polygon.

STATISTICAL DESCRIPTION OF DATA 73

Page 74: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

56. Median of a distribution can be obtained from

(a) Frequency polygon (b) Histogram

(c) Less than type ogives (d) None of these.

57. A comparison among the class frequencies is possible only in

(a) Frequency polygon (b) Histogram

(c) Ogives (d) (a) or (b).

58. Frequency curve is a limiting form of

(a) Frequency polygon (b) Histogram

(c) (a) or (b) (d) (a) and (b).

59. Most of the commonly used frequency curves are

(a) Mixed (b) Inverted J-shaped

(c) U-shaped (d) Bell-shaped.

60. The distribution of profits of a company follows

(a) J-shaped frequency curve (b) U-shaped frequency curve

(c) Bell-shaped frequency curve (d) Any of these.

61. Graph is a

(a) Line diagram (b) Bar diagram

(c) Pie diagram (d) Pictogram

62. Details are shown by

(a) Charts (b) Tabular presentation

(c) both (d) none

63. The relationship between two variables are shown in

(a) Pictogram (b) Histogram (c) Bar diagram (d) Line diagram

64. In general the number of types of tabulation are(a) two (b) three (c) one (d) four

65. A table has(a) four (b) two (c) five (d) none parts.

66. The number of errors in Statistics are(a) one (b) two (c) three (d) four

67. The number of “ Frequency distribution” is(a) two (b) one (c) five (d) four

74 STATISTICS (CPT)

Page 75: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

68. (Class frequency)/(Width of the class) is defined as(a) Frequency density (b) Frequency distribution (c) both

(d) none

69. Tally marks determines(a) class width (b) class boundary(c) class limit (d) class frequency

70. Cumulative Frequency Distribution is a

(a) graph (b) frequency(c) Statistical Table (d) distribution

71. To find the number of observations less than any given value(a) Single frequency distribution (b) Grouped frequency distribution

(c) Cumulative frequency distribution (d) None is used.

72. An area diagram is(a) Histogram (b) Frequency Polygon(c) Ogive (d) none

73. When all classes have a common width

(a) Pie Chart (b) Frequency Polygon

(c) both (d) none is used.

74. An approximate idea of the shape of frequency curve is given by

(a) Ogive (b) Frequency Polygon

(c) both (d) none

75. Ogive is a

(a) line diagram (b) Bar diagram (c) both (d) none

76. Unequal widths of classes in the frequency distribution do not cause any difficulty in theconstruction of

(a) Ogive (b) Frequency Polygon (c) Histogram (d) none

77. The graphical representation of a cumulative frequency distribution is called

(a) Histogram (b) Ogive (c) both (d) none

78. The most common form of diagrammatic representation of a grouped frequency distribution is

(a) Ogive (b) Histogram (c) frequency Polygon (d) none

STATISTICAL DESCRIPTION OF DATA 75

Page 76: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

79. Vertical bar chart may appear somewhat alike

(a) Histogram (b) Frequency Polygon

(c) both (d) none

80. The number of types of cumulative frequency is

(a) one (b) two (c) three (d) four

81. A representative value of the class interval for the calculation of mean, standard deviation,mean deviation etc. is

(a) class interval (b) class limit (c) class mark (d) none

82. The no. of observations falling within a class is called

(a) density (b) frequency (c) both (d) none

83. Classes with zero frequencies are called

(a) nil class (b) empty class

(c) class (d) none

84. For determining the class frequencies it is necessary that these classes are(a) mutually exclusive (b) not mutually exclusive

(c) independent (d) none

85. Most extreme values which would ever be included in a class interval are called(a) class limits (b) class interval(c) class boundaries (d) none

86. The value exactly at the middle of a class interval is called(a) class mark (b) mid value (c) both (d) none

87. Difference between the lower and the upper class boundaries is(a) width (b) size (c) both (d) none

88. In the construction of a frequency distribution, it is generally preferable to have classes of(a) equal width (b) unequal width(c) maximum (d) none

89. Frequency density is used in the construction of(a) Histogram (b) Ogive(c) Frequency Polygon (d) none when the classes are of unequal width.

90. “ Cumulative Frequency” only refers to the(a) less-than type (b) more-than type(c) both (d) none

76 STATISTICS (CPT)

Page 77: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

91. For the construction of a grouped frequency distribution(a) class boundaries (b) class limits(c) both (d) none are used.

92. In all Statistical calculations and diagrams involving end points of classes(a) class boundaries (b) class value

(c) both (d) none are used.

93. Upper limit of any class is(a) same (b) different(c) both (d) none from the lower limit of the next class

94. Upper boundary of any class coincides with the Lower boundary of the next class.(a) true (b) false (c) both (d) none.

95. Excepting the first and the last, all other class boundaries lie midway between the upper limitof a class and the lower limit of the next higher class.(a) true (b) false (c) both (d) none

96. The lower extreme point of a class is called(a) lower class limit (b) lower class boundary(c) both (d) none

97. For the construction of grouped frequency distribution from ungrouped data(a) class limits (b) class boundaries(c) class width (d) none are used.

98. When one end of a class is not specified, the class is called(a) closed- end class (b) open- end class(c) both (d) none

99. Class boundaries should be considered to be the real limits for the class interval.(a) true (b) false (c) both (d) none

100. Difference between the maximum & minimum value of a given data is called(a) width (b) size (c) range (d) none

101. In Histogram if the classes are of unequal width then the heights of the rectangles must beproportional to the frequency densities.(a) true (b) false (c) both (d) none

STATISTICAL DESCRIPTION OF DATA 77

Page 78: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

102. When all classes have equal width, the heights of the rectangles in Histogram will be numericallyequal to the(a) class frequencies (b) class boundaries(c) both (d) none

103. Consecutive rectangles in a Histogram have no space in between(a) true (b) false (c) both (d) none

104. Histogram emphasizes the widths of rectangles between the class boundaries.(a) false (b) true (c) both (d) none

105. To find the mode graphically(a) Ogive (b) Frequency Polygon(c) Histogram (d) none may be used.

106. When the width of all classes is same, frequency polygon has not the same area as the Histogram.(a) True (b) false (c) both (d) none

107. For obtaining frequency polygon we join the successive points whose abscissa represent thecorresponding class frequency(a) true (b) false (c) both (d) none

108. In representing simple frequency distributions of a discrete variable(a) Ogive (b) Histogram

(c) Frequency Polygon (d) both is useful.

109. Diagrammatic representation of the cumulative frequency distribution is(a) Frequency Polygon (b) Ogive(c) Histogram (d) none

110. For the overlapping classes 0—10, 10—·20, 20—30 etc. the class mark of the class 0--10 is

(a) 5 (b) 0 (c) 10 (d) none.

111. For the non-overlapping classes 0-19, 20-39 , 40-59 the class mark of the class 0-19 is(a) 0 (b) 19 (c) 9.5 (d) none

112. Class : 0-10 10-20 20-30 30-40 40-50Frequency: 5 8 15 6 4For the class 20-30, cumulative frequency is(a) 20 (b) 13 (c) 15 (d) 28

113. An Ogive can be prepared in ---------- different ways.(a) 2 (b) 3 (c) 4 (d) none

78 STATISTICS (CPT)

Page 79: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

114. The curve obtained by joining the points, whose x · coordinates are the upper limits of the class-intervals and y coordinates are corresponding cumulative frequencies is called(a) Ogive (b) Histogram(c) Frequency Polygon (d) Frequency Curve

115. The breadth of the rectangle is equal to the length of the class-interval in(a) Ogive (b) Histogram (c) both (d) none

116. In Histogram, the classes are taken(a) overlapping (b) non-overlapping (c) both (d) none

117. For overlapping class-intervals the class limit & class boundary are(a) same (b) not same (c) zero (d) none

118. Classification is of(a) four (b) three (c) two (d) five kinds.

* * *

STATISTICAL DESCRIPTION OF DATA 79

Page 80: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

ANSWER KEYS

80 STATISTICS (CPT)

Page 81: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

HOME WORK-2

1. When the data are classified in respect of successive time points, they are known as_______.

(a) Chronological data (b) Geographical data

(c) Ordinal data (d) Cordinal data

2. The number of accidents for seven days in a locality are given below :

No. of accidents : 0 1 2 3 4 5 6

Frequency : 15 19 22 31 9 3 2

What is the number of cases when 3 or less accidents occurred?

(a) 56 (b) 6 (c) 68 (d) 87

3. The following data relate to the marks of a group of students:

Marks : Below 10 Below 20 Below 30 Below 40 Below 50

No. of : 15 38 65 84 100

students

How many students got marks more than 30?

(a) 65 (b) 50 (c) 35 (d) 43

4. The following data relate to the incomes of 86 persons :

Income in ` : 500–999 1000–1499 1500–1999 2000–2499

No. of persons : 15 28 36 7

What is the percentage of persons earning more than Rs. 1500?

(a) 50 (b) 45 (c) 40 (d) 60

5. Bivariate Data are the data collected for

(a) Two variables. (b) More than two variables.

(c) Two variables at the same point of time.

(d) Two variables at different points of time.

6. An Ogive can be prepared in ______ different ways.

(a) 2 (b) 3 (c) 4 (d) 5

7. Find the number of observations between 250 and 300 from the following data :

Value : More than 200 More than 250 More than 300 More than 350

No. of observation : 56 38 15 0

(a) 56 (b) 23 (c) 15 (d) 8

STATISTICAL DESCRIPTION OF DATA 81

Page 82: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

8. The entire upper part of a table is known as

(a) Caption. (b) Stub (c) Box head. (d) Body.

9. The following data relate to the marks of a group of students:

Marks : Below 10 Below 20 Below 30 Below 40 Below 50

No. of students : 15 38 65 84 100

How many students got marks more than 30?

(a) 65 (b) 50 (c) 35 (d) 43

10. The unit of measurement in tabulation is shown in

(a) Box head. (b) Body. (c) Caption (d) Stub.

11. Most extreme values which would ever be included in a class interval are called

(a) Class limits. (b) Class interval.

(c) Class boundaries. (d) None of these.

12. Number of petals in a flower is an example of

(a) A continuous variable. (b) A discrete variable.

(c) An attribute. (d) All of these.

13. A Qualitative characteristic is known as

(a) An attribute. (b) A variable.

(c) A discrete variable. (d) A continuous variable.

14. Methods that are employed for the collection of primary data –

(a) Interview method. (b) Questionnaire method.

(c) Observation method. (d) All of these.

15. This method presents data with the help of a paragraph or a number of paragraphs.

(a) Tabular presentation. (b) Textual presentation.

(c) Diagrammatic representation. (d) None of these.

16. Data collected on minority from the census reports are

(a) Primary data. (b) Secondary data.

(c) Discrete data. (d) Continuous data.

17. __________ is the upper part of the table, describing the columns and sub columns.

(a) Box head (b) Stub (c) Caption (d) Body

82 STATISTICS (CPT)

Page 83: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

18. Refer following table:

Frequency distribution of weights of 16 students

Weight in kg. No. of students

(Class interval) (Frequency)

44 – 48 4

49 – 53 5

54 – 58 7

Total 16

Find Frequency density of the second class interval.

(a) 0.80 (b) 0.90 (c) 1.00 (d) 1.10

19. A variable is known to be _______ if it can assume any value from a given interval.

(a) Discrete (b) Continuous (c) Attribute (d) Characteristic

20. ________ is the entire upper part of the table which includes columns and sub–column numbers,unit(s) measurement.

(a) Stub (b) Box–head (c) Body (d) Caption

21. _______ is the left part if the table providing the description of the rows.

(a) Caption (b) Body (c) Stub (d) Box head

22. Data collected on sex ratio from the census reports are ______ .

(a) Primary data (b) Secondary data

(c) Discrete data (d) Continuous data

23. Refer following table:

Frequency distribution of weights of 16 students

Weight in kg. No. of students

(Class interval) (Frequency)

44 – 48 4

49 – 53 5

54 – 58 7

Total 16

Find Relative frequency for the second class interval.

(a) 1/11 (b) 5/4 (c) 5/16 (d) 1/4

STATISTICAL DESCRIPTION OF DATA 83

Page 84: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

24. _______ may be defined as the minimum value and the maximum value, the class interval maycontain.(a) Class mark (b) Class limit(c) Both of the above (d) None of the above

25. The colour of a flower is an example of ______ .(a) An attribute (b) A variable(c) A discrete variable (d) A continuous variable

26. A quantitative characteristic is known as ______ .(a) An attribute (b) A variable(c) Both of above (d) None of above

27. Refer following tableFrequency distribution of weights of 16 studentsWeight in kg. No. of students(Class interval) (Frequency)44 – 48 449 – 53 554 – 58 7Total 16Find Relative frequency for the third class interval.

(a) 7/16 (b) 7/4 (c) 16/7

(d) None of the above.

28. Representation of data is done by

(a) Diagrams (b) Pictures (c) Charts (d) All these

29. A statistic is described as

(a) A function of sample observation (b) A function of population units(c) A characteristic of a population (d) A part of population

30. Pie diagram is used for(a) Comparing different components and their relation to the total(b) Representing qualitative data in a circle(c) Representing quantitative data in a circle.(d) (b) or (c)

31. To find the median graphically use(a) Ogive (b) Frequency Polygon(c) Histogram (d) None of these

84 STATISTICS (CPT)

Page 85: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

32. Unequal widths of classes in the frequency distribution do not cause any difficulty in theconstruction of

(a) Ogive (b) Frequency Polygon

(c) Both (d) None of these

33. Frequency density is used in the construction of

(a) Histogram (b) Ogive

(c) Frequency polygon (d) None of these

34. The chart that was logarithm of the variable is known as

(a) Line chart (b) Ratio chart

(c) Multiple line chart (d) Component line chart

35. The mean salary for a group of 4 male is ` 5200 per month and that for a group of 6 female is` 6800 per month. What is the combined salary?

(a) ` 6160 (b) ` 6610 (c) ` 6110 (d) None of these

36. A frequency distribution can be presented graphically by a

(a) Pie diagram (b) Histogram (c) Pictogram (d) Line diagram.

37. The width of each of ten classes in a frequency distribution is 2.5 and the lower class boundaryof the lowest class is 10.6. Which one of the following is the upper class boundary of the highestclass?

(a) 35.6 (b) 33.1 (c) 30.6 (d) None of these

38. Let L be the lower class boundary of a class in a frequency distribution and m be the mid pointof the class. Which one of the following is the higher class boundary of the class?

(a) (b) (c) 2m - L (d) m - 2L

39. Ogive is used to obtain

(a) Mean (b) Mode (c) Quartiles (d) All these

40. In audit test statistical methods are not used

(a) True (b) False (c) Both (d) None of these

STATISTICAL DESCRIPTION OF DATA 85

Page 86: STATISTICAL DESCRIPTION OF DATA · The input-output analysis is based on statistical data which ... The production department of an organisation prepares the forecast ... “Science

1 (a) 11 (c) 21 (c) 31 (a)

2 (d) 12 (b) 22 (b) 32 (a)

3 (c) 13 (a) 23 (c) 33 (a)

4 (a) 14 (d) 24 (b) 34 (b)

5 (c) 15 (b) 25 (a) 35 (a)

6 (a) 16 (b) 26 (b) 36 (b)

7 (b) 17 (c) 27 (a) 37 (a)

8 (c) 18 (c) 28 (d) 38 (c)

9 (c) 19 (b) 29 (a) 39 (c)

10 (a) 20 (b) 30 (a) 40 (b)

ANSWER SHEET

* * *

86 STATISTICS (CPT)