  • Digital Reference: Statistics & DataLIS 536March 4, 2009

  • OutlineQuantitative digital content Distinguishing between statistics and dataThe basics about statistics and data as digital informationThe reference interview and search strategiesProbe to determine if the patron wants statistics or dataIf statistics, are they likely official statistics or non-official statistics?Strategies for searching for statisticsGovernment publications approachData approachIf data, need to think about the life cycle of dataAn example of from the Canadian National Population Health Survey, 1994Conditions of access can become a barrierFormats matter to your patronsCitations for statistics and data

  • Distinguishing statistics from data

  • How statistics and data differStatisticsnumeric summaries known as facts/figures derived from data, i.e, processed from datapresentation-ready formatDatanumeric files created and organized for computer analysisrequires computer processingnot in a display format

  • A statistic cant be real without dataA real statistic requires a data source. If the publisher of a statistic cant tell you the data source behind a statistic, you should question that the statistic is real. After all, people do make up statistics.Classic example: a statistic in a 1986 Newsweek article claimed that a 40-year-old woman had a better chance of being killed by a terrorist than of getting married (2.6 percent). Twenty years later, Newsweek admitted that this comparison wasnt in the study.

  • A statistic cant be real without dataA statistic may have been derived from poor quality data and, consequently, may be of questionable value. But nevertheless, it is a real statistic.For example, a debate erupted over a Lancet article on the number of civilians deaths in Iraq following the first 18 months after the invasion. The desire is to have quality statistics that are derived from quality data.

  • Statistics Canadas criteriaStatistics Canada uses the following criteria to define quality statistics or statistics fit for useRelevance: addresses issues of important to usersAccuracy: degree it describes what it was designed to measureTimeliness: the delay between when the information was collected and when it is made available Accessibility: the ease to which the information can be obtained by usersInterpretability: access to metadata that facilitates interpretation and useCoherence: the fit with other statistical information through the use of standard concepts, classifications and target populations

  • Statistics are about definitions

  • Statistics are about definitions!You may think of statistics as being just numbers, but these numbers represent summaries of measurements or observations that have a conceptual meaning. Deriving statistics from data is dependent on definitions of the concept that is being summarized.

  • Statistics are about definitions!Consider the following example from the Canadian Census on the data behind statistics about visible minorities. This table displays the size of the visible minority population in Canada from the 2006 Census.

    Visible Minority Groups (15), Generation Status (4), Age Groups (9) and Sex (3) for the Population 15 Years and Over of Canada, Provinces, Territories, Census Metropolitan Areas and Census Agglomerations, 2006 Census - 20% Sample Data

  • Statistics are about definitions!How is visible minority status identified in the Census? Are aboriginals among the visible minority in Canada? What is the definition of visible minority?

  • Statistics involve classificationsThe definitions that shape statistics specify the metric of the data they summarize (for example, Canadian dollars) or the categories used to classify things if a statistic represents counts or frequencies. In this latter case, classification systems are used to identify categories of membership in a concepts definition.Some classification systems are based on standards while others are based on convention or practice.For an example of a standard, see the North American Industrial Classification System (NAICS).

  • Statistics are presentation readyTables and charts (or graphs) are typically used to display many statistics at once. You will find statistics sprinkled in text as part of a narrative describing some phenomenon; but tables and charts are the primary methods of organizing and presenting statistics.

  • A quick reviewTo this point, we have established that:Statistics are real only if they are derived from data;Statistics are dependent of definitions of the concepts they summarize;Statistics that represent counts of things in the data employ classification systems, which are based either on standards or convention; andStatistics are typically organized for display using tables or charts.

  • Characteristics of statisticsTo discover some additional characteristics of statistics, we will examine a table published by Statistics Canada about the average undergraduate tuition fees for full-time students by field of study.While this table does not display all of the information that I want to find in a published statistical table, it is fairly comprehensive.Refer to the handout entitled, Tips for Reading a Statistical Table, to find a full list of the information that I do want to find in a statistical table.

  • What about data?It is helpful to understand some basics about the origins of data, especially since statistics are derived from data. As we will see later, having a good understanding of data can greatly help in the search for statistics.There are three generic methods by which data are produced. Statistics are generated from the data produced out of all of these methods.Observational MethodsExperimental MethodsComputational Methods

  • Methods producing data

  • Methods producing data A particular discipline or field of study will tend to be dominated by one of these three methods, although outputs may also exist from the other two methods. Consequently, the knowledge disseminated within a field is often fairly homogeneous in the way statistical information is used and reported.We will see later how knowing the method from which data are derived and the life cycle in which statistics are produced can help in the search for statistics.

  • Reference interviewWhen a patron is looking for quantitative information, begin by determining if she/he is looking for statistics or data.Never take at face value that a patron asking for data really wants data.Probe by asking follow up questions based on the characteristics that distinguish statistics from data. What are some questions that you might ask to help determine if you should be looking for statistics or data?

  • Chart of numeric informationOkay, we determined that our patron wants statistics. What next?

  • Chart of numeric informationWe behave like a government publications librarian and begin by filtering on the basis of official or non-official status for the statistics being sought.

  • Official vs. non-official statisticsOfficial statistics are those produced by national agencies with a public mandate (such as Statistics Canada or the Office for National Statistics in the UK) and international organizations with mandates from other governments (such as the UN).Non-official statistics are produced by all other bodies, including trade associations, professional organizations, banks, consultants, marketing companies, newspapers, research institutes, etc.

  • Official statistics: no one definitionNo single definition of official statistics exists upon which national statistical agencies agree. Consulting some of the national statistical agencies, you will find official statistics characterized as a critical element of open and accountable government (NZ), as different viewpoints of what statistics are (UK) and as a process for establishing a fitness for use (Canada).

  • Official statistics (NZ)Official statistics are statistics produced by government agencies to: shed light on economic and social conditions develop, implement and monitor policies inform decision making, debate and discussion both within government and the wider community Government and its administrative arms need official statistics for policy development, implementation and evaluation. The public at large have similar information needs in order to evaluate government policy, to ensure public accountability, and to be adequately informed about social and economic conditions.

  • Official statistics (UK)Official statistics can mean different things to different people. There are three broad ways of defining it. First, it may be defined in terms of people providing the service (eg. the Government Statistical Service). Second, it may be defined in terms of activities (e.g., collecting data, publishing statistics, providing statistical advice to support policy work). Third, it may be defined in terms of outputs, or products of statistical work (e.g., the published statistics on the labour market, on crime, on health etc).Source: Statistics: A Matter of Trust

  • Official statistics (Canada)There is no standard definition among statistical agencies for the term official statistics. There is a generally accepted, but evolving, range of quality issues underlying the concept of 'fitness for use'. These elements of quality need to be considered and balanced in the design and implementation of an agency's statistical program.

  Official statistics go through a formal process to be created and released. Definitions of concepts are a critical aspect of the process as well as the methodologies for collecting and producing the statistics.


