market research mb mk 02 - mba - iii sem, uptu syllabus
DESCRIPTION
Marketing Research - MBA - III Sem, UPTUTRANSCRIPT
Marketing ResearchMB MK – 0 2
Syllabus of Unit I1. Marketing Research –
1. Definition,
2. Scope,
3. Significance,
4. Limitations,
5. Obstacles in acceptance.
2. Ethics in marketing research.
3. Marketing Intelligence system
4. Research process
5. Management dilemma (problem) – 1. Decision problem
2. Research problem
6. Hypothesis statement – 1. characteristics of a good hypothesis
7. Drafting the research proposal.
8. Various sources of market Information – 1. Methods of collecting Market Information
2. Secondary data – sources – problems of fit and accuracy.
9. Syndicated services.
04/10/2023 Kartikeya Singh 2
Unit 1. Marketing Research
a). Definition
• “Marketing research is the systematic gathering, recording and analyzing of data about problems relating to the marketing of goods and services.”
• “Market research will give you the data you need to identify and reach your target market at a price customers are willing to pay.”
04/10/2023 Kartikeya Singh 4
b).Scope
• The scope of Marketing Research could cover the business problems relating to the followings– Types of consumer that compromise present and
potential markets.– Buying habits and purchasing habits– Size and location of different markets, not only in Indian
but also overseas– New mantras for emerging markets– Marketing and manufacturing capabilities of competitors– Most Suitable entry timing– Optimum use of Promotion tools.– Chances of improvement in current channels– Pricing Strategy.
04/10/2023 Kartikeya Singh 5
b). Scope
• Market Research• Product Research• Sales related Research• Packaging Research• Advertising Research• Business Economic Research• Promotion Research• Distribution Research• Consumer Research• Pricing Research
04/10/2023 Kartikeya Singh 6
c).Significance
• A manager takes decisions• His responsibility is to reduce risk of
failure in decision making• Risk arises due to lack of relevant
information• A manager always seeks information to
improve quality of decision making• Information can be collected through MR• Hence, MR is an important tool for
managerial decision making
04/10/2023 Kartikeya Singh 7
d).Disadvantages/Limitations
• Disadvantages of Market Research– Information only as good as the
methodology used– Can be inaccurate or unreliable– Results may not be what the business wants
to hear!– May stifle initiative and ‘gut feeling’– Always a problem that we may never know
enough to be sure!
04/10/2023 Kartikeya Singh 8
e).Obstacles in Acceptance
A} A narrow conception of Marketing Research.
B} Uneven Caliber of Marketing Researchers
C} Late and Occasionally Erroneous Findings by Marketing Research
D} Personality and presentation Differences
04/10/2023 Kartikeya Singh 9
2. Ethics in Marketing Research
• Relating to Respondents
• Relating to Clients• Relating to Research
Firms• Relating to Research
Professionals
04/10/2023 Kartikeya Singh 10
3.Marketing Intelligence System
• “A Market Intelligence System (MkIS) is one that systematically gathers and processes critical business information, transforming it into actionable Management intelligence for marketing decisions”.
04/10/2023 Kartikeya Singh 11
3.Marketing Intelligence System
• Market and customer orientation• Identification of new opportunities • Early warning of competitor moves • Minimizing investment risk • Better customer interaction.• Better market selection & positioning.• Quicker, more efficient and cost effective
information
04/10/2023 Kartikeya Singh 12
4.Research Process1. Define the Problem
2. Develop an Approach to the Problem• Type of Study? Exploratory, Descriptive, Causal?• Mgmt & Research Questions, Hypotheses
3. Formulate a Research Design• Methodology• Questionnaire Design
4. Fieldwork/Data collection.
5. Prepare & Analyze the Data
6. Prepare & Present the Report04/10/2023 Kartikeya Singh 13
4. Research Process-Simplified
1. Identifying and Defining the Research Problem
2. Conducting Survey
3. Formulating hypothesis
4. Creating Research Design
5. Determining the data Need
6. Sample Selection
7. Designing Questionnaire
8. Selection and Training of Field Staff
9. Collection of Data
10. Data Processing
11. Data Analysis and Interpretation
12. Preparation of Research Report
13. Follow up04/10/2023 Kartikeya Singh 14
5. Management Dilemmaa). Decision Problem
• Research Problem: Research problem must contain the followingi. An Individual or an organization which has the
problem
ii. Some objective/goal to be attained.
iii. Research Should have some doubts regarding the selection of possible alternatives.
iv. They must occupy some environment/condition to which the difficulty pertains
v. There should be some alternative course of action through which the objectives can be attained.
04/10/2023 Kartikeya Singh 15
6.(b) Research Problem
• Research problem enables the researchers to be on the right path, whereas an ill-defined problem may create problems. In real sense, formulation of a problem is often more essential than its solution.
04/10/2023 Kartikeya Singh 16
7.Hypothesis Statement
The word hypothesis is a compound of two words ‘hypo’ and ‘thesis’. Hypo means under or below and thesis means a reasoned theory or a viewpoint.
• “A hypothesis is an attempt at explanation: a provisional supposition made in order to explain scientifically some fact or phenomenon.” - Coffey
• “Hypothesis is a summary which is temporary and imaginary related to subject of study.” - George Caswell
• “Hypothesis is a proposition which can be put to test to determine its validity”- Good and Hatt
04/10/2023 Kartikeya Singh 17
7.(a)Characteristics of Good Hypothesis
• Guidance• Clarity• Not in Exaggerative
Language• Temporary Solution• Connectivity with
Main problem• Specialization• Scientific and
Meaningful• Related with Theories
04/10/2023 Kartikeya Singh 18
7.Research Proposal
• A research proposal is a document written by a Researcher that describes in details the program for a proposed scientific investigation..
• A research proposal is a document written by a researcher that provides a detailed description of the proposed program. It is like an outline of the entire research process that gives a reader a summary of the information discussed in a project.
04/10/2023 Kartikeya Singh 19
7. Research ProposalThe Issue
What problem the researcher address?
BenefitWhat will the
research contribute to the existing knowledge?
Research Design
How will the research achieve its shared objective?
04/10/2023 Kartikeya Singh 20
7.Drafting a Research Proposal
1. Brief paragraph – what the research is about.
2. Background to the topic
3. Why this research is import, necessary, and what is new about it.
4. Detail what the research about – Aims and objectives of the research.
5. If we have research design then give a glimpse of it.(not necessary)
6. Methodology – How you are going to carry this research.
7. Limitations if any.04/10/2023 Kartikeya Singh 21
8. Various Sources of Market Information
• Importance of Market Information.– Anticipation of consumer demand– Complexity of Marketing– Significance of Economic Indicators– Significance of Competition– Development of Science and Technology– Consumerism– Marketing Planning– Information explosion.
04/10/2023 Kartikeya Singh 22
8.Various Sources of Market Information
• Marketing Information:- “Marketing research is the function which links the consumer, customer and public to the marketer through information.”
• Data – “Recorded experience that is useful for decision making.”
• Characteristics:– Accurate– Current– Sufficient– Available– Relevant
04/10/2023 Kartikeya Singh 23
8.Various Sources of Market Information.a) Methods of collecting Market Information
• Questionnaire Method
• In this method, the responded is questioned directly about his attitudes, opinion etc.
• Observation Method• In this method the
responded is simply observed and his actions are recorded. This is done by using mechanical devices or by physically watching them.
04/10/2023 Kartikeya Singh 24
8.Various Sources of Market Information.a) Methods of collecting Market Information
Questionnaire Method:-• Questionnaire is simply a formalized schedule to
obtain and record specific and relevant information with accuracy.
• Questionnaire has five functions to perform– Give the respondent clear comprehension of the questions.– Assurance of confidentiality.– Stimulate responses through introspection, using memory or
reference to records.– Give instructions on what is required and the way it would be
responded.– Identify what needs to be known to classify and verify the
interview.
04/10/2023 Kartikeya Singh 25
8.Various Sources of Market Information.a) Methods of collecting Market Information
Questionnaire Method:-
• Eight steps in designing a questionnaire:-
1. Determine the specific data to be collected
2. Determine Interview Process
3. Then evaluation of the questionnaire content
4. Decide on Question Contenti. Open ended
ii. Closed endeda) Dichotomous
b) Ranking
c) Checklist
d) Multiple Choice Questions
e) Scales
Cont…….04/10/2023 Kartikeya Singh 26
8.Various Sources of Market Information.a) Methods of collecting Market Information
• Questionnaire Method:-Contd….• Eight steps in designing a questionnaire:-
5. Determine the wordings of the questions.I. Use of simple language.
II. Use familiar words
III. Avoid using lengthy questions
IV. Be as specific as possible.
6. Questionnaire structure.—Sequence of questions.
7. Determine the physical characteristics of the forum.
8. Pretest, Revision and Final Draft.
04/10/2023 Kartikeya Singh 27
8.Various Sources of Market Information.a) Methods of collecting Market Information
Advantage:-a. The questionnaire method has
the capacity to address or deal with all types/aspects of research problems.
b. This method can maintain confidentiality of answers and hence respondents can freely express themselves.
c. The processing of data can be fast if the questionnaire is well structured.
d. It is less time consuming and less expensive than observation method.
e. It is very structured and thus leaves little room for manipulation or incorrect recording by the interviewer or respondent.
Disadvantage:- a. A saturation level has been reached
and now the respondent is not
willing to fill up questionnaire and
hardly manages to spare time.
b. If a questionnaire is not well-
designed, it generates incomplete
information.
c. Interviewers who are not well trained
can spoil a good questionnaire.
d. The respondent may not fill a
questionnaire property if he has to
tax his memory too much.
e. Lack of Time.
f. Lack of Interest.
04/10/2023 Kartikeya Singh 28
Questionnaire Method:-Contd….
8.Various Sources of Market Information.a) Methods of collecting Market Information
Observation Method:-• Another method used for gathering research data is by
observing a respondents over behavior. Observation is used to obtain information on both past and present behavior of people.
• Observation may be used either solely or in conjunction with some other method.
• The potential components in this form must be evaluated on the basis of four criteria to determine what exactly is to be observed:
I. Who should be observed?
II. What should be observed?
III. When should the observation take place?
IV. What should be the expected path?
Equipments used for it:- eye camera, pupil metric camera, Psycho galvanometer, video camera, cctv etc.
04/10/2023 Kartikeya Singh 29
8.Various Sources of Market Information.a) Methods of collecting Market Information
Advantage:-I. It is objective and accurate.
II. It eliminates the subjective element faced in questionnaires
III. The willingness of the respondents does not matter as the respondent is not aware.
IV. It is very useful in case of respondents where there is difficulty to communicate.
Disadvantage:-I. The action observed may not
necessary be the one in actual normal circumstances.
II. It is very expensive and time consuming to set up and undertake observation studies.
III. It can’t yield information on state of mind, motives etc.
IV. The observer must be properly selected and trained as the data collection depends upon skill of the observer.
V. Time constraint.
VI. Confidentiality.
04/10/2023 Kartikeya Singh 30
Survey Method:-
8.Various Sources of Market Information.b) Secondary data – sources.
a) Secondary Data:- “Data collected by someone else for purpose other than solving the problem being investigated". Secondary data can be collected through
I. Internal and
II. external sources.a) Government Sources
b) Business References
c) Commercial Research Agencies
04/10/2023 Kartikeya Singh 31
8.Various Sources of Market Information.c) Problems of fit and accuracy.
• It is not enough to know what was the purpose behind the data collection, it is also necessary to know how the data was collected.
• Secondary data suffers from a major limitation of obsolescence. The utility of secondary data diminishes with time.
• Secondary data may be available but always be relevant is not necessary.
• The Classification bases used in the secondary data often do not coincide with those of the present study.
• Locating appropriate sources of secondary data is a time consuming affair.
• One can not be always sure of the accuracy of secondary data.
04/10/2023 Kartikeya Singh 32
9. Syndicated services.• Syndicate services may be regarded as an
‘intermediate’ source falling between primary and secondary sources of data. Syndicated services are normally designated to suit the requirements of many individual firms. Such services are particularly useful in the spheres of T.V. viewing, magazine readership and consumer goods/movement through retail outlet.
• Syndicate services are provided by certain organizations, which collect and tabulate marketing information on a continuous basis. Organizations providing syndicated services may also engage themselves in other types of research work for their clients. However, such organizations usually confine themselves to this activity alone.
04/10/2023 Kartikeya Singh 33
Assignment - 1
1. Define Market Research. State its significance and Limitations.
2. What do you understand by the term ethics in Market Research?
3. Elaborate Research Process in detail with suitable example.
4. How hypothesis is different from Research Proposal.
5. What are the various sources of information. Discuss it in detail.
• Date of Submission– 27th September,2013
04/10/2023 Kartikeya Singh 34
Case Study-JD sports
• CASE STUDY-MARKET RESEARCH-JD SPORTS.pdf
04/10/2023 Kartikeya Singh 35
End of Unit I
04/10/2023 Kartikeya Singh 36
Unit II
04/10/2023 Kartikeya Singh 37
Syllabus of Unit II1. Marketing research techniques:
2. Market development research: I. Cool hunting – socio cultural trends,
II. Demand estimation research,
III. Test marketing,
IV. Segmentation Research - Cluster analysis,
V. Discriminant analysis.
3. Sales forecasting – I. objective and
II. subjective methods
4. Marketing Mix Research: I. Concept testing,
II. Brand Equity Research,
III. Brand name testing,
5. Commercial eye tracking :I. package designs,
II. Conjoint analysis,
III. Multidimensional scaling
IV. positioning research,
V. Pricing Research,
VI. Shop and retail audits,
6. Advertising Research I. Copy Testing,
II. Readership surveys and viewer ship surveys,
III. Ad tracking,
IV. Viral marketing research.
04/10/2023 Kartikeya Singh 38
1. Research Techniques• Ad Tracking – periodic or continuous in-market research to monitor
a brand’s performance using measures such as brand awareness, brand preference, and product usage. (Young, 2005)
• Advertising Research – used to predict copy testing or track the efficacy of advertisements for any medium, measured by the ad’s ability to get attention (measured with Attention Tracking), communicate the message, build the brand’s image, and motivate the consumer to purchase the product or service. (Young, 2005)
• Brand equity research — how favorably do consumers view the brand? • Brand association research — what do consumers associate with the
brand? • Brand attribute research — what are the key traits that describe the
brand promise? • Brand name testing - what do consumers feel about the names of the
products? • Commercial eye tracking research — examine advertisements, package
designs, websites, etc. by analyzing visual behavior of the consumer Concept testing - to test the acceptance of a concept by target consumers
04/10/2023 Kartikeya Singh 39
1. Research techniques• Cool hunting - to make observations and predictions in changes of new or existing
cultural trends in areas such as fashion, music, films, television, youth culture and lifestyle
• Buyer decision making process research — to determine what motivates people to buy and what decision-making process they use; over the last decade,
• Neuro marketing emerged from the convergence of neuroscience and marketing, aiming to understand consumer decision making process
• Copy testing – predicts in-market performance of an ad before it airs by analyzing audience levels of attention, brand linkage, motivation, entertainment, and communication, as well as breaking down the ad’s flow of attention and flow of emotion.
• Customer satisfaction research - quantitative or qualitative studies that yields an understanding of a customer's satisfaction with a transaction
• Demand estimation — to determine the approximate level of demand for the product
• Distribution channel audits — to assess distributors’ and retailers’ attitudes toward a product, brand, or company
• Internet strategic intelligence — searching for customer opinions in the Internet: chats, forums, web pages, blogs... where people express freely about their experiences with products, becoming strong opinion formers.
04/10/2023 Kartikeya Singh 40
1. Research Techniques• Marketing effectiveness and analytics — Building models and
measuring results to determine the effectiveness of individual marketing activities.
• Mystery consumer or mystery shopping - An employee or representative of the market research firm anonymously contacts a salesperson and indicates he or she is shopping for a product. The shopper then records the entire experience. This method is often used for quality control or for researching competitors' products.
• Positioning research — how does the target market see the brand relative to competitors? - what does the brand stand for?
• Price elasticity testing — to determine how sensitive customers are to price changes
• Sales forecasting — to determine the expected level of sales given the level of demand. With respect to other factors like Advertising expenditure, sales promotion etc.
04/10/2023 Kartikeya Singh 41
1. Research Techniques• Segmentation research - to determine
the demographic, psychographic, and behavioral characteristics of potential buyers
• Online panel - a group of individual who accepted to respond to marketing research
• Online Store audit — to measure the sales of a product or product line at a statistically selected store sample in order to determine market share, or to determine whether a retail store provides adequate service
• Test marketing — a small-scale product launch used to determine the likely acceptance of the product when it is introduced into a wider market
• Viral Marketing Research - refers to marketing research designed to estimate the probability that specific communications will be transmitted throughout an individual's Social Network. Estimates of Social Networking Potential (SNP) are combined with estimates of selling effectiveness to estimate ROI on specific combinations of messages and media.
04/10/2023 Kartikeya Singh 42
2.(a)Cool Hunting.
• The practice of observing current trends and predicting where the youth demographic will shift in trends in the immediate future.
• A term coined in the 90’s referring to marketing firms who looked to design and develop the newest trends.
• The marketing firms then sell these ideas to retail establishments who uses these idea to earn more profits.
04/10/2023 Kartikeya Singh 43
2.(a)Cool Hunting.• The “hot new designs” influence…• Art (ex. Magic Poster, Window
Pictures, Wall Paint Colors)• Retail Merchandise (ex. Tights,
Baggy pants, Khaki’s, Knee Length Socks, Not Socks)
• Music (ex. Reggae, Punk, Techno, European, Alternative)
• Shoes (ex. Knee length boots, Low Rise Sneakers, Design your own shoes)
• Gaming (ex. Puzzle Solving games, War Rally Games, Real Life Solutions Gaming)
• Travel (Ex. Costa Rica, Thailand, Backpacking) 04/10/2023 Kartikeya Singh 44
2.(a)Cool Hunting.
Alpha Consumer: A term used by marketers to define the “cool people” setting trends within their peer group. Usually the alpha consumer is setting this trend a year before it is mainstreamed.
Urban Pioneers: People who are established in music, fashion, film, marketing, and advertising.
04/10/2023 Kartikeya Singh 45
2.(b)Demand estimation research
• The decision-making task has become difficult and extremely important
• The need of the hour for a manager is to know the behavior of the market related variables, their interrelationship and future movement
• Demand estimation attempts to quantify the links between the level of demand for a product and the variables which determines it whereas demand forecasting simply attempts to predict the level of sales at some particular future date
04/10/2023 Kartikeya Singh 46
2.(b)Demand estimation research- Methods of Demand Estimation
Demand Estimation
Qualitative Methods
Consumer Survey,Market Experiment
Quantitative Methods
Statistical MethodModel specificationStatistical Models
04/10/2023 Kartikeya Singh 47
2.(b)Demand estimation research- Methods of Demand Estimation
Qualitative Method
• Consumer Survey.• Firms can obtain
information regarding their demand functions by using interviews and questionnaires, asking questions about buying habits, motives and intentions.
• These can be quick on-the-street interviews, or in-depth ones.
04/10/2023 Kartikeya Singh 48
2.(b)Demand estimation research- Methods of Demand Estimation
Qualitative Method
Advantage
• They give up-to-date information reflecting the current business environment.
• Much useful information can be obtained that would be difficult to uncovering other ways;
• Firms can also establish product characteristics that are important to the buyer,
Disadvantage
• Validity: Consumers often find it difficult to answer hypothetical questions, and sometimes they will deliberately mislead the interviewer to give the answer they think the interviewer wants.
• Reliability: It is difficult to collect precise quantitative data by such means.
• Sample bias: Those responding to questions may not be typical consumers.
04/10/2023 Kartikeya Singh 49
2.(b)Demand estimation research- Methods of Demand Estimation
Qualitative Method
• Market experiments:• Laboratory experiments or consumer clinics seek to
test consumer reactions to changes invariables in the demand function in a controlled environment.
• Consumers are normally given small amounts of money and allowed to choose how to spend this on different goods at prices that are varied by the investigator.
• However, such experiments have to be set up very carefully to obtain valid and reliable results; the knowledge of being in an artificial environment can affect consumer behavior.
04/10/2023 Kartikeya Singh 50
2.(b) Demand estimation research- Methods of Demand Estimation
Qualitative Method
Advantage• Gives direct feed back
about customer interest.• Customers are able to act
in stimulated atmosphere so their interest level can be known immediately
• Direct observation of the consumers takes place rather than something of a hypothetical theoretical model .
Disadvantage• There is less control in
this case• The number of variations
are more• Experiments may have to
be long-lasting
04/10/2023 Kartikeya Singh 51
2.(b) Demand estimation research- Methods of Demand Estimation
Qualitative Method
Model specification• In order to understand this we must first distinguish a
statistical relationship from a deterministic relationship. The latter are relationships known with certainty, for example the relationship among revenue, price and quantity:
• R=P*Q; if P and Q are known R can be determined exactly. • Statistical relationships are much more common in
economics and involve an element of uncertainty. The deterministic relationship is considered first.
04/10/2023 Kartikeya Singh 52
2.(b) Demand estimation research- Methods of Demand Estimation
Qualitative Method
Mathematical models:• It is assumed to begin with
that the relationship is deterministic. With a simple demand curve the relationship would therefore be:
• Q=f (P)• If we are also interested in
how sales are affected by the past price, the model might in general become:
• Qt=f (Pt, Pt-1)04/10/2023 Kartikeya Singh 53
2.(b) Demand estimation research- Methods of Demand Estimation
Qualitative Method
In practice we can very rarely specify an economic relationship exactly. Models by their nature involve simplifications; in the demand situation we cannot hope to include all the relevant variables on the right hand side of the equation, for a number of reasons:1. We may not know from a theoretical viewpoint what variables are relevant in affecting the demand for a particular product.2. The information may not be available, or impossible to obtain. An example might be the marketing expenditures of rival firms.3. It may be too costly to obtain the relevant information. For example, it might be possible to obtain information relating to the income of customers, but it would take too much time (and may not be reliable).
04/10/2023 Kartikeya Singh 54
• Statistical models
2.(b) Demand estimation research- Methods of Demand Estimation
Qualitative Method
Statistical Method• In a perfect relationship the points would exactly fit a
straight line, or some other regular curve. We therefore have to specify the relationship in statistical terms, using a residual term to allow for the influence of omitted variables. This is shown for the linear form as follows:
• Qi=a +bPi +di• where di represents a residual term. Thus, even if P is
known, we cannot predict Q with complete accuracy because we do not know for any observation what the size or direction of the residual will be.
04/10/2023 Kartikeya Singh 55
2.(c) Test marketing
• Test marketing is a research technique which is used when the proposed product and the marketing programme for the same is tried out for the first time with a small sample size in the potential market.
• Test marketing is defined as “A controlled experiment done in a limited but carefully selected part of the market place, whose aim is to predict the sales or profit consequences in absolute or relative terms of one or more proposed marketing actions”
04/10/2023 Kartikeya Singh 56
04/10/2023 Kartikeya Singh 57
2.(c) Test marketing
• Features– It helps to get information and experience with the marketing
programme before finalizing the plans and making a total commitment to it.
– It helps to predict the programmes outcome when it is applied to the total market.
– It is costly/Expensive.– It is time consuming.– It allows the competitors to view your new product or your test
marketing mix.– The test market should be large enough to provide meaningful
results and it should be demographically represent the actual population
2.(c) Test marketingMethods of Test Marketing
• Consumer Goods Test Marketing:– Purchase frequency– Trial purchase– Repeat purchase
• Sales Wave Research:– Offered free for trial and then they charge for it. They repeat the
process 3-4 times.– When customer makes the choice to purchase– How they get an advantage over their competitor.
• Simulated Test Marketing:– Simulation is an imitation of a real world situation.– Advertisement shown>Money provided>store purchase behavior
noticed>compared with the competitor.• Controlled Test Marketing:
– Shelf position, display method, point of purchase, pricing etc.
04/10/2023 Kartikeya Singh 58
2.(d) Segmentation Research – Cluster Analysis
Cluster analysis is used to classify persons or objects into a smaller number of mutually exclusive and exhaustive groups. There should be high internal (within cluster) homogeneity and high external(between cluster) heterogeneity, cluster analysis has been increasingly used in marketing research due to its utility in resolving the problem of classifying the consumers, products etc.
04/10/2023 Kartikeya Singh 59
2.(d) Segmentation Research – Cluster Analysis
04/10/2023 Kartikeya Singh 60
2.(e). Discriminant analysis
• A discriminant analysis enables the researcher to classify the person or objects into two or more categories.
• For Example, consumers may be classified as heavy and light users.
• With the help of such techniques, it is possible to predict the categories or classes which are mutually exclusive in which individuals are likely to be included. In recent years, discreminant analysis has been used by the marketing researchers.
• Identifying new product buyers, determining brand loyalty among customers etc.
04/10/2023 Kartikeya Singh 61
3. Sales forecasting – a). Objective and Subjective methods
• Sales analysis:- Sales analysis enables a company to identify the areas where its sales performance has been good or mediocre, customers who have bought I bulk, products with high and low sales volume etc.
• A systematic, comprehensive and periodical sales analysis will be helpful to a company to reinforce its sales effort where it is most needed. In this way, it can achieve the best possible results.
04/10/2023 Kartikeya Singh 62
3. Sales forecasting – a). Objective and Subjective methods
• Sales analysis by Territory• Sales analysis by Product• Sales analysis by
Customer• Sales analysis by Order
04/10/2023 Kartikeya Singh 63
3. Sales forecasting – a). Objective and Subjective methods
• The concept of Market Potential.– “Market Potential has been defined as “the
maximum demand response possible for a given group of customers within a well-defined geographic area for a given product or service over a specified period of time under well-defined competitive and environmental conditions”
04/10/2023 Kartikeya Singh 64
3. Sales forecasting – a). Objective and Subjective methods
• Methods of Estimating Current Demand:• “Total market potential is the maximum amount of
sales that might be available to all the firms in an industry during a given period under a given level of industry marketing effort and given environment conditions”
• Symbolically, total marketing potential is • Q = nxqxp
– Q = total market Potential– n = number of buyers in the specific product/market under
the given assumptions– q = quantity purchased by an average buyer– p = price of an average unit.
04/10/2023 Kartikeya Singh 65
3. Sales forecasting – a). Objective and Subjective methods
Process of predicting a future event based on historical data
Educated Guessing
Underlying basis of all business decisions Production Inventory Personnel Facilities
04/10/2023 Kartikeya Singh 66
• What is Sales forecasting
• Importance• Forecasting
Process
3. Sales forecasting – a). Objective and Subjective methods
• Predict the next number in the pattern:
a) 3.7, 3.7, 3.7, 3.7, 3.7,
b) 2.5, 4.5, 6.5, 8.5, 10.5,
c) 5.0, 7.5, 6.0, 4.5, 7.0, 9.5, 8.0, 6.5,
• What is Sales forecasting
• Importance• Forecasting
Process
3. Sales forecasting – a). Objective and Subjective methods
Importance: a) It keeps any firm ready for any contingency to happen.
b) It is tool to help budgeting for the entire firm
c) Forecasting is quite necessary for planning for uncertain future in different areas of the economy.
d) For effective planning by providing a scientific and reliable basis for anticipating future operations such as production, inventory, supply of capital and so on.
e) For reducing the area of uncertainty that surrounds management decision-making with respect to cost, production, profits, pricing, etc.
f) Making and reviewing on a continuous basis will compel the managers to think ahead and to search for the best possible decisions with a dynamic approach.
g) For efficient managerial control as Forecast of sales a must in order to control the costs of production and the productivity of personnel.
04/10/2023 Kartikeya Singh 68
• What is Sales forecasting
• Importance• Forecasting
Process
3. Sales forecasting – a). Objective and Subjective methods.
04/10/2023 Kartikeya Singh 69
• What is Sales forecasting
• Importance• Forecasting
Process
3. Sales forecasting – a). Objective and Subjective methods.
• Short-range forecast – Usually < 3 months
• Job scheduling, worker assignments
• Medium-range forecast– 3 months to 2 years
• Sales/production planning
• Long-range forecast– > 2 years
• New product planning
Designof system
Detailed use ofsystem
Quantitativemethods
QualitativeMethods
3. Sales forecasting – a). Objective and Subjective methods.
Introduction Growth Maturity Decline
Sales
Time
Quantitative models
- Time series analysis- Regression analysis
Qualitative models- Executive judgment- Market research- Survey of sales force- Delphi method
3. Sales forecasting – a). Objective and Subjective methods.
Methods of Forecasting
Subjective or Qualitative
Field Sales Force
Jury of Executives
Users Expectations
The Delphi Method
Objective or Quantitative
Causal
04/10/2023 Kartikeya Singh 72
Subjective Methods:• An important advantage of subjective
methods is that they are easily understood. • Another advantage is that the cost involved
in forecasting is quite low.• One major limitation is the varying
perceptions of people involved in forecasting. As a result, wide variance is found in forecasts.
• Subjective methods may be more suitable in case of highly technical products which have a limited number of customers.
04/10/2023 Kartikeya Singh 73
3. Sales forecasting – a). Subjective methods.
Subjective Methoda) Field Sales
Forceb) Jury of
Executivesc) Users
Expectationsd) The Delphi
Method
3. Sales forecasting – a). Subjective methods.
Field Sales Force• Some companies ask their salesman to indicate the most likely
sales for a specified period in the future. • Usually the salesman is asked to indicate anticipated sales for
each account in his territory. These forecasts are checked by district managers who forward them to the company’s head office. Different territory forecasts are then combined into a composite forecast at the head office. This method is more suitable when a short-term forecast is to be made as there would be no major changes in this short period affecting the forecast.
• Advantage– Sales force are directly involved so we get a direct feedback
• Disadvantage– Sales force would not take an overall or broad perspective– Sales force may give somewhat low figure.
04/10/2023 Kartikeya Singh 74
Subjective Methoda) Field Sales
Forceb) Jury of
Executivesc) Users
Expectationsd) The Delphi
Method
3. Sales forecasting – a). Subjective methods.
Jury of Executives:• Some companies prefer to assign the task of sales forecasting
to executives instead of a sales force. Given this task each executive make his forecast for the next period. Since each has his own assessment of the environment and other relevant factors, one forecast is likely to be different from the other.
• To narrow down the differences in the forecasts, sometimes discussion between the executives is organized so that they can arrive at a common forecast. In case this is not possible, the chief executive may have to decide which of these forecasts is acceptable as a representative one.
• Advantage:– It includes large base of executive to come on final consensus.
• Disadvantage:– Opinion may be influenced by current market conditions.
04/10/2023 Kartikeya Singh 75
Subjective Methoda) Field Sales
Forceb) Jury of
Executivesc) Users
Expectationsd) The Delphi
Method
3. Sales forecasting – a). Subjective methods.
Users Expectations• Forecast can be based on users expectations
or intentions to purchase goods and services. • It is difficult to use this method when the
number of users is large. • Another limitations of this method is that though
it indicates users intentions to buy, the actual purchases may be far less at a subsequent period.
• It is most suitable when the number of buyers is small such as in case of industrial products.
04/10/2023 Kartikeya Singh 76
Subjective Methoda) Field Sales
Forceb) Jury of
Executivesc) Users
Expectationsd) The Delphi
Method
3. Sales forecasting – a). Subjective methods.
The Delphi Method:• This method is based on the expert opinions. Here, each
expert has access to the same information that is available. A feedback system generally keeps them informed of each others forecasts but no majority opinion is disclosed to them. However, the experts are not brought together. This is to ensure that one or more vocal experts do not dominate other experts.
• The experts are given an opportunity to compare their own previous forecasts with those of the others and revise them. After three or four rounds, the group of experts arrives at a final forecast.
• The method may involve a large number of experts and this may delay the forecast considerably. Generally it involves a small number of participation
04/10/2023 Kartikeya Singh 77
Subjective Methoda) Field Sales
Forceb) Jury of
Executivesc) Users
Expectationsd) The Delphi
Method
3. Sales forecasting – b).Objective methods.
• Quantitative or Objective Method1. Causal or Explanatory Methods
Causal or explanatory methods are regarded as the most sophisticated methods of forecasting. These methods yield realistic forecasts provided relevant data are available on the major variables influencing changes in sales. There are three distinct advantages of causal methods.
First, turning points in sales can be predicted more accurately by these methods than by time-series methods.
Second the use of these methods reduces the magnitude of the random component far more than it may be possible with the time series methods.
Third, the use of such methods provides greater insight into causal relationships
This facilitates the management in decision making.04/10/2023 Kartikeya Singh 78
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
a) Leading Indicators:• Sometimes one finds that changes in sales of a particular
product or service are preceded by changes in one or more leading indicators. In such cases, it is necessary to identify leading indicators and to closely observe changes in them.
• One example of leading indicators is the demand for various household appliances which follows the construction of new houses.
• Likewise, the demand for many durables is preceded by an increase in disposable income.
• Yet another example is of number of births. The demand for baby food and other goods for infants can be ascertained by the number of births in territory. It may be possible to include leading indicators in regression models.
04/10/2023 Kartikeya Singh 79
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading
Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – a).Subjective methods.
b). Regression Models:• Linear regression analysis is perhaps the most
frequently used and the most powerful method among causal methods.
• Regression models indicate linear relationships within the range of observations and at the times when they were made.
• Sometimes there may be a lagged relationship between the dependent and independent variables.
• It may happen that the data required to establish the ideal relationship, do not exist or are inaccessible or, if available, are not useful.
• Finally, regression model reflects the association among variables.
04/10/2023 Kartikeya Singh 80
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
Input and Output Method:• The analyst takes into consideration a large number of factors,
which affect the outputs he is trying to forecast. For this purpose, input- out put table is prepared where the inputs are shown horizontally trying to forecast.
• For this purpose, input-output table is prepared where the inputs are shown horizontally as the column headings and the outputs vertically as the stubs. It may be mentioned that by themselves input-output flows are of little direct use to the analyst.
• The use of input-output analysis in sales forecasting is appropriate for products sold to governmental, institutional and industrial markets as they have distinct patterns of usage. It is seldom used for consumer products and services.
• Major constraint in the use of this method is that it needs extensive data for a large number of items which may not easily available.
04/10/2023 Kartikeya Singh 81
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
d). Econometric Models:• Econometric is concerned with the use of statistical and
mathematical techniques to verify hypothesis emerging in economic theory. An econometric model incorporates functional relationships estimated by these techniques into an internally consistent and logically self-contained framework. The use of econometric models is generally found at the macro level such as forecasting national income and its components.
• Such models show how the economy or any specific segment operates. As compared to an ordinary regression equation, they bring out the causalities involved more distinctly. This merit of econometric models enables them to predict turning points more accurately. However, their use at the micro level for forecasting has so far been extremely limited.
04/10/2023 Kartikeya Singh 82
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
• Time Series: Values taken by a variable over time (such as daily sales revenue, weekly orders, monthly overheads, yearly income) and tabulated or plotted as chronologically ordered numbers or data points.
• To yield valid statistical inferences, these values must be repeatedly measured, often over a four to five year period. Time series consist of four components:
I. Seasonal variations that repeat over a specific period such as a day, week, month, season, etc.,
II. Trend variations that move up or down in a reasonably predictable pattern,
III. Cyclical variations that correspond with business or economic 'boom-bust' cycles or follow their own peculiar cycles, and
IV. Random variations that do not fall under any of the above three classifications.
04/10/2023 Kartikeya Singh 83
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
a). Freehand Method: One of the methods of getting a secular trend is the freehand method.
It may be mentioned that it is the simplest method of finding the trend line, which is simply extended for forecast.
It is highly subjective method as the trend line fitted to the same set of data will vary from one person to another as such it is the most inappropriate method to be used for forecasting
04/10/2023 Kartikeya Singh 84
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
b). Trend Projection: The trend is forecast simply by substituting the appropriate value t(i.e. the year for which the forecast is desired) in the least squares line.
In case the data are monthly or quarterly, this value is to be multiplied by the seasonal index.
Finally we measure the cyclical component and try to ascertain what it is likely to be at the point for which forecast is being made.
04/10/2023 Kartikeya Singh 85
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
c). Exponential Smoothing: When a large number of forecasts are to be made for a number of items, exponential smoothing is particularly suitable as it combines the advantages of simplicity of computation and flexibility. It may be used for short-term forecasts(One period into the future) particularly when there is no long-term trend in a time series data or when the trend is not clear.
This method uses differential weights to time-series data. The heaviest weight is assigned to the most recent data and the least weight to the most remote data in the time series. It is a type of moving average that “smooth's” the time series of its sharp variations.
04/10/2023 Kartikeya Singh 86
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
• Exponential Smoothing:
The formula used for exponential smoothing is based on three terms:
i) The present observed value of the time series Y
ii) The previous computed exponentially smoothed value Ei-1
iii) A subjectively assigned weighting factor or smoothing coefficient W.
Thus, the formula is
Ei=WYi+(1-W)Ei-1
Ei = value of the exponentially smoothed series being computed in time period i.
Ei-1 = value of the exponentially smoothed series computed in the preceding time period i-1
Yi = observed value of the time series in period i.
W = subjectivity assigned weight whose value is between 0 and 1.
04/10/2023 Kartikeya Singh 87
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
Sales data of a firm for the year 1995 to 2000 are given below
1995 15
1996 24
1997 15
1998 20
1999 22
2000 28
04/10/2023 Kartikeya Singh 88
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
Exponentially Smoothed Values of Sales of a Business Firm
Year Sales (milliion rs) W=0.5 W=0.3
1995 15 15 15
1996 24 19.50 17.70
1997 15 17.25 16.89
1998 20 18.63 17.82
1999 22 20.32 19.07
2000 28 24.16 21.75
3. Sales forecasting – b).Objective methods.
• Autoregressive model:• Sometimes the values of a time
series data are highly correlated with the values that precede and succeed them. In such cases an auto regression model is used for forecasting.
• The first order auto regressive model may be expressed as
• Y^i=b0+b1Yi-104/10/2023 Kartikeya Singh 89
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins Model
3. Sales forecasting – b).Objective methods.
• Box-Jenkins Method: The analyst identifies a tentative model considering the nature of the past data. This tentative model and the data are entered in the computer. The box-Jenkins programme then gives the values of the parameters included in the model. A diagnostic check is then conducted to find out whether the model gives an adequate description of the data. If the model satisfies the analyst in this respect, then it is used to make the forecast.
04/10/2023 Kartikeya Singh 90
Quantitative or Objective Method1. Causal or Explanatory
Methodsa) Leading Indicatorsb) Regression
Modelsc) Input-Output
Analysisd) Econometric
Models2. Time Series
a) Free handb) Trend Projectionc) Exponential
Smoothingd) Autoregressive
Modele) Box Jenkins
Model
4.Marketing Mix Researcha). Concept Testing
04/10/2023 Kartikeya Singh 91
4.Marketing Mix Researcha). Concept Testing
• Concept testing (or market testing) is the process of using quantitative methods and qualitative methods to evaluate consumer response to a product idea prior to the introduction of a product to the market.
• It can also be used to generate communication designed to alter consumer attitudes toward existing products.
• Such methods are commonly referred to as concept testing and have been performed using field surveys, personal interviews and focus groups, in combination with various quantitative methods, to generate and evaluate product concepts.
04/10/2023 Kartikeya Singh 92
4.Marketing Mix Researchb). Brand Equity Research:
• A brand is a “name, term, sign, symbol, or design, or a combination of them intended to identify the goods and services of one seller or group of sellers and to differentiate them from those of competition.”
• :-American Marketing Association
04/10/2023 Kartikeya Singh 93
4.Marketing Mix Researchb). Brand Equity Research:
• Brand equity is the added value that endowed to products and services. This value may be reflected in how consumers think, feel, and act with respect to the brand, as well as the prices, market share and profitability that the brand commands for the firm. Brand equity is an important intangible asset that has psychological and financial value to the firm.
04/10/2023 Kartikeya Singh 94
4.Marketing Mix Researchb). Brand Equity Research:
• Brand equity research measures your brand value. We use leading edge brand equity research models and quantitative marketing research tools to tailor each client firm's research analysis study.
• Brand equity research studies support branding strategy programs
• Brand Base Research• Brand Qualitative Research• Brand Quantitative Research 04/10/2023 Kartikeya Singh 95
4.Marketing Mix Researchc).Brand Name Testing
• Develop Your Brand Strategy• Research the Market, Competitors, and
Consumers• Identify the Message Your Brand Should
Communicate • Brainstorm without Judging• Create a Short List• Trademark and Domain Name Availability Search• Create a Shorter Short List• Develop Brand Marketing Mock-ups• Test Your Brand Marketing Mock-ups• Roll out and Monitor Your Brand04/10/2023 Kartikeya Singh 96
4.Marketing Mix Researchc).Brand Name Testing
• So how should names be researched?
• Here’s just a few thoughts and research companies may not respond well to this kind of heresy. Every one will have their own methodology and you will need to decide if it can adapt easily to names. However, these principles apply if you are talking to real people or their ‘avatar’.
• Think about what you are testing. This will help to keep research simple.
• Allow the audience to concentrate on the names. Don’t let them be distracted by elements which potentially cloud their judgement.
• Don’t waste your time producing
unnecessary stimulus. Instead, you should be testing the strength of prospective names before entering into design (unless of course, you have unlimited budget!).
• Don’t waste time on too many names. If you’ve done your job, you will already have narrowed the list to a manageable size – say, six words. If you can’t be decisive, use a group to screen out then use the subsequent ones to dig deeper.
• Listen out for consumers playing back to you the criteria you’ve used all the way through the development process – then you’ll know that you’ve asked the right questions.
04/10/2023 Kartikeya Singh 97
4.Marketing Mix Researchc).Brand Name Testing
• Put your thoughts into context (without getting caught up in the detail of design work). For example, set the scene with what your product is and does – not the price and pack size. You recruited these people because they’re your target, but they don’t have to like the product to tell you that the name isn’t right for it. You could also tell them about the personality of your brand, what its story is, because then they’ve got something to relate the names back to, not just a product or a usage occasion.
• Don’t let the respondents read the words until they have heard them. The consumer should react to the name, not just to words on a page. Get them to say them out loud – to test ease of pronunciation. (You won’t hear this in an on-line test, so how will you know if it works?).
• Remember the core idea? Which of the
names best fits the story?• Don’t worry about how many people like the
name – this is a brand, you want stronger emotions than ‘like’. You want and need people to sit up and take notice; a groan, a laugh. That doesn’t mean that they have to like it. And a name can be right for all the wrong reasons.
• Research is for your guidance and reassurance, not necessarily for cut-and-dried decisions.
• And finally, don’t believe them when they say they don’t like it. If you ask the right questions and prompt them the right way, you may find that the first name they heard – and hated – is actually the one they think works best for the brand! Brand names seep into our consciousness, they do not always bang us over the head. Give them time to fall in love.
04/10/2023 Kartikeya Singh 98
4.Marketing Mix Researchd). Commercial Eye Tracking
• Determining what a user looks at. Using sophisticated equipment, eye tracking follows the eye movements of a person looking at any visual such as a printed ad, an application's user interface or a page on a Web site. It is used to analyze the usability and effectiveness of the layout.
04/10/2023 Kartikeya Singh 99
4.Marketing Mix Researchd). Package Designs
I. Protect
II. Inform
III. Contain
IV. Transport
V. Preserve
VI. Display
I. Captures Attention
II. Offers First Impression
III. Provides Information
IV. Aids Purchasing
V. Addresses Needs in Global Markets
VI. Meets Legal Requirements
04/10/2023 Kartikeya Singh 100
4.Marketing Mix Research
f).Conjoint analysis, • Technique that allows a
subset of the possible combinations of product features to be used to determine the relative importance of each feature in the purchase decision
• Conjoint Analysis is an advanced multivariate technique that helps to identify what value most in making decisions.
04/10/2023 Kartikeya Singh 101
4.Marketing Mix Researchf).Conjoint analysis,
Attitudes towards dishwashing products
1.Clean: glass/dishes clean2.Shiny: glass/dishes shiny3.Smell: Non-perfumed/lemon
fresh/intensive lemon fresh4.Quantity: small/medium/x-
large5.Packaging: loose in box/tab in
plastic/tab in dissolving plastic
6.Design: single/multi-colored/multi-colored + ball
• Advantage:• Estimates psychological tradeoffs
that consumers make when evaluating several attributes together
• Ensures preferences at the individual level
• Uncovers real or hidden drivers which may not be apparent to the respondent themselves
• Realistic choice or shopping task • Able to use physical objects • If appropriately designed, the ability
to model interactions between attributes can be used to develop needs based segmentation
04/10/2023 Kartikeya Singh 102
4.Marketing Mix Researchg).Multidimensional Analysis
• Multidimensional scaling (MDS) is a class of procedures for representing perceptions and preferences of respondents spatially by means of a visual display.
• Perceived or psychological relationships among stimuli are represented as geometric relationships among points in a multidimensional space.
• These geometric representations are often called spatial maps. The axes of the spatial map are assumed to denote the psychological bases or underlying dimensions respondents use to form perceptions and preferences for stimuli.
4.Marketing Mix Researchg).Multidimensional Analysis
Statistics and Terms Associated with MDS
• Spatial map. Perceived relationships among brands or other stimuli are represented as geometric relationships among points in a multidimensional space called a spatial map.
• Coordinates. Coordinates indicate the positioning of a brand or a stimulus in a spatial map.
• Unfolding. The representation of both brands and respondents as points in the same space is referred to as unfolding.
4.Marketing Mix Researchg).Multidimensional Analysis
Conducting Multidimensional Scaling
Formulate the Problem
Obtain Input Data
Decide on the Number of Dimensions
Select an MDS Procedure
Label the Dimensions and Interpret the Configuration
Assess Reliability and Validity
4.Marketing Mix Researchg).Multidimensional Analysis
i). Formulate the Problem
• Specify the purpose for which the MDS results would be used.
• Select the brands or other stimuli to be included in the analysis. The number of brands or stimuli selected normally varies between 8 and 25.
• The choice of the number and specific brands or stimuli to be included should be based on the statement of the marketing research problem, theory, and the judgment of the researcher.
4.Marketing Mix Researchg).Multidimensional Analysis
ii). Input Data for Multidimensional Scaling
Direct (Similarity Judgments)
Derived (Attribute Ratings)
MDS Input Data
Perceptions Preferences
• Perception Data: Direct Approaches. In direct approaches to gathering perception data, the respondents are asked to judge how similar or dissimilar the various brands or stimuli are, using their own criteria. These data are referred to as similarity judgments.
Very Very
Dissimilar Similar
Crest vs. Colgate 1 2 3 4 5 6 7
Aqua-Fresh vs. Crest 1 2 3 4 5 6 7
Crest vs. Aim 1 2 3 4 5 6 7
.
.
.
Colgate vs. Aqua-Fresh 1 2 3 4 5 6 7
• The number of pairs to be evaluated is n (n -1)/2, where n is the number of stimuli.
4.Marketing Mix Researchg).Multidimensional Analysis
ii). Conducting Multidimensional Scaling Obtain Input Data
Similarity Rating Of Toothpaste BrandsAqua-Fresh Crest Colgate Aim Gleem Macleans Ultra Brite Close-Up Pepsodent Dentagard
Aqua-FreshCrest 5
Colgate 6 7Aim 4 6 6
Gleem 2 3 4 5Macleans 3 3 4 4 5Ultra Brite 2 2 2 3 5 5Close-Up 2 2 2 2 6 5 6
Pepsodent 2 2 2 2 6 6 7 6Dentagard 1 2 4 2 4 3 3 4 3
• Perception Data: Derived Approaches. Derived approaches to collecting perception data are attribute-based approaches requiring the respondents to rate the brands or stimuli on the identified attributes using semantic differential or Likert scales.
Whitens Does not
teeth ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ whiten teeth
Prevents tooth Does not prevent
decay ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ tooth decay
.
.
.
.
Pleasant Unpleasant
tasting ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ tasting
• If attribute ratings are obtained, a similarity measure is derived for each pair of brands.
Conducting Multidimensional ScalingObtain Input Data
A Spatial Map of Toothpaste Brands
0.5
-1.5
Dentagard
-1.0-2.0
0.0
2.0
0.0
Close Up
-0.5 1.0 1.5 0.5 2.0
-1.5
-1.0
-2.0
-0.5
1.5
1.0
Pepsodent
Ultrabrite
Macleans Aim
Crest
Colgate
Aqua- Fresh
Gleem
Using Attribute Vectors to Label Dimensions
0.5
-1.5
Dentagard
-1.0-2.0
0.0
2.0
0.0
Close Up
-0.5 1.0 1.5 0.5 2.0
-1.5
-1.0
-2.0
-0.5
1.5
1.0
Pepsodent
Ultrabrite
Macleans Aim
Crest
Colgate
Aqua- Fresh
Gleem Fights Cavities
Whitens Teeth
Cleans Stains
• Stimuli can be selectively eliminated from the input data and the solutions determined for the remaining stimuli.
• A random error term could be added to the input data. The resulting data are subjected to MDS analysis and the solutions compared.
• The input data could be collected at two different points in time and the test-retest reliability determined.
Conducting Multidimensional ScalingAssess Reliability and Validity
External Analysis of Preference Data
0.5
-1.5
Dentagard
-1.0-2.0
0.0
2.0
0.0
Close Up
-0.5 1.0 1.5 0.5 2.0
-1.5
-1.0
-2.0
-0.5
1.5
1.0
Pepsodent
Ultrabrite
Macleans Aim
Crest
Colgate
Aqua- Fresh
Gleem Ideal Point
Assumptions and Limitations of MDS
• It is assumed that the similarity of stimulus A to B is the same as the similarity of stimulus B to A.
• MDS assumes that the distance (similarity) between two stimuli is some function of their partial similarities on each of several perceptual dimensions.
• When a spatial map is obtained, it is assumed that interpoint distances are ratio scaled and that the axes of the map are multidimensional interval scaled.
• A limitation of MDS is that dimension interpretation relating physical changes in brands or stimuli to changes in the perceptual map is difficult at best.
4.Marketing Mix Researchh). Positioning Research,
• The first component is the product class or the structure of the market a company's brand will compete.
• The second component is consumer segmentation.
• The third component is the consumers perception of the company’s brand in relation to those of the competitors.
• Fourth component of positioning is the benefit offered by the company’s brand.
04/10/2023 Kartikeya Singh 116
4.Marketing Mix Researchi). Pricing Research
Market Segmentation.
Estimate of Demand.
The Market Share.
The Marketing Mix.
Estimate of Costs.
Pricing Strategy.
The price Structure.
04/10/2023 Kartikeya Singh 117
4.Marketing Mix Researchi). Pricing Research
• Market Segmentation– Type of product to be produced or sold– The kind of service to be rendered– The costs of operations, to be estimated.– The type of customers or market segments
sought.
04/10/2023 Kartikeya Singh 118
4.Marketing Mix Researchi). Pricing Research
• Estimate of Demand:– Marketers will estimate total demand for the
products. It will be based on sales forecast, channel opinions and degree of competition in the market
• Market Share:– Marketer will choose a brand image and the
desired market share on the basis of competitive reaction. Market planners must know exactly what
04/10/2023 Kartikeya Singh 119
04/10/2023 Kartikeya Singh 120
04/10/2023 Kartikeya Singh 121
4.Marketing Mix Researchj). Shop and Retail Audits
• Normally, the retailer would like that research studies should cover
• Trade area analysis,• Store image,• Customer perception studies,• In store traffic pattern,• Location analysis
04/10/2023 Kartikeya Singh 122
•End of UnitII
04/10/2023 Kartikeya Singh 123
•Unit III
04/10/2023 Kartikeya Singh 124
Syllabus of Unit III
I. Marketing effectiveness and analytics research: a) Customer Satisfaction Measurement,
b) Mystery shopping,
c) Market and Sales Analysis .
II. Exploratory designs
III. Descriptive designs I. Longitudinal and cross-sectional analysis.
IV. Qualitative research techniques –a) Based on questioning: Focus groups, Depth interviews,
Projective techniques.
b) Based on observations: ethnography, grounded theory,
c) Participant observation.
V. Causal research – a) Basic experimental designs
b) internal and external validity of experiments.04/10/2023 Kartikeya Singh 125
I. Marketing effectiveness and analytics research:
• Marketing expert Tony Lennon believes marketing effectiveness is quintessential to marketing, going so far as to say It's not marketing if it's not measured
• Dimensions of marketing effectiveness:– Corporate – Each company operates within different bounds.
These are determined by their size, their budget and their ability to make organizes act in similar ways leading to the need to segment them. Based on these segments, they make choices based on how they value the attributes of a product and the brand, in return for price paid for the product.
– Exogenous Factors – There are many factors outside of our immediate control that can impact the effectiveness of our marketing activities. These can include the weather, interest rates, government regulations and many others.
04/10/2023 Kartikeya Singh 126
I. Marketing effectiveness and analytics research:
• Factors driving marketing effectiveness:• Marketing Strategy – Improving marketing effectiveness can be achieved by employing
a superior marketing strategy. By positioning the product or brand correctly, the product/brand will be more successful in the market than competitors’ products/brands.
• Marketing Creative – Even without a change in strategy, better creativity can improve results.
• Marketing Execution – By improving how marketers go to market, they can achieve significantly greater results without changing their strategy or their creative execution. At the marketing mix level, marketers can improve their execution by making small changes in any or all of the 4-Ps (Product, Price, Place and Promotion) (Marketing) without making changes to the strategic position or the creative execution marketers can improve their effectiveness and deliver increased revenue.
• Marketing Infrastructure (also known as Marketing Management) – Improving the business of marketing can lead to significant gains for the company. Management of agencies, budgeting, motivation and coordination of marketing activities can lead to improved competitiveness and improved results.
• Exogenous Factors - Generally out of the control of marketers, external or exogenous factors also influence how marketers can improve their results.
04/10/2023 Kartikeya Singh 127
Customer satisfaction measurement
“The customer you loose holds information you need to succeed.”
Frederick F.Reichheld
Measures of customer satisfaction
• Overall customer satisfaction with the organization and its products / services
• Rating in the industry on the basis of overall customer satisfaction• Satisfaction with value for money• Desire to recommend the product or service to others. • Loyalty in terms of repeat purchases
Means of measuring customer satisfaction
I.Customer feedback after delivery of product or service II.Customer complaints and suggestions III. Customer Surveys
I. Customer feedback after delivery of product or service
This is one of the simplest, fastest and the most effective method of measuring customer satisfaction. The customers should be immediately asked to evaluate the product or service and comment upon areas of satisfaction and dissatisfaction.
II. Customer complaints and suggestions
The organization must have a formalized system of recording all customer complaints and as well as the methods of their disposal. Customer complaints must be taken very positively as valuable inputs by the organization and should immediately trigger the improvement activities.
IV. Customer surveys
Steps in conducting customer surveys: -
A. Identify your customers requirements under various segments.
B. Determine your survey methodology
C. Develop survey / interview questions
D. Conduct survey / Interview your customers
A. Identify your customers requirement areas.
It is extremely important to know the requirement of your customers before designing a questionnaire or survey. This is because if we do not ask the right questions, the answers we get will be irrelevant and it will be difficult to find out if the customers are really satisfied with the issues that are important to them.
Ways to identify customer requirements
• Discuss the issue with sample group of customers
• Ask your existing customers “If we have to develop a questionnaire to measure our customers’ satisfaction, what questions should we ask.
• Brainstorming with employees from various functions within the organization. A cross section of ideas from various people will give us the complete picture about the requirement of the customers
Product requirements
For identifying the customer requirements for a PRODUCT, the survey must cover the following areas:
• Performance
• Timeliness
• Reliability
• Durability
• Serviceability
• Aesthetics
Service requirements
For identifying the customer requirements for a SERVICE, the survey must cover the following areas:
• Security
• Reliability
• Accessibility
• Timeliness
• Responsiveness
• Empathy
• Assurance
B. Determine your survey methodology
This requires the organization to answer the following questions :
• How many customers to survey?
• Whom to survey?
• How to survey?
• When to survey?
• Who should conduct the survey?
How many customers to survey
The basic rule behind sample selection is to choose a cross section of customers which represents your overall customer base. For example if your customer database consists of large, medium & small organizations, your sample must represent the same.
Other criteria for selecting may include percentage of frequent versus infrequent customers, industry sector & geographic area.
Whom to survey?
While conducting the survey, the organizations must include the following customers:
• Present customers• Potential customers• Past customers• Competitor’s customers
Whom to survey?
The customer sample must never be biased. Everyone wants to hear good things from the customers and nobody wants to hear a negative feedback. There is a natural tendency to include a positive feedback and to exclude the negative feedback. This will never reflect the true measure of customer satisfaction. The organization must be willing to hear both positive & the negative from the customers if they are truly willing to improve their customer satisfaction.
How to survey?
The following methods can be used for conducting the survey:• Mail survey• Telephonic surveys• Face to face interviews• Comment cards The best method will depend on your situation, number of
customers in the sample group and what works best for your customers.
When to survey?
Survey at periodic intervals: Many organizations prefer to conduct customer satisfaction
measurement survey at certain time of the year. This however has certain disadvantages. If the period of survey is widely known it can signal the time for enhanced services to the customers during that period. The marketing personnel may distribute questionnaires to customers during these periods. Such conduct is open to all sorts of bias & this practice should be discouraged and avoided.
When to survey?
Surveying continuously: More & more organizations are moving towards continuous
measurement of customer satisfaction due to turbulent & dynamic marketing environment. Continuous measurement recognizes the on-going importance of customer satisfaction and is not influenced by momentary events (good or bad). This method keeps the organization completely focused on customer satisfaction & does not allow it to be forgotten between survey waves.
When to survey?
Surveying after “moments of truth” : Moments of truth are any interactions with customers in which
an organizations effectiveness is tested. For example• Getting the car loan from the bank• Settlement of insurance claims• Similarly, receiving money from the cash counter of a bank
When to survey?
Every moment of truth can be followed up with a satisfaction survey to determine as to how well the organization has performed in this important interaction.
Who should conduct the survey?
The survey can be undertaken by the organizations themselves or it can also be given to outside agencies. There are following advantages of getting the survey by outside professional agencies.
Who should conduct the survey?
• They are more objective in formulating questions & analyzing responses.
• Customers are more open when providing information to third parties.
• Professional agencies have the expertise to ensure that the process is productive & effective.
C. Develop survey questions
The organization must develop a pre-determined set of questions which must take into account all the requirements of the customers.
Develop survey questions
The questionnaire must give an impression to the customers that you are thorough & organized when gathering customer satisfaction information. The presentation & packaging of the questionnaire should not be shoddy. A good appearance can suggest evidence of organization’s high commitment to customer satisfaction management process and vice versa.
Sample questionnaire - Airlines
Waiting time for getting the boarding pass
Behavior of the front desk executive
Ready availability of information
Time taken in identification of luggage
Excellent Good AverageA. At the airport
Sample questionnaire - Airlines
Cabin crew’s welcome at the time of boarding the flight
Availability of reading material
Quality and quantity of food & beverages
Quality of service
Space in the aircraft to keep your hand baggage
Responsiveness for special service asked for
Cleanliness in the toilets
Excellent Good AverageB. In-flight service
Sample questionnaire - Airlines
Timeliness of the flight
In flight experience with regard to:-
Noise level
Temperature
Ride and landing
Flight ambience
Overall ratings
Your suggestions for improvement
Excellent Good AverageC. In-flight experience
Customer feedback
Sample survey / feedback forms for consumer durables, consumer non-durables and service industry are given in MS Excel file “Feedback forms” given along with this package.
Advantages of a good survey
A well designed and executed customer satisfaction survey can be a great asset to any organization due to the following reasons:
• It can pinpoint expenditure & resources which is being spent but do not help to satisfy the customers.
Advantages of a good survey
• It can identify opportunities for product & service innovation.• It can ensure that the quality improvement efforts are correctly
focused on issues that are most important to a customer.
Why customer survey’s fail?
Unfortunately, a well designed & executed survey tends to be an exception rather than the rule. The challenge of conducting a customer survey is to minimize the total amount of error. This error comes from two different sources.
A. Sampling errors B. Measurement errors
Types of sampling errors
These errors deal with the manner in which people are selected in a survey. They are of following types
• Failing to use statistical sampling methods • Incorrect selection of profile• Incorrect selection of number of people• Ignoring non-responses.
Types of measurement errors
These errors are related to the content of the survey and the way in which the results are used. These mistakes deal with :
• Drawing incorrect inferences from the responses• Asking non-specific questions.• Failing to ask all the questions.• Using incorrect or incomplete data analysis methods.• Error in feeding the results
1.b.Mystery shopping
Mystery shopping:
• Mystery Shopping is a highly valuable performance tool that provides a clear, accurate and unbiased account of the interaction between your employees and your customers.
• It is a performance evaluation process that allows the owners and managers of service organisations to really understand how their customers are treated in their shops, offices or practices, on the telephone, in writing or online. It identifies the 'gap' between their service beliefs and the reality of the customer experience.
Process in mystery shopping:
Mystery shopper:
Mystery shopper is one who is paid by the company to masquerade as a customer to
discreetly measure the quality of services in their showrooms and front offices.
Mystery auditors often throw up startling facts and reveal huge room for improvement.Mystery shoppers identify soft skills and
intuitiveness as the key default areas among store staff in the country .
Typically ,mystery auditors charge Rs.1500-2000 for a small size retail format store with fees going
up for bigger showrooms.A mystery shopper spends 30-45 mins to review
a small retail shop,it could take 2-3 days to review a hotel.
1.c.Market and Sales Analysis
• Describe the goal of market analysis.• Enumerate and classify the different
dimensions of market analysis.• Discuss the dimensions of market analysis
and relate them to personal experiences and/or observations.
• Illustrate the value chain and experience curve.
1.c.Market and Sales Analysis Goal of Market Analysis
• To determine the attractiveness of a market and to understand its evolving opportunities and threats as they relate to the strengths and weaknesses of the firm.
1.c.Market and Sales Analysis Dimensions of Market Analysis
1. Market size (current and future)
2. Market growth rate
3. Market profitability
4. Industry cost structure
5. Distribution channel
6. Market trends
7. Key success factors
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Market Size
The size of the market can be evaluated based on:
• Present sales• Potential sales (if expanded)
Some information sources for determining market size:
• Government data• Trade associations• Financial data from major
players• Customer survey
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Market Growth Rate
A simple means of forecasting the market growth rate is to extrapolate (infer or estimate) historical data into the future. While this method may provide a first-order estimate, it does not predict important turning points. A better method is to study growth drivers such as demographic information and sales growth in complementary products.
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis
Ultimately, the maturity and decline stages of the product life cycle will be reached. Some leading indicators of the decline phase include:
• Price pressure caused by competition• Decrease in brand loyalty• Emergence of substitute products• Market saturation• Lack of growth drivers
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Market Profitability
While different firms in the market will have different levels of profitability, the average profit potential for a market can be used as a guideline for knowing how difficult it is to make money in the market.
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Porter’s Five Competitive Forces
Rivalry among
Competitors
Threat of Substitute Products
Potential New
Entrants
Bargaining Power of Buyers
Bargaining Power of Suppliers
Internet tends to increase bargaining power of suppliers
Internet reduces barriers to entry Internet blurs
differences among competitors
Internet creates new substitution threats
Internet shifts greater power to end consumers
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Industry Cost Structure
The cost structure is important for identifying key factors for success. To this end, Porter’s value chain model is useful for determining where value is added and for isolating the costs.
The cost structure also is helpful for formulating strategies to develop a competitive advantage. For example, in some environments the experience curve effect can be used to develop a cost advantage over competitors.
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Porter’s Generic Value Chain
Infrastructure
Human Resource Management
Technology DevelopmentProcurement
Elapsed Time - Value added time cost
InboundLogistic
s
Operations
Outbound
Logistics
Marketing& Sales
Service
Support Activities
Primary Activities
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Primary Value Chain Activities:
• Inbound Logistics: the receiving and warehousing of raw materials, and their distribution to manufacturing as they are required.
• Operations: the processes of transforming inputs into finished products and services.
• Outbound Logistics: the warehousing and distribution of finished goods.
• Marketing and Sales: the identification of customer needs and the generation of sales.
• Service: the support of customers after the products and services are sold to them.
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Supports of the Primary Activities:
• The infrastructure of the firm: organizational structure, control systems, company culture, etc.
• Human resource management: employee recruiting, hiring, training, development, and compensation.
• Technology development: technologies to support value-creating activities.
• Procurement: purchasing inputs such as materials, supplies, and equipment.
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Distribution Channel
The following aspects of the distribution system are useful in a market analysis:
• Existing distribution channel– can be described by how direct they are to
the customer.
• Trends and emerging channels– new channels can offer the opportunity to
develop a competitive advantage.
• Channel power structure– for example, in the case of a product having
little brand equity, retailers have negotiating power over manufacturers and can capture more margin.
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales AnalysisMarket Trends
Changes in the market are important because they often are the source of new opportunities and threats. The relevant trends are industry-dependent, but some examples include changes in price sensitivity, demand for variety, and level of emphasis on service and support. Regional trends also may be relevant.
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
1.c.Market and Sales Analysis Key Success Factors
– Elements that are necessary in order for the firm to achieve its marketing objectives.
few examples are:– Access to essential unique
resources– Ability to achieve economies of
scale– Access to distribution channels– Technological progress
It is important to consider that key success factors may change over time, especially as the product progresses through its life cycle.
1. Market size (current and future)
2. Market growth rate3. Market profitability4. Industry cost
structure5. Distribution channel6. Market trends7. Key success factors
2.Exploratory Research Design3.Descriptive Research Design4. Longitudinal and cross-sectional analysis.
•Research Design
04/10/2023 Kartikeya Singh 180
Syllabus
Research Design
• A master plan that specifies the methods and procedures for collecting and analyzing needed information.
04/10/2023 Kartikeya Singh 181
Research Design
04/10/2023 Kartikeya Singh 182
Research Design
Exploratory Design
Survey of Experts
Pilot Surveys
Secondary Data research
Conclusive Design
Descriptive Research
Exploratory Research
• Usually conducted during the initial stage of the research process
• Purposes– To narrow the scope of the
research topic, and– To transform ambiguous
problems into well-defined ones
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Exploratory Research Techniques• Secondary Data Analysis
– Secondary data are data previously collected & assembled for some project other than the one at hand
• Pilot Studies– A collective term for any small-scale
exploratory research technique that uses sampling but does not apply rigorous standards
– Includes• Focus Group Interviews
– Unstructured, free-flowing interview with a small group of people
• Projective Techniques– Indirect means of questioning that enables a
respondent to project beliefs and feelings onto a third party or an inanimate object
– Word association tests, sentence completion tests, role playing
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Exploratory Research Techniques
• Case Studies– Intensively investigate one or a few
situations similar to the problem situation
• Experience Surveys– Individuals who are knowledge about
a particular research problem are questioned
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Conclusive Research• Provide specific information that aids
the decision maker in evaluating alternative courses of action
• Sound statistical methods & formal research methodologies are used to increase the reliability of the information
• Data sought tends to be specific & decisive
• Also more structured & formal than exploratory data
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Types of Conclusive Research
• Descriptive Research:– Describes attitudes, perceptions,
characteristics, activities and situations.– Examines who, what, when, where, why, &
how questions
• Causal Research:– Provides evidence that a cause-and-effect
relationship exists or does not exist.– Premise is that something (and
independent variable) directly influences the behavior of something else (the dependent variable).
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Common Characteristics of Descriptive Studies
• Build on previous information• Show relationships between
variables• Representative samples
required• Structured research plans• Require substantial resources• Conclusive findings
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Major Types of Descriptive Studies
Descriptive Studies
Consumer PerceptionAnd Behavior Studies
Image
Product Usage
Advertising
Pricing
Market Characteristic Studies
Distribution
Competitive Analysis
Market Potential
Market Share
Sales Analysis
Sales Studies
Cross-sectional Designs
04/10/2023 Kartikeya Singh 190
• Involve the collection of information from any given sample of population elements only once.
• In single cross-sectional designs, there is only one sample of respondents and information is obtained from this sample only once.
• In multiple cross-sectional designs, there are two or more samples of respondents, and information from each sample is obtained only once. Often, information from different samples is obtained at different times.
• Cohort analysis consists of a series of surveys conducted at appropriate time intervals, where the cohort serves as the basic unit of analysis. A cohort is a group of respondents who experience the same event within the same time interval.
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Longitudinal Designs
• A fixed sample (or samples) of population elements is measured repeatedly on the same variables
• A longitudinal design differs from a cross-sectional design in that the sample or samples remain the same over time
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Sample Surveyed at
T1
Sample Surveyed at
T1
Same Sample also Surveyed at
T2
T1 T2
Cross Sectional Design
Longitudinal Design
Time
Cross Sectional vs. Longitudinal Designs
Relative Advantages and Disadvantages of Longitudinal and Cross-Sectional Designs
Evaluation Criteria
Cross-Sectional Design Longitudinal Design
Detecting ChangeLarge amount of data collectionAccuracyRepresentative SamplingResponse bias
---++
+++--
Note: A “+” indicates a relative advantage over the other design, whereas a “-” indicates a relative disadvantage.
Common Characteristics of Causal Studies
• Logical Time Sequence– For causality to exist, the cause must
either precede or occur simultaneously with the effect
• Concomitant Variation– Extent to which the cause and effect
vary together as hypothesized
• Control for Other Possible Causal Factors
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
How Descriptive & Causal Designs Differ
• Relationship between the variables– Descriptive designs determine degree of
association– Causal designs infer whether one or more
variables influence another variable
• Degree of environmental control– Descriptive designs enjoy lesser degrees of
control
• Order of the variables– In descriptive designs, variables are not
logically ordered
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
Uses of Casual Research
• To understand which variables are the cause (independent variables) and which variables are the effect (dependent variables) of a phenomenon
• To determine the nature of the relationship between the causal variables and the effect to be predicted
• METHOD: Experiments
Research Design• Exploratory
– Secondary data Research
– Pilot Survey– Survey of
Experts• Conclusive
– Descriptive• Cross Sectional
– Single Cross Sectional
– Multiple Cross Sectional
• Longitudinal
– Causal
5. Qualitative research techniques –1. Based on questioning: Focus groups,
Depth interviews, Projective techniques.
2. Based on observations: ethnography, grounded theory,
3. Participant observation.
04/10/2023 Kartikeya Singh 197
Qualitative Research
Qualitative research is a loosely defined term. It implies that the research findings are not determined by quantification or quantitative analysis.
Qualitative vs. Quantitative Research(1 of 2)
Comparison Dimension Qualitative Research Quantitative Research
Types of questions Probing Limited probing
Sample size Small Large
Information per Much Varies
respondent
Administration Requires interviewers Fewer specialized skillswith special skills required
Types of analysis Subjective, interpretive Statistical, summarization
Qualitative vs. Quantitative Research(2 of 2)
Comparison Dimension Qualitative Research Quantitative Research
Tools Tape recorders, projection Questionnaires, computers
devices, video, pictures printouts
Ability to replicate Low High
Training needed by Psychology, sociology, Statistics, decision models,
the researcher social psychology, DSS, computer program-
consumer behavior ming, marketing
Type of research Exploratory Descriptive or causal
Qualitative Research Methods
Include• Depth Interviews• Projective Techniques • Focus Groups• Observation (Ethnography)
… and other methods
Depth Interview
Example: Wide Seats in an AirplaneI: “Why do you like wide seats in an
airplane?”R: “It makes me comfortable.”I: “Why is it important to be comfortable?”R: “I can accomplish more.”I: “Why is important that you can accomplish
more?”R: “I feel good about myself.”
Implication: Wide seats may relate to self-esteem!
Projective Techniques
Eliciting deep-seated feelings/opinions by enabling the respondents to project themselves into unstructured situations.
Word AssociationSentence CompletionRole playingStory telling with pictures
… and several others
Popularity of Focus Group Research
• Most marketing research firms, advertising agencies, and consumer goods manufacturers use focus groups.
• Focus groups tend to be used more extensively by consumer goods companies than by industrial goods organizations.
Focus Group
Focus Group
• Spot source of marketing problem• Spark new product ideas• Develop questionnaires for quantitative research• Identify new advertising themes• Diagnose competitors’ strengths and
weaknesses
A group of people who discuss a subject under the direction of a moderator. Focus groups are used to:
Focus Group Research - Overview
The goal of focus group research is to learn and understand what people have to say and why
The emphasis is on getting people to talk at length and in detail about the subject at hand
The intent is to find out how they feel about a product, concept, idea, or organization, how it fits into their lives, and their emotional involvement with it
Benefits of Focus Group Research
• Synergy - together, the group can provide more insights than insights obtained individually.
• Snowballing - chain reaction to comment by one individual.
• Stimulation - group interaction excites people.• Spontaneity/serendipity - participants may get
ideas on the spot and discuss them.
Focus Group Research - Steps
1. Define objectives of study
2. Develop questions for discussion - Moderator Guide
3. Recruit participants
4. Conduct Session with a moderator
5. Analyze and report results to decision makers
Results can be misleading if the focus group is not conducted properly.
Focus Group Issues (1 of 2)
• How many people in a focus group?
• What type of people should be recruited?
• Should participants be …
– Knowledgeable?
– Diverse?
– Representative?
Qualitative Research
Comparing Qualitative and Quantitative Methods
Before discussing the differences between qualitative and quantitative methodologies one must understand the foundational similarities.
=?
Foundational Similarities
• All qualitative data can be measured and coded using quantitative methods.
• Quantitative research can be generated from qualitative inquiries.
• Example: One can code an open-ended interview with numbers that refer to data specific references, or such references could become the origin of a randomized experiment.
Foundational Differences
• The major difference between qualitative and quantitative research stems from the researcher’s underlying strategies.
• Quantitative research is viewed as confirmatory and deductive in nature.
• Qualitative research is considered to be exploratory and inductive.
Qualitative Research
• Terminology• Methods• Strengths and weaknesses
Terminology
• Grounded theory• Ethnography• Phenomenology• Field research
Grounded Theory
• Grounded theory refers to an inductive process of generating theory from data.
• This is considered ground-up or bottom-up processing.
• Grounded theorists argue that theory generated from observations of the empirical world may be more valid and useful than theories generated from deductive inquiries.
Grounded Theory (con’t)
• Grounded theorists criticize deductive reasoning since it relies upon a priori assumptions about the world.
• However, grounded theory incorporates deductive reasoning when using constant comparisons.
• In doing this, researchers detect patterns in their observations and then create working hypotheses that directs the progression of the inquiry.
Ethnography
• Ethnography emphasizes the observation of details of everyday life as they naturally unfold in the real world. This is sometimes called naturalistic research.
• Ethnography is a method of describing a culture or society. This is primarily used in anthropological research.
Phenomenology
• Phenomenology is a school of thought that emphasizes a focus on people’s subjective experiences and interpretations of the world.
• Phenomenological theorists argue that objectivity is virtually impossible to ascertain, so to compensate, one must view all research from the perspective of the researcher.
Phenomenology (con’t)
• Phenomenologists attempt to understand those whom they observe from the subjects’ perspective.
• This outlook is especially pertinent in social work and research where empathy and perspective become the keys to success.
Field Research
• Field research is a general term that refers to a group of methodologies used by researchers in making qualitative inquiries.
• The field researcher goes directly to the social phenomenon under study and observes it as completely as possible.
Field Research (con’t)
• The natural environment is the priority of the field researcher. There are no implemented controls or experimental conditions to speak of.
• Such methodologies are especially useful in observing social phenomena over time.
Methods
• Participant observation• Direct observation• Unstructured or intensive
interviewing• Case studies
Participant Observation
• The researcher literally becomes part of the observation.
• Example: One studying the homeless may decide to walk the streets of a given area in an attempt to gain perspective and possibly subjects for future study.
Direct Observation
• Direct observation is where the researcher observes the actual behaviors of the subjects, instead of relying on what the subjects say about themselves or others say about them.
• Example: The observation booth at the CECP in Martha Van may be used for direct observation of behavior where survey or other empirical methodologies may seem inappropriate.
Unstructured or Intensive Interviewing
• This method allows the researcher to ask open-ended questions during an interview.
• Details are more important here than a specific interview procedure.
• Here lies the inductive framework through which theory can be generated.
Case Studies
• A particular case study may be the focus of any of the previously mentioned field strategies.
• The case study is important in qualitative research, especially in areas where exceptions are being studied.
• Example: A patient may have a rare form of cancer that has a set of symptoms and potential treatments that have never before been researched.
Strengths and Weaknesses
• Objectivity• Reliability• Validity• Generalizability
Objectivity
• It is given that objectivity is impossible in qualitative inquiry. Instead the researcher locates his/herself in the research.
• Objectivity is replaced by subjective interpretation and mass detail for later analysis.
Reliability
• Since procedure is de-emphasized in qualitative research, replication and other tests of reliability become more difficult.
• However, measures may be taken to make
research more reliable within the particular study (such as observer training, or more objective checklists, and so on).
Validity
• Qualitative researchers use greater detail to argue for the presence of construct validity.
• Weak on external validity.
• Content validity can be retained if the researcher implements some sort of criterion settings.
• Having a focused criterion adds to the study’s validity.
Generalizability
• Results for the most part, do not extend much further than the original subject pool.
• Sampling methods determine the extent of the study’s generalizability.
• Quota and Purposive sampling strategies are used to broaden the generalizability.
Summing Up
• Remember that there are always trade-offs in research.
• Are you willing to trade detail for generalizability?
• Will exploratory research enable you to generate new theories?
• Can you ask such sensitive questions on a questionnaire?
Summing Up (con’t)
• Will the results add any evidence toward any pre-existing theory or hypothesis?
• Is FUNDING available for this research?
• Do you really need to see numbers to support your theories or hypotheses?
• Are there any ethical problems that could be minimized by choosing a particular strategy?
Unit - IV (6 sessions )• Primary data –
– Questionnaire design – – Administration and analysis considerations in design – – Attitude measurement – Scaling techniques. – Observation method of primary data collection. – Web based primary data collection – – Issues of reach, analysis, accuracy, time and efficiency.
• Sampling – – sampling methods – sampling and non sampling errors – – sample size calculation – – population and sample size - large and small samples – – Practical considerations in determining sample size.
04/10/2023 Kartikeya Singh 236
• Compilation and interpretation of primary and secondary sources of information.
• The integration of different sources will consolidate the write up of the report.
DATA COLLECTION
SOURCES OF INFORMATION
Primary Source• Data is collected by
researcher himself
• Data is gathered through questionnaire,
interviews,observations etc.
Secondary Source• Data collected,
compiled or written by other
researchers eg. books, journals, newspapers• Any reference must
be acknowledged
STEPS TO COLLECT DATA
DATA ANALYSIS AND INTERPRETATION
REVIEW & COMPILE SECONDARY SOURCE INFORMATION(Referred to in the BACKGROUND/ INTRODUCTION section of report)
REVIEW & COMPILE SECONDARY SOURCE INFORMATION(Referred to in the BACKGROUND/ INTRODUCTION section of report)
PLAN & DESIGN DATA COLLECTION INSTRUMENTS TO GATHER PRIMARY INFORMATION
(Referred to in the FINDINGS, CONCLUSIONS & RECOMMENDATIONS sections of report)
PLAN & DESIGN DATA COLLECTION INSTRUMENTS TO GATHER PRIMARY INFORMATION
(Referred to in the FINDINGS, CONCLUSIONS & RECOMMENDATIONS sections of report)
DATA COLLECTIONDATA COLLECTION
METHODS USED TO COLLECT
PRIMARY SOURCE DATA
1. Interviews2. Questionnaires3. Survey4. Experimentation5. Case Study6. Observation
However, for a small-scale study, the most commonly used methods are interviews, survey questionnaires and observations.
Effective way of gathering information
INTERVIEW
Involves verbal and non-verbal communications
Can be conducted face to face, by telephone,
online or through mail
Steps To An Effective Interview
Prepare your interview schedule
Select your subjects/ key informants
Conduct the interview
Analyze and interpret data collected from the interview
The most common data collection instrument
SurveyQuestionnaire
Useful to collect quantitative and qualitative
information
Should contain 3 elements:1. Introduction – to explain the objectives
2. Instructions – must be clear, simple language & short3. User-friendly – avoid difficult or ambiguous questions
2 Basic Types of survey questions:
1. Open-ended Questions– Free-response
(Text Open End)– Fill-in relevant
information
2. Close-ended Questions– Dichotomous question– Multiple-choice– Rank– Scale– Categorical– Numerical
Note: For specific examples and students’ activities on each question style, please refer to the notes on Data Collection in the e-learning.
Steps To An Effective Survey Questionnaire
Prepare your survey questions(Formulate & choose types of questions, order them, write instructions, make copies)
Select your respondents/samplingRandom/Selected
Administer the survey questionnaire(date, venue, time )
Analyze and interpret data collected
Tabulate data collected (Statistical analysis-frequency/mean/correlation/% )
A sample of complete survey questionnairehttp://www.custominsight.com/demo/form_widgets.rtf
Observe verbal & non-verbal communication, surrounding atmosphere,
culture & situation
Observations
Need to keep meticulous records of
the observations
Can be done through discussions,observations of habits, rituals,
review of documentation,experiments
Steps To An Effective ObservationDetermine what needs to be observed
(Plan, prepare checklist, how to record data)
Select your participantsRandom/Selected
Conduct the observation(venue, duration, recording materials, take photographs )
Analyze and interpret data collected
Compile data collected
3.Scaling Techniques
In business research, measurement of variables is a indispensable requirement
Problem – Defining what is to be measured, and how it is to be accurately and reliably measured
Some things (or concepts) which are inherently abstract in their nature (e.g. job satisfaction, employee morale, brand loyalty of consumers) are more difficult to measure than concepts which can be assigned numerical values (e.g. sales volume for employees X, Y and Z)
3.Scaling Techniques
A scale is basically a continuous spectrum or series of categories and has been defined as any series of items that are arranged progressively according to value or magnitude, into which an item can be placed according to its quantification
Four popular scales in business research are:
– Nominal scales– Ordinal scales– Interval scales– Ratio scales
3.Scaling Techniques A nominal scale is the simplest of the four scale types
and in which the numbers or letters assigned to objects serve as labels for identification or classification
Example:
Males = 1, Females = 2 Sales Zone A = Islamabad, Sales Zone B = Rawalpindi Drink A = Pepsi Cola, Drink B = 7-Up, Drink C = Miranda
3.Scaling Techniques
An ordinal scale is one that arranges objects or alternatives according to their magnitude
Examples:
Career Opportunities = Moderate, Good, Excellent Investment Climate = Bad, inadequate, fair, good, very good Merit = A grade, B grade, C grade, D grade
A problem with ordinal scales is that the difference between categories on the scale is hard to quantify, I,e., excellent is better than good but how much is excellent better?
3.Scaling Techniques
An interval scale is a scale that not only arranges objects or alternatives according to their respective magnitudes, but also distinguishes this ordered arrangement in units of equal intervals (i.e. interval scales indicate order (as in ordinal scales) and also the distance in the order)
Examples: Consumer Price Index Temperature Scale in Fahrenheit
Interval scales allow comparisons of the differences of magnitude (e.g. of attitudes) but do not allow determinations of the actual strength of the magnitude.
3.Scaling Techniques
A ratio scale is a scale that possesses absolute rather than relative qualities and has an absolute zero.
Examples: Money Weight Distance Temperature on the Kelvin Scale
Interval scales allow comparisons of the differences of magnitude (e.g. of attitudes) as well as determinations of the actual strength of the magnitude
254Measurement, Scaling, Questionnaire
& Form Design
3.Scaling Techniques
Primary Scales of Measurement
7 38
Nominal Numbers Assigned
Ordinal Rank Orderof Winners
Interval PerformanceRating on a
0 to 10 Scale
Ratio Time to Finishin Seconds
Thirdplace
Secondplace
Firstplace
Finish
Finish
8.2 9.1 9.6
15.2 14.1 13.4
3.Scaling Techniques
Type of Scale Numerical Operation Descriptive Statistics
Nominal Counting Frequency in each category, percentage in each category, mode
Ordinal Rank Ordering Median, range, percentile ranking
Interval Arithmetic Operations on Intervals between numbers
Mean, standard deviation, variance
Ratio Arithmetic Operations on actual quantities
Geometric mean, coefficient of variation
3.Scaling Techniques Criteria for Good Measurement: Reliability – Reliability is the degree to which
measurements are devoid of error and therefore in the position to yield consistent results, also over repeated attempts over time (ordinal measures always yield the same order, interval measurements always yield the same order and same distance between the measured items)
Validity – Validity is the ability of a scale or measuring instrument to measure what it is intended to measure (e.g. is absenteeism from work a valid measure of job satisfaction or are there other influences like a flu epidemic which is keeping employees from work)
3.Scaling Techniques
Sensitivity – Sensitivity is the ability of a measurement instrument to accurately measure variability in stimuli or responses (e.g. on a scale, the choices very strongly agree, strongly agree, agree, don’t agree offer more choices than a scale with just two choices - agree and don’t agree – and is thus more sensitive)
3.Scaling Techniques Attitude
Measuring Attitude is a frequent undertaking in business research
Attitude may be defined as an enduring disposition to consistently respond in a given manner to various aspects
Attitude has three dimensions:
AffectiveComponent
AffectiveComponent
CognitiveComponent
CognitiveComponent
BehaviouralComponent
BehaviouralComponent
3.Scaling Techniques
Components of AttitudeAffective Component – Reflective of a
person’s general feelings or emotions towards an object or subject (like, dislike, love, hate)
Cognitive Component – Reflective of a person’s awareness of and knowledge about an object or subject (know, believe)
Behavioral Component – Reflective of a person’s intentions and behavioral expectations, and predisposition to action
3.Scaling Techniques
• It can be difficult to measure attitude, therefore, indicators such as verbal expression, physiological measurement techniques and overt behavior are used for this purpose. The three different components of attitude may require different measuring techniques
• Common techniques used in business research to determine attitude include rating, ranking, sorting and the choice technique
3.Scaling Techniques
Rating Scales are frequently employed in business research for measuring attitude, and many scales have been developed for this purpose, including:
Simple Attitude Scales Category Scales Likert Scale Semantic Differential Numerical Scales Constant-Sum Scale Staple Scale Graphic Scales
3.Scaling Techniques.
Simple Attitude Scales In attitude scaling, individuals are typically asked
whether they agree or disagree with a question (or questions) put to them, or they are asked to respond to a question or questions
Simple attitude scales have the properties of a nominal scale and the disadvantages that go with it, also, they do not permit fine distinctions in the respondents’ answers because their choice of answers is limited, but they can be useful in instances where the respondents’ education level is low and questionnaires lengthy
3.Scaling Techniques
Category Scale:A category scale consists of several
response categories to provide the respondent with alternative ratings
Category scales are more sensitive than rating scales which allow only two answer categories (because of the larger number of choices), and thus provides more data and information
3.Scaling Techniques
The Likert Scale: A likert Scale is a measure of attitudes designed to allow
respondents to indicate how strongly they agree or disagree with carefully constructed statements that range from very positive to very negative towards an object or subject
The number of alternatives on the Likert scale can vary, often five alternatives are foreseen
A Likert Scale may include a number of question items, each covering some aspect of the respondent’s attitude, and these items collectively form an index
04/10/2023 Kartikeya Singh 265
3.Scaling Techniques The Semantic Differential
The semantic differential is an attitude measuring technique that which consists of a series of seven bi-polar rating scales which allow response to a concept (e.g. organization, product, service, job)
An advantage of the semantic differential is its versatility, on the other hand, it uses extremes which may influence respondents’ answers
Strong ____:____:____:____:____:____:____ Weak
Decisive ____:____:____:____:____:____:____ Indecisive
Good ____:____:____:____:____:____:____ Bad
Cheap ____:____:____:____:____:____:____ Expensive
Active ____:____:____:____:____:____:____ Passive
Lazy ____:____:____:____:____:____:____ Industrious
267
SAMPLINGMETHODS
SAMPLING
• A sample is “a smaller (but hopefully representative) collection of units from a population used to determine truths about that population” (Field, 2005)
• Why sample?– Resources (time, money) and workload– Gives results with known accuracy that can be
calculated mathematically• The sampling frame is the list from which the
potential respondents are drawn – Registrar’s office– Class rosters– Must assess sampling frame errors
268
SAMPLING……
269
• What is your population of interest?• To whom do you want to generalize your
results?–All doctors–School children– Indians–Women aged 15-45 years–Other
• Can you sample the entire population?
270
SAMPLING BREAKDOWN
SAMPLING…….
271
TARGET POPULATION
STUDY POPULATION
SAMPLE
Types of Samples
• Probability (Random) Samples• Simple random sample
– Systematic random sample– Stratified random sample– Multistage sample– Multiphase sample– Cluster sample
• Non-Probability Samples– Convenience sample– Purposive sample– Quota
272
Process
• The sampling process comprises several stages:– Defining the population of concern – Specifying a sampling frame, a set of items or
events possible to measure – Specifying a sampling method for selecting
items or events from the frame – Determining the sample size – Implementing the sampling plan – Sampling and data collecting – Reviewing the sampling process
273
Population definition
• A population can be defined as including all people or items with the characteristic one wishes to understand.
• Because there is very rarely enough time or money to gather information from everyone or everything in a population, the goal becomes finding a representative sample (or subset) of that population.
274
SAMPLING FRAME
• In the most straightforward case, such as the sentencing of a batch of material from production (acceptance sampling by lots), it is possible to identify and measure every single item in the population and to include any one of them in our sample. However, in the more general case this is not possible. There is no way to identify all rats in the set of all rats. Where voting is not compulsory, there is no way to identify which people will actually vote at a forthcoming election (in advance of the election)
• As a remedy, we seek a sampling frame which has the property that we can identify every single element and include any in our sample .
• The sampling frame must be representative of the population
275
PROBABILITY SAMPLING
• A probability sampling scheme is one in which every unit in the population has a chance (greater than zero) of being selected in the sample, and this probability can be accurately determined.
• When every element in the population does have the same probability of selection, this is known as an 'equal probability of selection' (EPS) design. Such designs are also referred to as 'self-weighting' because all sampled units are given the same weight.
276
PROBABILITY SAMPLING…….
• Probability sampling includes:
• Simple Random Sampling, • Systematic Sampling,• Stratified Random
Sampling, • Cluster Sampling• Multistage Sampling. • Multiphase sampling
277
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
NON PROBABILITY SAMPLING
• Any sampling method where some elements of population have no chance of selection (these are sometimes referred to as 'out of coverage'/'undercovered'), or where the probability of selection can't be accurately determined. It involves the selection of elements based on assumptions regarding the population of interest, which forms the criteria for selection. Hence, because the selection of elements is nonrandom, nonprobability sampling not allows the estimation of sampling errors..
• Example: We visit every household in a given street, and interview the first person to answer the door. In any household with more than one occupant, this is a non probability sample, because some people are more likely to answer the door (e.g. an unemployed person who spends most of their time at home is more likely to answer than an employed housemate who might be at work when the interviewer calls) and it's not practical to calculate these probabilities.
278
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
NONPROBABILITY SAMPLING…….
279
1. Non probability Sampling includes:
I. Accidental Sampling, II. Quota Sampling and III. Purposive Sampling.
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
SIMPLE RANDOM SAMPLING
• Applicable when population is small, homogeneous & readily available
• All subsets of the frame are given an equal probability. Each element of the frame thus has an equal probability of selection.
• It provides for greatest number of possible samples. This is done by assigning a number to each unit in the sampling frame.
• A table of random number or lottery system is used to determine which units are to be selected.
280
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
SIMPLE RANDOM SAMPLING……..
• Estimates are easy to calculate.• Simple random sampling is always an EPS
design, but not all EPS designs are simple random sampling.
• Disadvantages • If sampling frame large, this method
impracticable.• Minority subgroups of interest in population
may not be present in sample in sufficient numbers for study.
281
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
REPLACEMENT OF SELECTED UNITS
• Sampling schemes may be without replacement ('WOR' - no element can be selected more than once in the same sample) or with replacement ('WR' - an element may appear multiple times in the one sample).
• For example, if we catch fish, measure them, and immediately return them to the water before continuing with the sample, this is a WR design, because we might end up catching and measuring the same fish more than once. However, if we do not return the fish to the water (e.g. if we eat the fish), this becomes a WOR design.
282
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
SYSTEMATIC SAMPLING• Systematic sampling relies on arranging the target
population according to some ordering scheme and then selecting elements at regular intervals through that ordered list.
• Systematic sampling involves a random start and then proceeds with the selection of every kth element from then onwards. In this case, k=(population size/sample size).
• It is important that the starting point is not automatically the first in the list, but is instead randomly chosen from within the first to the kth element in the list.
• A simple example would be to select every 10th name from the telephone directory (an 'every 10th' sample, also referred to as 'sampling with a skip of 10').
283
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
SYSTEMATIC SAMPLING……
As described above, systematic sampling is an EPS method, because all elements have the same probability of selection (in the example given, one in ten). It is not 'simple random sampling' because different subsets of the same size have different selection probabilities - e.g. the set {4,14,24,...,994} has a one-in-ten probability of selection, but the set {4,13,24,34,...} has zero probability of selection.
284
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
SYSTEMATIC SAMPLING……
285
• ADVANTAGES:• Sample easy to select• Suitable sampling frame can be
identified easily• Sample evenly spread over entire
reference population• DISADVANTAGES:• Sample may be biased if hidden
periodicity in population coincides with that of selection.
• Difficult to assess precision of estimate from one survey.
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
STRATIFIED SAMPLING
Where population embraces a number of distinct categories, the frame can be organized into separate "strata." Each stratum is then sampled as an independent sub-population, out of which individual elements can be randomly selected.
• Every unit in a stratum has same chance of being selected.
• Using same sampling fraction for all strata ensures proportionate representation in the sample.
• Adequate representation of minority subgroups of interest can be ensured by stratification & varying sampling fraction between strata as required.
286
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
STRATIFIED SAMPLING……
• Finally, since each stratum is treated as an independent population, different sampling approaches can be applied to different strata.
• Drawbacks to using stratified sampling.
• First, sampling frame of entire population has to be prepared separately for each stratum
• Second, when examining multiple criteria, stratifying variables may be related to some, but not to others, further complicating the design, and potentially reducing the utility of the strata.
• Finally, in some cases (such as designs with a large number of strata, or those with a specified minimum sample size per group), stratified sampling can potentially require a larger sample than would other methods
287
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
STRATIFIED SAMPLING…….
288
Draw a sample from each stratum
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
POSTSTRATIFICATION
• Stratification is sometimes introduced after the sampling phase in a process called "poststratification“.
• This approach is typically implemented due to a lack of prior knowledge of an appropriate stratifying variable or when the experimenter lacks the necessary information to create a stratifying variable during the sampling phase. Although the method is susceptible to the pitfalls of post hoc approaches, it can provide several benefits in the right situation. Implementation usually follows a simple random sample. In addition to allowing for stratification on an ancillary variable, poststratification can be used to implement weighting, which can improve the precision of a sample's estimates.
289
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
OVERSAMPLING
• Choice-based sampling is one of the stratified sampling strategies. In this, data are stratified on the target and a sample is taken from each strata so that the rare target class will be more represented in the sample. The model is then built on this biased sample. The effects of the input variables on the target are often estimated with more precision with the choice-based sample even when a smaller overall sample size is taken, compared to a random sample. The results usually must be adjusted to correct for the oversampling.
290
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
CLUSTER SAMPLING
• Cluster sampling is an example of 'two-stage sampling' .
• First stage a sample of areas is chosen;• Second stage a sample of respondents
within those areas is selected.• Population divided into clusters of
homogeneous units, usually based on geographical contiguity.
• Sampling units are groups rather than individuals.
• A sample of such clusters is then selected.• All units from the selected clusters are
studied.291
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
CLUSTER SAMPLING…….
• Advantages :• Cuts down on the cost of
preparing a sampling frame.• This can reduce travel and
other administrative costs.• Disadvantages: sampling error
is higher for a simple random sample of same size.
• Often used to evaluate vaccination coverage in EPI
292
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
CLUSTER SAMPLING…….
• Identification of clusters– List all cities, towns, villages & wards of cities
with their population falling in target area under study.
– Calculate cumulative population & divide by 30, this gives sampling interval.
– Select a random no. less than or equal to sampling interval having same no. of digits. This forms 1st cluster.
– Random no.+ sampling interval = population of 2nd cluster.
– Second cluster + sampling interval = 4th cluster.
– Last or 30th cluster = 29th cluster + sampling interval
293
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
CLUSTER SAMPLING…….
Two types of cluster sampling methods.
One-stage sampling. All of the elements within selected clusters are included in the sample.
Two-stage sampling. A subset of elements within selected clusters are randomly selected for inclusion in the sample.
294
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
Difference Between Strata and Clusters
295
• Although strata and clusters are both non-overlapping subsets of the population, they differ in several ways.
• All strata are represented in the sample; but only a subset of clusters are in the sample.
• With stratified sampling, the best survey results occur when elements within strata are internally homogeneous. However, with cluster sampling, the best results occur when elements within clusters are internally heterogeneous
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
MULTISTAGE SAMPLING
• Complex form of cluster sampling in which two or more levels of units are embedded one in the other.
• First stage, random number of districts chosen in all
states.
• Followed by random number of talukas, villages.
• Then third stage units will be houses. • All ultimate units (houses, for instance)
selected at last step are surveyed.
296
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
MULTISTAGE SAMPLING……..
• This technique, is essentially the process of taking random samples of preceding random samples.
• Not as effective as true random sampling, but probably solves more of the problems inherent to random sampling.
• An effective strategy because it banks on multiple randomizations. As such, extremely useful.
• Multistage sampling used frequently when a complete list of all members of the population not exists and is inappropriate.
• Moreover, by avoiding the use of all sample units in all selected clusters, multistage sampling avoids the large, and perhaps unnecessary, costs associated with traditional cluster sampling.
297
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
MULTI PHASE SAMPLING
• Part of the information collected from whole sample & part from subsample.
• In Tb survey MT in all cases – Phase I• X –Ray chest in MT +ve cases – Phase II• Sputum examination in X – Ray +ve cases -
Phase III • Survey by such procedure is less costly, less
laborious & more purposeful
298
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
MATCHED RANDOM SAMPLING
A method of assigning participants to groups in which pairs
of participants are first matched on some characteristic and then individually assigned randomly to groups.
• The Procedure for Matched random sampling can be briefed with the following contexts,
• Two samples in which the members are clearly paired, or are matched explicitly by the researcher. For example, IQ measurements or pairs of identical twins.
• Those samples in which the same attribute, or variable, is measured twice on each subject, under different circumstances. Commonly called repeated measures.
• Examples include the times of a group of athletes for 1500m before and after a week of special training; the milk yields of cows before and after being fed a particular diet.
299
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
QUOTA SAMPLING
• The population is first segmented into mutually exclusive sub-groups, just as in stratified sampling.
• Then judgment used to select subjects or units from each segment based on a specified proportion.
• For example, an interviewer may be told to sample 200 females and 300 males between the age of 45 and 60.
• It is this second step which makes the technique one of non-probability sampling.
• In quota sampling the selection of the sample is non-random.
• For example interviewers might be tempted to interview those who look most helpful. The problem is that these samples may be biased because not everyone gets a chance of selection. This random element is its greatest weakness and quota versus probability has been a matter of controversy for many years
300
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
CONVENIENCE SAMPLING
• Sometimes known as grab or opportunity sampling or accidental or haphazard sampling.
• A type of nonprobability sampling which involves the sample being drawn from that part of the population which is close to hand. That is, readily available and convenient.
• The researcher using such a sample cannot scientifically make generalizations about the total population from this sample because it would not be representative enough.
• For example, if the interviewer was to conduct a survey at a shopping center early in the morning on a given day, the people that he/she could interview would be limited to those given there at that given time, which would not represent the views of other members of society in such an area, if the survey was to be conducted at different times of day and several times per week.
• This type of sampling is most useful for pilot testing. • In social science research, snowball sampling is a similar technique,
where existing study subjects are used to recruit more subjects into the sample.
301
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
CONVENIENCE SAMPLING…….
– Use results that are easy to get
302302
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
Judgmental sampling or Purposive sampling
• - The researcher chooses the sample based on who they think would be appropriate for the study. This is used primarily when there is a limited number of people that have expertise in the area being researched
303
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
PANEL SAMPLING
• Method of first selecting a group of participants through a random sampling method and then asking that group for the same information again several times over a period of time.
• Therefore, each participant is given same survey or interview at two or more time points; each period of data collection called a "wave".
• This sampling methodology often chosen for large scale or nation-wide studies in order to gauge changes in the population with regard to any number of variables from chronic illness to job stress to weekly food expenditures.
• Panel sampling can also be used to inform researchers about within-person health changes due to age or help explain changes in continuous dependent variables such as spousal interaction.
• There have been several proposed methods of analyzing panel sample data, including growth curves.
304
1. Probability sampling includes:
I. Simple Random Sampling,
II. Systematic Sampling,
III. Stratified Random Sampling,
IV. Cluster Sampling
V. Multistage Sampling.
VI. Multiphase sampling
2. Non probability Sampling includes:
I. Accidental Sampling,
II. Quota Sampling and
III. Purposive Sampling.
Result from survey is never
exactly the same as
the actual value in the population
WHY?
Components of total error
0% 100%
True population
value50%
Pointestimate
from survey40%
Total error
Nonsamplingbias
Sampling bias
Samplingerror
Prevalence
Nonsampling bias
• Is present even if sampling and analysis done correctly
• Would still be present if survey measured outcome in ENTIRE sampling frame
In sum, you have either sampled the wrong people or screwed up your measurements!
Nonsampling bias
• Types:– Sampling frame is not equal to population to which
you want to generalize (sampling universe)• Sampling frame out of date• Non-response among sampling units in sampling frame
– Measurement error• Tape incorrectly fixed to height board• Scale consistently reads low by 0.5 kg• Failure to remove heavy clothing before weighing• Misleading questions• Recall bias
Nonsampling bias
Source of bias• Sampling frame out of
date
• Non-response
• Measurement error
Prevention or cure• Use current sampling frame• Limit generalizations
• Minimize non-response• Use various statistical
methods to weight data
• Standardize instruments• Write clear & simple
questions• Train survey workers• Supervise survey workers
Sampling bias
• Selection of nonrepresentative sample, i.e., the likelihood of selection not equal for each sampling unit
• Failure to weight analysis of unequal probability sampleIn sum, you have not sampled people with equal probability and you have not accounted for this
in your analysis!
Sampling bias
• Examples– Nonrepresentative sample
• Selecting youngest child in household• Choosing households close to the road• Using a different sampling fraction in different
provinces
– Failure to do statistical weighting
Sampling bias
Source of biasNonrepresentative sampling
Failure to do weighting
ALWAYS ask yourself "Will this choice enhance representativeness or reduce it"?
Calculate the probabilities of selection
Apply appropriate statistical weights if selection probabilities unequal
Prevention or cure
Sampling error
• Difference between survey result and population value due to random selection of sample
• Influenced by:– Sample size– Sampling schemeUnlike nonsampling bias and sampling bias, it
can be predicted, calculated, and accounted for.
Sampling error
• Measures of sampling error:– Confidence limits– Standard error– Coefficient of variance– P values– Others
• Use these measures to:– Calculate sample size prior to sampling– Determine how sure we are of result after
analysis
Bias and sampling error
Non sampling biasSampling bias
Sampling error
Bias
Sampling error
In sum…
Bias• Includes nonsampling bias and sampling bias• Is due to mistakes which can be avoided• Cannot be precisely measured• Control and prevention requires careful attention
Sampling error• Is unavoidable if sampling < 100% of population• Can be controlled by selecting appropriate sample size
and sampling method• Can be precisely calculated after-the-fact
Introduction to Data Analysis
• Data Measurement• Measurement of the data is the first step in the process that ultimately
guides the final analysis.
• Consideration of sampling, controls, errors (random and systematic) and the required precision all influence the final analysis.
• Validation: Instruments and methods used to measure the data must be validated for accuracy.
• Precision and accuracy…Determination of error• Social vs. Physical Sciences
Introduction to Data Analysis
• Types of data• Univariate/Multivariate
• Univariate: When we use one variable to describe a person, place, or thing.
• Multivariate: When we use two or more variables to measure a person, place or thing. Variables may or may not be dependent on each other.
• Cross-sectional data/Time-ordered data (business, social sciences)• Cross-Sectional: Measurements taken at one time period• Time-Ordered: Measurements taken over time in chronological
sequence.
The type of data will dictate (in part) the appropriate data-analysis method.
• Measurement Scales• Nominal or Categorical Scale
• Classification of people, places, or things into categories (e.g. age ranges, colors, etc.).
• Classifications must be mutually exclusive (every element should belong to one category with no ambiguity).
• Weakest of the four scales. No category is greater than or less (better or worse) than the others. They are just different.
• Ordinal or Ranking Scale• Classification of people, places, or things into a ranking such that
the data is arranged into a meaningful order (e.g. poor, fair, good, excellent).
• Qualitative classification only
Introduction to Data Analysis
Introduction to Data Analysis
• Measurement Scales (business, social sciences)• Interval Scale
• Data classified by ranking.• Quantitative classification (time, temperature, etc).• Zero point of scale is arbitrary (differences are meaningful).
• Ratio Scale • Data classified as the ratio of two numbers.• Quantitative classification (height, weight, distance, etc).• Zero point of scale is real (data can be added, subtracted,
multiplied, and divided).
Univariate Analysis/Descriptive Statistics
• Descriptive Statistics– The Range– Min/Max– Average– Median– Mode– Variance– Standard Deviation– Histograms and Normal Distributions
Univariate Analysis/Histograms
• Distributions– Descriptive statistics are easier to interpret when
graphically illustrated.– However, charting each data element can lead to very
busy and confusing charts that do not help interpret the data.
– Grouping the data elements into categories and charting the frequency within these categories yields a graphical illustration of how the data is distributed throughout its range.
Univariate Analysis/Histograms
0
20
40
60
80
100
120
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
X-axis labels
Da
ta V
alu
es
With just a few columns this chart is difficult to interpret. It tells you very little about the data set. Even finding the Min and Max can be difficult.
The data can be presented such that more statistical parameters can be estimated from the chart (average, standard deviation).
Univariate Analysis/Histograms
• Frequency Table– The first step is to decide on the categories and group
the data appropriately.
(45, 49, 50, 53, 60, 62, 63, 65, 66, 67, 69, 71, 73, 74, 74, 78, 81, 85, 87, 100)
Category Labels Frequency
0-50 3
51-60 2
61-70 6
71-80 5
81-90 3
>90 1
Univariate Analysis/Histograms
• Histogram– A histogram is simply a column chart of the frequency
table.
Category Labels Frequency
0-50 3
51-60 2
61-70 6
71-80 5
81-90 3
>90 10
1
2
3
4
5
6
7
0-50 51-60 61-70 71-80 81-90 >90
Scores
Fre
qu
en
cy
Univariate Analysis/Histograms
• Histogram
0
1
2
3
4
5
6
7
0-50 51-60 61-70 71-80 81-90 >90
Scores
Fre
qu
en
cy
Average (68.6) and Median (68)
Mode (74)
-1SD
+1SD
0
0.02
0.04
0.06
0.08
0.1
0.12
25 45 65 85 105 125 145 165
Univariate Analysis/Normal Distributions
• Distributions that can be described mathematically as Gaussian are also called Normal
• The Bell curve– Symmetrical– Mean ≈ Median
Mean, Median, Mode
Univariate Analysis/Skewed Distributions
• When data are skewed, the mean and SD can be misleading
• Skewnesssk= 3(mean-median)/SDIf sk>|1| then distribution is
non-symetrical• Negatively skewed
– Mean<Median– Sk is negative
• Positively Skewed– Mean>Median– Sk is positive
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0 20 40 60 80 100 120 140 160
0
0.02
0.04
0.06
0.08
0.1
0.12
25 45 65 85 105 125 145 165 185 205 225
Central Limit Theorem
• Regardless of the shape of a distribution, the distribution of the sample mean based on samples of size N approaches a normal curve as N increases.– N must be less than the entire sample
N=10
Univariate Analysis/Descriptive Statistics
• The Range– Difference between minimum and maximum
values in a data set– Larger range usually (but not always)
indicates a large spread or deviation in the values of the data set.
(73, 66, 69, 67, 49, 60, 81, 71, 78, 62, 53, 87, 74, 65, 74, 50, 85, 45, 63, 100)
Univariate Analysis/Descriptive Statistics
• The Average (Mean)– Sum of all values divided by the number of values in the data set.– One measure of central location in the data set.
Average =
Average=(73+66+69+67+49+60+81+71+78+62+53+87+74+65+74+50+85+45+63+100)/20 = 68.6
Excel function: AVERAGE()
N
i
imN 1
1
Univariate Analysis/Descriptive Statistics
0 2.5 7.5 10
4.8
0 2.5 7.5 10
4.8
The data may or may not be symmetrical around its average value
Univariate Analysis/Descriptive Statistics
• The Median– The middle value in a sorted data set. Half the values
are greater and half are less than the median.– Another measure of central location in the data set.(45, 49, 50, 53, 60, 62, 63, 65, 66, 67, 69, 71, 73, 74, 74,
78, 81, 85, 87, 100)Median: 68
(1, 2, 4, 7, 8, 9, 9)
– Excel function: MEDIAN()
Univariate Analysis/Descriptive Statistics
• The Median– May or may not be close to the mean.– Combination of mean and median are used to define
the skewness of a distribution.
0 2.5 7.5 10
6.25
Univariate Analysis/Descriptive Statistics
• The Mode– Most frequently occurring value.– Another measure of central location in the data set.– (45, 49, 50, 53, 60, 62, 63, 65, 66, 67, 69, 71, 73, 74,
74, 78, 81, 85, 87, 100)– Mode: 74
– Generally not all that meaningful unless a larger percentage of the values are the same number.
Univariate Analysis/Descriptive Statistics
• Variance– One measure of dispersion (deviation from the mean) of a data
set. The larger the variance, the greater is the average deviation of each datum from the average value.
m
mmN
N
ii
2
1
)(1
Variance =
Average value of the data set
Variance = [(45 – 68.6)2 + (49 – 68.6)2 + (50 – 68.6)2 + (53 – 68.6)2 + …]/20 = 181
Excel Functions: VARP(), VAR()
Univariate Analysis/Descriptive Statistics
• Standard Deviation– Square root of the variance. Can be thought of as the
average deviation from the mean of a data set.– The magnitude of the number is more in line with the
values in the data set.
Standard Deviation = ([(45 – 68.6)2 + (49 – 68.6)2 + (50 – 68.6)2 + (53 – 68.6)2 + …]/20)1/2 = 13.5
Excel Functions: STDEVP(), STDEV()
Multivariate Analysis
• Many statistical techniques focus on just one or two variables
• Multivariate analysis (MVA) techniques allow more than two variables to be analysed at once– Multiple regression is not typically included
under this heading, but can be thought of as a multivariate analysis
Outline of Lectures
• We will cover– Why MVA is useful and important
• Simpson’s Paradox
– Some commonly used techniques• Principal components• Cluster analysis• Correspondence analysis• Others if time permits
– Market segmentation methods– An overview of MVA methods and their niches
Simpson’s Paradox
• Example: 44% of male applicants are admitted by a university, but only 33% of female applicants
• Does this mean there is unfair discrimination?
• University investigates and breaks down figures for Engineering and English programmes
Male Female
Accept 35 20
Refuse entry
45 40
Total 80 60
Simpson’s Paradox• No relationship between sex
and acceptance for either programme– So no evidence of
discrimination• Why?
– More females apply for the English programme, but it it hard to get into
– More males applied to Engineering, which has a higher acceptance rate than English
• Must look deeper than single cross-tab to find this out
Engineer-ing
Male Female
Accept 30 10
Refuse entry
30 10
Total 60 20
English Male Female
Accept 5 10
Refuse entry
15 30
Total 20 40
Another Example
• A study of graduates’ salaries showed negative association between economists’ starting salary and the level of the degree– i.e. PhDs earned less than Masters degree holders,
who in turn earned less than those with just a Bachelor’s degree
– Why?• The data was split into three employment
sectors– Teaching, government and private industry– Each sector showed a positive relationship– Employer type was confounded with degree level
Simpson’s Paradox
• In each of these examples, the bivariate analysis (cross-tabulation or correlation) gave misleading results
• Introducing another variable gave a better understanding of the data– It even reversed the initial conclusions
Many Variables
• Commonly have many relevant variables in market research surveys– E.g. one not atypical survey had ~2000 variables– Typically researchers pore over many crosstabs– However it can be difficult to make sense of these,
and the crosstabs may be misleading• MVA can help summarise the data
– E.g. factor analysis and segmentation based on agreement ratings on 20 attitude statements
• MVA can also reduce the chance of obtaining spurious results
Multivariate Analysis Methods
• Two general types of MVA technique– Analysis of dependence
• Where one (or more) variables are dependent variables, to be explained or predicted by others– E.g. Multiple regression, PLS, MDA
– Analysis of interdependence• No variables thought of as “dependent”• Look at the relationships among variables, objects
or cases– E.g. cluster analysis, factor analysis
Principal Components
• Identify underlying dimensions or principal components of a distribution
• Helps understand the joint or common variation among a set of variables
• Probably the most commonly used method of deriving “factors” in factor analysis (before rotation)
Principal Components
• The first principal component is identified as the vector (or equivalently the linear combination of variables) on which the most data variation can be projected
• The 2nd principal component is a vector perpendicular to the first, chosen so that it contains as much of the remaining variation as possible
• And so on for the 3rd principal component, the 4th, the 5th etc.
Principal Components - Examples
• Ellipse, ellipsoid, sphere• Rugby ball• Pen• Frying pan• Banana• CD• Book
Multivariate Normal Distribution
• Generalisation of the univariate normal• Determined by the mean (vector) and
covariance matrix
• E.g. Standard bivariate normal ,~ NX
22
22
2
1)(,,0,0~
yx
expINX
Example – Crime Rates by State
The PRINCOMP Procedure
Observations
50
Variables 7
Simple Statistics
Murder Rape Robbery Assault Burglary Larceny
Auto_Theft
Mean
7.444000000
25.73400000
124.0920000
211.30000001291.90400
02671.28800
0377.5260000
StD 3.866768941
10.75962995
88.3485672100.253049
2432.455711 725.908707 193.3944175
Crime Rates per 100,000 Population by State
Obs
StateMurde
rRape
Robbery
Assault
Burglary
Larceny
Auto_Theft
1 Alabama 14.2 25.2 96.8 278.3 1135.5 1881.9 280.7
2 Alaska 10.8 51.6 96.8 284.0 1331.7 3369.8 753.3
3 Arizona 9.5 34.2 138.2 312.3 2346.1 4467.4 439.5
4 Arkansas 8.8 27.6 83.2 203.4 972.6 1862.1 183.4
5 California
11.5 49.4 287.0 358.0 2139.4 3499.8 663.5
… … ... ... ... ... ... ... ...
Correlation Matrix
Murde
r RapeRobber
yAssaul
tBurglar
yLarcen
yAuto_Thef
t
Murder 1.00000.601
20.4837 0.6486 0.3858 0.1019 0.0688
Rape 0.60121.000
00.5919 0.7403 0.7121 0.6140 0.3489
Robbery 0.48370.591
91.0000 0.5571 0.6372 0.4467 0.5907
Assault 0.64860.740
30.5571 1.0000 0.6229 0.4044 0.2758
Burglary 0.38580.712
10.6372 0.6229 1.0000 0.7921 0.5580
Larceny 0.10190.614
00.4467 0.4044 0.7921 1.0000 0.4442
Auto_Theft
0.06880.348
90.5907 0.2758 0.5580 0.4442 1.0000
Eigenvalues of the Correlation Matrix
Eigenvalue
Difference
Proportion
Cumulative
1 4.11495951 2.87623768 0.5879 0.5879
2 1.23872183 0.51290521 0.1770 0.7648
3 0.72581663 0.40938458 0.1037 0.8685
4 0.31643205 0.05845759 0.0452 0.9137
5 0.25797446 0.03593499 0.0369 0.9506
6 0.22203947 0.09798342 0.0317 0.9823
7 0.12405606 0.0177 1.0000
Eigenvectors
Prin1 Prin2 Prin3 Prin4 Prin5 Prin6 Prin7
Murder 0.300279 -.629174 0.178245 -.232114 0.538123 0.259117 0.267593
Rape 0.431759 -.169435 -.244198 0.062216 0.188471 -.773271 -.296485
Robbery 0.396875 0.042247 0.495861 -.557989 -.519977 -.114385 -.003903
Assault 0.396652 -.343528 -.069510 0.629804 -.506651 0.172363 0.191745
Burglary 0.440157 0.203341 -.209895 -.057555 0.101033 0.535987 -.648117
Larceny 0.357360 0.402319 -.539231 -.234890 0.030099 0.039406 0.601690
Auto_Theft 0.295177 0.502421 0.568384 0.419238 0.369753 -.057298 0.147046
• 2-3 components explain 76%-87% of the variance• First principal component has uniform variable
weights, so is a general crime level indicator• Second principal component appears to contrast
violent versus property crimes• Third component is harder to interpret
Cluster Analysis
• Techniques for identifying separate groups of similar cases– Similarity of cases is either specified directly
in a distance matrix, or defined in terms of some distance function
• Also used to summarise data by defining segments of similar cases in the data– This use of cluster analysis is known as
“dissection”
Clustering Techniques
• Two main types of cluster analysis methods– Hierarchical cluster analysis
• Each cluster (starting with the whole dataset) is divided into two, then divided again, and so on
– Iterative methods• k-means clustering (PROC FASTCLUS)• Analogous non-parametric density estimation method
– Also other methods• Overlapping clusters• Fuzzy clusters
Applications
• Market segmentation is usually conducted using some form of cluster analysis to divide people into segments– Other methods such as latent class models or
archetypal analysis are sometimes used instead
• It is also possible to cluster other items such as products/SKUs, image attributes, brands
Tandem Segmentation
• One general method is to conduct a factor analysis, followed by a cluster analysis
• This approach has been criticised for losing information and not yielding as much discrimination as cluster analysis alone
• However it can make it easier to design the distance function, and to interpret the results
Tandem k-means Exampleproc factor data=datafile n=6 rotate=varimax round reorder flag=.54 scree out=scores; var reasons1-reasons15 usage1-usage10;run;
proc fastclus data=scores maxc=4 seed=109162319 maxiter=50; var factor1-factor6;run;
• Have used the default unweighted Euclidean distance function, which is not sensible in every context
• Also note that k-means results depend on the initial cluster centroids (determined here by the seed)
• Typically k-means is very prone to local maxima– Run at least 20 times to ensure reasonable maximum
Selected Outputs
19th run of 5 segments Cluster Summary Maximum Distance RMS Std from Seed Nearest Distance Between Cluster Frequency Deviation to Observation Cluster Cluster Centroids ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 433 0.9010 4.5524 4 2.0325 2 471 0.8487 4.5902 4 1.8959 3 505 0.9080 5.3159 4 2.0486 4 870 0.6982 4.2724 2 1.8959 5 433 0.9300 4.9425 4 2.0308
Selected Outputs
19th run of 5 segments
FASTCLUS Procedure: Replace=RANDOM Radius=0 Maxclusters=5 Maxiter=100 Converge=0.02
Statistics for Variables Variable Total STD Within STD R-Squared RSQ/(1-RSQ) ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FACTOR1 1.000000 0.788183 0.379684 0.612082 FACTOR2 1.000000 0.893187 0.203395 0.255327 FACTOR3 1.000000 0.809710 0.345337 0.527503 FACTOR4 1.000000 0.733956 0.462104 0.859095 FACTOR5 1.000000 0.948424 0.101820 0.113363 FACTOR6 1.000000 0.838418 0.298092 0.424689 OVER-ALL 1.000000 0.838231 0.298405 0.425324
Pseudo F Statistic = 287.84 Approximate Expected Over-All R-Squared = 0.37027 Cubic Clustering Criterion = -26.135 WARNING: The two above values are invalid for correlated variables.
Selected Outputs
19th run of 5 segments
Cluster Means
Cluster FACTOR1 FACTOR2 FACTOR3 FACTOR4 FACTOR5 FACTOR6 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 -0.17151 0.86945 -0.06349 0.08168 0.14407 1.17640 2 -0.96441 -0.62497 -0.02967 0.67086 -0.44314 0.05906 3 -0.41435 0.09450 0.15077 -1.34799 -0.23659 -0.35995 4 0.39794 -0.00661 0.56672 0.37168 0.39152 -0.40369 5 0.90424 -0.28657 -1.21874 0.01393 -0.17278 -0.00972
Cluster Standard Deviations
Cluster FACTOR1 FACTOR2 FACTOR3 FACTOR4 FACTOR5 FACTOR6 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 0.95604 0.79061 0.95515 0.81100 1.08437 0.76555 2 0.79216 0.97414 0.88440 0.71032 0.88449 0.82223 3 0.89084 0.98873 0.90514 0.74950 0.92269 0.97107 4 0.59849 0.74758 0.56576 0.58258 0.89372 0.74160 5 0.80602 1.03771 0.86331 0.91149 1.00476 0.93635
Cluster Analysis Options• There are several choices of how to form clusters in
hierarchical cluster analysis– Single linkage– Average linkage– Density linkage– Ward’s method– Many others
• Ward’s method (like k-means) tends to form equal sized, roundish clusters
• Average linkage generally forms roundish clusters with equal variance
• Density linkage can identify clusters of different shapes
FASTCLUS
Density Linkage
Cluster Analysis Issues• Distance definition
– Weighted Euclidean distance often works well, if weights are chosen intelligently
• Cluster shape– Shape of clusters found is determined by method, so choose method
appropriately• Hierarchical methods usually take more computation time than k-
means• However multiple runs are more important for k-means, since it can
be badly affected by local minima• Adjusting for response styles can also be worthwhile
– Some people give more positive responses overall than others– Clusters may simply reflect these response styles unless this is adjusted
for, e.g. by standardising responses across attributes for each respondent
MVA - FASTCLUS
• PROC FASTCLUS in SAS tries to minimise the root mean square difference between the data points and their corresponding cluster means– Iterates until convergence is reached on this criterion– However it often reaches a local minimum– Can be useful to run many times with different seeds
and choose the best set of clusters based on this RMS criterion
• See http://www.clustan.com/k-means_critique.html for more k-means issues
Iteration History from FASTCLUS
Relative Change in Cluster Seeds Iteration Criterion 1 2 3 4 5 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 0.9645 1.0436 0.7366 0.6440 0.6343 0.5666 2 0.8596 0.3549 0.1727 0.1227 0.1246 0.0731 3 0.8499 0.2091 0.1047 0.1047 0.0656 0.0584 4 0.8454 0.1534 0.0701 0.0785 0.0276 0.0439 5 0.8430 0.1153 0.0640 0.0727 0.0331 0.0276 6 0.8414 0.0878 0.0613 0.0488 0.0253 0.0327 7 0.8402 0.0840 0.0547 0.0522 0.0249 0.0340 8 0.8392 0.0657 0.0396 0.0440 0.0188 0.0286 9 0.8386 0.0429 0.0267 0.0324 0.0149 0.0223 10 0.8383 0.0197 0.0139 0.0170 0.0119 0.0173
Convergence criterion is satisfied.
Criterion Based on Final Seeds = 0.83824
Results from Different Initial Seeds
19th run of 5 segments
Cluster Means
Cluster FACTOR1 FACTOR2 FACTOR3 FACTOR4 FACTOR5 FACTOR6 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 -0.17151 0.86945 -0.06349 0.08168 0.14407 1.17640 2 -0.96441 -0.62497 -0.02967 0.67086 -0.44314 0.05906 3 -0.41435 0.09450 0.15077 -1.34799 -0.23659 -0.35995 4 0.39794 -0.00661 0.56672 0.37168 0.39152 -0.40369 5 0.90424 -0.28657 -1.21874 0.01393 -0.17278 -0.00972
20th run of 5 segments
Cluster Means
Cluster FACTOR1 FACTOR2 FACTOR3 FACTOR4 FACTOR5 FACTOR6 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 0.08281 -0.76563 0.48252 -0.51242 -0.55281 0.64635 2 0.39409 0.00337 0.54491 0.38299 0.64039 -0.26904 3 -0.12413 0.30691 -0.36373 -0.85776 -0.31476 -0.94927 4 0.63249 0.42335 -1.27301 0.18563 0.15973 0.77637 5 -1.20912 0.21018 -0.07423 0.75704 -0.26377 0.13729
Howard-Harris Approach• Provides automatic approach to choosing seeds for k-
means clustering• Chooses initial seeds by fixed procedure
– Takes variable with highest variance, splits the data at the mean, and calculates centroids of the resulting two groups
– Applies k-means with these centroids as initial seeds– This yields a 2 cluster solution– Choose the cluster with the higher within-cluster variance– Choose the variable with the highest variance within that cluster,
split the cluster as above, and repeat to give a 3 cluster solution– Repeat until have reached a set number of clusters
• I believe this approach is used by the ESPRI software package (after variables are standardised by their range)
Another “Clustering” Method• One alternative approach to identifying clusters is to fit a
finite mixture model– Assume the overall distribution is a mixture of several normal
distributions– Typically this model is fit using some variant of the EM algorithm
• E.g. weka.clusterers.EM method in WEKA data mining package• See WEKA tutorial for an example using Fisher’s iris data
• Advantages of this method include:– Probability model allows for statistical tests– Handles missing data within model fitting process– Can extend this approach to define clusters based on model
parameters, e.g. regression coefficients• Also known as latent class modeling
Cluster MeansCluster 1 Cluster 2 Cluster 3 Cluster 4
Reason 1 4.55 2.65 4.21 4.50
Reason 2 4.32 4.32 4.12 4.02
Reason 3 4.43 3.28 3.90 4.06
Reason 4 3.85 3.89 2.15 3.35
Reason 5 4.10 3.77 2.19 3.80
Reason 6 4.50 4.57 4.09 4.28
Reason 7 3.93 4.10 1.94 3.66
Reason 8 4.09 3.17 2.30 3.77
Reason 9 4.17 4.27 3.51 3.82
Reason 10 4.12 3.75 2.66 3.47
Reason 11 4.58 3.79 3.84 4.37
Reason 12 3.51 2.78 1.86 2.60
Reason 13 4.14 3.95 3.06 3.45
Reason 14 3.96 3.75 2.06 3.83
Reason 15 4.19 2.42 2.93 4.04
=max. =min.
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Usage 1 3.43 3.66 3.48 4.00
Usage 2 3.91 3.94 3.86 4.26
Usage 3 3.07 2.95 2.61 3.13
Usage 4 3.85 3.02 2.62 2.50
Usage 5 3.86 3.55 3.52 3.56
Usage 6 3.87 4.25 4.14 4.56
Usage 7 3.88 3.29 2.78 2.59
Usage 8 3.71 2.88 2.58 2.34
Usage 9 4.09 3.38 3.19 2.68
Usage 10 4.58 4.26 4.00 3.91
Cluster Means=max. =min.
Cluster Means
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Usage 1 3.43 3.66 3.48 4.00
Usage 2 3.91 3.94 3.86 4.26
Usage 3 3.07 2.95 2.61 3.13
Usage 4 3.85 3.02 2.62 2.50
Usage 5 3.86 3.55 3.52 3.56
Usage 6 3.87 4.25 4.14 4.56
Usage 7 3.88 3.29 2.78 2.59
Usage 8 3.71 2.88 2.58 2.34
Usage 9 4.09 3.38 3.19 2.68
Usage 10 4.58 4.26 4.00 3.91
Correspondence Analysis
• Provides a graphical summary of the interactions in a table
• Also known as a perceptual map– But so are many other charts
• Can be very useful– E.g. to provide overview of cluster results
• However the correct interpretation is less than intuitive, and this leads many researchers astray
Reason 1
Reason 2
Reason 3
Reason 4
Reason 5
Reason 6
Reason 7
Reason 8
Reason 9
Reason 10
Reason 11
Reason 12
Reason 13
Reason 14
Reason 15
Usage 1
Usage 2
Usage 3
Usage 4
Usage 5
Usage 6
Usage 7Usage 8
Usage 9
Usage 10
Cluster 1
Cluster 2
Cluster 3
Cluster 4
25.3%
53.8%
2D Fit = 79.1%
Four Clusters (imputed, normalised)
= Correlation < 0.50
Interpretation
• Correspondence analysis plots should be interpreted by looking at points relative to the origin– Points that are in similar directions are positively
associated– Points that are on opposite sides of the origin are
negatively associated– Points that are far from the origin exhibit the strongest
associations• Also the results reflect relative associations, not
just which rows are highest or lowest overall
Software for Correspondence Analysis
• Earlier chart was created using a specialised package called BRANDMAP
• Can also do correspondence analysis in most major statistical packages
• For example, using PROC CORRESP in SAS:
*---Perform Simple Correspondence Analysis—Example 1 in SAS OnlineDoc; proc corresp all data=Cars outc=Coor; tables Marital, Origin; run;
*---Plot the Simple Correspondence Analysis Results---; %plotit(data=Coor, datatype=corresp)
Cars by Marital Status
Canonical Discriminant Analysis
• Predicts a discrete response from continuous predictor variables
• Aims to determine which of g groups each respondent belongs to, based on the predictors
• Finds the linear combination of the predictors with the highest correlation with group membership– Called the first canonical variate
• Repeat to find further canonical variates that are uncorrelated with the previous ones– Produces maximum of g-1 canonical variates
CDA Plot
Canonical Var 1
Canonical Var 2
Discriminant Analysis
• Discriminant analysis also refers to a wider family of techniques– Still for discrete response, continuous
predictors– Produces discriminant functions that classify
observations into groups• These can be linear or quadratic functions• Can also be based on non-parametric techniques
– Often train on one dataset, then test on another
CHAID
• Chi-squared Automatic Interaction Detection• For discrete response and many discrete
predictors– Common situation in market research
• Produces a tree structure– Nodes get purer, more different from each other
• Uses a chi-squared test statistic to determine best variable to split on at each node– Also tries various ways of merging categories, making
a Bonferroni adjustment for multiple tests– Stops when no more “statistically significant” splits
can be found
Example of CHAID Output
Titanic Survival Example• Adults (20%)• /• /• Men• / \• / \• / Children (45%)• /• All passengers• \• \ 3rd class or crew (46%)• \ /• \ /• Women• \• \• 1st or 2nd class passenger (93%)
CHAID Software
• Available in SAS Enterprise Miner (if you have enough money)– Was provided as a free macro until SAS decided to
market it as a data mining technique– TREEDISC.SAS – still available on the web, although
apparently not on the SAS web site• Also implemented in at least one standalone
package• Developed in 1970s• Other tree-based techniques available
– Will discuss these later
TREEDISC Macro
%treedisc(data=survey2, depvar=bs,
nominal=c o p q x ae af ag ai: aj al am ao ap aw bf_1 bf_2 ck cn:,
ordinal=lifestag t u v w y ab ah ak,
ordfloat=ac ad an aq ar as av,
options=list noformat read,maxdepth=3,
trace=medium, draw=gr, leaf=50,
outtree=all);
• Need to specify type of each variable– Nominal, Ordinal, Ordinal with a floating value
Partial Least Squares (PLS)
• Multivariate generalisation of regression– Have model of form Y=XB+E– Also extract factors underlying the predictors– These are chosen to explain both the response
variation and the variation among predictors
• Results are often more powerful than principal components regression
• PLS also refers to a more general technique for fitting general path models, not discussed here
Structural Equation Modeling (SEM)
• General method for fitting and testing path analysis models, based on covariances
• Also known as LISREL• Implemented in SAS in PROC CALIS• Fits specified causal structures (path
models) that usually involve factors or latent variables– Confirmatory analysis
SEM Example:Relationship between
Academic and Job Success
SAS Code• data jobfl (type=cov);• input _type_ $ _name_ $ act cgpa
entry• salary promo;• cards;• n 500 500 500 500 500• cov act 1.024• cov cgpa 0.792 1.077• cov entry 0.567 0.537 0.852• cov salary 0.445 0.424 0.518 0.670• cov promo 0.434 0.389 0.475 0.545
0.716• ;
• proc calis data=jobfl cov stderr;• lineqs• act = 1*F1 + e1,• cgpa = p2f1*F1 + e2,• entry = p3f1*F1 + e3,• salary = 1*F2 + e4,• promo = p5f1*F2 + e5;• std• e1 = vare1,• e2 = vare2,• e3 = vare3,• e4 = vare4,• e5 = vare5,• F1 = varF1,• F2 = varF2;• cov• f1 f2 = covf1f2;• var act cgpa entry salary promo;• run;
Results
• All parameters are statistically significant, with a high correlation being found between the latent traits of academic and job success
• However the overall chi-squared value for the model is 111.3, with 4 d.f., so the model does not fit the observed covariances perfectly
Latent Variable Models
• Have seen that both latent trait and latent class models can be useful– Latent traits for factor analysis and SEM– Latent class for probabilistic segmentation
• Mplus software can now fit combined latent trait and latent class models– Appears very powerful– Subsumes a wide range of multivariate
analyses
Broader MVA Issues
• Preliminaries– EDA is usually very worthwhile
• Univariate summaries, e.g. histograms• Scatterplot matrix• Multivariate profiles, spider-web plots
– Missing data• Establish amount (by variable, and overall) and pattern
(across individuals)• Think about reasons for missing data• Treat missing data appropriately – e.g. impute, or build into
model fitting
MVA Issues
• Preliminaries (continued)– Check for outliers
• Large values of Mahalonobis’ D2
• Testing results– Some methods provide statistical tests– But others do not
• Cross-validation gives a useful check on the results– Leave-1-out cross-validation– Split-sample training and test datasets
» Sometimes 3 groups needed» For model building, training and testing