outline research methods

7/28/2019 Outline Research Methods

1/31

[1] 30.03.2012

Summary:

Research Methods


2/31

[2]

Part IConceptual and Ethical Foundations

Chapter 1The Spirit of Behavioral Research

Science and the Search for Knowledge:

Scientific methodis a misleading term because not every finding can be accounted for withthe same formulaic strategy; one of the four ways of knowing.

Empirical reasoning is the process that underlies scientific method. Verifiability principle means to be able to assess the truth via amassing of factual

observations.

Falsifiable hypotheses are claims that knowledge evolves through repeated testing ofcircumstances.

What do Behavioral Researchers Really Know?:

Why-questions and how-questions: Why resembles the underlying process of something andHow addresses the observable nature of things.

Descriptive research orientation(s) refers to Social constructionism,Contextualism/Perspectism, Evolutionary Epistemology.

Social Constructionism:

Panpsychistic. The (social) world exists only in the cognition of the individual which itself isa linguistic construction.

Explaining the world through interpretations and narrative analyses of everyday life. Thedeductivist approach is enough; which leads to probabilistic assertions (A is more likely

than B).

Science should be defined through experiments of things that can be exactly duplicated(Gergen, 1995), thus Geology would be no scientific discipline.

Contextualism/Perspectism:

Bothstarted as a reaction against a mechanistic model of behavior. The pure knowledge itself does not exist. Circumstances are only through context, such as

societal systems and culture, getting meaning; and within those borders true

(Contextualism).


3/31


4/31


5/31

[5]

Inductive-statistical explanation: Reasoning from specific to general in a probabilistic way.Orienting Habits of Good Scientific Practice:

Enthusiasm. Open-mindedness. Common sense. Role-taking ability. Inventiveness. Confidence in ones own judgment. Consistency and care about detail. Ability to communicate. Honesty.


6/31

[6]

Chapter 2

Contexts of Discovery and Justification

Inspiration and Exploration:

Discoveryrefers to the origin of ideas or the genesis of theories and hypotheses. Justificationrefers to the processes by which hypotheses are empirically adjudicated. Null hypothesis significance testing is a dichotomous decision making paradigm

(either/or not).

Theories and Hypotheses:

Hypothesesare theoretical statements or theoretical propositions. Theoriesarehypothetical formulations; conjectural. Thinking inductively is thinking theoretically. Testability means that theories and hypotheses are stated in a way that should allow

disconfirmation (falsification).

Using a Case Study for Inspiration:

Using an in-depth analysis of an individual or a group of people with sharedcharacteristics.

Serendipity in Behavioral Research:

Serendipityisthe term for lucky findings.Novelty, Utility, and Consistency:

Noveltyshouldnt be merely a minor variation from an older idea with a big contributionto the scientific world.

Utilityand Consistency: Is the idea useful and consistent with what is generally known inthe field?

To ensure that you are up-to-date visit the newest literature on your topic.Testability and Refutability:

Poppers idea was that it is not verifiability, but falsifiability which is the essentialdifference between science and non-science/pseudoscience.

Clarity and Conciseness:

Operational definition: The technical name for an empirically based definition.


7/31

[7]

Theoretical definitions:Do not attempt to force our thinking into a rigidly empirical mold. Typology is a systematic classification of types when trying to condense the definition

of a psychological concept.

Facet analysis: To formulate a classification system based on assumed structuralpatterns or dimensions (factor analysis).

Coherence means whether the precise statement of the hypothesis fits logically Parsimony describes the simplicity of a statement (Occams razor).

Positivism:

Embracing the positive observational (opposite to negativism). So, statementsauthenticated by sensory experience are more likely to be true (see Vienna circle).

Hume: All knowledge resolves itself in probability and thus it is impossible to provebeyond doubt that a generalization is incontrovertible true.

Falsificationism:

Antipositivist view on the basis of inescapable conclusions. It is argued that theories might have boundaries that will never be explored.

Conventionalism:

Duheme-Quine thesis plays on the role of language. Meaning that theories evolve onthe basis of certain linguistic conventions (like simplicity).

An Amalgamation of Ideas:

The views about the requirements of scientific theories and hypotheses now seem amixture of Falsificationism, conventionalism, and practicality:

i. Finite testability.ii. Falsifiability.

iii. Theories can add or replace outmoded models.iv. If its not supported it may not be right.v. Though, if there is support it may not be true.

vi. There are always alternative explanations.Type I and Type II Decision Errors:

Null hypothesis, H0 Alternative hypothesis, H1


8/31

[8]

Type I Erroris the mistake of rejecting H0 when its true Type II Errorrefers to the mistake of not rejecting H0 when it is false. Significance level, p-value, indicates the probability of Type I error and is denoted as

alpha.

The probability of Type II error is symbolized as beta. Confidence: 1-alpha, is the probability of not making a Type I error. Power:1-beta, the probability of not making a Type II error; sensitivity.

Statistical Significance and the Effect Size:

Significance test = Size of effect x Size of Study Any particular test of significance can be obtained by one or more definitions of the

effect size multiplied by one or more definitions of the study size.

Two Families of Effect Sizes:

Correlation family(phi, rpb, R, partial-eta squared). Difference family(Cohens d, Hedges g) (Ratio family.)

Interval Estimates around Effect Sizes:

Null-counternull intervalis an estimate based on the actual p rather than the chosenalpha level.


9/31

[9]

Chapter 3

Ethical Considerations, Dilemmas, and Guidelines

Puzzles and Problems:

A controversy between Wittgenstein and Popper around their views of philosophy. There areno problems in philosophy, simply linguistic puzzles revealing misuse of language (W.).

Ethics: Has to do with the values by which the conduct of individuals is morally evaluated.A Delicate Balancing Act:

Even if topics seem to be neutral (and the researcher himself as well) one must realize thatto others these topics do not necessarily be free of value.

Institutional Review Board (IRB): This group of independent people oversees the work ofscientists.

Passive Deception: The use of deception via leaving our or avoiding something. Active Deception: The deception through intentional misleading information on behalf of a

certain party.

Historical Context of the APA Code:

APA created a task force Cook Commission which wrote a code of ethics (1966; adaptedin 1972).

The APA proposed a list with 10 Ethical Guidelines.The Belmont Report, Federal Regulations, and the IRB:

Belmont Report: Includes emphasize on respect, maximization of plausible benefits(minimizing harm), and fairness.

If more than minimal risk is involved then specific safety requirements are necessary. Minimal risk research: Including studies in which the probability for the participant to get

harmed is not higher than in everyday life.

The emphasis is on 5 broad principles.Principle IRespect for Persons and their Autonomy:

Informed Consent refers to the procedure in which prospective subjects voluntarily agree.May also impair validity of the research (subject expectations) or can create doubts on its

own (paranoid ideation).

Principle IIBeneficence and Nonmalficience:


10/31

[10]

Doing good (beneficence) or doing no harm (nonmalficience). Debriefing if deception is used; also referred to as dehoaxing. Can be useful for

disclosure of information that was not revealed before as well.

Principle IIIJustice:

Placebo (represents an issue of fairness); it implies a masquerading of the real thing. Wait-list control group, here is the alternative therapy given to the control group after

results have been documented in the experimental group.

How people view ethical questions depends on their orientation. The Consequentialistview refers to the argument stating that right or wrong depends on the last consequence.

Whereas, the Deontologist view argues that the procedure matters and not the following

consequence.

For participants who have been in a placebo - or control group, a moral cost may be involvedby simply publishing the results of the study.

Principle IVTrust:

Confidentiality is intended to ensure the subjects privacy by procedures for protecting thedata (see Certificate of Confidentiality).

Principle VFidelity and Scientific Integrity:

Causism is a problem that occurs ifsomeone implies a causal relationship where none issupported by the underlying data.

Omission of data, dealing with outliers and in general the fabrication of data is important inorder to establish a meaningful association between variables. There is a discussion about

the moral and technical appropriateness of analysis and re-analysis of data.

Plagiarism refers to the stealing of anothers ideas or work.Costs, Utilities, and IRBs:

It is important not to only consider the expenses for research but also to have a look at theprobable costs of doing no research in a specific field (prospective costs/risks analysis).

Scientific and Societal Responsibilities:

Research on animals has been the foundation for numerous significant advances.


11/31

[11]

Part IIOperationalization and Measurement of

Dependent Variables

Chapter 4

Reliability and Validity of Measurements

Random and Systematic Error:

Reliabilityis the degree to which measures give consistent, dependable, and stableresults.

Validityrefers to the degree to which measures are appropriate or meaningful. Erroris the fact that all measurements are subject to fluctuations. Noiseis another description of chance error. Random erroris presumed to be uncorrelated with the actual value (push measurements

up or down, so over many trials should be very close to normal).

Systematic errorpushes measurements in the same direction and over many trials thepopulation value wont be reached.

The systematic errors in experimental research are the main concern of internal validity.Assessing Stability and Equivalence:

Three traditional types of reliability do exist and each is quantified by a reliabilitycoefficient.

i. Test-retest reliability from one measurement to another.ii. Alternate-form reliability equivalence of different versions of tests. The

correlation between two tests is called equivalent-forms reliability; the tests

are also expected to have the same variance. Stability coefficients are

correlations between scores on the same form administered to the same people

at different times. Coefficients of equivalence are correlations between scores

on different forms administered at the same time. Cross lagged correlationsare used to indicate both, the stability and the equivalence.

iii. Internal-consistency reliability is the degree of relatedness of items thatmeasure the same thing in a test, reliability of components.

Internal-Consistency Reliability and Spearman Brown:


12/31


13/31

[13]

i. When the replication is conducted. Thus, early replications are typically moreuseful.

ii. How the replication is conducted. A precise replication is intended to be asclose as possible to the original whereas a varied replication intentionally

changes an aspect of it.

iii. Who conducted the replication because of the problem of correlatedreplicators (same person replicates finding over and over again). Unfortunately

and fortunately, some researchers are precorrelated by virtue of their common

interests.

Validity Criteria in Assessment:

Determining validity typically depends on accumulation of evidence in three specificfields.

i. Content validityrequires that the items represent the material/content theyshould.

ii. Criterion validityconcerns the extent to which the test correlates with criteria itshould correlate with. When a criterion is in the immediate present, we speak of

concurrent validity. Another type of criterion related indication is predictive

validity.

iii. Construct validityis a measure of the psychological characteristic and itscorrelation with the questionnaire that was used to assess it. There are two

types of construct validity. The convergence (convergent validity) across

different methods or measures of the same trait and the divergence (divergent

validity) between measures of related but conceptually distinct behaviors or

traits. The Multitrait-multimethod matrix of intercorrelations is used totriangulate the convergent and discriminant validity of a construct.

Test Validity, Practical Utility, and the Taylor-Russell Tables:

Selection ratiois the proportion of applicants to be selected by a test. The selectionaccuracy increases as the validity coefficients increase and the selection ratios decrease.

The benefits of increasing validity coefficients are usually greater as selection ratios

decrease.

Relationship of Validity to Reliability:

There is no minimum level of internal-consistency reliability needed for validity but inpractice low reliabilities associated with high validity are not common. In practice thevalidity of a composite instrument depends on the average validity, number, and

intercorrelation of each individual item, subtest or judge. The larger the better.


14/31

[14]

Chapter 5

Observations, Judgments, and Composite Variables

Observing, Classifying, and Evaluating:

refers to data in written form, records, etc. consists of numerical data.

Observing While Participating:

The preferred strategy of investigation of ethnographers is participant observation,which means the interaction while participating as observers in a culture.

Maximizing Credibility and Serendipity:

Time sampling involves sampling of specified periods and recording everything of interestin this time.

Behavioral sampling is used when the behavior itself is periodically sampled. Events are relatively brief occurrences in a specific moment. States are occurrences of longer duration. Condensed account is signed by the participants implying a permission to use gathered

data.

Expanded account includes details from recall of things that were not recorded earlier.Organizing and Sense-Making in Ethnographic Research:

Organize it chronically and zoom from broad to narrow. Analytic serendipity starts with knowledge about the current literature and follows the

asking of questions about particular phenomena.

Interpreter and Observer Biases:

Are normally categorized as . These kinds of artifacts are systematicerrors that operate in the hand of the scientist but are not due to uncontrolled variables that

might interact with the participants behavior.

are systematic errors that occur while interpreting the data (e.g. whileclinging to a specific theory).

refer to systematic errors in the recording phase of research (perceptiondoes not equal reality) generally in favor of the researcher hypothesis.

Unobtrusive Observations and Nonreactive Measurements:


15/31

[15]

are not controlled for and they do have a direct impact on thereactions of research participants. Reactive measures affect the behavior that is being

measured, whereas nonreactive measures do not.

Archives:

Two subcategories exist in archival research. Running records, such as actuarial data (birth,marriage, facebook timeline) and personal documents.

Archival Data and Content Analysis:

is the name given to a set of procedures that are used to categorize andevaluate pictorial, verbal, or textural material. Use something (hypothesis, common sense)

as a basis to judge something.

Physical Traces:

Simple unobtrusive observations are observations that are not visible for the person beingobserved.

Unobtrusive Observation:

Contrived unobtrusive observations are unobtrusive measures in manipulated situations.Selecting the Most Appropriate Judges:

, judgesare used to evaluate or categorize the variables of interest(random judges or expert judges depending on the corresponding hypothesis).

There is generally a positive correlation between bias toward a category and accuracy in thatcategory.

Effects of Guessing and Omissions on Accuracy:

The scoring of omitted items as zero gives too little credit. It is reasonable to credit theseitems with the score that would be attained via random guessing.

Forced-Choice Judgments:

is a procedure used to overcome the halo error. refers to a type of response set in which the person being evaluated is judged in

terms of a general impression.

Categorical Scales and Rating Scales:

are responses on some form of continuous rating scale (numerical-, magnitude and graphic scale).

Numerical Formats:


16/31

[16]

has numbers as anchors and a following description which is used to givethose numbers meaning and a specific context to evaluate.

Graphic Formats:

Are simply straight lines (mostly invisibly consisting of different areas) in which the judge orthe participant should indicate his/her position (attitude) with regard to construct inquestion (anchor).

Scale Points and Labels:

The greatest benefits to reliability with regard to the measurement instrument accrued asyou go from 2 to 7 points in scale.

contain gradual distributions of one aspect. contain gradual distributions of one aspect and its opposite anchored with

neutrality in the middle.

Magnitude Scaling:

Magnitude scaling is a concept in which the upper range of the score is not defined but leftfor interpretation of the judge (open-ended).

Rating Biases and their Control:

tend to rate someone who is familiar more positively. if you getting reminded about the leniency you tend to evaluate more

negatively.

occurs if raters avoid giving extreme ratings. in ratingrefers to the problem that judges rate variables or dimensions in a

similar way, only in the basis of perceived logical relatedness.

The difference between halo and logical error is that the latter one results from thejudgesconscious evaluation of relatedness and not from a subjective feeling.

Bipolar Versus Unipolar Scales:

It may often be worth the effort to use more unipolar scales in hopes of turning up somesurprises; but bipolar are more common.

Forming Composite Variables:

To form composite variables, you start standardizing the scores of all available variables andthen replace these score with the overall mean.

Benefits of Forming Composite Variables:


17/31

[17]

If variables are highly correlated with each other it is hard to treat them as being different.Thus, conceptually the forming of composite variables is beneficial. If you combine variables

you are able to obtain more accurate estimates of the relationships with other composite

variables and you reduce the number of predictors.

Forming Composites and Increasing Effect Sizes:

Only when individual variables are perfectly correlated with each other there is no benefit offorming composite variables. The lower the mean intercorrelation among the individual

variables, the greater will be the increase in r.

The Intra/Intermatrix:

average means the average between variables within a single composite. average characterizes the level of relationship between one composite

variable and another.

The r Method:

In this method the point-biseral correlation is computed. This is the Pearson r, where one ofthe variables is continuous and the other is dichotomous. The rpb is computed between the

mean correlations of the intra/Intermatrix (continuous) with their dichotomously located

position on the principal diagonal, versus off the diagonal. The more positive the correlation,

the higher the intra than the inter.

The g Method:

G is the difference between the mean of the meanrs on the diagonal and the mean of themean rsoff the diagonal divided by the weighted s, combined from the on-diagonal (intra)

and off-diagonal (intra) values of r.

The Range-to-Midrange Ratio:

Is used if you cannot use the r or the g method.


18/31

[18]

Part IIIThe Logic of Research Designs

Chapter 7Randomized Controlled Experiments and Causal Inference

Experimentation in Science:

The most important principle is randomization. Experimentation is a systematic study design to examine the consequences of deliberately

varying a potential cause agent. Variation, posttreatment measures, and inferential

techniques are common features.

Randomized Experimental Designs:

Between-subjects designs are regarded as the gold standard, such as randomly assigningthe subjects to a specific treatment group or placebo.

As there is a lot of ethical discourse about the use of placebos, they may only be used whenthere are no other therapies available that are suitable for comparison.

Wait-list control groups are another possibility; combined with repeated measurements,these allow us to gain further information about the temporal effects of the drug.

Within-subjects design, or crossed design is used when each participant is receiving morethan one treatment or is in more than one condition (e.g. being experimenter and subject atthe same time).

To address the problem of systematic differences between successive conditions, theexperimenter can use counterbalancing, that is rotating the sequences of the conditions in

a so called Latin square.

Afactorial design includes more than one variable (or factor) which is measured by includingmore than one level of each factor (e.g. 2x2). Also known as full factorials.

Fractional factorial designs only use some combinations of factor levels.

Mixed factorial designs, consisting of both between- and within-subjects factors.

Characteristics of Randomization:

The rare instances in which very large differences between conditions existed even beforethe treatments were administered are sometimes referred to as failures of randomization.

Another variation on the randomized designs noted before is to use pretest measurementsto establish baselines scores for all subjects.


19/31

[19]

Statistically randomness means that each participant has the same probability of beingchosen for a particular group.

The Philosophical Puzzle of Causality:

Puzzling topic this causality! Causal relations imply an association between cause and effect. There are four types of causation:

i. Material causerefers to the elementary composition of things.ii. The formal cause is the outline, conception, or vision of the perfected thing.iii. Efficient cause (!) is the agent or moving force that brings about the change.iv. The final cause, or teleological explanation, refers to the purpose, goal, or ultimate

function of the completed thing.

Contiguity, Priority, and Constant Conjunction:

Hume, the sensation of causation is fictional. He came up with eight rules to judge causesand effects; boiled down to three essentials.

i. Contiguityin space and time.ii. Prioritywhich means that the cause most come before the effect.iii. Constant conjunction, or union,between cause and effect.

AS an example, the barometer falls before it rains, but a falling barometer doesnt cause therain.

Coincidentally, e.g. Monday precedes Tuesday but it is absurd to say that Monday causesTuesday. Therefore, there is a missing ingredient for causality.

Four Types of Experimental Control:

Control conditions, and control, are implying a check for the treatment condition. Behavior controlrefers to the shaping of learned behavior based on a particular schedule of

reinforcement designed to elicit the behavior in question.

Mills Method of Agreement and Difference:

Method of agreementstates that, If X, then Y. X is in this case a sufficient condition (oradequate) to bring about the effect in Y.

Method of difference describes that, If not-X, then not-Y; thus X is not just a sufficientcondition of Y, but a necessary one. In other words, X is required for Y to occur.

Between-groups Designs and Mills Joint Method:


20/31

[20]

The group given the treatment (experimental condition) resembles the method ofagreement, whereas the group not given the drug (control condition) resembles the

method of difference.

The method described above is referred to asjoint method of agreement and difference.Independent, Dependent, and Moderator Variables:

Independent variable is the antecedent which evokes a presumable change in an outcomevariable (predictor)

Dependent variable is the status of a measurable consequence (criterion). Moderatorsmay alter the relationship between cause-and-effect variables. Mediator variables are defined as factors that intervene between the independent variable

and the outcome variable in a causal chain.

To avoid the implied meaning of a causal chain, researchers try to focus on functionalrelations/correlations instead.

Solomons Extended Control Group Design:

Investigates the possible sensitizing effects of pretests in pre-post-test designs. It is assumedthat pretests might change the subjects attitudinal set. Therefore, it is argued that a three-

group or preferably a four group design is the (Solomon) way to go. (Control II = pretest

control)

The additional group in a four way design is control for history, or the effects ofuncontrolled events that may be associated with the passage of time. (Control III = history

control)

Threats to Internal Validity:

Internal validityand causal inference depends on operationalizing a reliable relationship andits presumed cause (covariation), the evidence of temporal precedence and the out

ruling of possible rival explanations internal validity.

i. Regression toward the meanhas not to do with actual scores, but rather withpredicted ones. It occurs when pre and post variables consist of the same measure

taken at two points in time and the correlation between both is smaller than 1. The

following four threats to internal validity are all diminished by using a Solomon

design.

ii. Historyimplies a source of error attributable to an uncontrolled event that occursbetween pre and post measurement and can bias the post measurement.

iii. Maturationdescribed the intrinsic changes in the subjects and is confounded withinternal validity as target.

iv. Instrumentationrefers to intrinsic changes in the measurement instruments.


21/31

[21]

v. Selectionis a potential threat when there are unsuspected differences between theparticipants in each condition; random allocation is not a guarantee of comparability

between groups.

Threats to External Validity:

External validityis the validity for inferences about the possibility to generalize findings.Three issues are conflated in this broader use of the term.

i. Statistical generalizabilityrefers to the representativeness of the results to a widerpopulation.

ii. Conceptual replicabilityor robustness.iii. Realism, which can be divided into mundane realism (extent of analogous meaning

from laboratory to natural setting) and experimental realism (psychological impact

of the experimental manipulation on the participants.

Representative research designdescribes an idealized experimental model in whicheverything is perfect.

Ecologically validare experiments that satisfy this criterion of representativeness. Single stimulus design (e.g. use of experimenters of one sex) has two major limitations.

i. If there are differences in results after using a second stimulus (femaleexperimenter) we cannot conclude if the results are still valid or confounded which

resembles a threat to internal validity.

ii. If we fail to find differences, this could be due to the presence of an uncontrolledstimulus variable operation either to counteract or to increase an effect artificially

to a ceiling value.

The use of convenience samples is topic of major debate (see convenience sample mice,Hull/Tollman)

Statistical Conclusion and Construct Validity:

Statistical conclusion validityis concerned with inferences about the correlation betweentreatment and outcome (Humes, contiguity of events). See statistical power, fishing for

statistically significant effects, etc.

Construct validityrefers to higher order constructs that represent sampling participants.Thus, whether a test measures the characteristics that it is supposed to measure.

Subject and Experimenter Artifacts:

Artifacts are factors that are evenly distributed across conditions and result in inaccuratefindings.

Subject and experimenter artifacts are systematic errors due to uncontrolled subject- orexperimenter-related variables.


22/31

[22]

Hawthorne effectdescribes that human subjects behave in a special way because they knowthey are subjects and under investigation.

Saul Rosenzweig (one badass name!) describes three artifacts.i. Observational attitude of the experimenter.

ii. Motivational attitude.iii. Errors of personality influence (e.g. warmth or coolness of the experimenter).

Dustbowl empiricist view that emphasizes only observable responses as acceptable data inscience leaving all the cognitive accounts.

Demand Characteristics and Their Control:

Demand characteristics are subtle, uncontrolled task orienting cues in an experimentalsituation.

Good subject effectis cooperative behavior is used to support the view of the authorityconducting the research.

Quasi-control strategyhas the idea that some of the participants step out of the goodsubject role and act as co-investigators in the search of truth. Thus, having participants

serve as their own control and maybe disclose the factors that determined their behavior

later on.

Preinquiriyis another use of quasi control, meaning that some of the prospectiveparticipants are sampled and afterwards separated from the pool.

Evaluation apprehensions are spurious effects resulting from the participants anxietiesabout how they would be evaluated.

In some experimental situations some subjects may feel a conflict between evaluationapprehension and the good subject effect (looking good versus doing good).

Interactional Experimenter Effects:

Interactional experimenter effects are ate least to some extend attributable to theinteraction between experimenters and their subjects. There are five classes of effects.

i. Biosocial attributes including the biological and social characteristics ofexperimenters, such as gender, age, and race.

ii. Psychosocial attributes include factors such as personality and temperament.iii. Situational effects refer to the overall setting, including the experience of the

researcher.

iv. Modeling effect. It sometimes happens that before the experimenters conducttheir studies, they try out the tasks themselves. If a modeling effect occurs, it is most

likely to be patterned on the researcher opinion.


23/31

[23]

The use of replications is one amongst the most powerful tools available to control forthese kinds of artifacts.

Experimenter expectancy is a virtual constant in scienceand may lead to a self-fulfillingprophecy.

Experimenter Expectancy Effects and Their Control:

Several strategies available to control for the effects of experimenters expectancies.i. Increasing the number of experimenters.

ii. Monitoring the behavior of experimenters.iii. Analyzing experiments for order effects.iv. Maintaining (double-) blind contact.v. Minimizing experimenter subject contact.vi. Employing expectancy control groups.


24/31

[24]

Chapter 8

Nonrandomized Research and Functional Relationships

Nonrandomized and Quasi-Experimental Studies:

Quasi-experimentalrefers experiments that are lacking the full control over thescheduling of experimental stimuli that make randomized experiments possible.

Association implies covariation to some degree. Methodological pluralism is necessary because all research designs are limited in some

way (it depends).

There are four types of nonrandomized strategies.i. Nonequivalent groups design, in this case, the researchers dont have any

control about the assignments to groups (historical control trials).

ii. Interrupted time-series designs uses large numbers of consecutive outcomemeasures that are interrupted by a critical intervention. The objective is to

assess the causal impact of the intervention by comparing before and after

measurement.

iii. Single-case studies primarily used as detection experiments and frequently inNeuroscience.

iv. Correlational design, is characterized by the simultaneous observation ofinterventions and their possible outcomes (retrospective covariation of X and Y).

Diachronic research is the tracking of the variable of interest over successive periods oftime. Synchronic research is the name for studies that take a slice of time and examine

behavior only at one point.

Nonequivalent Groups and Historical Controls:

Nonequivalent-groups design in addition to their resemblance to nonrandomizedbetween-groups experiments is that there is usually a pre and post observation or

measurement. But it is not always possible to not assign certain people to treatments

(Ethics). Furthermore, self-selection or assignment biases can introduce problems. One

way to overcome these obstacles is to introduce randomization after assignment orthrough a wait-list control design.

Historical controlsor literature controls are often uninterruptable and dangerouslymisleading (see clinical data).

Net effects may often mask true individual effects; it can lead to spurious conclusionsbecause of a statistical irony named Simpsons paradox. The raw data should not be

pooled before the individual results are first examined.


25/31

[25]

Interrupted Time Series and the Autoagressive Integrated Moving Average:

Interrupted time-series designs makes use of sampled measurements obtained atdifferent time intervals before and after the intervention. Time series because there is

a single data point for each time and interrupted because there is a clear cut for the

intervention (choose a sampling interval that will capture the effects of interest).

First step is to define the period of observation, then obtain the data. Last step is usingidentification of underlying serial effects (ARIMA) and checking the fitted model.

Pulse function is an abrupt change that lasts only a short time. A series of observations must be stationary which means that it is assumed that values

of the observations fluctuate normally about the mean as opposed to systematically

drifting upward or downward.

Secular trendare systematic in/decreases in the level of the series. A secular trend can be made stationary by differencing. The procedure of

differencing consists of subtracting the first observation from the second and so forth.

Autocorrelation refers to the extent to which data points or observations are dependenton one another or can be assumed to be independent.

i. Regular describes the dependency of adjacent observations or data points onone another.

ii. Seasonal describes the dependency of observations separated by one periodor cycle.

Single-Case Experimental Designs:

Single-case experimental studies involve repeated observations but on a single unit(case), or in a few ones. Individual behavior is first assessed under specific baseline

conditions against which any subsequent changes in behavior can be evaluated after an

environmental treatment is manipulated.0

A-B-A, also called reversal design where A is no treatment (baseline) and B istreatment phase can be extended or changed in order whatever is most suitable.

If the treatment is counterproductive or ineffective, the researcher can terminate theenvironmental manipulation or alter the scheduling of events. These kinds of studies are

mostly cost-effective but still time consuming and might not be generalizable (thus, the

need for replication).

Direct replication which means the repetition of the same study. Systematic replicationrefers to the variation of an aspect of the previous study.

Cross-lagged Correlational Designs:


26/31

[26]

Correlational designs and cross-lagged panel designs are frequently used in behavioralsciences. Cross lagged implies that some data points are treated as temporally lagged

values of the outcome measures. Panel design is another name for longitudinal

research (increased precision of treatment and added time component in order to be

able to detect temporary changes).

Observed bivariate correlations can be too high, too low, spurious or accurate (causation[?]) depending on the pattern of relationship among the variables in the structure that

actually generated the data. The absence of correlation in cross-lagged designs is not

proof of the absence of causation.

Three sets of paired correlations are represented.i. Test-retest correlations(ra1a2, rb1b2)

ii. Synchronous correlations (ra1b1, ra2b2)iii. Cross lagged correlations (ra1b2, rb1a2)

In fact, relationships are seldom stationary, but instead are usually lower over longerlapses of time-described as temporal erosion; the reduction is also called attenuation

and the leftover is referred to as residual.

Invisible Variables and the Mediation Problem:

Path analysis necessary (questionable) concept because by simply observing variablesone cannot assess causality. Therefore, via removing the alternate pathways the goal

is to settle for the most probable one which is then used to infer causality.

Third-variable problemmight be also characterized as the invisible variables problembecause there could be more than one hidden confounding variable. Any variable that is

correlated with both A and B may be the cause of both.

The possibility that a causal effect of some variable X on outcome variable Y is explainedby a mediator variable that is presumed to intervene between X and Y.

The estimation of parameters is recommended by using the procedure ofBootstrapping or Jackknife.

It is not possible to prove that a causal hypothesis is true, but it might be possible toreject untenable hypotheses and, in this way, narrow down the number of plausible

hypotheses.

The Cohort in Longitudinal Research:

Longitudinal research has the purpose to examine peoples responses over an extendedtime span.

Cohortor generation is a group of people born about the same time and having hadsimilar life experiences.


27/31

[27]

Longitudinal data are usually collected prospectively but the data can also be gottenretrospectively from historical records.

Different forms of Cohort Studies:

A cohort table provides basic data and enables the researcher to detect importantdifferences between cross-sectional and cohort designs.

The concepts of age, cohort, and period may not be operationally defined the same wayin different fields.

Age effect implies changes in average responses due to the natural aging process. Time of measurement effect implies some kind of impact of events in chronological

time.

Cohort effect implies past history specific to a particular generation These three effects cannot be estimated simultaneously in any of these (following)

designs.

i. Simple cross-sectional design, subjects at different ages are observed at thesame time.

ii. Simple longitudinal design, subjects of the same cohort are observed overseveral periods.

iii. Cohort-sequential design, several cohorts are studied, the initial measurementsbeing taken in successive years.

iv. Time-sequential design, subjects at different ages are observed at differenttimes.

v. Cross-sequential designs, several different cohorts that are observed overseveral periods are initially measured in the same period.

Subclassification on Propensity Scores:

Subclassification; it is claimed that via increasing the number of subclasses theprecision of the analysis increases. Five or six subclasses ordinarily reducing the bias in

the raw comparison by some 90%.

Multiple Confounding Covariates:

Combine all confounding variables into a single variable. In work with multipleconfounded covariates, the propensity score is computed from the prediction of group

membership scored one or zero.

It does not require a particular kind of relationship between the covariate and theoutcome within each condition, whereas the regression approach does. The major

limitation of the propensity score method is that it can adjust only for observed

confounding covariates.


28/31

[28]

Chapter 9

Randomly and Nonrandomly Selected Sampling Units

Sampling a Small Part of the Whole:

Probability sampling, herei. Every sampling unit has a known nonzero probability of being selected.

ii. The units are randomly drawn.iii. The probabilities are taken into account in making estimates from the sample.

Convenience samplesare raising the concern about generalizability (e.g. student samples). Paradox of sampling implies that the sample is of no use when its not representative of the

population. In order to know that its representative, you need to know the whole population

characteristics, but then you dont need the sample in the first place.

Sampling plansspecifies how the respondent will be selected. This is one way to overcomethe paradox of sampling by consideration or the procedure by which the sample is

obtained.

Bias and Instability in Surveys:

Point estimates are central values of frequency distributions. Interval estimates are the corresponding variability of the underlying distribution (margin of

error [SE*crit.]).

True population value is the point value we would obtain based on analyzing all the score inthe population.

Bias is the difference between the true population value and our estimate of it from asampling distribution.

Stabilityrefers to the actual variation in data. Precision describes the bias which leads to estimation that is systematically above or below

the true value.

Instability results when the observations within a sample are highly variable and the numberof observations is small. The more homogenous the members of the population, the fewer

of them need to be sampled.

Simple Random-Sampling Plans:

Simple random sampling meaning that the sample is selected from an undivided population(simple) and chosen by process that gives every unit in the population the same chance of

being selected (random).


29/31

[29]

Random sampling without replacementdescribes the procedure in which a selected unitcannot be reselected and must be disregarded on any later draw.

Random sampling with replacementrefers to the possibility to reselect a unit.Improving Accuracy in Random Sampling:

In doing stratified random sampling we divide the population into a number of parts andrandomly sample in each part independently (unbiased procedure). It can pay off in an

improved accuracy by enabling researchers to randomly sample strata that are less variable

than the original population.

Speaking of Confidence Intervals:

Bayesian: We can be % confident/certain that the population value we are trying toestimate falls between those lower and upper limits.

Frequentist: With repeated samplings for each sample of % confidence intervals, we willbe correct in % of our samples when we say that the quantity estimated will fall within the %confidence interval.

Other Selection Procedures:

Area probability sampling, the population is divided into selected units that have the sameprobability of being chosen as the unselected units in the population cluster, all done in

stages (multistage cluster sampling).

Systematic samplinginvolves methodological selection of the sampling units in sequencesseparated on lists, by the interval of selection. It starts randomly and is based on the

selection of units from the population at particular intervals.

Haphazard/Fortuitous samples are amongst the most common designs of nonrandomselection (e.g. informal polls).

Quota sampling involves obtaining specified numbers of respondents to create a sample thatis roughly proportional to the population.

Nonresponse Bias and its Control:

Increasing the effort to recruit the nonrespondents decreases the bias of point estimates inthe sample, independent on what design you are using.

We are often in a position to suspect bias but unable to give an estimate of its magnitude. Prepaid monetary incentives produce higher response rates than promised incentives or gifts

offered with the initial encounter.

A proposed theoretical model was based on Walds model of nonresponse.Studying the Volunteer Subject:


30/31

[30]

The volunteer subject problem can be understood as a variant on the problem ofnonresponse bias.

Approaches to compare volunteers and nonvolunteers range from looking through archives,recruiting volunteers, to second-stage volunteers comparisons etc.

Volunteering might have both general and specific predictors.Characteristics of the Volunteer Subject:

There are conclusions warranting maximum/considerable/some/minimum confidence forthe characteristics of volunteers. The maximum are following.

i. Better educated, and higher social class status.ii. More intelligent.iii. Higher in the need for social approval and more sociable.

Implications for the Interpretation of Research Findings:

Merely increasing the size of the sample of volunteers will not reduce the bias, but an effortto recruit more nonvolunteers or to use probability sampling as well as attempting to reduce

the nonresponse rate would target this problem.

Situational Correlates and the Reduction of Volunteer Bias:

There are conclusions warranting maximum/considerable/some/minimum confidence forthe situational correlates of volunteering. The maximum and considerable ones are

i. More interested in the topic.ii. Higher expectation to be favorably evaluated by the investigator.iii. Perceiving the investigation as important.iv. Feeling states influence participation (e.g. guilt triggers likelihood).v. An incentive increasing the likelihood and stable personality moderates this.

Explaining participants the significance of the research results in giving up trivial researchand increases the likelihood of authentic participation.

The Problem of Missing Data:

The primary problem of missing data is the introduction of bias into the estimations; anadditional problem is decreased statistical power.

MCAR, missingness unrelated to any variables of interest (missing completely at random). MAR, missingness is related, but can be accounted for by other variables (missing at

random).


31/31

MNAR, missingness related to variables of substantial interest and cannot be fully accountedfor by other variables (missing not at random).

Procedures for Dealing with Missing Data:

Two approaches, either nonimputational or imputational. Nonimputationalcan be listwise deletion (drop all) or pairwise deletion (compute missing

from available). The latter one requires the data to be MCAR to yield unbiased estimates.

Maximum likelihoodand Bayesian estimation are two procedures which can lead unbiasedresults when the data are MAR.

Imputationalcan be single or multiple. Single imputational procedures can be furthersubdivided into four alternative procedures.

i. Mean substitution procedure, here are all missing values for any given variablereplaced by the mean value of that variable (only if MCAR).

ii. Regression substitution procedure, all missing values are replaced by the predictedvalue of that variable from a regression analysis using only cases with no missing

data (only if MCAR).

iii. Stochastic regression imputation adds a random residual term to the estimatesbased on regression substitution and often yields more accurate analyses than

regression substitution.

iv. Hot deck imputation is a procedure in which cases without missing data arematched to the cases with the missing data.

In multiple imputation each missing observation is replaced not by a single estimate, butby a set ofm reasonable estimates that will yield m pseudocomplete data sets; these are

later on combined to obtain a more accurate estimate of variability than what is possible

with single imputational techniques. These procedures tend to be much simpler

computationally than Bayesian or Maximum Likelihood Estimation and most useful.

outline research methods

Documents