a" research methods reliability and validity
DESCRIPTION
Research methodsTRANSCRIPT
Validity! We need to find out if our
research is sound. Do our tests
measure what they claim to measure?
Are techniques used to collect data in tests, questionnaires, interviews and observations measuring what is claimed? For example was the Strange Situation
really measuring attachment style?
We need to be able to measure
or observe something time after time and produce the
same or similar results
I want to measure intelligence. If the same person sits the test on several occasions and the results change each time, then that test
lacks reliability
The test also arguably lacks validity because the scores are meaningless
If I test my participants again several months later and their scores remains consistent, I can
say the test is reliable, but it might still lack validity.
Is an A level in Psychology a valid and reliable assessment of your performance in Psychology.
This measures consistency from one occasion to another – the same result should be found on different days, in different
labs , observations or interviews, by different researchers
I exposed these teenage brain cells to 1000 PowerPoint slides last
Monday and they’re all dead
I thought that was a fluke but they seem to be shrivelling after only five
minutes!
Participants take the same test on different occasions – a high correlation between test scores indicates the test has good external reliability .
Timing is crucial. Why?
January June
I hope that’s the right
answer this time
This refers to the consistency of a researcher’s behaviour.A researcher should produce similar test results, or make similar observations or
carry out interviews in the same way on more than one occasion.
Thanks for taking part today. Any
problems and I’ll be right over. Take
your time.
Right. Let’s get on. Fast as you can.
How much longer before I can get in the pub and relax
my facial muscles?
In observational studies this is known
as inter-observer reliability – observers have to agree on what they see and carry out the same procedure
Consistency between different researchers working on the some
study is very important for reliability
There should be a high positive
correlation between the
scores of different
observers
Improving Researcher Reliability1. Increase reliability by standardising
instructions2. Carry out a pilot study to improve
procedures and materials3. You will be thoroughly trained in the use of materials and procedures prior to our study taking place
This measures the extent to which a test or procedure is consistent within itself, i.e., questionnaire items or questions
in an interview should all be measuring the same thing
Do you like to keep to deadlines?Do you get impatient driving?
Do you like cheese?Do you like doing several tasks at once?
Do you like chocolate?Do you get easily irritated?
Are you competitive?
This interviewer seems a little confused about
Type A personality traits
Compares a participant’s performance on two halves of a test or questionnaire – there should be a close correlation between scores on both halves of the test.
Questions in both halves should be of equal quality for good internal reliability.
Odds/Evens Top/Bottom
Would you see this as bullying or horseplay in the playground?
You would see this from your own subjective
viewpoint – we’re biased by experience and
expectation
Observers must agree about what
they are observing – they need to use
standardised behavioural categories
Measuring Reliability Match the method of estimating reliability to the
description
Test-Retest
reliability
If the measure depends upon interpretation of behaviour, we can compare the results from two or more raters.
If the results in the two halves are similar, we can assume the test is reliable
Split Half
Reliability
Splitting a test into two halves, and comparing the scores in both halves
If the results on the two tests are similar, we can assume the test is reliable
Inter-Rater
reliability
The measure is administered to the same group of people twice
If there is high agreement between the raters, the measure is reliable
The tool is measuring what it is intending to measure
=
=The findings can be generalized
beyond the context of the research situation
Assessing and Improving Internal Validity
Does our measuring tool appear to be doing
what it should?
Face validity:
One or more judges assess whether the test seems appropriate and suggest changes if necessary
Improving Validity
Improving Internal Validity
Does the content of a test cover everything in
the area of interest?
Content validity:More rigorous –
experts in the field systematically examine the tool’s components
and compare them with set standards
They have to agree the content is appropriate
Improving internal validity
• Single blind procedure - reduces demand characteristics
• Double blind procedure ….
Population Validity
Can we generalise findings from our
research participants to other population
groups?
Can we apply our findings to other contexts and situations
outside of the research setting?Ecological Validity
Improving external validity
• Sample must be representative of target population and be unbiased…..
• Research situation must reflect real life situation e.g. debate over Milgram….Strange Situation