reliability - blog.metu.edu.tr€¦ · web viewr = 0.86 . rs = 0.92. i think reliability is high...

13
TEST ANALYSIS Yağmur Yaman 1929538 [email protected]

Upload: others

Post on 24-Sep-2019

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

TEST ANALYSIS

Yağmur Yaman

1929538

[email protected]

Page 2: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

Table of ContentReliability.................................................................................................3

Statistical Analyses...................................................................................4

Item Analysis............................................................................................5

Grading.....................................................................................................7

Reference..................................................................................................8

2

Page 3: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

Reliabilityr = 0.86 rs = 0.92

I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that reliability is good. However, we use split half technique because there is only one exam. Split half is normally higher than the correlation because of the one exam. That is why in our project, split half is 0.92 which means that the reliability of the test is very good. The reliability is very high because we have one exam and we divide it into to part.

We can increase the reliability by adding more items. Number of assessment task is more important. If we add more items, we can measure more information. Moreover, range is the other important thing to increase the reliability. If we use short range score such as 1-5 range, we could not measure reliability rather than the larger range of scores. However if we use larger range such as 1-100 range it gives us a better result in terms of reliability. Furthermore, another important thing is objectivity to increase the reliability. If we are subjective to evaluate students’ exam, ıt does not give the health and correct results. That is why assessments such as multiple choices, short answer, true-false and multiple choice questions should be objective. Also it is important to score them accurately. If there are open- ended questions or complex performance assessments, we should set standard rules for scoring to be more objective and if possible we should share the rubric to students.

We can do lots of thing in the test and its administration to increase the reliability of test. One of them is that we should pay more attention to prepare good test questions which means that students can easily understand what you want in question. Therefore, questions should be clearer as much possible as. The other thing is planning the test and determining the time of test. If we do not give enough time to the students, we could not get reliable results because students could not focus the test very well. Another thing is writing clear directions. If the students could not understand what teacher wants, they could not give reliable answers. The last thing is giving more frequent tests. Giving more than one test is more reliable than giving one test to students (Jacobs, 1991).

Statistical AnalysesM=14.17Mdn=13.50Mode= 5.00, 8.00, 10.00, 12.00, 13.00, 15.00, 19.00, 20.00, and 24.00Range=23.00SD_6.58

3

Page 4: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

There is some calculations we need to do. These are mean, median, mode, range, and standard deviation.

Mean shows the average of the students’ scores. It is represented as M. In this project the mean is 14, 17. I think it is normal not very good or bad. The mean is founded nearly in the middle score because there is high and low scores nearly equal as we consider out of 26 score.

Median is used to determine the middle score. It is represented as Mdn. We should ascending short which is very important while finding median. If the total number of scores are odd, we can get directly middle of the score as median. However, if the total number of scores are even, we can get two score which are in the middle. Then we sum up these two scores and divide by two. The result gives us the median. In this project, the median is 13, 50. I think it is also good because it is very close to mean as we consider out of 26 score.

Mode shows us the most repeated score. It is represented as Mode. If there is more than one mode, it is called bimodal. In this project, there is bimodal that 5, 8, 10, 12, 13, 15, 19, 20, and 24 which are repeated two times. I think it is not good because it is not support normal distribution. A modal is better than bimodal.

Range shows us differences between highest and lowest score. It is represented as Range. If we increase the range of the score, we get more reliable results. In this project, the highest score is 26 and the lowest score is 3. The range is 23 in this project. I think the range is good when we consider the out of 26 question. However, it could be better than 23. If we increase the range it gives us more reliable scores. To do this we should increase the number of questions.

(Wikipedia, 2016) Standard deviation is a measure. It is represented as SD. It shows quantify the amount of variation or distribution of a set of data values in this project this is number of the correct answer. Also it shows distance from the mean. In a low standard deviation, we can interpret that the data points tend to be close to the mean of the test. On the other hand, in a high standard deviation, we can interpret that the data points are expand over a wider range of values. In this project, the standard deviation is 6, 58. I think it is good when we consider this project. However, if we increase the standard deviation, it gives us larger normal distribution. And larger normal distribution gives us more reliable and comprehensive results.

4

Page 5: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

Item AnalysisIn figure 1, this item is moderate in terms of difficulty. The number which is in discrimination part is in the border to discriminate between upper students and lower students. To increase this number, we should look the D option again because lower students also eliminate this option. This means that d option is absolutely irrelevant option. Instead of this, we should add harder distractor.

ITEM 1 Frequencies Indices

Students Alternatives*A B C D

Omits Difficulty Discrimination

Upper 10 8 1 1 0 0 70% 0.20

Lower 10 6 3 1 0 0

Figure 1

In figure 2, this item is moderate in terms of difficulty. The number in the discrimination part is good to discriminate between upper students and lower students. Even so, there is some problem in this item. We should look the D and C option again because lower students also eliminate this option. This means that these options are absolutely irrelevant. Instead of this, we should add harder distractors. However, B option is hardest distractor for lower students; we should also look this option.

ITEM 17 Frequencies Indices

Students Alternatives*A B C D

Omits Difficulty Discrimination

Upper 10 8 0 0 0 0 60% 0.50

Lower 10 3 7 0 0 0

Figure 2

In figure 3, this item is difficult question in terms of difficulty. The number in the discrimination part is good to discriminate between upper students and lower students. However, C id hardest distractor for lower students because C is more preferred then B which is the correct answer. We should look the C option again.

5

Page 6: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

ITEM 14 Frequencies Indices

Students AlternativesA * B C D

Omits Difficulty Discrimination

Upper 10 2 6 2 0 0 35% 0.50

Lower 10 2 1 6 1 0

Figure 3

In figure 4, this item is moderate in terms of difficulty. The number in the discrimination part is very good to discriminate between upper students and lower students. Even so, there is some problem in this item. We should look the D option again because lower students also eliminate this option. This means that these options are absolutely irrelevant. Instead of this, we should add harder distractors. However, B option is hardest distractor for lower students because instead of choosing A they mostly chose b option. We should also look this option again.

ITEM 18 Frequencies Indices

Students Alternatives*A B C D

Omits Difficulty Discrimination

Upper 10 10 0 0 0 0 60% 0.80

Lower 10 2 7 1 0 0

Figure 4

In general, the item difficulty is good because there is no very easy and very difficult questions in this project. Out of 26 item 16 item between 20 and 60 ranges (60 is not included) in terms of item difficulty it means 20 questions are difficult. Out of 26 item 10 item between 60 and 80 ranges in terms of item difficulty it means 10 questions have moderate difficulty.

In general, the discrimination is very good because there is at least 0.20 discrimination in this project. Out of 26 item 20 item greater than or equal to 0.50 in terms of item discrimination. It means that, the questions distinguish between lower students and upper students very well.

6

Page 7: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

Grading

I will select catalog grading system because in this system, criteria are always the same. It does not change according to the student exam performance like curve system. In curve system, students evaluates in comparison with other students performance or exam notes. Also students know how they will be evaluated, what are the criteria in the catalog system, but in the curve system, students generally don’t know according to which criteria they got the score. Sometimes teacher cannot be the fair. That is why I chose the catalog system. In catalog system, I determined the criteria that students who make 12 out of 26 questions will receive a passing grade. It means that students should give correct answer at least nearly 50 percentages of the questions for getting enough point to pass the course. When we looked this project, according to the criteria, 76, 67 percentages of the students will get DC and over grade. More than half students will get passing grade and over. 23, 33 percentages of the students will not get passing grade in this exam. I think this is fair for the students because they should learn nearly half of the topic to pass the course. You can see the grading policy detail in the figure 5.

 Score interval Frequency Grade Percentage3-5 4 FD 13.336-8 3 DD 10.00

7

Page 8: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

9-11 4 DC 13.3312-14 5 CC 16.6815-17 4 CB 13.3318-20 4 BB 13.3321-23 3 BA 10.0024-26 3 AA 10.00

Figure 5

8

Page 9: Reliability - blog.metu.edu.tr€¦ · Web viewr = 0.86 . rs = 0.92. I think reliability is high because correlation is 0.86. If the correlation is bigger than 0.80 this means that

ReferenceJacobs, L.C. (1991). Reliability for Teachers Activity: How can teachers increase their classroom

tests’reliability?. Retrieved Jenuary 1, 2017 from http://www.k-state.edu/ksde/alp/activities/Activity2-4.pdf.

Wikipedia. (2016, December 16). Standard deviation. Retrieved Jenuary 1, 2017 from https://en.wikipedia.org/wiki/Standard_deviation.

9