modelo de análisis difuso para clasificar preguntas de ...palabras clave: verbos de taxonomía de...

11
REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected] 69 ARTÍCULO ABSTRACT/ In this work, a new fuzzy classification algorithm has been developed and evaluated to be used in learning quality management system to classify exam questions based on Bloom’s taxonomy strategy. An experimental evaluation test has been implemented considering several classification algorithms, trained and tested on a dataset that contains exam questions extracted from the Moodle system that belongs to private institutions in the Sultanate of Oman. The proposed fuzzy algorithm has been evaluated with the dominant classification algorithms based on machine learning models. The obtained results show that Meta classifier ‘Bagging’ outperforms all classification algorithms available in machine learning with insignificant confidence of 88.7% classification of correct instances, while the developed fuzzy algorithm could achieve significant confidence of 96.2% classification of correct instances. The proposed fuzzy algorithm outperforms the Bagging algorithm with 7.5% improvement, in term of linking exam questions to the correct Bloom’s verb categories. The outcome of this work is a Smart Bloom’s Analyzer capable of providing smart recommendations that possibly improve the assessment method in higher education institutions, a target that comes in-lined with the framework of Oman Academic Accreditation Authority (OAAA) and learning quality management system in the Sultanate of Oman. Keywords: Bloom’s Taxonomy verbs, Fuzzy Analysis model, Classification, Machine Learning.RESUMEN/ En este trabajo, se ha desarrollado y evaluado un nuevo algoritmo de clasificación difusa para ser utilizado en el sistema de gestión de la calidad del aprendizaje para clasificar las preguntas del examen según la estrategia de taxonomía de Bloom. Se implementó una prueba de evaluación experimental considerando varios algoritmos de clasificación, entrenados y probados en un conjunto de datos que contiene preguntas de examen extraídas del sistema Moodle que pertenece a instituciones privadas en el Sultanato de Omán. El algoritmo difuso propuesto se ha evaluado con los algoritmos de clasificación dominantes basados en modelos de aprendizaje automático. Los resultados obtenidos muestran que el clasificador Meta "Empaquetamiento" supera a todos los algoritmos de clasificación disponibles en el aprendizaje automático con una confianza insignificante de 88.7% de clasificación de instancias correctas, mientras que el algoritmo difuso desarrollado podría lograr una confianza significativa de 96.2% de clasificación de instancias correctas. El algoritmo difuso propuesto supera al algoritmo de embolsado con una mejora del 7.5%, en términos de vincular las preguntas del examen con las categorías verbales correctas de Bloom. El resultado de este trabajo es un Smart Bloom's Analyzer capaz de proporcionar recomendaciones inteligentes que posiblemente mejoren el método de evaluación en las instituciones de educación superior, un objetivo que se alinea con el marco de la Autoridad de Acreditación Académica de Omán (OAAA) y el sistema de gestión de la calidad del aprendizaje en El Sultanato de Omán. Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION Learning at university level presents great challenge to educators when it comes to construct course learning objectives with related exam questions that reflect learning outcomes. Research and development in educational studies have long been addressed education questions and produce ideas, 1* Fouad Jameel Ibrahim AlAzzawi, 2 Boumedyen Shannaq 1Al-Rafidain University College, Baghdad, IRAQ, 2University of Buraimi, Al-Buraimi, Sultanate of Oman, [email protected] [email protected] Fuzzy Analysis Model for Classifying Exams Questions in Learning Quality Management System Based on Bloom’s Taxonomy Verbs Modelo de análisis difuso para clasificar preguntas de exámenes en el sistema de gestión de calidad de aprendizaje basado en los verbos de taxonomía de Bloom Publicación /28-08-2019

Upload: others

Post on 21-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

69

AR

TÍC

UL

O

ABSTRACT/ In this work, a new fuzzy classification algorithm has been developed and evaluated to be used in learning quality management system

to classify exam questions based on Bloom’s taxonomy strategy. An experimental evaluation test has been implemented considering several

classification algorithms, trained and tested on a dataset that contains exam questions extracted from the Moodle system that belongs to private

institutions in the Sultanate of Oman. The proposed fuzzy algorithm has been evaluated with the dominant classification algorithms based on

machine learning models. The obtained results show that Meta classifier ‘Bagging’ outperforms all classification algorithms available in machine

learning with insignificant confidence of 88.7% classification of correct instances, while the developed fuzzy algorithm could achieve significant

confidence of 96.2% classification of correct instances. The proposed fuzzy algorithm outperforms the Bagging algorithm with 7.5% improvement,

in term of linking exam questions to the correct Bloom’s verb categories. The outcome of this work is a Smart Bloom’s Analyzer capable of

providing smart recommendations that possibly improve the assessment method in higher education institutions, a target that comes in-lined with

the framework of Oman Academic Accreditation Authority (OAAA) and learning quality management system in the Sultanate of Oman. Keywords:

Bloom’s Taxonomy verbs, Fuzzy Analysis model, Classification, Machine Learning.RESUMEN/ En este trabajo, se ha desarrollado y evaluado

un nuevo algoritmo de clasificación difusa para ser utilizado en el sistema de gestión de la calidad del aprendizaje para clasificar

las preguntas del examen según la estrategia de taxonomía de Bloom. Se implementó una prueba de evaluación experimental

considerando varios algoritmos de clasificación, entrenados y probados en un conjunto de datos que contiene preguntas de examen

extraídas del sistema Moodle que pertenece a instituciones privadas en el Sultanato de Omán. El algoritmo difuso propuesto se

ha evaluado con los algoritmos de clasificación dominantes basados en modelos de aprendizaje automático. Los resultados

obtenidos muestran que el clasificador Meta "Empaquetamiento" supera a todos los algoritmos de clasificación disponibles en el

aprendizaje automático con una confianza insignificante de 88.7% de clasificación de instancias correctas, mientras que el

algoritmo difuso desarrollado podría lograr una confianza significativa de 96.2% de clasificación de instancias correctas. El

algoritmo difuso propuesto supera al algoritmo de embolsado con una mejora del 7.5%, en términos de vincular las preguntas del

examen con las categorías verbales correctas de Bloom. El resultado de este trabajo es un Smart Bloom's Analyzer capaz de

proporcionar recomendaciones inteligentes que posiblemente mejoren el método de evaluación en las instituciones de educación

superior, un objetivo que se alinea con el marco de la Autoridad de Acreditación Académica de Omán (OAAA) y el sistema de

gestión de la calidad del aprendizaje en El Sultanato de Omán. Palabras clave: verbos de taxonomía de Bloom, modelo de análisis

difuso, clasificación, aprendizaje automático

1. INTRODUCTION

Learning at university level presents great

challenge to educators when it comes to

construct course learning objectives with

related exam questions that reflect learning

outcomes. Research and development in

educational studies have long been addressed

education questions and produce ideas,

1*Fouad Jameel Ibrahim AlAzzawi, 2Boumedyen Shannaq 1Al-Rafidain University College, Baghdad, IRAQ, 2University of Buraimi, Al-Buraimi, Sultanate of Oman, [email protected] [email protected]

Fuzzy Analysis Model for Classifying Exams

Questions in Learning Quality Management

System Based on Bloom’s Taxonomy Verbs

Modelo de análisis difuso para clasificar preguntas

de exámenes en el sistema de gestión de calidad de

aprendizaje basado en los verbos de taxonomía de

Bloom

Publicación /28-08-2019

Page 2: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

70

AR

TÍC

UL

O

methods and systems to manage the learning

process. Rague (2005) stated that the quality

of educational process is in the center of

interest for education managers and

educators. In this regard, the quality of

education can be defined as balanced

correspondence properties and characteristics

of the educational process. In the Sultanate of

Oman, the Government control and

supervision of the quality system is aimed at

ensuring a cohesive national policy for

improving the quality of management,

preparation, and the rational use of national

funds allocated to finance the education

system (OAAA, 2017). National control and

supervision of the quality in education is work

out by national education authorities (OAAA,

2018). Such control and supervision could

mainly require the institution itself to establish

an innovative quality management system that

responds to the requirements of the national

authority (Shannaq, 2018). The goal of this

work is to develop a new classification

algorithm that classifies exam questions

according to Bloom’s Taxonomy verbs using

updated fuzzy model.

To achieve this goal, the following steps were

followed:

1. Study and review the current classification

work in machine learning.

2. Build an effective machine learning model

based on experimental test to find the

dominant classification algorithms available in

the knowledge flow environment in Weka

tools.

3. Propose and realize a new classification

algorithm using fuzzy search method.

4. Conduct experiments to test the algorithm

with the best model obtained in step 2.

Figure 1 illustrates the task of the Quality

Management System (QMS) in Higher

Education Institutions (HEI) to audit the exam

questions based on Bloom’s Taxonomy verb

categories, where ‘?’ indicates the question

matching the Bloom’s verb?

Figure 1 task of QMS in HE to audit the exam

questions

One of the important features of evaluation

the quality of exam questions to fulfill the

requirements of QMS in HEI is to control the

presence of intentionally or unintentionally the

repeated fragments of exam questions that

had better match to Bloom’s verb categories,

which greatly complicates the preparation and

the evaluation process of the exam questions

and becomes very difficult and time consuming

for educators in the absence of additional tools

for the development and maintenance of the

exam questions. Thus, if the QMS doesn’t track

the presence of Bloom’s verb repetitions in

exam questions. Therefore, the exam

questions could be insufficient to improve the

learning outcomes, which may eventually

affect the quality of education. Therefore, the

simplification and partial automation of the

process of locating and refactoring such

repetitions becomes an important task. The

exact text matching using machine learning

and other approaches have been applied to

Bloom’s verbs, so when a Bloom’s verb

appears in the question, the classifier matches

such question to the exact Bloom’s category as

illustrated in figure 1. The proposed fuzzy

algorithm in this work could be consider as an

innovative tool to improve the application of

QMS in HEI. The proposed fuzzy algorithm is

distinguished from other approaches in

considering number of parameters such as the

number of the clone Bloom’s categories,

average number of clones and the average

number of related keywords in a clone for all

Bloom’s categories. In this regard, this work is

prepared to promote an innovative approach

to improve the quality of education through the

application of technology and intelligent

solutions, creating opportunities for enhancing

auditing and traceability.

2. Background and literature review

2.1 The development of Domestic Quality

Management Systems

In recent years, interest in portfolio system

has been noted in Oman’s education. (OAAA,

2017). Sultanate of Oman Quality

Management System (QMS) community

represented by OAAA, have developed a set of

comprehensive documents under the title

“National Education Frameworks” (OAAA,

2018) to be oriented towards the highest level

of quality in education ever achieved in the

region, and the highest standards of

requirements for financial, logistical,

intellectual and information technology

Page 3: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

71

AR

TÍC

UL

O

support for the functioning of the educational

system per scholar. OAAA in this regard

developed nine standard indicators to ensure

the quality of education at the registered local

academic institutions. This work is proposed to

improve the quality control of Standard

number 2, namely “Student Learning by

Coursework Programs” and in particular

Criteria number 2.8, namely “Assessment

Methods, Standards and Moderation”. Figure 2

proposed by OAAA are depicted herein to

demonstrate the focus of this work.

Figure 2 Nine standards set by OAAA

Essentially, the assessment of the standard

indicators highlighted in figure 2 should be

based on an analysis of the availability and

effectiveness of the quality assurance system

at the university level, which directly obliges

educational institutions to start creating such a

system. However, though the availability and

effectiveness of such indicators are not clearly

defined until now, which makes it difficult to

carry out an external auditing for the

certification and national accreditation of

academic organizations. Academic institutions

at the Sultanate of Oman are working

extremely to improve the quality of education

based on the standards established by OAAA

(Shannaq, 2018).

2.2 Bloom’s Taxonomy

Bloom’s Taxonomy involves the creation of an

integrated six levels of cognitive learning

objectives and learning assessment to classify

educational learning into levels of difficulty and

specificity. Six lists of verbs cover learning

objectives were proposed to control course

delivery and likewise are used to classify

exams to control their compatibility with the

learning objectives set for students according

to Bloom’s Taxonomy models (Scott, 2003).

The cognitive domain list has been the primary

focus of most world education institutions

including the Sultanate of Oman, and

frequently used to structure course learning

objectives, assessments and activities. Figure

3 shows the terms of question verbs and

samples of question contexts (Huitt, 2011;

QTLMS Unizwa 2017).

Figure 3 Bloom taxonomy verbs

Page 4: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

72

AR

TÍC

UL

O

(Anwar, 2017; Siti et al., 2017; Nafa et al.,

2016) presented that planning an assessment

strategy based on Bloom’s Taxonomy, will help

in formulating exams and assignments

compatible with learning objectives set out for

the curriculum courses.

The work presented in this research paper is a

practical software application called Bloom

Analyzer capable of providing smart

recommendations based on Bloom’s

Taxonomy model to improve assessment

method.

3. Classification

Many research papers (Anwar, 2017; Siti et

al., 2017; Nafa et al., 2016) discussed

different possible approaches to the

construction of automatic classification system

for exam questions based on Bloom’s

Taxonomy verbs. Those studies were carried

out within the framework of processing and

classifying exam questions to control the

assessment and ensure their integrity and

compatibility with the six categories of Bloom’s

Taxonomy that were used to construct course

learning objectives. Classifiers are a support

mechanism for adding new experiences to

improve the quality of the educational process.

Ulum, (2016); Kocakaya and Kotluk,(2016);

Omar et al., (2012); Zhang and Lee, (2003)

Have used BayesNet ,J48 ,RandomForest

,NaiveBayes, RandomTree, Stacking, Bagging

and Vote algorithms to build classifiers and

predictors with the help of java API. The

implementation of those algorithms which

belong to the knowledge flow intelligence tools

have been customized to perform all

computational experiments.

3.1 Bloom’s Taxonomy verbs learning

The novelty of this approach is the idea of

converting a text classification task into

learning task with an automatic educator to

build an attribute description of each exam

question under consideration, which is a

Boolean vector of the occurrence of words

(there is a word or not) in a question from a

pre-built exam questions bank extracted from

Moodle used in the institution of higher

education. The task of learning is applied to a

set of input objects (questions) X, where each

object x ∈ X is assigned a value y, called the

output, or answer, belonging to the set of valid

answers Y. Ordered pair “question-answer” (x,

y) where x ∈ X, y ∈ Y is called a precedent. The

relationship between input and output based

on the data in the final set of precedents is

called training sample (Gareth et al., 2018;

Shannaq and Adebiaye 2015; Yusof and Hui, 2010). {(𝑥𝑖 , 𝑦𝑖) | 𝑥𝑖 ∈ 𝑋 , 𝑦𝑖 ∈ 𝑌 , 𝑖 = 1, 𝑁̅̅ ̅̅ ̅}. In

other words, the task is to build a model

(function) f, which having received x as an

input, would predict the value of the answer y.

The process of finding f is called learning or

setting up the model. The main requirement

for a solution is a high generalizing ability, that

is, a trained model must produce accurate

predictions on new (not included in the training

set) precedents. Thus, the optimal solution of

the problem of inductive learning should

satisfy the following conditions:

f * = arg min𝐹 ∈𝐾

∑ 𝐿 (𝑦𝑖 , 𝐹(𝑥𝑖𝑁𝑖=1 ))

optimal solution for inductive learning

L (y , f (x))

non-negative function of loss (penalty)

K

a set of models (𝑥𝑖 , 𝑦𝑖) , i = 1 , 𝑁̅̅ ̅̅ ̅̅ ̅

precedents make up a training set

4. Experiments

An experiment was conducted on exam

questions database, extracted from the Moodle

system. This database included multiple types

of exam questions of information systems

courses, each question marked and classified

based on the six

Bloom’s Taxonomy Action Verbs’ on the

following categories: "Remembering",

"understanding", "Applying", "Analyzing",

"Evaluating", "Creating", (altogether 6

categories are considered).

This work builds a list of keywords based on

Bloom’s Taxonomy verbs and their synonym,

after which Visual C# code have been

developed to read all exam questions and

classify exam questions based on Bloom’s

Taxonomy verbs considering their synonym.

For example:

Exam question: “What is Web-based

multimedia and how it is used today” the C#

code classify this question as a “Remember”.

The same is done for other exam questions.

Page 5: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

73

AR

TÍC

UL

O

Figure 4 demonstrates the preprocessing steps for the given questions and classes.

Data 6 classes distribution 6 classes distribution after

filtering

Figure 4 Data set and class preprocessing

To solve the class imbalance problem, this

work used Weka filtering tools to undergo

unsupervised resampling. The 10 folds

distribution have been used into training and

test samples (with the number of objects equal

to 7000 and 1756 respectively). To improve

the classification performance and to

guarantee that all 6 classes will appears in all

folds, Weka supervised filtering tools have

been applied.

Experiment 1

Figure 4.1 demonstrates the knowledge flow

environment to implement a performance

comparison for common selected classifiers.

Figure 4.1 Experiment configuration

Figure 4.1 shows work load dataset, classes

assignment, class fold set, cross validation

fold, training and test groups for 8 classifiers

and link each classifier with performance

evaluator and implement the model for

visualizing the performance chart based on

percent correct classifications. Figure 4.2

shows the plot of ROC curve for the above

setting presented in Figure 4.1.

Page 6: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

74

AR

TÍC

UL

O

Figure 4.2 ROC curve compression over 8 classifiers

The Paired T-Tester have been employed to

analyze the experiments. The Roc Curve

presented in figure 4.3 used to perform the

performance comparison over all classifiers

and the percent of correct test have been

selected with significance of 0.05, i.e. any

difference found among the generated

classifiers will be 95% of confidence interval

and figure 4.4 demonstrate details of the

comparison

Figure 4.3 percent_correct indicator among

all classifiers

To assess the quality of the generated

classifications, for each model, the values of

accuracy percent_correct, calculated as it is

shown in figure 4.3. In some cases, the

accuracy values have been determined. It

should be noted that Bagging classifier

performs all other classifier with little

confidence while BayesNet, J48,

RandomForest, NaiveBayes, RandomTree

algorithms receptively are similar

performance. The use of Stacking, and Vote

give the worst results in all aspects. Figure 4.3

shows a test percent_correct total for all

classes (the percentage of correctly classified

exam questions to the total test sample

objects), as well as the average and standard

deviation categorization of each classifier.

From the obtained results it can be seen that

the training of models representing ensembles

of Bagging, BayesNet, J48, RandomForest,

NaiveBayes, RandomTree algorithms

receptively, are significantly exceeds the

Stacking, and Vote, however, Bagging gives

the best indicators for the percent_correct

classification. Thus, the RandomForest

algorithm of decision trees should be also

marked in solving the text classification

problem proved to be a serious competitor to

the meta classifier, i.e. Bagging traditionally

used in problems of this kind, making it

possible to increase the quality of

classification, especially for small categories of

exam questions (Kotsiantis, Tsekouras, &

Pintelas, 2018; Esposito, & Saitta, 2005).

Experiment 2

In his section we propose and realize a new

algorithm using fuzzy search methods. In fact,

comparing the received text with itself, cold

lead to inaccurate repeating fragments using

search algorithms. Initially, the size of the

clones is unknown and could be changeable, as

a result of which a straightforward

implementation of this approach would have

required comparing all fragments of the text to

all sizes. Comparing each other with all

fragments of the same size, there is

complexity of O (n2 / t2), where n is the

number of words in the text (exam questions

in this work), t is the size of the fragment in

terms and quadratic. Let us calculate the

general complexity of the algorithm without

considering the cost of comparing two

fragments of the text between itself: ∑nt = 1 n2

/ t2 = n2 ∗ (1 + 1/2 + ... + 1 / n) 2 = O (n2).

Therefore, regardless of the choice of an

algorithm for comparing these fragments, such

an approach would be extremely time

consuming and unproductive. Thus, in this

Page 7: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

75

AR

TÍC

UL

O

work it was decided to build an alphabet from

the text (exam questions), the symbols of

which will be the words found in the text, and

then store the text in the symbols of the new

alphabet. This conversion provides several

advantages: reducing the amount of memory

needed to store text, as well as accelerating

the further operation of the algorithm with

text, since it is possible to work not with string

values of words, but with their position in the

created alphabet, which will allow us to

compare fragments faster. After obtaining the

results of the fuzzy search algorithm, it will be

enough to use the alphabet to return to the

textual representation. However, this

conversion also does not give a significant

increase in the performance of the proposed

algorithm. Therefore, it was decided to change

the way how the fragments were compared,

figure 4.4 shows the steps of the proposed

algorithms.

Figure 4.4 steps of the proposed algorithms

The first step of the proposed algorithm is to

make assumption about the minimum clone

size of interest to the user/auditor. Since we

are talking about working with large exam

questions, we can assume that fuzzy

repetitions of a few words are unlikely to be

the main object of search and refactoring, and

can be easily found by exact search, therefore,

we set a limit on the minimum size of a

repeating fragment. Now the idea of

comparing the text with itself can be modified

using the previous statement. The next step of

the algorithm is to split the text into fragments

of a given size. Then there will be a clear

comparison with each other. Since in this case,

the fragment size is already fixed, a significant

part of the comparisons is discarded. However,

fuzzy clones can have a much larger size than

one such fragment, as a result of which one of

the subsequent stages of the algorithm will be

the expansion of the found clones. With this

approach, the complexity of the algorithm will

be O (n2 / t2), where n is the number of

keywords in the text, t is the minimum clone

size.

The next important step of the algorithm is to

use the hashing to speed up the comparison of

fragments. But, since it is required to search

for inaccurate matches between fragments, it

is necessary to use perceptual hashing, which

will produce similar values on similar

fragments, and will allow us to discard

obviously dissimilar fragments. Rochimah et

al., (2013) proposed a hash function called

Signature used to map each fragment to a

vector of size m. Each ith element of this vector

corresponds to a set of alphabet characters,

and if the fragment contains one of these

symbols, then the ith element will be equal to

one, and zero otherwise. However, in this case,

this use of the function is impractical, since

with significant fragment sizes, their

signatures will always be close to the values

even for completely different fragments.

Therefore, a signature hash function was

proposed, with the sets of values for the

elements of the vector in which there were

symbols of the documentation language, but

now the ith element of the vector will be equal

to one only if the character from the set is the

first symbol of a giving key word from the

words found in the fragment. This will save

memory costs at the same level, without losing

accuracy and the value of the hash function of

the fragment will be the number of H (w) =

∑m − 1i = 0 2i ∗ sign(w)i + 1, where w is a

fragment, sign (w) is a vector signature of size

m by the first characters of words in the

fragment. Figure 4.5 illustrates the signature

of the hash function used for English letters;

the most suitable element was found in this

work is when m = 9.

Page 8: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

76

AR

TÍC

UL

O

Figure 4.5 Signature of the hash function

Figure 4.6 shows a diagram for quickly

comparing the values of the hash function of

different fragments.

Figure 4.6 Hashing comparisons of fragments

It was decided to take advantage of the fact

that initially the value was represented as a

vector of zeros and ones. The values of the

hash function of the two fragments are

translated into binary representation, and the

operation of the exclusive ‘OR’ is bitwise used.

In the bitwise representation of the number

obtained by this conversion, the number of

unit bits will denote the difference between the

values of the hash functions of the two

fragments. If this value does not exceed a

certain threshold value specified at the start of

the algorithm, then the second comparison

step is performed.

Figure 4.7 shows the comparison results for

the proposed algorithm and meta classifier

Bagging.

Figure 4.7 comparison results

To verify the correct operation of the

algorithm, as well as to find its limitations,

several sets of test data were generated. A set

of data for checking the correctness of the

algorithm, consisting of several small and large

texts exam questions. Validation Data Test

data consists of several small pieces of text

containing repeating elements. Carrying out

these tests is designed to demonstrate the

operation of the new algorithm on various

types of clones. Figure 4.7 illustrate the

percent of correct classification when the

algorithm is compared to the bagging

classifier.

5. Implementation

The fuzzy algorithm has been used and

customized to build the Smart Information

System and successfully categorizes exam

questions with 96.2% confidence. Figure 5

below shows a sample of the manual work and

figure 5.1 shows the output of the developed

system.

Figure 5.1 sample of the manual work (course specification)

Page 9: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

77

AR

TÍC

UL

O

Figure 5.1 Sample of the developed Bloom Analyzer

The Implementation Features of the proposed

algorithm implies the ability to configure and

change some parameters by the user. The user

could configure such parameters as the size of

the fragments into which the text is split, the

maximum permissible editorial distance

between two fragments, and also the proximity

threshold of two hash values for different

Page 10: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

78

AR

TÍC

UL

O

fragments. Also, the implementation of the

algorithm provides for the possibility of

parallelization at several stages of the

algorithm: normalizing the text, calculating

hashes of fragments, comparing fragments of

text and expanding them, which will further

accelerate the operation of the algorithm.

Conclusion

This work successfully developed fuzzy

classifier algorithm which is used as catalog for

constructing course exams. The observed

results from the developed Bloom’s Analyzer

suggest the possibility of using the considered

method for the construction of automatic

catalogs in e-learning libraries. We believe

that, Bloom’s Analyzer application will serve

educators to build effective exam questions

and comply with the requirement of QMS.

Analysis of the obtain results to prove the

accuracy of the proposed algorithm have been

conducted on several key tests were carried

out, the successful completion of which means

that the algorithm allows us to find accurate,

so fuzzy repetitions of varying degrees of

variability, and allows us to overcome the

limitations of the machine learning algorithm

to perform the classification task.

References

[1]. Anwar, A. Y. (2012) ‘Bloom’s Taxonomy

Cognitive Levels Data Set’, 25th

International Conference on

Industrial, Engineering & Other

Applications of Applied Intelligent

Systems, Dalian, Liaoning, China.

[2]. Esposito, R. and Saitta, L. (2005)

‘Experimental comparison between

bagging and Monte Carlo ensemble

classification’, Proceedings of the 22nd

international conference on Machine

learning – ICML.

[3]. Gareth, J. Witten, D. Hastie, T. and

Tibshirani, R. (2018) ‘An introduction to

Statistical learning: With applications in

R’ (Springer Texts in Statistics Book 103)

1st ed. 2013, Corr. 7th printing 2017

Edition, Kindle Edition.

[4]. Huitt, W. (2011) ‘Bloom’s et al.

taxonomy of the cognitive domain’,

Educational Psychology Interactive,

Valdosta, GA: Valdosta State University

http://www.edpsycinteractive.org/topics

/cognition/bloom.html.

[5]. Kocakaya, S. and Kotluk, N. (2016)

‘Classifying the Standards Via Revised

Bloom's Taxonomy. A Comparison of Pre-

Service and In- Service Teachers’,

International Journal Of Environmental &

Science Education, 11(18), 11297-

11318.

[6]. Kotsiantis, S. B. Tsekouras, G. E. and

Pintelas P.E. (2018) ‘Bagging Model

Trees for Classification Problems’, LNCS

3746, pp.328-337, Springer-Verlag

Berlin Heidelberg.

[7]. Nafa, F. Othman, S. and Khan, J. (2016)

‘Automatic Concepts Classifications

based on Bloom Taxonomy using Text

Analysis and Naïve Bayes Classifier

Method. CSEDU 8th international

Conference on Computer Supported

Education, 391-396

[8]. OAAA (2017),

http://www.aiaccredits.org.

[9]. Oman Academic Accreditation Authority

(2018), www.oaaa.gov.om.

[10]. Omar, N. Haris, S.S. Hassan, R. Arshad,

H. Rahmat, M. Zainal, N. F. A. and

Zulkifli, R. (2012) ‘Automated analysis of

exam questions according to Bloom's

taxonomy’, Procedia-Social and

Behavioral Sciences, 59(6), 297-303.

[11]. Rochimah S. Dewandono R. D. and

Saputra F. A. (2013) ‘Clone detection

using Rabin-Karp parallel algorithm’ 7th

International Conference on Information

& Communication Technology, Bali.

[12]. Shannaq B. (2018) ‘Smart Information

System for Evaluation the Effectiveness

of the Educational Organization to

succeed the modern university in

entrepreneurial university’, Open Arab

Conference, Higher Education

Development and Quality.

[13]. Shannaq, B. and Adebiaye R.

(2015)‘Analytic-Synthetic Processing of

Information As Smart-Based

Environment for Text Summarization’,

International Journal of Innovative

Research in Science, Engineering and

Technology (An ISO 3297: 2007 Certified

Organization) 4(1).

[14]. Scott T. (2003) ‘BLOOM'S TAXONOMY

APPLIED TO TESTING IN COMPUTER

SCIENCE CLASSES’, Journal of

Computing Sciences in Colleges, 19(1),

267-274.

[15]. Siti, H. Salmah, F. Rina, S. and Mazlina,

M. (2017) ‘Mining exam question based

on Bloom Taxonomy’, SiteSeerx,

http://citeseerx.ist.psu.edu/viewdoc/su

mmary?doi=10.1.1.403.3743.

Page 11: Modelo de análisis difuso para clasificar preguntas de ...Palabras clave: verbos de taxonomía de Bloom, modelo de análisis difuso, clasificación, aprendizaje automático 1. INTRODUCTION

REVISTA AUS 26.4/ Fouad Jameel Ibrahim AlAzzawi et al.,/ DOI:10.33329/aus.2019.n26.4.10/ www.ausrevista.com/ [email protected]

79

AR

TÍC

UL

O

[16]. Tague, N. R. (2005) ‘The Quality

ToolBox, American Society for Quality’,

Quality Press, 2nd edition.

[17]. QTLMS UNIZWA (2017). Quality

Teaching And Learning Management

System (Qtlms): A Developmental

Student-Centered Pedagogical

Framework. Guideline Book

[18]. Ulum, O. G. (2016) ‘A Descriptive

Content Analysis of the Extent of Bloom'

s Taxonomy in the Reading

Comprehension Questions of the Course

Book Q: Skills for Success 4 Reading and

Writing’, The Qualitative Report 21(9),

1674-1683.

https://nsuworks.nova.edu/tqr/vol21/is

s9/7.

[19]. Yusof, N. and Hui, C. J., (2010)

‘Determination of Bloom's cognitive level

of question items using artificial neural

network’, Proceedings of the 10th

International Conference on Intelligent

Systems Design and Applications (ISDA).

Cairo, Egypt, 866-870.

[20]. Zhang, D. and Lee, W.S. (2003)

‘Question classification using support

vector machines’, Proceedings of the

26th Annual International ACM SIGIR

Conference on Research and

Development in Information Retrieval.

Toronto, Canada, 26-32.