data hiding in arabic text - uotechnology.edu.iq

153
Data Hiding in Arabic Text Ministry of Higher Education and Scientific Research University of Technology School of Technical Education Data Hiding in Arabic Text A Thesis Submitted to Technical Education Department University of Technology/ Baghdad in Partial Fulfillment of the Requirement for the Degree of Doctor of Philosophy in Engineering Education Technology/ Electrical Engineering By Auday Jamal Fawzi Supervised By Dr. Saleh M. Al-Karaawy Dr. Shawket T. Al-Hiazay Ass. Prof. Prof. 2007

Upload: others

Post on 22-Dec-2021

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Ministry of Higher Education and Scientific Research

University of Technology

School of Technical Education

Data Hiding in Arabic Text

A Thesis

Submitted to Technical Education Department

University of Technology/ Baghdad

in Partial Fulfillment of the Requirement for the

Degree of Doctor of Philosophy

in Engineering Education Technology/

Electrical Engineering

By

Auday Jamal Fawzi

Supervised By

Dr. Saleh M. Al-Karaawy Dr. Shawket T. Al-Hiazay Ass. Prof. Prof.

2007

Page 2: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Language Certification

This is to certify that I have read the thesis titled “Data

Hiding in Arabic Text” and corrected any mistake in grammar

and style.

Dr. Moutaz S. Abdul Wahab

Assistant Professor

Page 3: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Supervisor’s Certificate

We certify that this thesis entitled “Data Hiding in Arabic Text” was

prepared by (Auday Jamal Fawzi) under our supervision at the

Department of Technical Education / University of Technology / Baghdad,

in partial fulfillment of requirements for the degree of Doctor of

Philosophy in Engineering Education Technology / Electrical Engineering.

Signature: Signature:

Name: Dr. Saleh M. Al-Karaawy Name: Dr. Prof. Shawkat T. Al-Hiazay

Engineering Supervisor Technical Supervisor

Date: / 1 / 2007 Date: / 1 / 2007

Page 4: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Examination Certificate

We certify that this thesis entitled “Data Hiding in Arabic Text”, is

submitted by the student (Auday Jamal Fawzi), and as Examining

Committee examined the student in its content and that, in our opinion, it

meets the standard of a thesis for the degree of Doctor of Philosophy in

Engineering Education Technology.

Signature:

Prof. Dr. Emad Al-Hussani

(member) Date: / 1 / 2007

Signature:

Assist Prof. Dr. Nasser K. Al-Ani

(member) Date: / 1 / 2007

Signature:

Assist Prof. Dr. Adnan Al-Sultani

(member) Date: / 1 / 2007

Signature:

Assist Prof. Dr. Ibtesam R. Karhiy

(member) Date: / 1 / 2007

Signature:

Prof. Dr. Hilal H. Saleh

(Chairman) Date: / 1 / 2007

Signature:

Assist Prof. Dr. Saleh M. Al-Karaawy

(Supervisor) Date: / 1 / 2007

Signature:

Prof. Dr. Shawket T. Al-Hiazay

(Supervisor) Date: / 1 / 2007

Signature:

Dr. Dhari Yousif Mahmood

(Head of the Technical Education Department) Date: / 1 / 2007

Page 5: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Dedication

I’d like to present this work to

My family with love

My teachers with respect

And to the memory of my teacher

Dr. Awatif Barsoum,

Page 6: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Acknowledgment

I’d like to express my deep gratitude to my supervisors

Dr. Saleh M. Al-Karaawy and Dr. Shawkat T. Al-Hiazay for their

willingness to discuss the research, continual encouragement and

their gentle and valuable comment.

Page 7: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

I

Abstract There has been a rapid growth of interest in information and how this

information to be transferred within a network linking the entire world,

these necessarily require a way to maintain information privacy and

security. This is an incentive to achieve this research where its goal is to

find a new technique to cipher and hide information in Arabic text

documents.

To achieve the research goal, a program is prepared to cipher a

message and then hide it in an Arabic paragraph taking into consideration

to deal with two types of files, the first one is document file type and the

second is Rich Text Format (RTF) file, where the two types are compatible

with Microsoft Word application.

This work is concerned with the creation of a program to cipher a

message, then hiding it using a white space and word shift methods that

deal with English paragraph to be hidden message in an Arabic text

document. The program to hide message in an Arabic text is prepared by

taking the benefit of extension (ـ) used with Arabic text.

The above methods are implemented, and results show that there is

still need for a method having more efficiency to achieve more security.

For this reason a new technique is proposed to hide in an Arabic text named

a “Unicode system method”, which uses the Arabic character code to hide

the message. After implementing this technique on Arabic text, it is found

that the target file size takes the same source file size, and the third party

cannot recognize the difference between source and target files by eye

which makes it difficult to break the hidden message.

Because the processes of hiding are done by people who use the

World Wide Web, and because this net deals with different operating

systems, RTF files are used. This is because they serve as both a standard

Page 8: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

II

of data transfer between word processing software, document formatting,

and a means of migrating content from one operating system to another.

Implementing RTF file technique on Arabic text gives two

advantages; first the third party cannot recognize the difference between

source and target files by eye, and increase the amount of information

hidden in file. However, one disadvantage is found which is the increase of

file size after hiding process and to avoid this problem, a proposed

subroutine is written to compress the file in order to make the difference

between its size and the source file as small as possible.

An educational program has been prepared depending on

instructional design concepts and using tutorial method to present concepts

and information of ciphering and hiding process. And for the benefit of the

target population from the program, a questionnaire form had been

prepared to be evaluated by a number of experts and students, and as a

result of the questionnaire and by using feedback process a development is

done to achieve the link between the theoretical side and practical side of

the research.

Page 9: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

III

Table of Contents Abstract I

Contents III

List of Symbols VIII

List of Abbreviation VIII

Chapter One: Research Foundations

1.1 Introduction 1

1.2 Information Security Concepts 1

1.3 Motivations to Use Steganography 3

1.4 Information Hiding Applications 3

1.5 Research Problem 4

1.6 Research Importance 4

1.7 Research Aims 5

1.8 Research Limits 5

1.9 Terminology 6

1.10 Literature Review 6

1.10.1 Engineering Literatures 6

1.10.2 Instructional Technology Literatures 9

1.11 Thesis Organization 11

Chapter Two: Theoretical Concepts of Data Hiding

2.1 Introduction 12

2.2 Cryptography 12

2.2.1 General Concepts 12

2.2.2 Stream Cipher 13

2.2.3 Random Number Generation 14

2.2.4 Shift Register Based Schemes 14

2.2.4.1 Linear Feedback Shift Register 14

Page 10: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

IV

2.2.4.2 Combination and Filter Generators 15

2.2.4.3 Multiplexers 15

2.2.4.4 Desirable Properties of LFSR-Based Keystream Generators

16

2.2.4.5 Life Cycle of a Key 16

2.2.5 Statistical Tests 17

2.3 Steganography (Data Hiding in Text) 19

2.3.1 General Concepts 19

2.3.2 Coding Methods 20

2.3.2.1 Open Space Methods 20

2.3.2.2 Syntactic Methods 22

2.3.2.3 Semantic Methods 23

2.3.2.4 Shift Coding 24

2.3.2.5 Feature Coding 25

2.3.3 Steganographic Protocols 26

2.3.3.1 Pure Steganography 26

2.3.3.2 Secret Key Steganography 26

2.3.3.3 Public Key Steganography 26

2.4 Unicode System 27

2.4.1 Characters 27

2.4.2 Arabic Characters 28

2.5 Data Compression 28

2.5.1 Static Huffman Coding 29

2.5.1.1 Encoding 30

2.5.1.2 Decoding 31

2.6 Technical Framework 31

2.6.1 Introduction 31

2.6.2 Instructional Design 32

2.6.3 Instructional Package 33

Page 11: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

V

Chapter Three: The Proposed Hiding Algorithm

3.1 Introduction 34

3.2 Specification of the proposed Software 34

3.3 The Proposed Software Structure 35

3.4 Operation of the Proposed Software 37

3.4.1 Providing a Plain Text and a Password 37

3.4.2 Load Microsoft Word File 37

3.4.3 Selecting Hiding Method 38

3.4.4 Debrief the Time from the Computer Clock 38

3.4.5 Generate Keystream 40

3.4.5.1 Labels 40

3.4.5.2 The Registers 42

3.4.5.3 Initialization Registers 42

3.4.5.4 Design Principles 49

3.4.5.5 Keystream Generation 49

3.4.5.6 Keystream Testing 51

3.4.6 Huffman Code 55

3.4.7 Check Document File 59

3.4.8 Encryption 59

3.4.9 Hide Cipher Text 59

3.4.9.1 Hyphen Method 59

3.4.9.2 White Space Method 61

3.4.9.3 Change Word Position Method 61

3.4.9.4 Unicode System Method 61

3.4.10 Hiding the Time 64

3.5 Hiding Data in a Rich Text Format (RTF) File 64

3.5.1 Contents of an RTF File 64

3.5.2 Paragraph Formatting Properties 65

3.5.3 Hiding Algorithm 65

Page 12: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

VI

3.5.4 Unhiding Algorithm 67

3.5.5 Compression Algorithm 67

3.6 Design Instructional Package 69

3.6.1 Analysis 69

3.6.2 Construction 70

3.6.3 Evaluation 73

3.6.4 Statistical Method 73

Chapter Four: Results and Discussion

4.1 Introduction 75

4.2 Ciphering and Hiding Data in .DOC Document Files 75

4.2.1 Open Document File 76

4.2.2 Select Hiding Method 77

4.2.3 Write the Message 78

4.2.4 Write the Password 78

4.2.5 Start Hiding Process 78

4.2.6 Hiding Data with Unicode System Method 88

4.2.7 Hiding Data with White Space Method 90

4.2.8 Hiding Data with Hyphen Method 91

4.2.9 Hiding Data with Change Position Method 93

4.3 Hiding Data in .RTF Document Files 95

4.3.1 Open Document File 95

4.3.2 Write the Message 96

4.3.3 Start Hiding Process 97

4.4 Discussion 100

4.5 Instructional Technology Side Results 103

4.5.1 Opinion List Results of Experts View Point Analysis 103

4.5.2 Questionnaire Results of Learners View Point Analysis 104

4.6 Conclusions 105

Page 13: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

VII

4.7 Recommendations 106

4.8 Suggestions 106

References 107

Appendixes

Appendix A Unicode Tables A-1

Appendix B Program Subroutines B-1

Appendix C Expert’s and Learner’s Questionnaire Forms C-1

Page 14: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

VIII

List of Symbols

X1 Frequency test value

X2 Serial test value

X3 Poker test value

X4 Run test value

X5 Autocorrelation test value

2χ Chi Square

List of Abbreviations

ANSI American National Standard Institute

AppWd Application Word Document

ASCII American Standard Code for Information Interchange

DOC Document file format

LFSR Linear Feedback Shift Register

RTF Rich Text File format

Unicode Universal Character Encoding Standard

XOR Exclusive OR gate

Page 15: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One

Research Foundations

Page 16: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

1

1.1 Introduction

Steganography is the art of covered or hidden writing, the purpose of

steganography is covert communication to hide a message from a third

party, this differs from cryptography, the art of secret writing, which is

intended to make a message unreadable by a third party but does not hide

the existence of the secret communication. Although steganography is

separate and distinct from cryptography, there are many analogies between

the two, and some authors categorize steganography as a form of

cryptography since hidden communication is a form of secret writing [1].

Steganography hides the covert message but not the fact that two

parties are communicating with each other, the steganography process

generally involves placing a hidden message in some transport medium,

called the carrier, the secret message is embedded in the carrier to form the

steganography medium. The use of a steganography key may be employed

for encryption of the hidden message and for randomization in the

steganography scheme, in summary [2]:

steganography_medium = hidden_message + carrier + steganography_key

As an increasing amount of data is stored on computers and

transmitted over networks, it is not surprising that steganography has

entered the digital age. On computers and networks, steganography

applications allow for someone to hide any type of binary file in any other

binary file [3].

1.2 Information Security Concepts

Information security includes two fields; Cryptography and

Steganography:

1. Cryptography is the science of information security. The word is derived

from the Greek kryptos, meaning hidden. Cryptography is closely

Page 17: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

2

related to the disciplines of cryptology and cryptanalysis. Cryptography

includes techniques such as microdots, merging words with images,

and other ways to hide information in storage or transit. However, in

today's computer-centric world, cryptography is most often associated

with scrambling plaintext (ordinary text, sometimes referred to as

cleartext) into ciphertext (a process called encryption), then back again

(known as decryption). Individuals who practice this field are known as

cryptographers [4].

2. Steganography on the other hand (pronounced stehg-uh-nah-gruhf-ee,

from Greek steganos, or "covered," and graphie, or "writing") is the art

of concealing the existence of information within seemingly innocuous

carriers. Steganography can be viewed as akin to cryptography. Both

have been used throughout recorded history as means to protect

information [4].

Steganography is the art of hiding signals inside other signals, this

basically comes down to using unnecessary bits (holes) in an innocent

file to store the sensitive data, the techniques used make it impossible

to detect that there is anything inside the innocent file, but the intended

recipient can obtain the hidden data. A further challenge is to fill these

holes with data in a way that remains invariant to a large class of host

signal transformations [5, 6].

While cryptography is about protecting the content of messages

(their meaning), steganography is about concealing their very existence, it

is usually interpreted to mean hiding information in other information.

Examples include sending a message to a spy by marking certain letters in

a newspaper using invisible ink, and adding sub-perceptible echo at certain

places in an audio recording, it is often thought that communications may

Page 18: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

3

be secured by encrypting the traffic, but this has rarely been adequate in

practice [7].

1.3 Motivations to Use Steganography

There has been a rapid growth of interest in this subject over the last

few years, and for many reasons [8, 9]:

1. The publishing and broadcasting production have become interested in

techniques for hiding encrypted copyright marks and serial numbers in

digital films, audio recordings, books and multimedia products; an

appreciation of new market opportunities created by digital

distribution is coupled with a fear that digital works could be too easy

to copy.

2. Various governments to restrict the availability of encryption services

have motivated people to study methods by which private messages

can be embedded in seemingly innocuous cover messages. The ease

with which this can be done may be an argument against imposing

restrictions.

3. Protect data from compromise or disclosure, like a design for a new

business system, that information should be protected from disclosure.

4. People hide data is because they don't want anyone to see it except for

them.

5. People hide data is for covert communication, hiding data for covert

communication can be very effective if someone is not expecting

anyone to communicate in that way.

6. Someone may not want anyone to see data because it contains a virus

or Trojan.

1.4 Information Hiding Applications

Data hidden in text has a variety of applications, including copyright,

verification, authentication, and annotation. Making copyright information

Page 19: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

4

inseparable from the text is one way for publishers to protect their products

in an era of increasing electronic distribution. Annotation can be used for

tamper protection. For example, if a cryptographic hash of the paper is

encoded into the paper, it is a simple matter to determine whether or not the

file has been changed. Verification is among the tasks that could easily be

performed by a server which, in this case, would return the judgment

“authentic” or “unauthentic” as appropriate [5].

1.5 Research Problem

The problem of the research could be summarized as follows:

1. The well-known methods for information hiding in a document file do

not offer an effective way to avoid attacking. So, it is the time to think

about a new method that is suitable for Internet applications such as

E-mail.

2. Spreading of the computers that are connected with a network for data

transfer, there is a need for transfer data securely between these

computers.

1.6 Research Importance

The importance of the work comes from the following aspects:

1. The possibility of using the software in governmental and special

offices to hide personal and security information on their local

computers.

2. There is an idea to implement a computer network in the University of

Technology, and most of the files to be transferred between users are

in Arabic. So there is an opportunity to exploit this media to hide

secured information (for example, transmit a secure data between

university chairmanship and departments).

3. The possibility of using this work as a practical course in a field of

ciphering and hiding data in an Arabic text.

Page 20: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

5

4. The previous researches hide information in an English text only,

while the current research hides in an Arabic text.

5. Some previous researches study the information hiding technique

using theoretical side only, while this research study its theoretical and

practical sides.

1.7 Research Aims

The main aims of this work are:

1. Design software that is capable of ciphering and hiding data in Arabic

text.

2. Design an instructional package to represent or view the scientific

concepts of cryptography and steganography.

1.8 Research Limits

This work is limited to the following:

1. Design and implement software that is capable to hide information in

Arabic document files.

2. The document files used to hide data are Microsoft Word Document

that has extensions (.DOC, .RTF).

3. The hidden message use only Arabic characters.

4. Design an instructional package to describe concepts of cryptography

and steganography techniques in Arabic document file.

5. The software can be used by:

A. Computer engineering, computer science, communication

engineering.

B. Post graduate students in computer science, computer engineering,

and communication engineering departments.

6. The academic year 2005-2006, in University of Technology.

Page 21: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

6

1.9 Terminology

There are many vocabularies in this research that need to be defined:

1. Network: A network is defined as two or more computers linked

together for the purpose of communicating and sharing information

and other resources. Most networks are constructed around a cable

connection that links the computers. This connection permits the

computers to talk (and listen) through a wire.

2. Encoding: Is the process of transforming information from one

format into another. Character encoding is a code that pairs a set of

natural language characters (such as an alphabet) with a set of

something else, such as numbers.

3. Decoding: Is the process of transforming information from one

format into another, it is opposite operation of encoding.

4. Package: Instructional package is one of the instructional design

programs, which consist of three elements: printing materials,

audible materials, visual materials.

1.10 Literature Review

1.10.1 Engineering Literatures

1. Brassil ,J. T., et al., “Copyright Protection for the Electronic

Distribution of Text Documents”, 1999. The researchers proposed a

watermarking method called word-shift coding. In this method, each

line is first divided into groups of words. Each group has a sufficient

number of characters. Then, each even group is shifted to the left or the

right according to the value of a specific bit in the payload. The odd

groups are used as references for measuring and comparing the

distances between the groups during the decoding stage. A correlation

method has been suggested for detecting the watermark. This method

requires the use of the original document, especially when the inter-

word spacing is variable [10].

Page 22: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

7

2. Shaar, Mahmoud, et al., “A Hybrid Hiding Encryption Algorithm

for Data Communication Security”, 2003. The researcher presents an

encryption algorithm that can be used for hardware-implemented

applications to secure data communications. This encryption algorithm

is based on hiding a number of bits from plain text message into a

random vector of bits. The locations of the hidden bits are determined

by a key known to the sender and receiver. The name of this paper

demonstrates the two basic operations of this algorithm. These are

operations that include inserting part of the plaintext bits into a cover to

hide it from recognition. There are no conventional operations on the

ciphered text, just plain hiding in a random bit string [11].

3. Kim, Young-Won, et al., “A Text Watermarking Algorithm based

on Word Classification and Inter-word Space Statistics”, 2003. The

researcher proposes a text watermarking algorithm that exploits the

novel concepts of word classification and inter-word space statistics.

The words are classified using some features. Several adjacent words

are grouped into a segment, and the segments are also classified using

the word class information. The same amount of information is inserted

into each of the segment classes. The information is encoded by

modifying some statistics of inter-word spaces of the segments

belonging to the same class. Several advantages over the conventional

word-shift algorithms come from the concepts of the word and segment

classification and of using the statistical distributions of inter-word

spaces, where which in the conventional algorithms, individual lines or

words hide a portion of total watermarking information independently

of other lines or words [12].

Page 23: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

8

4. Sui, Xin-Giiang, and Lilo, Hui, “A New Steganography method

Based on Hypertext”, 2004. The researcher proposes to analyze the

structure of the Hypertext files and proposes a new secure

steganography method. This method achieves the aim of hiding secret

information in hypertext by modifying the written states of the markup

letters. Experiments and analysis prove that it is a method with high

efficiency and security, since the method modifies only the markup

letters instead of the content itself where the stego-hypertext and the

cover have no difference in normal show. And the algorithm doesn't

lengthen the file since it just modifies the markup letters instead of

adding letters [13].

5. Topkara, M., et al., “Natural Language Watermarking”, 2005. The

researcher discusses natural language watermarking, which uses the

structure of the sentence constituents in natural language text in order

to insert a watermark. This approach is different from techniques,

collectively referred to as “text watermarking,” which embed

information by modifying the appearance of text elements, such as

lines, words, or characters. The goal in this paper is to review the

current state of the art in natural language watermarking. The type of

the text that is being modified for watermarking has an important effect

on the process of evaluation. For example, when watermarking a

magazine article or a novel, the emphasis may be on the preservation of

the author’s style. On the other hand, when watermarking a cooking

recipe or a user manual, preserving the preciseness and jargon would be

more important [14].

Page 24: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

9

6. Voloshynovskiy, S . , et al . , “Text Data-Hiding for Digital and Printed

Documents: Theoretical and Practical Considerations”, 2006. In this

paper, the researcher proposes a new theoretical framework for the

data-hiding problem of digital and printed text documents. The main

idea for this interpretation is to consider a text character as a data

structure consisting of multiple quantifiable features such as shape,

position, orientation, size, color, etc. We also introduce color

quantization, a new semi-fragile text data-hiding method that is fully

automatable, has high information embedding rate, and can be applied

to both digital and printed text documents. The main idea of this

method is to quantize the color or luminance intensity of each character

in such a manner that the human visual system is not able to distinguish

between the original and quantized characters, but it can be easily

performed by a specialized reading machine. The implementation of

this method in a digital-only environment is straightforward. In the

experiments, the researchers implemented a prototype for Microsoft

Office Word documents capable of embedding and extracting any

arbitrary message. The experimental work confirmed that this method

has high perceptual invisibility, high information embedding rate, and

is fully automatable [15].

1.10.2 Instructional Technology Literatures

1. Uden, L .and Alderson, A., “Teaching and Learning Using

Instructional Design”, 2000. The researchers propose an Instructional

System Design module to the final year computing science students at

Staffordshire University, the aim was to teach the group of students the

various instructional design theories and Instructional System Design

processes .They want to establish whether the instructional design

theories and Instructional System Design processes did help students to

understand their learning better and improve on their work. They

Page 25: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

10

concluded that applying Instructional Design Theories and the

Instructional System Design processes offers many benefits to helping

students in their learning. It enables students to classify the subject into

learning outcomes using taxonomy such as Gagne’s. The Instructional

System Design processes help students to identify the activities

involved in learning the subject. Finally, it also helps students to assess

their learning with the appropriate learning outcomes [16].

2. Tubsree, Chalong, and Tubsree, Nai-Fen Yu “Designing Effective

Instruction for Computer in Education Courses”, 2002. The researcher

proposes to design and develop an effective instruction for a computer

in education course. At the end of a study, seven instructional packages

were developed, the researcher then evaluated the developed packages

by considering students’ performance after studying the packages. It

was found that all students performed at the mastery level. They

produced high satisfaction on problem solving and construction task.

This indicated that the developed instructional packages helped

students learning [17].

3. Mushtaq, Rasha F . , “Educational Package for Detecting hidden

Information Embedded in an Image”, 2006. The researcher aims to

design an educational package forming the scientific concepts of

steganalysis, by building up instructional computer program depending

on the tutorial method in displaying its content and put it under

evaluation by the experts. The study reached the following:

a. The instructional package assisted the learners to develop their

self, because the package produces the feedback directly and

speedily.

Page 26: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter One Research Foundations

11

b. The instructional package as a learning device has its

psychological and educational impacts, because the learners are

depending on their selves [18].

1.11 Thesis organization

This thesis consists of four chapters, as well as chapter one, it as follows:

• Chapter two: Gives the idea behind the cryptography and generating

a random keystream, steganography and hiding methods and

protocols, Unicode system, as well as data compression technique.

• Chapter three: Gives the design of the proposed software that used

to hide message in five methods (word space method, word shift

method, hyphen method, Unicode method, and hide in RTF file

format).

• Chapter four: Presents the software implementation and results, as

well as conclusions, recommendations and suggestions for future

works.

Page 27: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two

Theoretical Concepts of

Data Hiding

Page 28: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

12

2.1 Introduction

Cryptography and Steganography are effective methods used to

protect plain text message by encrypting and hiding it. The security of the

system is based on the difficulty of the inverse computation.

2.2 Cryptography

2.2.1 General Concepts

A method of encryption and decryption is called ciphering. Its goal

is to protect information from unauthorized users. Modern algorithms are

using a key to control encryption and decryption. A message can be

decrypted only if the decryption key matches the encryption key.

There are two classes of key-based encryption algorithms,

symmetric (or secret-key) and asymmetric (or public-key) algorithms.

The difference is that symmetric algorithms use the same key for

encryption and decryption, whereas asymmetric algorithms use a different

key for encryption and decryption.

Symmetric algorithms can be divided into stream ciphers and block

ciphers. Stream ciphers can encrypt a single bit of plaintext at a time,

whereas block ciphers take a number of bits (typically 64 bits), and encrypt

them as a single unit.

Asymmetric ciphers (also called public-key cryptography) permit

the encryption key to be public, allowing anyone to encrypt with the key,

whereas only the proper recipient (who knows the decryption key) can

decrypt the message. The encryption key is also called the public key and

the decryption key is the private key or secret key [19].

There are hundreds of types of cipher systems ranging from very

simple paper-and pencil systems to very complex cipher machine or

computer based enciphered systems. These can be categorized as either

transposition or substitution or a combination of the two. In a transposition

system, the plaintext characters of a message are systematically rearranged.

Page 29: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

13

After transposing a message, the same characters are still present, but the

order of the letters is changed. In a substitution system, the plaintext

characters of a message are systematically replaced by other characters.

After the substitution takes place, the order of the underlying plaintext is

unchanged, but the same characters are no longer present [19].

2.2.2 Stream Cipher

A stream cipher is a type of symmetric encryption algorithm, it can

be designed to be exceptionally faster than any block cipher. While block

ciphers operate on large blocks of data, stream ciphers typically operate on

smaller units of plaintext, usually bits. The encryption of any particular

plaintext with a block cipher will result in the same ciphertext when the

same key is used. With a stream cipher, the transformation of these smaller

plaintext units will vary, depending on when they are encountered during

the encryption process.

A stream cipher generates what is called a keystream (a sequence of

bits used as a key). Encryption is accomplished by combining the

keystream with the plaintext, usually with the bitwise XOR operation. The

generation of the keystream can be independent of the plaintext and

ciphertext, yielding what is termed a synchronous stream cipher, or it can

depend on the data and its encryption, in which case the stream cipher is

said to be self-synchronizing [20].

Stream ciphers are generally faster than block ciphers, they are also

more appropriate, and in some cases mandatory, when buffering is limited

or when characters must be individually processed as they are received,

because they have limited or no error propagation, stream ciphers may

also be advantageous in situations where transmission errors are highly

probable [21].

Page 30: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

14

2.2.3 Random Number Generation

A random number generator is an algorithm that outputs a sequence

of 0s and 1s such that at any point, the next bit cannot be predicted based

on the previous bits. However, true random number generation is difficult

to do on a computer, since computers are deterministic devices. Thus, if the

same random generator is run twice, identical results are received. True

random number generators take input from something in the physical

world, for example, the rate of neutron emission from a radioactive

substance.

Because of these difficulties, random number generation on a

computer is usually only pseudo-random number generation. A pseudo-

random number generator produces a sequence of bits that has a random

looking distribution. With each different seed, the pseudo-random number

generator generates a different pseudo-random sequence [22].

2.2.4 Shift Register Based Schemes

The vast majority of any proposed keystream generators are based in

some way on the use of linear feedback shift registers because their

behavior is easily analyzed using algebraic techniques [23].

2.2.4.1 Linear Feedback Shift Register

A Linear Feedback Shift Register (LFSR) is a mechanism for

generating a sequence of binary bits (keystream). It consists of a number of

stages numbered from left to right as 0…L-1 with feedback from each to

stage 0, as shown in Figure (2.1). The contents of the L stages of a register

describe its state.

The register is controlled by a clock and at each clocking instances

the contents of stage i are moved to stage i+1. The contents of stage L-1 are

output and form part of the sequence while the new contents to stage 0, are

calculated as some linear function of the previous contents from

Page 31: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

15

stages 0…L-1, the particular function being dependent on the feedback

used where ci used as control bit of the feedback [23, 24].

Figure (2.1) Linear Feedback Shift Register

LFSRs are fast and easy to implement in both hardware and

software. With a judicious choice of feedback taps the sequences that are

generated can have a good statistical appearance. However, the sequences

generated by a single LFSR are not secure because a powerful

mathematical framework has been developed over the years which allows

for their straightforward analysis. However, LFSRs are useful as building

blocks in more secure systems.

2.2.4.2 Combination and Filter Generators

When using linear feedback shift registers there are two obvious

ways to generate an alternative output. The first is to use several registers

in parallel and to combine their output in some cryptographically secure

way. A generator like this is conventionally called a combination

generator. Another alternative is to generate the output sequence as some

nonlinear function of the state of a single register; such a register is termed

a filter generator [21].

2.2.4.3 Multiplexers

A multiplexer is a logic device that selects one input from a set of

inputs according to the value of another index input. The keystream

Page 32: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

16

generator is conventionally described using two sequences and the

multiplexer is used to combine these two sequences in a highly nonlinear

way [23]. Figure (2.2) shows a two-to-one multiplexer block diagram.

Figure (2.2) Two by one multiplexer

2.2.4.4 Desirable Properties of LFSR-Based Keystream Generators

For essentially all possible secret keys, the output sequence of an

LFSR-based keystream generator should have the following properties [21]:

1. Large period.

2. Large linear complexity.

3. Good statistical properties.

2.2.4.5 Life Cycle of a Key

Keys have limited lifetimes for a number of reasons. The most

important reason is protection against cryptanalysis. Each time the key is

used, it generates a number of ciphertexts. Using a key repetitively allows

an attacker to build up a store of ciphertexts which may prove sufficient for

a successful cryptanalysis of the key value. Thus keys should have a

limited lifetime [25].

Data Input

Multiplexer

Selector

Data Output

Page 33: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

17

2.2.5 Statistical Tests

There are some tests designed to measure the quality of a generator

purported to be a random bit generator, the tests described help detection of

certain kinds of weaknesses the generator may have. This is accomplished

by taking a sample output sequence of the generator and subjecting it to

various statistical tests. Each statistical test determines whether the

sequence possesses a certain attribute that a truly random sequence would

be likely to exhibit, if the sequence is deemed to have failed any one of the

statistical tests, the generator may be rejected as being non-random. On the

other hand, if the sequence passes all of the statistical tests, the generator is

accepted as being random. Below, five methods are discussed [21].

a. Frequency Test (Mono Bit Test)

The purpose of this test is to determine whether the number of 0’s

and 1’s in a binary sequence (s) are approximately the same, as would be

expected for a random sequence. Let n0 and n1 denote the number of 0’s

and 1’s in s, respectively. The statistics used is [21]:

n

)nn(X2

101

−= … 2.1

where X1: Frequency test value

n : length of the sequence

Which approximately follows a χ2 (Chi Square) distribution with

one degree of freedom if n ≥ 10.

b. Serial Test (Two-Bit Test)

The purpose of this test is to determine whether the number of

occurrences of 00, 01, 10, and 11 as subsequences of s are approximately

the same, as would be expected for a random sequence. Let n0, n1 denote

the number of 0’s and 1’s in s, respectively, and let n00, n01, n10, and n11

denote the number of occurrences of 00, 01, 10, 11 in s, respectively.

Page 34: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

18

The n00 + n01 + n10 + n11 = (n - 1) since the subsequences are allowed

to overlap. The statistics used is [21]:

( ) ( ) 1nnn2nnnn

1n4X 2

120

211

210

201

2002 ++−+++

−= … 2.2

where X2: Serial test value

Which approximately follows a χ2 (Chi Square) distribution with

two degrees of freedom if n ≥ 21.

c. Poker Test

Let m be a positive integer such that )2(5][ mmn ×≥ and let ][k m

n= .

Divide the sequence s into k non-overlapping parts each of length m, and

let ni be the number of occurrences of the ith type of sequence of length m,

1 ≤ i ≤ 2m. The poker test determines whether the sequences of length m

each appears approximately the same number of times in s, as would be

expected for a random sequence. The statistics used is [21]:

knk

2Xm2

1i

2i

m

3 −⎟⎟⎠

⎞⎜⎜⎝

⎛= ∑

= … 2.3

where X3: Poker test value

Which approximately follows a χ2 distribution with 2m-1 degrees of

freedom.

d. Runs Test

The purpose of the runs test is to determine whether the number of

runs of various lengths in the sequence s is as expected for a random

sequence. The expected number of gaps (or blocks) of length i in a random

sequence of length n is ei = (n-i+3)/2i+2. Let k be equal to the largest integer

i for which ei≥5. Let Bi, Gi be the number of blocks and gaps, respectively,

of length i in s for each i, 1 ≤ i ≤ k. The statistics used is [21]:

Page 35: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

19

∑−

+∑−

===

k

1i i

2iik

1i i

2ii

4 e)eG(

e)eB(X … 2.4

where X4: Runs test value

The statistics used is which approximately follows a χ2 distribution

with 2k-2 degrees of freedom.

e. Autocorrelation Test

The purpose of this test is to check for correlation between the

sequence s and its (non-cyclic) shifted versions.

Let d be a fixed integer, 1 <= d <= [n/2]. The number of bits in s not

equal to their d-shifts is [21]:

∑ ⊕=−−

=+

1dn

1idii ss)d(A … 2.5

where ⊕ denotes the XOR operator .The statistics used is

dn/2

dn)d(A2X5 −⎟⎠⎞

⎜⎝⎛ −

−= … 2.6

where X5: Autocorrelation test

Which approximately follows normal distribution if n-d>=10.

2.3 Steganography (Data Hiding in Text)

2.3.1 General Concepts

Soft-copy text is in many ways the most difficult place to hide data.

This is due largely to the relative lack of redundant information in a text

file as compared with a picture or a sound, while it is often possible to

make imperceptible modifications to a picture, even an extra letter or

period in text may be noticed by a casual reader. There are many methods

of encoding data, some of them are: open space methods that encode

through manipulation of white space, syntactic methods that utilize

Page 36: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

20

punctuation, and semantic methods that encode using manipulation of the

words themselves [26].

2.3.2 Coding Methods

There are many categories of coding method, some of them are:

2.3.2.1 Open Space Methods

There are two reasons why the manipulation of white space in

particular yields useful results. First, changing the number of trailing

spaces has little chance of changing the meaning of a phrase or sentence.

Second, a casual reader is unlikely to take notice of slight modifications to

white space. There are three methods of using white space to encode data.

The methods exploit inter-sentence spacing, end-of-line spaces, and inter-

word spacing in justified text.

a. Inter-Sentence Spacing

The first method encodes a binary message into a text by

placing either one or two spaces after each terminating

character, e.g., a point (.) or comma (,), etc. A single space

encodes a “0,” while two spaces encode a “1.” This method has

a number of inherent problems; it is inefficient, requiring a great

deal of text to encode a very few bits. One bit per sentence

equates to a data rate of approximately one bit per 160 bytes

assuming sentences are on average two 80-character lines of

text. Its ability to encode depends on the structure of the text.

Many word processors automatically set the number of spaces

after periods to one or two characters. Finally, inconsistent use

of white space is not transparent [26]. Figure (2.3) shows an

example of data hiding using inter-sentence spacing method.

Page 37: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

21

تعتبر وحدة المعالجة المركزية في الحاسب من أهم الأجزاء بل أهمها على الإطـلاق لأنها بمثابة العقل في الجهاز، كما أنها تعمل على إنجاز كافة العمليات الحـسابية فـي سرعات مذهلة، بالإضافة إلى معالجة مختلف أنواع البيانات والتنسيق بين جميع أجزاء

ج من أكثر الأجهزة تعقيدا، حيـث يحتـوي علـى ملايـين الحاسب، ويعتبر المعال الترانزستورات والتي تترابط مع بعضها البعض بواسطة شعيرات معدنية من الزجاج

.المصهور والتي لها سمكها أرق مئات المرات من سمك الشعرة الواحدة للإنسان

Figure (2.3) Example of data hidden using Inter-sentence spacing

b. End-of-Line Spaces

A second method of exploiting white space to encode data is to

insert spaces at the end of lines. The data are encoded allowing for a

predetermined number of spaces at the end of each line. Two spaces

encode one bit per line, four encode two, eight encode three, etc.,

dramatically increasing the amount of information it can encode over

the previous method. In Figure (2.4), the text has been selectively

justified, and then had spaces added to the end of lines to encode more

data, another advantages of this method are that it can be done with any

text, and it will go unnoticed by readers, since this additional white

space is peripheral to the text. As with the previous method, some

programs, e.g., “sendmail,” may inadvertently remove the extra space

characters. A problem unique to this method is that the hidden data

cannot be retrieved from hard copy [26].

Figure (2.4) Example of data hidden using End-of-line spaces

جزاء بل أهمها على الإطلاق لأنهـا تعتبر وحدة المعالجة المركزية في الحاسب من أهم الأ بمثابة العقل في الجهاز، كما أنها تعمل على إنجاز كافة العمليات الحسابية في سـرعات مذهلة، بالإضافة إلى معالجة مختلف أنواع البيانات والتنـسيق بـين جميـع أجـزاء

لـى ملايـين الحاسب، و يعتبر المعالج من أكثر الأجهزة تعقيدا، حيـث يحتـوي ع الترانزستورات والتي تترابط مع بعضها البعض بواسطة شعيرات معدنيـة مـن الزجـاج

. المصهور والتي لها سمكها أرق مئات المرات من سمك الشعرة الواحدة للإنسان

Page 38: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

22

c. Inter-Word Spacing

A third method of using white space to encode data involves left-

justification of text. Data are encoded by controlling where the extra

spaces are placed. One space between words is interpreted as a “0”.

Two spaces are interpreted as a “1”. This method results in several bits

encoded on each line as shown in Figure (2.5). Because of constraints

upon justification, not every inter-word space can be used as data. In

order to determine which of the inter-word spaces represent hidden data

bits and which are part of the original text. Another way is a

Manchester-like encoding method, Manchester encoding groups bits in

sets of two, interpreting “01” as a “1” and “10” as a “0.” The bit strings

“00” and “11” are null. For example, the encoded message

“01100101010001” is reduced to “101111”, while “110011” is a null

string [26].

Figure (2.5) Example of data hidden using Inter-word spacing

2.3.2.2 Syntactic Methods

There are many circumstances where punctuation is ambiguous or

when mispunctuation has low impact on the meaning of the text. For

example, the phrases “bread, butter, and milk” and “bread, butter and milk”

are both considered correct usage of commas in a list. Exploiting the fact

that the choice of form is arbitrary. Alternation between forms can

represent binary data, e.g., anytime the first phrase structure (characterized

by a comma appearing before the “and”) occurs, a “1” is inferred, and

anytime the second phrase structure is found, a “0” is inferred. Other

examples include the controlled use of contractions and abbreviations.

Page 39: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

23

While written English affords numerous cases for the application of

syntactic data hiding, these situations occur infrequently in typical prose.

The expected data rate of these methods is on the order of only several bits

per kilobyte of text.

Although many of the rules of punctuation are ambiguous or

redundant, inconsistent use of punctuation is noticeable to even casual

readers. Finally, there are cases where changing the punctuation will

impact the clarity, or even meaning, of the text considerably. This method

should be used with caution. Figure (2.6) shows the data hiding with this

method.

Figure (2.6) Example of data hidden using Syntactic methods

2.3.2.3 Semantic Methods

Semantic method is similar to the syntactic methods. Rather than

encoding binary data by exploiting ambiguity of form, these methods

assign two synonyms primary or secondary value. For example, the word

“big” could be considered primary and “large” secondary. Whether a word

has primary or secondary value bears no relevance to how often it will be

used, but, when decoding, primary words will be read as ones, secondary

words as zeros.

Page 40: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

24

2.3.2.4 Shift Coding

a. Line-Shift Coding

Line-shift coding is very easy to perform and is considered the

most resistant to degradation due to copying. In line-shift coding, the

lines of text are shifted vertically to encode the document, see

Figure(2.7).

Figure (2.7) Example of data hidden using Line-shift coding

By determining which lines have been shifted, the

encoded bits can be discovered. Although this method

withstands copying, the human eye and other measurements can

easily detect it. It can also be easily defeated through respacing

or reformatting of the text.

b. Word-Shift Coding

Word-shift coding can also be easily done. In word-shift

coding, code words are coded into a document by shifting the

vertical location of words within lines of text, see Figure(2.8). In

doing so, the appearance of natural spacing must be maintained

in order not to arouse suspicion. By determining the location

where unnatural spacing has occurred, the encoded bits can be

revealed.

Page 41: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

25

Figure (2.8) Example of data hidden using word-shift coding

There are advantages in using word-shift coding instead of line-

shift coding. Word-shift coding is less obvious to the unsuspecting

reader. Readers are used to reading text that has been justified for a

better presentation. However, there are also ways that word-shift coding

can be detected. If an attacker knew the spacing algorithm, the attacker

can calculate the differences in spacing and figure out the encoded data.

Like line-shift coding, word-shift coding can also be easily defeated

through respacing or justification of the text.

2.3.2.5 Feature Coding

Feature coding is another way of embedding data into a text file. In

feature coding, certain text features are altered depending on the embedded

data. For example, one type of feature coding would be extending the

vertical lines of characters such as “l”, “d”, “b”, “h”, “p”, and “q”. In order

for this type of feature coding to work, the text must be altered by

randomizing the lengths of the vertical lines before applying this algorithm.

The randomness will help the text look less suspicious to its readers.

In order to decode this algorithm, the text, after the randomization,

but before the algorithm application, can be compared with the message

containing the embedded data to retrieve the encoded bits. This type of

feature coding can be easily defeated if the vertical line length is adjusted

to a fixed length before the file is opened.

Page 42: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

26

2.3.3 Steganographic Protocols

There are basically three types of steganographic protocols used,

they are Pure Steganography, Secret Key Steganography and Public Key

Steganography [27].

2.3.3.1 Pure Steganography

This method of Steganography is the least secure means by which to

communicate secretly because the sender and receiver can rely only upon

the presumption that no other parties are aware of this secret message.

Using open systems such as the Internet this is not the case at all.

2.3.3.2 Secret Key Steganography

Secret Key Steganography is defined as a steganographic system that

requires the exchange of a secret key (stego-key) prior to communication.

Here, steganography takes a cover message and embeds the secret message

inside it by using a secret key (stego-key). Only the parties who know the

secret key can reverse the process and read the secret message. Unlike Pure

Steganography where a perceived invisible communication channel is

present, Secret Key Steganography exchanges a stego-key, which makes it

more susceptible to interception. The benefit to Secret Key Steganography

is even if it is intercepted, only parties who know the secret key can extract

the secret message.

2.3.3.3 Public Key Steganography

Steganography takes the concepts from Public Key Cryptography as

explained below. Public Key Steganography is defined as a steganographic

system that uses a public key and a private key to secure the

communication between the parties wanting to communicate secretly. The

sender will use the public key during the encoding process and only the

private key, which has a direct mathematical relationship with the public

Page 43: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

27

key, can decipher the secret message. Public Key Steganography provides a

more robust way of implementing a steganographic system because it can

utilize a much more robust and researched technology in Public Key

Cryptography. It also has multiple levels of security in that unwanted

parties must first suspect the use of steganography and then they would

have to find a way to crack the algorithm used by the public key system

before they could intercept the secret message.

2.4 Unicode System

Unicode is a universal character encoding standard, designed to

represent text for computer interchange, processing, and display of many

modern written languages. It is a 16-bit encoding that encompasses many

characters used in general text interchange throughout the world, they

include the principal written languages of Europe, America, the Middle

East, India, Africa, Asia, and Pacifica. Each Unicode index refers

unambiguously to a given character. Unicode allows a larger range of

characters to be addressed than is possible using a Single-Byte character

encoding [28]. Figure (2.9) shows the layout of this encoding system.

Figure (2.9) Unicode's encoding layout

2.4.1 Characters

The smallest component of written language that has semantic value,

refers to the abstract meaning and/or shape, rather than a specific shape,

though in code tables some form of visual representation is essential for the

reader to understand [28].

Page 44: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

28

2.4.2 Arabic Characters

Arabic script is a cursive writing system used for the Arabic

language, the appearance of a letter changes depending on its

context/position: isolated, initial (joined on the left), medial (joined on both

sides), and final (joined on the right). Arabic code points in the U+0600 -

U+06FF range Unicode table (Appendix A) represents all of the letters

without regard to their position, it is up to the font to show the letter with

the proper appearance. For compatibility with existing standards, Unicode

also defined code points with explicit positions for most letters (Arabic

presentation standard form and form-B) [29].

2.5 Data Compression

In computer science, data compression is the process of encoding

information using fewer bits than a more obvious representation would

use [30].

As is the case with any form of communication, compressed data

communication only works when both the sender and receiver of the

information understand the encoding scheme. Compression is important

because it helps to reduce the consumption of expensive resources, such as

disk space or connection bandwidth.

The task of compression consists of two components, an encoding

algorithm that takes a message and generates a “compressed”

representation (hopefully with fewer bits), and a decoding algorithm that

reconstructs the original message or some approximation of it from the

compressed representation [31].

There are lossless and lossy forms of data compression. Lossless data

compression is used when the data has to be uncompressed exactly as it

was before compression. Text files are stored using lossless techniques,

since losing a single character can in the worst case make the text

dangerously misleading. Lossy compression, in contrast, works on the

Page 45: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

29

assumption that the data doesn't have to be stored perfectly. Much

information can be simply thrown away from images, video data, and audio

data, and when uncompressed such data will still be of acceptable

quality [32].

2.5.1 Static Huffman Coding

The basic idea in Huffman coding is to assign short codewords to

those input blocks with high probabilities and long codewords to those with

low probabilities. It is a variable length coding technique that provides a

systematic approach to designing a variable length code which is best for

a given finite-alphabet source.

The Huffman algorithm uses the notion of prefix code. A prefix code

is a set of words containing no word that is a prefix of another word of the

set. The advantage of such a code is that decoding is immediate. Moreover

it can be proved that this type of code does not weaken the compression.

A prefix code on the binary alphabet {0,1} corresponds to a binary

tree in which the links from a node to its left and right children are labeled

by 0 and 1 respectively. Such a tree is called a (digital) tree. Leaves of the

tree are labeled by the original characters and labels of branches are the

words of the code (codewords of characters). Working with prefix code

implies that codewords are identified with leaves only. Moreover, in the

present method codes are complete: they correspond to complete tree i.e.

tree in which internal nodes have all exactly two children. In the model where characters of the text are given new codewords

the Huffman algorithm builds a code that is optimal in the sense that the

compression is the best possible. The length of the encoded text is

minimum. The code depends on the input text and more precisely on the

frequencies of characters in the text. The most frequent characters are given

Page 46: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

30

short codewords while the least frequent symbols correspond to the longest

codewords [33].

2.5.1.1 Encoding

The complete compression algorithm is composed of three steps:

count of character frequencies, construction of the prefix code, encoding of

the text. The last two steps use information computed by their preceding

step [33].

First step consists counting the number of occurrences of each

character in the original text. It is possible to skip this first step if fixed

statistics on the alphabet are used. In this case however the method is

optimal according to the statistics but not necessarily for the specific text.

Second step of the algorithm builds the tree of a prefix code called a

Huffman tree using the character frequency freq(a) of each character a in

algorithm below.

Algorithm (2.1) Creating Huffman tree

Create a one-node tree t for each

Character a, setting weight(t)=freq(a) and label (t)=a,

Repeat until only one tree remains

Extract the two least weighted trees t1 and t2

Create a new tree t3 having

Left subtree t1, right subtree t2,

and weight weight(t3)= weight(t1)+weight(t2)

Figure (2.10) shows an example of the Huffman tree.

Page 47: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

31

Figure (2.10) Example of a Huffman code represented as a binary tree

2.5.1.2 Decoding

Decoding a file containing a text compressed by Huffman algorithm

is a mere programming exercise. First the coding tree is rebuilt and then the

original text is recovered by parsing the compressed text with the coding

tree. The process begins at the root of the coding tree and follows a left

edge when a 0 is read or a right edge when a 1 is read. When a leaf is

encountered the corresponding character (in fact the original codeword of

it) is produced and the parsing resumes at the root of the tree. The process

ends when the codeword of the end marker is encountered [33].

2.6 Technical Framework

2.6.1 Introduction

Instructional design has developed as a prescriptive science based on

a system approach linking basic research in the psychological processes of

learning with concrete solutions to instructional problems such as optimal

learning retention and transfer.

Many students in higher education find learning difficult, especially

when it comes to understanding the course content and doing their

p(A)=0.16 p(D)=0.13 p(E)=0.11 p(C)=0.09

p(AD)=0.29 p(CE)=0.20

P(ADCE)=0.49 p(B)=0.51

p(ADCEB)=1.00

1

1

1

1

0

0

0

0

Page 48: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

32

coursework or assignments. Students are complaining that they do not

know what is expected of them in their subjects. Although they attend

lectures regularly, they often fail to know what they are supposed to know.

This is especially true when it comes to the examinations. Students have

little idea what they should be revising and what types of questions they

would expect to be asked. This results in students becoming demoralized

and frustrated. This problem can be resolved by the adoption of

instructional design research in teaching [16].

2.6.2 Instructional Design

Instructional design is concerned with understanding, improving and

applying methods of instruction. It is a process of deciding what methods

of instruction are best for bringing about desired changes in student

knowledge and skills for a specific student population [34]. Growth of

instructional design has evolved over the past half-century from an initial

narrow focus on programmed instruction to a multidimensional field of study,

integrating psychology, education, measurement and management [35].

Instructional design theory is a set of prescriptions for determining

the appropriate instructional strategies to enable learners to acquire

instructional goals. The theory is prescription-based and founded on

instructional theory and related disciplines. The emphasis is on what works

rather than on the steps to carry out the design and development process [15].

Instructional design [36] is a set of procedures for systematically

designing and developing instructional materials. The emphasis is primarily

on what to do, rather than how to do it or why it works. Instructional design

has many variations, but all involve seven basic phases [37]:

1. Planning. 2. Classification. 3. Analysis. 4. Construction. 5. Implementation.

Page 49: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Two Theoretical Concepts of Data Hiding

33

6. Evaluation. 7. Development.

At the most general level, Instructional design is a process for

determining what to teach and how to teach it. The assumption is made that

there is a target population that should learn something. To determine what

is to be learned, the designer analyses a goal statement to identify

subordinate skills and formulates specific objectives and associated criteria

referenced assessments [16].

2.6.3 Instructional Package

Instructional package is a program that has the ability to create

instructional events by participating with the user. This makes the learning

sequential, graded in continual steps [38].

The instructional package, in general, is formed from the following items:

1. Title which represents the title of the package.

2. Introduction that shows the idea of the contents.

3. Target community identification.

4. Instructional target which can be measured and observed by the

learner to expect what he will do during his study of the package.

5. Help about using the package.

6. Contents of the package units show the units used by the package.

7. Pre-test to know the skills of the learner.

8. Instructional activities and alternatives which are suitable for learner

characteristics and take into consideration the personal differences.

9. Exercise shows the range of the package benefit, which contains

feedback.

10. Post-test which is the final test used after finishing from all units to

determine that the aims of the package are achieved.

Page 50: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three

The Proposed Hiding

Algorithm

Page 51: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

34

3.1 Introduction

The proposed method provides a new technique of encrypting and

hiding information in an Arabic document files, demonstrating how one

can easily encode and embed secret message in a text file format.

First, a linear feedback shift registers with 128 bits key is proposed

to build a stream cipher used to generate binary keystream sequence, and

take exclusive-or with the plain message to obtain ciphered message.

Second, a new method is proposed to hide the cipher message in an

Arabic text file, benefit from Unicode system characteristics.

3.2 Specification of the Proposed Software

The software hides the information in a file with extension (.DOC),

this means that it is fully compatible with Microsoft Word, which is a part

of Microsoft Office. This lets every one use the program to hide

information using Microsoft Word document files.

The software is written in Visual Basic Language, which benefits

from its features to design an information hiding editor and manage

Microsoft Word Objects used to deal with Microsoft Word files.

The process of hiding a stream of information in a file can be

achieved using the following hiding methods:

1. Using hyphens.

2. Using spaces between words.

3. Change the word position (to: right, left, up, or down).

4. Unicode system method.

5. Hiding in RTF file format.

Page 52: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

35

An essential part of creating a useful program is providing a simple

and consistent way for the user to interact with it. Menus and toolbars

provide a quick, convenient, and widely accessible way to expose simple

commands and options to the user. They're easy to customize and

controlled by Visual Basic language, and used to write the program in a

way that lets making any modification in the future easy.

3.3 The proposed Software Structure

The program consists of several tasks and each was designed to

perform specific operation. The tasks of data encrypting and hiding

comprise the following steps:

1. Provide a plain text (to hide) and a password (for encryption).

2. Debrief time from computer clock.

3. Mix password with time.

4. Convert text to binary stream (with Huffman coding).

5. Initialize the linear feedback shift register.

6. Generate keystream.

7. Test keystream.

8. Check document file.

9. Encrypt plain text with keystream to get cipher text.

10. Hide cipher text in the document file.

11. Hide time in the document file.

Figure (3.1) shows the steps of the proposed software.

Page 53: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

36

Figure (3.1) Steps of data encrypting and hiding

Enter Passw ord for encryption

D ebrief tim e from com puter clock

M ix passw ord w ith tim e to produce encryption key

Convert plain text to binary stream (using H uffm an coding)

Initialize the linear feedback shift register

G enerate keystream

Encrypt plain text (binary) w ith keystream to get cipher text

H ide cipher text in docum ent file

H ide tim e in docum ent file

iM ac

Enter text to hide

End process

Check paragraph

Test K eystream

Enter Passw ord for encryption

D ebrief tim e from com puter clock

M ix passw ord w ith tim e to produce encryption key

Convert plain text to binary stream (using H uffm an coding)

Initialize the linear feedback shift register

G enerate keystream

Encrypt plain text (binary) w ith keystream to get cipher text

H ide cipher text in docum ent file

H ide tim e in docum ent file

iM ac

Enter text to hide

End process

Check paragraph

Test K eystream

Pass

Fail

Pass

Fail

Page 54: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

37

3.4 Operation of the proposed Software

In order to understand the operation of the proposed software, the

following subsection illustrates each step described in section (3.3).

3.4.1 Providing a Plain Text and a Password

The user provides a text (to be hidden), and a password (for

encoding). The password consists of ten digits (may be characters,

numbers, or mixture of both).

3.4.2 Loading Microsoft Word File

A document file can be loaded from file menu, which must be

compatible with Microsoft Word editor and contains a text written in

Arabic language.

Automating Word from Visual Basic allows the programmer to

export, edit, and return data by referencing another application's objects,

properties, and methods. Application objects that are referenced in another

application are called Automation objects. The first step toward making

Word available to Visual Basic for Automation is to create a reference to

the Word type library. A reference to the Word type library can be created,

by clicking References on the Tools menu in the Visual Basic Editor, and

then select the check box next to Microsoft Word 8.0 Object Library.

Open a Word Application object and assigns it to appWD. Using the

objects, properties, and methods of the Word Application object. The

following example opens an existing Word document.

appWd.Documents.Open filename

Page 55: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

38

3.4.3 Selecting Hiding Method

One of the hiding methods can be selected from [Stego

Menu Select methods submenu]. The menu consists of four methods to

hide the information in the Arabic text, the first method is the Unicode

system method, second is the white space method, third is the hyphen

method, and last is the change position method.

3.4.4 Debriefing the Time from the Computer Clock

The program debrief first the time from the computer clock at

starting process. Start process is done by pushing (hide icon) from the

toolbar so that the program gets the time (in seconds) from the computer

clock at that moment. The time consists of eight characters (seven digits),

the first five digits represent the second and the two digits after dot

represent the partial second separating them by a point, which are ignored

to get pure seven digits. (For example 52170.63). These seven digits mixed

with ten digits of the password entered by the user according to the

following steps:

• Convert each digit of the password to its ASCII code to create an

encryption key consisting of twenty bytes (each digit represented by two

byte ASCII code).

• Multiply specific bytes of the encryption key by the digits of the time to

regenerate a new encryption key. Even if the password is repeated many

times, a different encryption key is generated to encode the data because

of the variation of computer clock time at each moment.

• Example, if the password provided is “D6JU3SHU80”, then the

encryption key will be “68547485518372855648”. And if the time at

start process is “72270.03”, the new encryption key will be

“68532191356072805192”, through the algorithm shown in figure (3.2).

Page 56: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

39

In the flowchart, the first value between the brackets represents the

digit position and the second represents number of digits, for example if

i=1, the old_key will be two digits from position four (47) and timer will be

one digit from position one (7). For example, if the equation is

new_key(4) = old_key(4,2) * time(1,1)

Then

new_key (4) = 47 * 7

The condition (i=6) in the flowchart used to skip location six in the

time string which represent dot, for example “72270.03”

Figure (3.2) Generation process of the encryption key

i = 1

Start

new_key(2+(i*2))=old_key(2+(i*2),2)*time(i,1)

i = i + 1

i = 6

i <= 8

No

Yes

Yes

End

No

Page 57: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

40

3.4.5 Generate Keystream

The generator of the keystream is built from five registers R1, R2,

R3, R4, and R5. Each register has variable cells length depending on the

encryption key (the summation of the five register cells are 128 to obtain a

128 bit key size).

3.4.5.1 Labels

Assign a label to registers and to each part of them with a name to represent its activity.

• Register’s name

5i1where,)i(gRe ≤≤

• Register’s length

∑ =128length_)i(gRewhere,length_)i(gRe

• Bit state

}{ length_)i(gRej1where1,0)j(cell_)i(gRe ≤≤=

• Feedback taps

}{ length_)i(gRej1where1,0)j(tap_)i(gRe ≤≤=

• Transfer bit locations

length_)i(gRej1where),j(cell_)i(gRe)cellsActive( cellssixteen ≤≤∈ • Transfer address

length_)i(gRej1where),j(cell_)i(gRe)UP_Address( cellsfour ≤≤∈ length_)i(gRej1where),j(cell_)i(gRe)DOWN_Address( cellsfour ≤≤∈

• Multiplexer selector (MS)

length_)i(gRej1where)j(cell_)2(gRe)1selectorMux( cellone ≤≤∈ length_)i(gRej1where)j(cell_)3(gRe)2selectorMux( cellone ≤≤∈ length_)i(gRej1where)j(cell_)4(gRe)3selectorMux( cellone ≤≤∈

• Keystream output

}{ length_Textj1where1,0)j(keystream ≤≤=

Page 58: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

41

Mux

.K

ey s

trea

m

Feed

back

Fun

ctio

n

Feed

back

Fun

ctio

n

Feed

back

Fun

ctio

n

Feed

back

Fun

ctio

n

Feed

back

Fun

ctio

n

Reg

iste

r 1

Reg

iste

r 2

Reg

iste

r 3

Reg

iste

r 4

Reg

iste

r 5

Figure (3.3) LFSRs proposed to encode data

Page 59: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

42

3.4.5.2 The Registers

Each cell in the registers contain one bit, and also has variable set of

feedback tap positions depending on the encryption key, the proposed

registers are shown in figure(3.3).

Each register is connected to its following register by one bit

connection (after each shift operation, a register transfers one bit to the

following register to change its bit stream), the connection is changed

depending on the bits in the transfer bit locations.

3.4.5.3 Initialization Registers

The generated key (encryption key) from the previous stage

initializes LFSRs (linear feedback shift registers) by specifying its

characteristics (register length, initial states, feedback taps, transfer bit

locations, transfer address, and multiplexer selector), which are used to

produce the keystream. Figure (3.4) shows the register characteristics

mapped on encryption key table, (Appendix B/ Subroutine-1 shows a full

initialize registers program).

Encryption key (digit)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Transfer bit Reg. Length

Initial state Feedback tap AD_Up AD_Down Ms

3 Ms

2 Ms

1

Figure (3.4) Encryption key table

Page 60: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

43

a. Registers Length

To obtain 128 bit key size, the summation of the five register cells is

128 cells. The length of the first four registers are generated randomly

between 20 - 27 cells, this can be achieved by the algorithm explained in

the flowchart shown in figure (3.5).

To produce a random integer number in a given range, the following formula is used: rnd_key = (key(15+i, 1) + 1) / 10 ….3.1

Int((upperboundary – lowerboundary + 1)*Rnd + lowerboundary) ….3.2

Where, Upperboundary is the highest number in the range Lowerboundary is the lowest number in the range Rnd is a rnd_key generated from eq.(3.1)

To ensure that total register length is 128 cells, calculate the length

of register number five from the equation

3.3....length_)4(reglength_)3(reg

length_)2(reglength_)1(reg128length_)5(reg ⎥

⎤⎢⎣

⎡+

++−=

Figure (3.5) Flowchart represents generation of register length

i = 0

Start

rnd_key = (key(15+i, 1) + 1) / 10

i = i + 1

i <= 4Yes

EndNo

reg(i)_length = Int((30-20+1) * rnd_key + 20)

Page 61: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

44

b. Initial State Bits

The registers are initialized from the encryption key as shown in the

flowchart of figure (3.6), where all registers are initialized with binary

number {0, 1}. Eq.(3.4) gets two digits from the encryption key to be used

by the following equation to generate a random number.

r1 = key(4+j , 2) + 1 … 3.4 where 1 <= j <= number of registers

The function right in the algorithm means to cut from the variable

specific digits from the right.

Figure (3.6) Flowchart represents generation of initial state

i = 1

Start

r1 = key(4+j, 2) + 1

i = i + 1

End

reg(j)_cell(i) = r2 Mod 2

j = 0

r1 = r1 / (key(5+i,1)+1)

r2 = right(r1,3)

j = j + 1 No

No

Yes

Yes

i <= reg(j)_length

j <= 5

Page 62: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

45

c. Feedback Taps

Generate a random set of feedback tap for all registers, depending on

the encryption key, by the algorithm shown in flowchart of figure (3.7).

Figure (3.7) Flowchart represents generation of feedback tap

i = 1

Start

r1 = key(12+j, 2) + 1

i = i + 1

End

reg(j)_tap(i) = r2 Mod 2

j = 0

r1 = r1 / (key(13+i,1)+1)

r2 = right(r1,3)

j = j + 1 No

No

Yes

Yes

i <= reg(j)_length

j <= 5

Page 63: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

46

d. Transfer Bit Locations

Transfer bit locations are sixteen active cells in each register. They

are used to transfer a bit from one active cell to the next register, the

algorithm used to generate these locations is represented by the flowchart

shown in figure (3.8).

Figure (3.8) Flowchart represents generation of active cells

i = 1

Start

r1 = key(1+j, 2) + 1

i = i + 1

i <= 16

End

reg(j)_cell( r2 Mod reg(j)_length+1 ) = Active

j = 0

r1 = r1 / (key(1+i,1)+1)

r2 = right(r1,4)

j = j + 1

j <= 5

No

No

Yes

Yes

Page 64: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

47

e. Transfer Address

A transfer address is eight cells in each register, four cells are used to

address an active cell with previous register and the other four cells are

used to address an active cell with next register. The algorithm used to

generate these cells is represented by the flowchart shown in figure (3.9).

(a)Generation of transfer address cells (b)Generation of transfer address cells with previous register with next register

Figure (3.9) Flowcharts represent generation of transfer address cells

i = 1

Start

r1 = key(5+j, 2) + 1

i = i + 1

i <= 4

End

reg(j)_cell(r2 Mod reg(j)_length+1) = AD_U

j = 0

r1 = r1 / (key(5+i,1)+1)

r2 = right(r1,4)

j = j + 1

j <= 5

No

No

Yes

Yes

i = 1

Start

r1 = key(10+j, 2) + 1

i = i + 1

i <= 4

End

reg(j)_cell(r2 Mod reg(j)_length+1) = AD_D

j = 0

r1 = r1 / (key(10+i,1)+1)

r2 = right(r1,4)

j = j + 1

j <= 5

No

No

Yes

Yes

Page 65: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

48

f. Multiplexer Selector

A multiplexer selector is three cells, one in register 2, one in

register 3 and one in register 4. They are used as a multiplexer

selector, to select which register output will formalize a bit of the

key stream. The algorithm used to generate these cells is

represented in the flowchart shown in figure (3.10).

Figure (3.10) Flowchart represents generation of multiplexer selector cells

Start

r1 = key(20, 2) + 1

End

reg(2)_cell(r2 Mod reg(2)_length+1) = MS1

r1 = r1 / (key(20,1)+1)

r2 = right(r1,4)

r1 = key(15, 2) + 1

reg(3)_cell(r2 Mod reg(2)_length+1) = MS2

r1 = r1 / (key(15,1)+1)

r2 = right(r1,4)

r1 = key(7, 2) + 1

reg(4)_cell(r2 Mod reg(2)_length+1) = MS3

r1 = r1 / (key(7,1)+1)

r2 = right(r1,4)

Page 66: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

49

3.4.5.4 Design Principles

The generator uses five LFSRs connected to each other by a random

connection. This connection depends on bits in transfer address cells

(which address the active cells in each register) and all register’s output are

connected to a multiplexer. Choosing an output (multiplexer selector

changed at each clock pulse) represents a bit of the keystream, as described

in figure (3.3). 3.4.5.5 Keystream Generation

The combined shift registers perform the following operations,

Starting from register 1; i.e.: i=1, (Appendix B/ Subroutine-2 shows a full

stream generation program).

i. The content of reg(i)_cell(j) is shifted to reg(i)_cell(j-1) (one bit to

the right) for each j, 1≤ j ≤ L, where L is reg(i)_length.

ii. The new content of reg(i)_cell(L) is the feedback bit, calculated

from a random feedback function, as shown in algorithm (3.1).

Algorithm (3.1) Feedback bit calculation

Feedback = Reg(i)_cell(L) ,

For 1≤ j ≤ reg(i)_length ,

feedback = feedback XOR [reg(i)_cell(j) * reg(i)_tap(j)] ,

Reg(i)_cell(L) = feedback ,

iii. Calculate the address of transfer bit from reg(i), and the address of

transfer bit to reg(i+1) from address locations assigned in reg(i) and

reg(i+1) , as shown in algorithm (3.2).

Page 67: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

50

Algorithm (3.2) Transfer bit Calculation

address_reg(i) = 0, k = 0

For 1≤ j ≤ reg(i)_length ,

If reg(i)_cell(j) = Address_DOWN = 1 ,

address_reg(i) = address_reg(i) + 2k ,

k = k + 1 ,

address_regi+1 = 0, k = 0

For 1≤ j ≤ reg(i+1)_length ,

If reg(i+1)_cell(j) = Address_UP = 1 ,

address_regi+1 = address_regi+1 + 2k ,

k = k + 1 ,

For Active_cell only

reg(i+1)_cell(address_regi+1) = reg(i)_cell(address_regi) ,

iv. Repeat from (i) to (iii) , to the remaining registers.

v. Calculate the address of the multiplexer selector from the three cells

one in each of the registers (2), (3) and (4), as shown in

algorithm(3.3).

Algorithm (3.3) Multiplexer selector address calculation

Mux_selector = ( reg(2)_cell(mux_selector1) ) * 20 +

( reg(3)_cell(mux_selector2) ) * 21 +

( reg(4)_cell(mux_selector3) ) * 22

Page 68: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

51

vi. The output of the multiplexer forms part of the keystream, as shown

in algorithm (3.4).

Algorithm (3.4) Keystream selected from multiplexer

Select case Mux_selector

If 0 or 1, then keystream = keystream + reg(1)_cell(1)

If 2, then keystream = keystream + reg(2)_cell(1)

If 3 or 4, then keystream = keystream + reg(3)_cell(1)

If 5, then keystream = keystream + reg(4)_cell(1)

If 6 or 7, then keystream = keystream + reg(5)_cell(1)

vii. Repeat steps (i) to (vi), until generating the keystream used to

encode the text.

3.4.5.6 Keystream Testing

A test subroutine is used to determine whether the keystream

possesses some specific characteristic that makes it truly random

sequence. There are five statistical tests used (frequency test, serial test,

poker test, run test, and autocorrelation test). The keystream is checked-

up by all these tests and if any one of them fails, the program should

regenerate a new keystream, (Appendix B/ Subroutine-3 shows a full

stream test program).

a. Frequency test, using equation (2.1) with threshold value 3.8415 (one

degree of freedom and mean level 0.05), as shown in algorithm (3.5).

Page 69: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

52

Algorithm (3.5) Calculate frequency test value

n0 = 0, n1 = 0,

For 1 < i < n,

If keystream(i) = 0, n0 = n0 + 1,

Else n1 = n1 + 1,

f_test = ( n0 - n1 )2 / n,

If f_test < 3.8415, then PASS, Else FAIL,

b. Serial test, using equation (2.2) with threshold value 5.9915

(two degrees of freedom and meaning level 0.05), as shown in

algorithm (3.6).

Algorithm (3.6) Calculate serial test value

n0 = 0, n1 = 0,

n00 = 0, n01 = 0, n10 = 0, n11 = 0,

For 1 < i < n,

If keystream(i) = 0, n0 = n0 + 1,

Else n1 = n1 + 1,

bits_check=keystream; get two bits from position i,

If bits_check =00, n00 = n00 + 1,

Else if bits_check =01, n01 = n01 + 1,

Else if bits_check =10, n10 = n10 + 1,

Else if bits_check =11, n11 = n11 + 1,

s_test = [4/(n-1)*(n002+n01

2+n102+n11

2)] - [(2/n)*(n02+n1

2)] + 1,

If s_test < 5.9915, then PASS,

Else FAIL,

Page 70: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

53

c. Poker test, using equation (2.3) with threshold value 14.0617 (seven

degrees of freedom (2m-1=23-1=7), and meaning level 0.05), as shown

in algorithm (3.7).

Algorithm (3.7) Calculate poker test value

m = 3, ‘ length of block

bno = Int( n / m ) ‘ blocks number

n000=0, n001 =0, n010 =0, n011 =0, n100 =0, n101=0, n110 =0, n111=0,

For 1 < i < n step by m,

bits_check=keystream; get three bits from position i,

If bits_check =000, n000 = n000 + 1,

Else if bits_check =001, n001 = n001 + 1,

Else if bits_check =010, n010 = n010 + 1,

Else if bits_check =011, n011 = n011 + 1,

Else if bits_check =100, n100 = n100 + 1,

Else if bits_check =101, n101 = n101 + 1,

Else if bits_check =110, n110 = n11 0+ 1,

Else if bits_check =111, n111 = n111 + 1,

p_test = [(2m / bno) * (n0002 + n001

2 + n0102 + n011

2 + n1002 + n101

2

+ n1102 + n111

2)] - bno

If p_test < 14.0617, then PASS,

Else FAIL,

d. Run test, using equation (2.4) with threshold value 9.4877 (four

degrees of freedom (2k-2=2*3-2=4), and meaning level 0.05), as

shown in algorithm (3.8)

Page 71: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

54

dn −

Algorithm (3.8) Calculate run test value

k=3 ‘ largest no. of bits

bl1=0, bl2=0, bl3=0 ‘ blocks

ga1=0, ga2=0, ga3=0 ‘ gaps

For 1 < i < n,

bl1 = calculate no of “1” in sequence,

bl2 = calculate no of “11” in sequence,

bl3 = calculate no of “111” in sequence,

ga1 = calculate no of “0” in sequence,

ga2 = calculate no of “00” in sequence,

ga3 = calculate no of “000” in sequence,

For 1 < i < k,

e(i) = (n – 1 + 3) / 2i+2 ‘ expected no. of gaps or blocks

r_test = ∑ −+∑ −==

k

1i

2k

1i

2 )i(e/)]i(e)i(ga[)i(e/)]i(e)i(bl[

If r_test < 9.4877, then PASS, Else FAIL,

e. Autocorrelation test, using equations (2.5) and (2.6) with threshold

value 1.6449 (meaning level 0.05), as shown in algorithm (3.9).

Algorithm (3.9) Calculate autocorrelation test value

A=0, ‘ autocorrelation value

d=8, ‘ shift value

For 1 < i < n,

A = A + [ keystream(i) XOR keystream(i+d) ]

a_test = 2 * [ A – (n-d) / 2 ] /

If a_test < 1.6449, then PASS,

Else FAIL,

Page 72: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

55

3.4.6 Huffman Code

The text to be hidden can be converted from characters (ASCII code)

to binary. Each character is represented by eight bits, to reduce the total bit

message, a Huffman code is used. Figure (3.11) shows the binary stream

generation scheme.

At first, the frequencies (probabilities) of each character should be

counted in many text files. Containing Arabic characters, (more than 96%

of text to hide consists of only 36 characters; the Arabic letters, and the

Figure(3.11) Binary stream generation scheme

space) can be used to make an appropriate compression scheme.

Figure(3.12) shows a histogram of Arabic characters probability.

To build the Huffman tree of a prefix code using the characters

probability (as in figure(3.12)), the following steps are used for this

purpose:

• Order the characters from highest to lowest probability.

• Then the two least-probability characters are selected, logically grouped

together, and their probabilities added. This begins the construction of a

"binary tree" structure.

• Now again select the two elements the lowest probabilities, and

combination as a single element.

Plain text Binary stream Huffman coding

Page 73: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

56

ا

لنتومي

رفعةبهقسدكجذىصشحأ

ؤءزغئظضثطإخ Space

• Continue in the same way to select the two elements with the lowest

frequency, group them together, and add their frequencies, until running

out of elements.

• The result is known as a "Huffman tree". To obtain the Huffman code

itself, each branch of the tree is labeled with a 1 or 0.

• Tracing down the tree gives the "Huffman codes", with the shortest

codes assigned to the characters with the greatest probability.

Figure (3.13) shows Huffman tree, and table (3.1) shows Arabic

characters listed from highest to lowest probability and their Huffman code.

Encode each character wants to hide, to its Huffman code, to get a binary

stream of data labeled as binary_stream(i), where [1 ≤ i ≤ stream_length],

(Appendix B/ Subroutine-4 shows a full text to Huffman code conversion

program, Subroutine-5 for Huffman code to text conversion program, and

Subroutine-8 for Huffman array creation).

Figure (3.12) Histogram of Arabic characters probability

Page 74: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

57

Figure (3.13) Huffman tree and the generated codes

Page 75: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

58

Table (3.1) Huffman codes for Arabic characters

Letter Probability % Code (Bit) SP 18.11 0 0 0 1 0 0 12.00 ا 0 0 1 0 9.93 ل 1 0 1 0 5.84 ي 0 1 1 0 5.51 م 1 1 1 0 4.37 و 0 0 0 0 1 3.99 ت 1 0 0 0 1 3.80 ن 0 1 0 0 1 3.71 ر 1 1 0 0 1 2.73 ف 0 0 0 1 0 1 2.66 ة 1 0 0 1 0 1 2.46 ع 0 1 0 1 0 1 2.44 هـ 1 1 0 1 0 1 2.39 ب 0 0 1 1 0 1 2.25 س 1 0 1 1 0 1 2.24 ق 0 1 1 1 0 1 2.12 ك 1 1 1 1 0 1 1.79 د 0 0 0 0 1 1 1.66 أ 1 0 0 0 1 1 1.36 ح 0 1 0 0 1 1 1.24 ش 1 1 0 0 1 1 0.98 ص 0 0 1 0 1 1 0.86 ى 1 0 1 0 1 1 0.84 ذ 0 1 1 0 1 1 0.81 ج 1 1 1 0 1 1 0.75 خ 0 0 0 1 1 1 0.75 إ 1 0 0 1 1 1 0.71 ط 0 1 0 1 1 1 0.43 ث 1 1 0 1 1 1 0.26 ض 0 0 1 1 1 1 0.22 ظ 1 0 1 1 1 1 0.19 ئ 0 0 1 1 1 1 1 0.19 غ 1 0 1 1 1 1 1 0.19 ز 0 1 1 1 1 1 1 0.18 ء 1 1 1 1 1 1 1 0.08 ؤ

Page 76: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

59

3.4.7 Check Document File

This task is used to check if the paragraphs in a document file have

enough area to hide data. The way of checking depends on hiding method

selected by the user. If there is not enough area in a file, then the program

indicates that by a message: the number of bits wants to hide and number

of bits can hide into file.

3.4.8 Encryption

Encrypting binary_stream, using a stream cipher is defined as a

keystream generated from previous step of the same length, to produce a

cipher_stream by bitwise XOR operation, where

cipher_stream(i) = binary_stream(i) XOR keystream(i)

where 1 ≤ i ≤ stream_length

Appendix B/ Subroutine-6 shows a full Encipher process program, and

Subroutine-7 for full Decipher process program.

3.4.9 Hide cipher text

Hiding the cipher_stream in an Arabic text document, there are four

methods proposed to hide the information:

3.4.9.1 Hyphen method

A hyphen ( ـ or (kashida) is a small line used to connect between ( ـ

Arabic characters which are used to stretch characters to increase length of

words, to justify the paragraph to a specific margin.

The hyphen must be added between two linked characters. In this

work, for each word with no hyphen inserted is interpreted as “0”, one

hyphen is interpreted as “1”, as shown in algorithm (3.10).

Page 77: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

60

next:

Algorithm (3.10) Data hiding using hyphen method

k = 0 For 1 < i <= words_Count k = k + 1 p = no_of_hyphen_can_add_to_word(i) If p >= cipher_stream(k) Then p = 0 char_curr = word(i)_char(1) new_text = new_text + char_curr For 2 < j < word_char_count char_prev = word(i)_char(j-1) char_curr = word(i)_char(j) If char_curr = "س" or “ ”ش then new_text = new_text + char_curr + " ”ـ , p = p + 1 If cipher_stream(k) = p Then new_text = new_text + Cut word(i) form right(word(i)_char_count-i) Go to next ElseIf [(char_curr = "ا" or "ة" or "ي") and (char_prev <> "ر" and "ز" and "و" and "ذ" and "د" and "ا" and "أ" and "آ" and "إ" and "ء" and "ل")] then new_text = new_text + " ”ـ + char_curr , p = p + 1 If cipher_stream(k) = p Then new_text = new_text + Cut word(i) form right(word(i)_char_count-i) Go to next Else new_text = new_text + char_curr If (i = l) Then char_prev = new_text(1 + p - 1) If char_prev <> "ر" and "ز" and "و" and "ذ" and "د" and then "ل" and "ء" and "إ" and "آ" and "أ" and "ا" new_text = new_text(1,word_char_count+p-1)+" ”ـ +char_curr, p=p+1 If cipher_stream(k) = p then Go to next

Page 78: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

61

3.4.9.2 White spaces method

This method is used to add addition white space between words to

hide a data, one space between words is interpreted as a “0”, and two

spaces are interpreted as a “1”, as shown in algorithm (3.11)

Algorithm (3.11) Data hiding using white space method

k = 0

For 1 < i <= words_Count

k = k + 1

If cipher_stream(k) = 1 Then

Add and justify space between word(i) and word(i+1)

3.4.9.3 Change word position method

Hiding Data into a document by setting the position of word

vertically, relative to the base line text, as shown in algorithm (3.12).

Algorithm (3.12) Data hiding using word position method

k = 0

For 1 < i <= words_Count

k = k + 1

If cipher_stream(k) = 1 Then

word(i)_Position = UP

Else

word(i)_Position = Normal

3.4.9.4 Unicode system method

An Arabic Unicode table (takes the range 0600 – 06FF, shown in

Appendix A) represents standard forms of all characters used in Arabic

language, and another Unicode table (take the range FE70 – FEFF)

Page 79: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

62

represents Arabic presentation forms-B that has all Arabic characters with

isolated form.

The idea for using Unicode system to hide data is to change the code

of the isolated characters (i.e. any character not connected to others within

a word), take each word in the paragraph, and check if there is an isolated

character (the Microsoft word document saves a character as a Unicode

with standard Arabic code, range 0600 – 06FF), then replacing it with the

same glyph character but with form-B Arabic code.

For example, Table (3.2) lists some Arabic characters with standard

code and form-B code.

Table (3.2) Arabic characters with different codes

Character Description Standard code

Hex value

Form-B code

Hex value

Alef 0627 FE8D ا

Beh 0628 FE8F ب

Teh 062A FE95 ت

Theh 062B FE99 ث

Jeem 062C FE9D ج

Hah 062D FEA1 ح

Khah 062E FEA5 خ

Dal 062F FEA9 د

Thal 0630 FEAB ذ

Reh 0631 FEAD ر

Zaih 0632 FEAF ز

Algorithm (3.13) shows hiding data using Unicode system method

(Appendix B/ Subroutine-9 shows a full hiding process program,

Page 80: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

63

Subroutine-10 for full Unhiding process program, and Subroutine-11 for

ASCII code with Unicode table creation)

Algorithm (3.13) Data hiding using hyphen method

k = 0

For 1 < i <= words_Count

char_prev = Nothing

For 1 < j <= word(i)_char_count

char_curr = word(i)_char(j)

If

[j=1 And (char_curr="ا" or "أ" or "د" or "ذ" or "ر" or "ز"

or "و")]

** check first character **

or

[(char_prev="ا" or "أ" or "د" or "ذ" or "ر" or "ز" or "و") And

(char_curr="ا" or "أ" or "د" or "ذ" or "ر" or "ز" or "و")]

** check characters in middle **

or

[j=word(i)_char_count) and (char_prev="ا" or "أ" or "د"

or "ذ" or "ر" or "ز" or "و")]

** check last character **

Then

k = k + 1

If cipher_stream(k) = 1 then exchange word(i)_char(j)

End If

char_prev = char_curr

Page 81: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

64

3.4.10 Hiding the Time

The time is debrief from the computer clock at previous step

(consist of seven digits). Each digit is multiplied by a multiple of

ten, the algorithm (3.14) shows this process.

Algorithm (3.14) Hiding time process

For 1 < i <= 7 time_loc(i) = time(i, 1) + [ (i - 1) * 10 ] For 1 < i <= 7 word(time_loc(i) + 1) = SHIFT POSITION_DOWN

3.5 Hiding Data in a Rich Text Format (RTF) File

The Rich Text Format (RTF) is a method of encoding formatted text

and graphics for easy transfer between applications.

The RTF Specification provides a format for text and graphics

interchange that can be used with different operating environments, and

operating systems. RTF uses the ANSI character set to control the

representation and formatting of a document [40].

3.5.1 Contents of an RTF File

An RTF file has the following syntax:

<File> '{' <header> <document> '}' It consists of unformatted text, control words, control symbols, and

groups. For ease of transport, a standard RTF file can consist of only 7-bit

ASCII characters, converters that communicate with Microsoft Word for

Windows should expect 8-bit characters.

A control word is a specially formatted command that RTF uses to

mark printer control codes and information that applications use to manage

documents.

Page 82: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

65

A control symbol consists of a backslash followed by a single,

nonalphabetic character.

A group consists of text and control words or control symbols

enclosed in brackets { }. The opening brace ‘{’ indicates the start of the

group and the closing brace ‘}’ indicates the end of the group. Each group

specifies the text affected by the group and the different attributes of that

text. The RTF file can also include groups for fonts, styles, screen color,

pictures, footnotes, comments (annotations), headers and footers, summary

information, fields, and bookmarks, as well as document, section,

paragraph, and character formatting properties. If the font, file, style, screen

color, and summary-information groups and document-formatting

properties are included, they must precede the first plain-text character in

the document. These groups form the RTF file header.

Document text should be emitted as ANSI characters. If there are

Unicode characters that do not have corresponding ANSI characters, they

should be output using the \ucN keywords.

3.5.2 Paragraph Formatting Properties

There are many control words that specify generic paragraph

formatting properties. These control words can appear anywhere in the

body of the paragraph, not just at the beginning. If the \pard control word is

present, the current paragraph resets to default paragraph properties.

3.5.3 Hiding Algorithm

The proposed algorithm benefits from the special format of the RTF

file to hide data. In RTF file the control words, control symbols, and braces

constitute control information. All other characters in the file are plain text.

If the RTF reader cannot find a particular control word or control

symbol in the lookup table specified for the file format, the control word or

control symbol should be ignored. The proposal benefits from this

Page 83: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

66

characteristic is to create a dummy control symbol and make it a cover to

hide the data with it, the following pseudo program opens a source RTF

document file and produces a target document file which contains the

hidden data, algorithm (3.15) shows hiding process.

Algorithm (3.15) Hiding data with RTF file format

Text = Convert Arabic text to English characters

Do While Not End of Source File

pointer1 = pointer1 + 1 pointer2 = pointer2 + 1

String_5bytes = Get from Source_file(pointer1) String_byte = Get from Source_file(pointer1)

Put into Target_file(pointer2) = String_byte

If String_5bytes = "\pard" and then "{" and then "\rtlch", Then

If String_byte = Space, Then pointer2 = pointer2 + 1

If coun >= Text_length Then Put into Target_file(pointer2) = "\azEND" pointer2 = pointer2 + 5 Exit Loop End If

Put into Target_file(pointer2) = "\az" + Text(coun,3)

pointer2 = pointer2 + 5

coun = coun + 3

End If

If String_5bytes = "}", Then

End If

Loop

Page 84: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

67

3.5.4 Unhiding Algorithm

This algorithm is an opposite operation of the hide algorithm, where

the program searches for the dummy control symbol in the target RTF

document file and extracts the data from it. The following pseudo program

extracts the hidden data from the target file, algorithm (3.16) shows

unhiding process.

Algorithm (3.16) Unhiding data from RTF file format

Pointer = 0

Do While Not End of Target File

String_3bytes_1 = Get from Target_file(pointer)

If String_3bytes_1 = "\az" And q <> 1 Then

pointer = pointer + 3

String_3bytes_2 = Get from Target_file(pointer)

If String_3bytes_2 = "END", Then Exit Loop

Text = Text + String_3bytes_2

End If

pointer = pointer + 1

Loop

Convert English characters(Text) to Arabic text

3.5.5 Compression Algorithm

Arabic characters are stored in the RTF document file using a control

word denoted by \'hh, where hh is a hexadecimal value based on the

specified character set like ASCII code. The proposed software benefit

from this characteristic which replaces the four bytes character code \'hh by

Page 85: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

68

one byte that represents a character. The following pseudo program

compresses the target file, algorithm (3.17) shows compression process.

Algorithm (3.17) Compress data of RTF file format

Do While Not End of Source File

pointer1 = pointer1 + 1 pointer2 = pointer2 + 1 String_2bytes = Get from Source_file(pointer1) String_byte = Get from Source_file(pointer1)

Put into Target_file(pointer2) = String_byte If String_2bytes = " \' " Then

pointer1 = pointer1 + 2 String_2bytes = Get from Source_file(pointer1) If String_2bytes = Coded Arabic character, Then Convert to original Arabic character using lookup table Put into Target_file(pointer2) = String_byte pointer1 = pointer1 + 1 End If

End If

Loop

Page 86: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

69

3.6 Design Instructional Package

The researcher uses systems approach in designing the

package, figure (3.14) represents the design of the package

according to this method [41].

Analysis Construction Evaluation Feedback Feedback Figure (3.14) Process of Design the learning Package 3.6.1 Analysis

This stage specifies the scientific structure of the package, and it

consists of:

1. Specify needs

The first step of analyzing any package is to identify the learning

needs, which represent the need to produce the package. The need to

produce the current package is to construct and develop the learner’s

knowledge and capabilities in the field of ciphering and hiding data in an

Arabic text document.

2. Specify Aims

The package aims are used to inform the learners with faculties,

practicability and information that can be achieved by the package after

finishing its use. The aims of the current package are to define the

- Specify needs

- Specify aims

- Specify

community

- Scientific contents

- Set tests

- produce the

package

- Supervisors

- Experts

Page 87: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

70

concepts of cryptography and steganography, be familiar with cipher a

stream of data and be familiar with hide data in Arabic text.

For assurance to achieve these aims, a behavioral aims are presented

before each unit:

• Describe the major milestone of the cryptography.

• Explain the principles of steganography.

• Describe the methods of data hiding.

• Recognize the data compression techniques.

• Recognize the Huffman coding method.

• Identify the Unicode system.

• Recognize the Arabic character’s characteristics.

3. Specify community

To achieve the benefit from the package, the designer must pay

attention to the properties, knowledge level and determine the

background knowledge of the learner, to be able to set-up the materials

of the package according to the learners’ level. For this the researcher

identifies the target community as:

• Computer engineering, computer science, communication

engineering.

• Higher education students in computer science, computer engineering,

and communication engineering.

3.6.2 Construction

In this stage the package is constructed according to the needs and

aims identified above, and it consists of:

1. Scientific contents

For each Instructional package there is content identified by the aims

given above. The researcher specifies the scientific contents of the

Page 88: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

71

current package to learn the learners about data hiding in Arabic text.

Contents of the current package are dividing into five units:

1. Cryptography principles.

2. Steganography principles.

3. Huffman coding.

4. Unicode system.

5. Cipher simulation.

These units are designed as a computer presentation, benefit from the

capabilities of the computer to build these units using background,

colors, photos, etc.

2. Set tests

The researcher uses three types of test; the first is pre-test which is

used before beginning with each unit to determine the knowledge level

of the learner. The second is exercises which are used after finishing

from each unit to determine the information and concepts acquired by

the learner. The last is post-test which is used after finishing from all

units to determine that the aims of the package are achieved.

The researcher takes care for many steps before putting the questions:

• Simplicity presenting the questions.

• The questions are a part of the scientific contents of the package.

• The questions must be comprehensive.

• Present a help for how to answer the questions.

• Using feedback if any answer is false.

3. Produce the package

To produce the package the researcher uses Visual Basic language to

program the package and the PhotoShop program to design the pages

because they have facilities to build the presentation screen with full

Page 89: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

72

Title

Introduction Community Aims Help

Contents

Unit aim

Pre-test

Scientific content

Exercise

Go to another unit

Post test

End

Fail

Pass

Fail

Pass

Yes

No

capacity to use colors, fonts, sounds and transition between pages easily

between screens. Figure (3.15) shows the flowchart of the learning

package, where represent the one unit and the others take the same

structure.

Figure (3.15) Flowchart of the learning package

Page 90: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

73

3.6.3 Evaluation

This is the last stage used to design the package, this stage is used to

find out the negative and positive points in the package.

Evaluation process is first performed by the supervisors to ensure

that the following points are achieved:

• The quality of the aims and scientific contents.

• Simplicity of the language used.

• The questions and their feedback.

• The help of using the package.

• The background and the colors of the presentation.

For the purpose of measuring the package performance, two

questionnaires are distributed:

1. Questionnaire for viewpoints of experts, it includes (11) items shown

in Appendix (C), and the names of the experts are listed in Appendix

(C).

2. Users questionnaire, which contains (12) items shown in Appendix

(C),

The answer of the items will be by using symbol (√) in the field

which reflects the user’s opinion. The questionnaire gives a measure of five

different degrees which reflect the comprehension of information and

concepts.

3.6.4 Statistical Method

The researcher used the standard deviation to analyze the results of

the questionnaires, where there are five answers for each item from very

large to very little.

Page 91: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Three The Proposed Hiding Algorithm

74

Standard deviation is one of statistical dispersion, measuring how the

values in a data set are spread out. If the data points are all close to the

mean, then the standard deviation is close to zero. If many data points are

far from the mean, then the standard deviation is far from zero [42].

The standard deviation is calculated from the following equation:

1N

F)XX(S i2

ii−

∑ −= ….3.5

where Xi : degree of the given item

X : average

Fi : repetition number

N : sample number

Page 92: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four

Results And

Discussion

Page 93: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

75

4.1 Introduction

The proposed software consists of two programs, the first hides

Arabic text in a document file with extension (.DOC) using four methods

(Unicode system, white space, add hyphen, and change position), and the

second program hides Arabic text in a document file with extension (.RTF)

by hiding the message in the data part of the file, where the two extensions

are Microsoft Word compatible.

4.2 Ciphering and Hiding Data in .DOC Document Files

Ciphering and hiding process are done by many steps as shown in

figure (4.1), which will be discussed in brief using an Arabic plain text

document file used as a cover, Arabic message to hide and a password for

encryption process. The following figure shows the main window of the

proposed software.

Figure (4.1) Main window of the proposed software

Page 94: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

76

As an example, cover paragraphs shown in figure (4.2) will be taken

to hide the message.

أهمية وحدة المعالجة المركزيةتعتبر وحدة المعالجة المركزية في الحاسب من أهم الأجزاء بل أهمهـا على الإطلاق لأنها بمثابة العقل في الجهاز، كما أنها تعمل على إنجـاز كافـة العمليات الحسابية في سرعات مذهلة، بالإضافة إلى معالجـة مختلـف أنـواع

أجزاء الحاسب، و يعتبر المعالج من أكثر الأجهزة البيانات والتنسيق بين جميع تعقيدا، حيث يحتوي على ملايين الترانزستورات والتي تتـرابط مـع بعـضها

والتي لهـا سـمكها ) من الزجاج المصهور ( البعض بواسطة شعيرات معدنية .أرق مئات المرات من سمك الشعرة الواحدة للإنسان بساعة النظـام، ولكـن لا يوجد بداخل كل حاسب ساعة خاصة تسمى

تستخدم هذه الساعة لمعرفة الوقت، وإنما لإرسال نبضات كهربائية صغيرة إلى وحدة المعالجة والتي بدورها تقوم باستخدام هذه النبضات للتحكم في العمليـات التي تنجزها، ولوجود هذه الساعة علاقة وثيقة بسرعة تردد المعالج، فعلى سبيل

هيرتز يـستطيع أن يـستقبل 300ي يقوم بالعمل على تردد المثال المعالج الذ مليون نبضة في الثانيـة وبمـا أن 300النبضات الكهربائية من الساعة بمعدل

من نبضات ( المعالجات تقوم عادة بإنجاز عملية واحدة فقط لكل نبضة كهربائية و . انيـة مليون عملية لكل ث 300فبالتالي بإمكان المعالج إنجاز ) ساعة النظام

بـشكل أصـغر ) أو الدوائر التـي بـداخلها ( من أهم أسباب جعل المعالجات فأصغر من قبل شركات تصنيع المعالجات هو جعل مسافات انتقال الكهرباء بين الترانزوستورز بداخل وحدة المعالجة أقصر الأمر الذي يعمل على زيادة سرعة

.المعالج أقسام، أهم هذه الأقسام والتـي تتكون وحدة المعالجة المركزية من عدة

يتم من خلالها معالجة البيانات والقيام بمختلف العمليات في الحاسب هما وحـدة .التحكم و وحدة التنفيذ

Figure (4.2) Arabic plain text paragraphs

Following are the steps of Ciphering and hiding process:

4.2.1 Open Document File

Open an Arabic text document file from the <File> menu shown in

the menu bar of the software, the file must have a “ .DOC ” extension

which is saved using Microsoft Word application and used to hide the

message. The proposed software will open the file using Microsoft Word

environment, as shown in figure (4.3).

Page 95: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

77

Figure (4.3) Arabic text document file - cover file

4.2.2 Select Hiding Method

This step is used to select the method of hiding as shown in

figure(4.4), there are four hiding methods: Unicode system, white space,

change position and hyphen methods.

Figure (4.4) Stego menu of the proposed software - Select method submenu

Page 96: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

78

4.2.3 Write the Message

In this step a message which one wants to hide is written in the

specific text box as shown in figure (4.5), where all Arabic characters as

well as space are allowd to be entered.

Figure (4.5) Message window

4.2.4 Write the Password

In this step a password for encryption is written in the specific text

box as shown in figure (4.6), where all English characters (upper case and

lower case) as well as numbers will be allowed to be entered.

Figure (4.6) Password window

4.2.5 Start Hiding Process

Starting cipher and hide process by selecting the “Hide” item from

the <Stego> menu as shown in figure (4.7).

Figure (4.7) Menu bar of the proposed software - Stego menu

Cipher and hide processes consist of many steps, each step is a part

(subroutine) of the software and will be discussed below:

Page 97: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

79

1. Huffman Code Subroutine: converts message characters (written in

previous step) to binary using Huffman code, by taking each character

from the message and convert it to its equivalent binary code using

table(3.1). The result of the conversion is shown in figure (4.8), (where

for this example of the message shown in figure (4.5) the binary stream

has 196 bits length).

0 0 1 0 1 1 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 1 0 1 1 0 1 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 1 1 0 1 0 0 0 1 0 1 1 1 0 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 0 1 1 0 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 1 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 1 1 0 1 0 0 0 1 0 1 1 1 0 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 0 1

Figure (4.8) Binary bits of the converted plain text message

2. Check Subroutine: calculates the number of words and the number of

bits that can be hidden in the document, as well as the number of characters

in the message, number of bits before and after compression, and

compression ratio as shown in figure (4.9).

No. of words 239 Document

(cover) No. of bits can hide 221

No. of characters 42

No. of bits-before compression 336

No. of bits-after compression 196

Message

(to hide)

Compression ratio 46%

Figure (4.9) Check subroutine results

Page 98: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

80

The subroutine checks if the number of bits in the message is larger

than the number of bits that can be hidden in the document, then a message

box appears like that in figure (4.10). If answered by “Yes” the program

continues with hiding process but ignores the last part of the message that

has not enough space in the document, if answered by “No” the program

will cancel the hiding process.

Figure (4.10) No enough space message box 3. Initialize Registers Subroutine: builds the registers which are used to

generate the keystream, and this can be achieved by many steps described

as follows:

• Convert each character of the password to its ASCII code to get the

key, as shown below

Char H 8 3 D 2 F V 7 S R ASCII 72 56 51 68 50 70 86 55 83 82

After converting the characters, the key is 72565168507086558382.

• Debrief the time from the computer clock, timer for example is

(78538.46).

tim(1) tim(2) tim(3) tim(4) tim(5) tim(6) tim(7)Timer digit 7 8 5 3 8 . 4 6

Page 99: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

81

• Mix the key with time to get a key varying with time, by multiplying

each specific digit of the key with the specific digit of the time as

shown in figure (4.11), the result of each multiplication step changes

the value of part of the key.

Key location 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Original 7 2 5 6 5 1 6 8 5 0 7 0 8 6 5 5 8 3 8 2Tim(1) 7 Key 1 7 2 5 4 5 5 6 8 5 0 7 0 8 6 5 5 8 3 8 2Tim(2) 8 Key 2 7 2 5 4 5 4 4 8 5 0 7 0 8 6 5 5 8 3 8 2Tim(3) 5 Key 3 7 2 5 4 5 4 4 4 2 5 7 0 8 6 5 5 8 3 8 2Tim(4) 3 Key 4 7 2 5 4 5 4 4 4 2 1 7 1 8 6 5 5 8 3 8 2Tim(5) 8 Key 5 7 2 5 4 5 4 4 4 2 1 7 1 4 4 5 5 8 3 8 2Tim(6) 4 Key 6 7 2 5 4 5 4 4 4 2 1 7 1 4 4 5 2 3 2 8 2Tim(7) 6 Final 7 2 5 4 5 4 4 4 2 1 7 1 4 4 5 2 3 1 6 8

Figure (4.11) generating varied with time key

After last step of multiplication the new key produced is the final

result encryption key which is equal to 72545444217144523168,

which is used to generate registers length, state bits, feedback taps,

transfer bit locations, transfer address and multiplexer selector.

• The encryption key is generated from the previous step which

is used to produce the registers length, using algorithm in the

flowchart shown in figure (3.5). The result of registers length is

listed in table (4.1).

Page 100: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

82

Table (4.1) Registers length

Register number Length (cells)

1 22 2 23 3 21 4 26 5 36

• Initialize state bits for each register using algorithm in the flowchart

shown in figure (3.6), where each cell has a binary value either ‘0’ or

‘1’. The state bits of all registers are listed in table (4.2).

Table (4.2) State Bits for each register

Register 1 1 0 1 0 0 1 1 0 1 1 1 1 1 0 1 0 1 0 1 0 1 0

Register 2 0 1 1 1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 1 0

Register 3 0 1 1 1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1

Register 4 0 1 0 1 1 1 1 0 1 0 0 1 1 0 0 0 0 0 1 1 0 1 0 1 0 1

Register 5 1 1 0 1 1 0 1 0 1 0 1 0 0 0 0 1 0 1 0 0 1 0 1 1 1 1 0 0 1 1 1 1 0 1 1 1

• Generate feedback taps for each register using algorithm in the

flowchart shown in figure (3.7), where each tap has a binary

value either ‘0’ or ‘1’. The feedback taps of all registers are

listed in table (4.3).

Page 101: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

83

Table (4.3) Feedback taps bit for each register

Register 1 1 0 1 1 0 0 1 0 0 1 0 0 0 1 0 0 1 1 0 0 1 0

Register 2 1 1 1 0 1 1 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 0

Register 3 0 1 1 1 0 1 0 0 0 1 0 1 0 1 0 1 0 1 1 0 0

Register 4 0 0 0 1 1 0 1 1 0 1 0 1 0 1 0 0 0 0 1 1 1 1 1 1 1 0

Register 5 1 1 1 1 0 1 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 1 0 1 1 1 0 1 0 1 1 0 1 0 1 0 • Generate transfer bit locations using the algorithm in the flowchart

shown in figure (3.8). Each register has sixteen active cells, which are

listed in table (4.4).

Table (4.4) Locations for transition for each register

Register no. Cell no. of Active cells 1 1 2 3 4 5 6 7 8 10 12 13 17 19 20 21 222 1 2 3 6 8 9 10 11 12 14 16 18 19 20 21 223 1 2 3 4 5 9 10 12 13 14 15 16 17 18 19 214 2 3 4 5 6 7 8 12 13 14 15 18 20 21 24 265 3 6 7 11 13 14 17 18 23 24 26 27 28 33 35 36

• Generate transfer address using algorithm in the flowchart shown in

figure (3.9). Each register has four address cells used to address an

active cell with previous register as listed in table (4.5), and other four

address cells used to address an active cell with next register as listed

in table (4.6).

Page 102: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

84

Table (4.5) Locations of transition for each register

Register no. Cell no. of address up 1 8 9 18 22 2 7 8 9 20 3 1 2 3 6 4 5 20 21 23 5 2 7 8 24

Table (4.6) Locations of transition for each register

Register no. Cell no. of address down 1 6 11 17 19 2 1 3 6 21 3 4 5 8 11 4 3 20 22 26 5 2 3 16 35

• Generate a multiplexer selector cells using algorithm in the flowchart

shown in figure (3.10). Table (4.7) lists the cell number for each

selector.

Table (4.7) Locations for transition for each register

Register no. Cell no. for multiplexer selector 2 3 3 12 4 24

After all register parameters are initialized the five registers are building as shown in figure (4.12)

Page 103: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

85

Figure (4.12) The proposed Registers design as an example

Page 104: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

86

Figure (4.12) shows five registers, the output of each register is

used as an input to the multiplexer, where the multiplexer selector

takes its value from the content of the cells (drawn by dark gray

rectangular with two arrows) assigned in the previous step. The active

cells for each register are drawn with light gray boxes, and the

feedback cells for each register are drawn with gray small boxes

above the cell boxes. Cells used for transition address-up are drawn by

gray rectangular -up- and cells used for transition address-down are

drawn by gray rectangular -down-.

4. Keystream Generation Subroutine: this subroutine uses the previous

registers model shown in figure (4.12) to generate a keystream with length

equal to the length of the message bits after compression. The result of the

generation is shown in figure (4.13) where the binary stream has 196 bits

length.

0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 1 1 0 1 1 0 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 1 0 1 1 1 1 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 0 1 0 1 1 1 1 1 1 0 0 1 0 0 0 0 0 1 0 0 1 1 1 1 1 0 0 1 1 1 0 1 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 1 0 1 1 0 1 1 0 1 0 1 0 0 1 0 0 0 0 1 1 1 1 1 0 0 1 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 1 0

Figure (4.13) Keystream bits generated

5. Test Subroutine: tests the keystream generated from the previous step

using a statistical test. The software check values, if one of them or more,

above the permission value, the software will return to get another time

value and calculate a new key to generate a new keystream until all test

values are acceptable (in the range). Table (4.8) lists the values of threshold

for each statistical test and the test value (for this example).

Page 105: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

87

Table (4.8) Test values of the keystream

Test type Test Value Threshold value Frequency test 00.184 03.8415 Serial test 00.447 05.9915 Poker test 02.938 14.0671 Runs test 01.650 09.4877 Autocorrelation test 00.001 01.9600

6. Encryption Subroutine: this task is used to XOR the plain text

message bits with the keystream bits to generate a cipher stream as

shown in figure (4.14).

0 1 1 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 1 1 0 1 0 0 1 0 0 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 0 1 1 0 0 0 1 1 1 1 1 1 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 1 1 0 0 1 1 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 0 1 1 1 1 0 0 0 1 1 0 1 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 0 1 1

Figure (4.14) Cipher stream bits

7. Hiding Subroutine: this subroutine is used to hide the cipher stream that

is produced from the previous step. The hiding process is dependent on the

selected method as shown in figure (4.4), and the following sections

describe all the four methods. Figure (4.15) shows the window of the

proposed software after hiding process is complete.

Page 106: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

88

Figure (4.15) Main window of the proposed software after hiding process

4.2.6 Hiding Data with Unicode System Method

This method benefits from the character code tables shown in

Appendix A, where for Arabic text there are two code tables, one for

standard characters and the other for isolated characters. The method states

to convert or leave each isolated character in the words from one code

number to another code number depending on the 0’s and 1’s of the cipher

stream bits without affecting the appearance of the document when

browsing in a word document viewer like Microsoft word software or

WordPad software.

The text document after hiding the message is shown in figure (4.16)

where it has identical view with the source document shown in figure (4.3),

they have the same file size and the software can hide 221 bits in the source

document file.

Page 107: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

89

Figure (4.16) Document file after hiding the message (using Unicode system method)

The following figure (4.17) is part of the document with zooming

300% of the text before hiding process, and figure (4.18) is the same part of

the document but after hiding, the two figures look identical and cannot be

detected by third party.

Figure (4.17) Part of the document file before hiding process

Page 108: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

90

Figure (4.18) Part of the document file after hiding process

(using Unicode system method)

4.2.7 Hiding Data with White Space Method:

This method uses the space between words to hide data, where one

space between two words represents the Boolean number zero and two

spaces between two words represent the Boolean number one.

Part of the document is shown in figure (4.19), and the text

document after hiding the message is shown in figure (4.20). The third

party can easily detect the difference between files by eye. The target file

and the source file have the same file size, and the software can hide 235

bits in the source document file.

Figure (4.19) Part of the document file after hiding process

(using white space method)

Page 109: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

91

Figure (4.20) Document file after hiding the message (using white space method)

4.2.8 Hiding Data with Hyphen Method

This method benefits from Arabic characters that are connected to

each other, to hide data by adding a hyphen between them, where if there is

no hyphen in the word, the Boolean number zero is represented and one

hyphen in the word represents the Boolean number one.

The text document after hiding the message is shown in

figure (4.21), and part of the document is shown in figure (4.22).

In this method the third party cannot easily detect the difference

because Arabic paragraphs already have hyphens for justification

and alignment of the text. The target file and the source file have

Page 110: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

92

the same file size, and the software can hide 218 bits in the

source document file.

Figure (4.21) Document file after hiding the message (using hyphen method)

Figure (4.22) Part of the document file after hiding process

(using hyphen method)

Page 111: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

93

4.2.9 Hiding Data with Change Position Method

This method uses the line of a row as a base line and shifts word up

or leave it depending on the data to hide, where no shift in the word

represents the Boolean number zero and shift the word one pixel

represents the Boolean number one.

The text document after hiding the message is shown in

figure (4.23), the target file and the source file have the same file

size, and the software can hide 239 bits in the source document

file.

Figure (4.23) Document file after hiding the message (using change position method)

Page 112: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

94

Part of the text with zooming 500% is shown in figure (4.24). In

figure (4.23) the third party cannot detect the difference by eye, but when

the text is zoomed to 500% the difference can be detected by eye as shown

in figure (4.24 a) and figure (4.24 b).

a. Text before hiding (original text)

b. Text after hiding

Figure (4.24) Part of the document file (using change position method)

Page 113: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

95

4.3 Hiding Data in .RTF Document Files

The second proposed software is for hiding data in a Microsoft word

compatible format named as RTF file, the main window of the software is

shown in figure (4.25).

Figure (4.25) Main window of the proposed software

As an example, the same cover paragraphs that are shown in

figure (4.2) will be taken to hide the same message. Following are

the steps of the hiding process:

4.3.1 Open Document File

Open a word document file, which is used to hide the message (using

the same document file that is shown in figure (4.2)).

Page 114: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

96

4.3.2 Write the Message

Write the message to hide in the specific text box as shown in

figure (4.26), where all Arabic characters as well as space will be

allowed to enter.

Figure (4.26) Message window

4.3.3 Start Hiding Process

Start hiding process by selecting the <Hide> item of the <Process>

menu from the menu bar as shown in figure (4.27).

Figure (4.27) Main menu of the software – Process submenu

Hiding process hides the message in the data zone of the RTF file.

Figure (4.28) shows part of the file viewed in hexadecimal code. The

software uses a dummy control symbol like ( \az ) to hide the message

characters with it, this is shown in figure (4.29) where after each dummy

control, three characters are used as a technique to hide.

Figure (4.30) represents the document file after hiding the message,

where it has identical view with the source document shown in figure (4.3).

Page 115: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

97

Figure (4.28) View of the file before hiding

Figure (4.29) View of the file after hiding

Page 116: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

98

Figure (4.30) Document file after hiding the message (using RTF file format)

The following figure (4.31) is part of the document with

zooming 300% of the text document after hiding, it looks identical

to that in figure (4.17) and cannot be detected by third party.

Figure (4.31) Part of the document file after hiding process

(using RTF file method)

Page 117: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

99

In this method the size of the file after hiding is larger than the size

of the source file, for this problem a subroutine is written to compress the

file size to reach its original file size. Figure (4.32) shows the document file

after compression, from this figure and the previous figure the view of the

two files is identical.

Figure (4.32) Document file after compression

Page 118: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

100

Figure (4.33) shows the main window of the proposed software after

implementing processes (hide, unhide, compress) on a document file.

Figure (4.33) Main window of the proposed software after hiding and compression processes

4.4 Discussion

From the previous sections, and after examining fifteen different

documents to hide messages by all methods used in this thesis, the best way

of hiding data in an Arabic text document is the Unicode system method

where the target file takes the same source file size, the third party cannot

recognize the difference by eye, not required the original document for

detecting the hidden message, and any change in font name, font size, font

style, and paragraph justification do not affect the hidden message.

The RTF file method is similar to the previous method but the file

size after hiding is increased, but the proposed software compresses the file

Page 119: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

101

to make the difference between its size and the source file size as smaller as

possible.

The other three methods can be used for the hiding domain but with

some risk if the third party has knowledge of hiding data. On the other

hand, using cryptography before hiding improves the security of the

message.

The researcher can summarize from the literatures introduced in the

previous chapter that:

1. Brassil, et al. used a word shift coding method (where shift words to

left or right), while current study uses the same method but shift

words up or leave it.

2. Shaar proposed a study to hide a number of bits from plain text

message into a random vector bits, and the location of the hidden bits

are determined by a key. While the current study uses the same

strategy when hiding the time (which is a part of the password) in the

document file.

3. Kim used an inter word space method, while the current study uses the

same method but in different way.

4. Sui proposed a method to hide information in hypertext file, the final

stego-file is similar with that of the current study when using Unicode

system method to hide data, where the stego-file and the cover have

no difference in normal appearance and algorithm doesn’t lengthen

the file size.

5. Topkara uses a linguistic method to hide information, this study is

different from the current study, where it changes the sentence words

but maintains its mean, while the current study change the feature of

the characters to hide data.

6. Voloshynovskiy proposed a method to hide data in character’s color,

using a Microsoft word document as a cover file, this facility is used

Page 120: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

102

by the current study, where using two types of file that are compatible

with Microsoft word, which are Document file and RTF file.

7. Current study agrees with Alderson, Tubsree and Fahim studies in

educational technology field in preparing instructional computer

program.

Table (4.9) reviews the hiding methods for two different document

files and their file size, capacity to hide data and document view before and

after hiding the message.

Table (4.9) Hiding methods review

Size of document file (Byte) Method type File

index Before hiding After hiding

No. of words in the

document

No. of Bitcan hide in file

Document view

before andafter

Unicode system 221 Identical

White space 235 Not Hyphenation 218 Not

Change position

1.doc 21,504 21,504

239 Identical(Normal

view) Before

compressAfter

compressRTF 1.rtf 8,78410,303 8,784

239

6072 Identical

Unicode system 208 Identical

White space 180 Not Hyphenation 198 Not

Change position

2.doc 20,992 20,992

202 Identical(Normal

view) Before

compressAfter

compressRTF 2.rtf 8,4399,892 8,440

202

5808 Identical

Page 121: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

103

4.5 Instructional Technology Side Results 4.5.1 Opinion List Results of Experts Viewpoint Analysis

Table (4.10) represents the value of the standard deviation for each

item. From the table, it is found that item (3) gives the highest standard

deviation, while item (6) gives the lowest standard deviation. This means

that the experts have good agreement in that the information and concepts

that are displayed in the package are suitable scientifically, and they differ

in the clearness of displaying package style.

Table (4.10) Opinion list result of experts

No. The items Large Medium Little Standard deviation

1 The instructions of using the package are simple. 6 3 1 0.707

2 The density of displaying information on computer screen is suitable.

7 3 - 0.483

3 The information that displayed in the package are suitable scientifically.

8 2 - 0.421

4 Designing of the instructional package takes into account the personal differences.

7 2 1 0.699

5 Clearness of the item’s titles. 3 6 1 0.632

6 Displaying package style is limiting. 4 3 3 0.875

7 The attached images in the instructional units are participating to understand the concepts.

1 7 2 0.567

8 Understanding the producing questions in the program. 4 5 1 0.674

9 Suitable of the used color in the package. 2 6 2 0.666

10The questions are including all items in the instructional package.

3 5 2 0.737

11The language style that used to explain the scientific concepts and information is clear.

3 4 3 0.816

Page 122: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

104

4.5.2 Questionnaire Results of Learners Viewpoint Analysis Table (4.11) represents the value of the standard deviation for each item. From the table, it is found that item (2) gives the highest standard deviation, while item (4) gives the lowest standard deviation. This means that the students agree in that the scientific concepts that are displayed in the package were clear, and they differ in that the flowchart presented assisted to increase the understanding of the scientific information.

Table (4.11) Opinion list result of learners

No. The items Large Medium Little Standard deviation

1 The division of the instructional package subject into five typical units participated to increase your understanding of the package.

10 4 1 0.632

2 The scientific concepts that displayed in the package were simple.

12 3 - 0.414

3 The information that displayed in the package was clear. 10 4 1 0.632

4 The flowchart assisted to increase in understanding the scientific information

9 3 3 0.828

5 The harmony between display image and related information was fines.

6 9 - 0.507

6 Moving steps between the instructional package screens were simple.

7 8 - 0.516

7 English language is better that Arabic language in displaying the instructional package materials.

5 6 4 0.798

8 The language style to explain the scientific concepts is understand. 7 7 1 0.632

9 Using of colors cleared the displayed concepts. 5 8 2 0.676

10 Titles of the instructional package items are clear. 8 6 1 0.639

11Immediate support for the answer, increased your desire to continue with the package.

7 6 2 0.723

12 The package increases your learning desire. 11 4 - 0.457

Page 123: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

105

4.6 Conclusions

In this thesis, a new technique to hide information in Arabic text is

proposed. This technique takes into consideration some parameters that are

used to detect the existence of hidden information in a text file, such as file

size, justification, font size, font characteristics …etc.

The experiments were done using actual network, where files

containing hidden information were transferred using e-mail.

Even though hiding information in text file has some limitations, it is

necessary to think about some techniques that can improve the performance

of such method. Since, here in the University of Technology there is an

idea to connect the University departments in a common network and most

of documents and files transfer between users in the whole University are

in Arabic text file. Some secret information, may be between heads, be

transferred as hidden information using Arabic text files.

Many conclusions can be drawn from this work, the most important

of which are:

1. The tests show that the best method is the Unicode system method,

since:

a. The method modifies only the Unicode letters instead of the

content itself. The stego-file and the cover have no difference in

normal view.

b. The algorithm doesn't lengthen the file size since it just modifies

the Unicode letters instead of adding letters.

2. At the same level of the Unicode system method, the RTF file

method is good after compressing the file, taking into consideration

some parameters that affect the detection methods.

3. Other methods are also taken into consideration, but there may be

some risk from the third party.

Page 124: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Chapter Four Results and Discussion

106

4. This system deals with document files that are compatible with

Microsoft Word Documents.

5. Using statistical test to measure the quality of the generated

keystream to get a random bit generator.

6. The instructional package participates in solving the problems which

face the student, to connect between theoretical explanations of

information hiding and applies these concepts practically.

7. The instructional package takes into consideration the cognitive

difference between the learners, where they use the package

according to their learning speed.

8. The instructional package assists the learners to develop their

knowledge, using the feedback provided by the package.

4.7 Recommendations

1. The researcher recommends using this work in the implementation of

the proposed system in an actual University computer network.

2. Taking benefits from instructional package to enhance the ordinary

teaching method for cryptography and steganography subject.

4.8 Suggestions

There are many suggestions which can be taken as proper research in

information hiding process, these are:

1. Develop this system to hide voice data in an Arabic text documents.

2. Develop a system to hide information in Adobe Acrobat files.

3. Build an instructional package, to teach the student about hiding

voice data in an Arabic text documents.

4. Develop a system to hide information with online data transfer.

Page 125: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

107

References

1. Bauer, F. L., “Decrypted Secrets: Methods and Maxims of Cryptology”,

3rd ed. Springer-Verlag, New York, 2002.

2. Arnold, M., Schmucker, M., and Wolthusen, S. D., “Techniques and

Applications of Digital Watermarking and Content Protection”, Artech

House, Norwood, Massachusetts, 2003.

3. Kessler, Gary C., “An Overview of Steganography for the Computer

Forensics Examiner”, Forensic Science Communications, No.3-Vol.6-

July 2004.

4. Pawliw, Borys and Neijts, Roberto, “Definitions”, 2002.

www.searchSecurity.com

5. Bender, W., Gruhl, D., Morimoto, N., and Lu, A., “Techniques for Data

Hiding”, IBM Systems Journal 35, Nos. 3&4, 313–336 (1996).

6. Engelfriet, Arnoud,” Steganography”, 2000.

www.stack.nl/galactus/remailers/index-privacy.html

7. Watermarking World web site, 2005.

www.watermarkingworld.org/faq.html

8. Franz, E., Jerichow, A., M¨oller, S., Pfitzmann, A., and Stierand, I.,

“Computer Based Steganography”, in Information Hiding, Springer

Lecture Notes in Computer Science v.1174, pp.7-21, 1996.

9. Linux, Fu-King, “Basic Data Hiding Tutorial”, 2003.

http://www.antionline.com/showthread.php?threadid=251463

10. Brassil, J. T., Low, S., and Maxemchuk, N.F., “Copyright protection for

the electronic distribution of text documents”, Proceedings of IEEE,

Vol.87, No.7, pp.1181- 1196, July 1999.

11. Shaar, M., Saeb, M. and Badawi, U., “A Hybrid Hiding Encryption

Algorithm for Data Communication Security”, Cairo University,

Faculty of Science Mathematics Dept., Computer Science Division,

1997.

Page 126: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

108

12. Kim, Young-Won, Moon, Kyung-Ae, and Oh, Il-Seok, “A Text

Watermarking Algorithm based on Word Classification and Inter-word

Space Statistics”, Department of Computer Science, Chonbuk National

University, Korea, 2003.

13. Sui, Xin-Giiang, and Lilo, Hui, “A New Steganography method Based

on Hypertext”, National Key Lab of Modern Signal Processing, IEEE,

2004.

14. Topkara, M., Taskiran, C. M., and Delp, E. J., “Natural Language

Watermarking”, Video and Image Processing Laboratory, School of

Electrical and Computer Engineering, Purdue University, Indiana,

2005.

15. Voloshynovskiy, S., Vill´an, R., and Koval, O., Vila, J., “Text Data-

Hiding for Digital and Printed Documents”, Computer Vision and

Multimedia Laboratory - University of Geneva, Switzerland, 2006.

16. Uden, L. and Alderson, A., “Teaching and Learning Using Instructional

Design”, School of Computing, Staffordshire University, England,

2000.

17. Tubsree, Chalong, and Tubsree, Nai-Fen Yu, “Designing Effective

Instruction for Computer in Education Courses”, International

Conference on Computers in Education, IEEE, 2002.

18. Fahim, Rasha, “Educational package for detecting hidden information

embedded in an image”, Ph.D. thesis, Technical Education Department,

University of Technology, Iraq, 2006.

19. SSH communications security web site, 2004.

http://www.ssh.fi/support/cryptography/introduction/algorithms.html 20. RSA Security web site, 2004.

http://www.rsasecurity.com/rsalabs/node.asp?id=2164 21. Menezes, Alfred J., van Oorschot, Paul C., and Vanstone, Scott A.,

“Handbook of Applied Cryptography”, CRC Press Inc., Fifth Printing,

2001.

Page 127: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

109

22. RSA Security web site, 2004.

http://www.rsasecurity.com/rsalabs/node.asp?id=2209 23. Robshaw, M.J.B., “Stream Ciphers”, RSA Laboratories, a division of

RSA Data Security, Inc., 1995.

24. RSA Security web site, 2004.

http://www.rsasecurity.com/rsalabs/node.asp?id=2174 25. RSA Security web site, 2004.

http://www.rsasecurity.com/rsalabs/node.asp?id=2266 26. Bender, W., Gruhl, D., Morimoto, N., and Lu, A., “Techniques for data

hiding”, IBM Systems Journal, Vol.35, 1996.

27. Dunbar, Bret, “A detailed look at Steganographic Techniques and their

use in an Open-Systems Environment”, SANS Institute, 2002.

28. Unicode Inc. web site, 2005.

http://www.unicode.org

29. McCreedy, David, “Gallery of Unicode Fonts”, 2005.

http://www.travelphrases.info/fonts.html

30. Wikipedia web site, “Data compression”, The free encyclopedia, 2005.

http://en.wikipedia.org 31. Blelloch, Guy E., “Introduction to Data Compression”, Computer

Science Department, Carnegie Mellon University, 2001.

32. Goebel, Greg, “Lossless Data Compression”, public domain, 2005.

http://www.vectorsite.net/ttdcmp1.html

33. Crochemore, M. and Lecroq, T., “Text data compression algorithms, in

(Algorithms and Theory of Computation Handbook)”, Chapter 10,

CRC Press, Boca Raton, 1998.

34. Reigeluth, C. M., “Instructional Design Theories and Models”,

Lawrence Erlbaum Associates. Hillsdale, NJ, 1983.

35. Tennyson, R.D., Schott, F., and Dijkstra, S., “Instructional Design:

international Perspectives, Theory, Research and Models”, Lawrence

Erlbaum Associates, Hillsdale, NJ, 1997.

Page 128: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

110

36. Dick, W. and Carey, L. M., “The Systematic Design of Instruction”,

Scott Foresman, Glenview, IL, 1 997.

37. Astin, B.H, “Principles of instructional Design”, University of Chicago

Press, 1997.

38. Al-Hela, M. M., “Design and produce an Instructional materials”,

Jordan, 2000.

39. Instructional Technology Center web site, “Thirteen Steps to Better

Instructional Visuals for Electronic Presentation”, Iowa State

University, 1999.

40. Microsoft web site, “Microsoft Office Word 2003 Rich Text Format

(RTF) Specification”, White Paper, Published: April 2004.

http://www.microsoft.com/office.htm

41. McCulloch, Bob, “Instructional Design”, University of Calgary, 1998.

http://www.ucalgary.ca/~edtech/688/getstart.htm

42. Wikipedia web site, “Standard Deviation”, The free encyclopedia,

2006. http://en.wikipedia.org/wiki/Standard_deviation

Page 129: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendixes

Page 130: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix A Unicode Tables

A-1

Arabic Characters Standard Form, Range 0600-06FF

The table bellow contains the Unicode Standard, version 4.1, 2005.

Page 131: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix A Unicode Tables

A-2

Arabic Characters Form-B, Range FE70-FEFF

The table bellow contains the Unicode Standard, version 4.1, 2005.

Page 132: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-1

Subroutine 1: Initialize registers Public Sub initialize(key) For i = 1 To 10 a1 = a1 + Mid(Str(Asc(Mid(key, i, 1))), 2, 2) Next tim = Format(Timer, "00000.00") For i = 1 To 8 c1 = Val(Mid(a1, 2 + (i * 2), 2)) c2 = Val(Mid(tim, i, 1)) If i = 6 Then GoTo q Mid(a1, 2 + (i * 2)) = c1 * c2 q: Next i key = a1 ' *** generate register length *** For i = 1 To 4 r = Mid(key, i + 15, 1) / 10 b(i) = Int((30 - 20 + 1) * r + 20) Next i b(5) = 128 - b(1) - b(2) - b(3) - b(4) '*** generate locations for transition * c *** For j = 1 To 5 r = Mid(key, 1 + j, 2) + 1 For i = 1 To 16 l2: r = r / (1 + (Val(Mid(key, 1 + i, 1)) + 1) / 1000) t1 = Str(r) t2 = Right(r, 4) t3 = Val(t2) Mod b(j) + 1 a = t3 j1 = 0 For k = 1 To 16 If c(j, k) = a Then j1 = j1 + 1 Next k If j1 = 0 Then c(j, i) = a Else GoTo l2 Next i Next j '*** sort c ** For k = 1 To 5 For i = 1 To 16

Page 133: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-2

For j = 1 To 16 If c(k, i) < c(k, j) Then t = c(k, i) c(k, i) = c(k, j) c(k, j) = t End If Next j Next i Next k '*** generate locations of address * up * d *** For j = 1 To 5 r = Mid(key, 5 + j, 2) + 1 For i = 1 To 4 l3: r = r / (1 + (Val(Mid(key, 5 + i, 1)) + 1) / 1000) t1 = Str(r) t2 = Right(r, 4) t3 = Val(t2) Mod b(j) + 1 a = t3 j1 = 0 For k = 1 To 4 If d(j, k) = a Then j1 = j1 + 1 Next k If j1 = 0 Then d(j, i) = a Else GoTo l3 Next i Next j '*** sort d *** For k = 1 To 5 For i = 1 To 4 For j = 1 To 4 If d(k, i) < d(k, j) Then t = d(k, i) d(k, i) = d(k, j) d(k, j) = t End If Next j Next i Next k '*** generate locations of address * down * e *** For j = 1 To 5 r = Mid(key, 10 + j, 2) + 1 For i = 1 To 4 l4:

Page 134: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-3

r = r / (1 + (Val(Mid(key, 10 + i, 1)) + 1) / 1000) t1 = Str(r) t2 = Right(r, 4) t3 = Val(t2) Mod b(j) + 1 a = t3 j1 = 0 For k = 1 To 4 If e(j, k) = a Then j1 = j1 + 1 Next k If j1 = 0 Then e(j, i) = a Else GoTo l4 Next i Next j '*** sort e *** For k = 1 To 5 For i = 1 To 4 For j = 1 To 4 If e(k, i) < e(k, j) Then t = e(k, i) e(k, i) = e(k, j) e(k, j) = t End If Next j Next i Next k '*** generate locations for multiplexer address * f *** r = Mid(key, 20, 2) + 1 r = r / (1 + (Val(Mid(key, 20, 1)) + 1) / 1000) t1 = Str(r) t2 = Right(r, 4) t3 = Val(t2) Mod b(2) + 1 f(1) = t3 r = Mid(key, 15, 2) + 1 r = r / (1 + (Val(Mid(key, 15, 1)) + 1) / 1000) t1 = Str(r) t2 = Right(r, 4) t3 = Val(t2) Mod b(3) + 1 f(2) = t3 r = Mid(key, 7, 2) + 1 r = r / (1 + (Val(Mid(key, 7, 1)) + 1) / 1000) t1 = Str(r) t2 = Right(r, 4) t3 = Val(t2) Mod b(4) + 1

Page 135: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-4

f(3) = t3 '*** generate bits for each register *** For j = 1 To 5 r = Mid(key, 12 + j, 2) + 1 For i = 1 To b(j) r = r / (1 + (Val(Mid(key, 13, 1)) + 1) / 1000) t1 = Str(r) t2 = Right(r, 3) t3 = Val(t2) Mod 2 g(j, i) = t3 Next Next For j = 1 To 5 r = Mid(key, 4 + j, 2) + 1 For i = 1 To b(j) r = r / (1 + (Val(Mid(key, 5, 1)) + 1) / 1000) t1 = Str(r) t2 = Right(r, 3) t3 = Val(t2) Mod 2 x(j, i) = t3 Next Next End Sub Subroutine 2: Shift bits Public Sub shifter(s1) k = 0 For j = 1 To s1 k = k + 1 '*** register 1 *** '*** calculate feedback bit *** xone1 = x(1, 1) For i = 2 To b(1) If g(1, i) = 1 Then xone1 = xone1 Xor x(1, i) Next '*** shift *** For i = b(1) To 2 Step -1 x(1, i) = x(1, i - 1) Next x(1, 1) = xone1

Page 136: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-5

'*** put bit in next reg. *** h1 = 0: h2 = 0 For i = 1 To 4 If x(1, d(1, i)) = 1 Then h1 = h1 + 2 ^ (4 - i) If x(2, e(2, i)) = 1 Then h2 = h2 + 2 ^ (4 - i) Next x(2, c(2, h2)) = x(2, c(2, h2)) Xor x(1, c(1, h1)) '***register 2 *** '*** calculate feedback bit *** xone2 = x(2, 1) For i = 2 To b(2) If g(2, i) = 1 Then xone2 = xone2 Xor x(2, i) Next '*** shift *** For i = b(2) To 2 Step -1 x(2, i) = x(2, i - 1) Next x(2, 1) = xone2 '*** put bit in next reg. *** h1 = 0: h2 = 0 For i = 1 To 4 If x(2, d(2, i)) = 1 Then h1 = h1 + 2 ^ (4 - i) If x(3, e(3, i)) = 1 Then h2 = h2 + 2 ^ (4 - i) Next x(3, c(3, h2)) = x(3, c(3, h2)) Xor x(2, c(2, h1)) '***register 3 *** '*** calculate feedback bit *** xone3 = x(3, 1) For i = 2 To b(3) If g(3, i) = 1 Then xone3 = xone3 Xor x(3, i) Next '*** shift *** For i = b(3) To 2 Step -1 x(3, i) = x(3, i - 1) Next x(3, 1) = xone3 '*** put bit in next reg. *** h1 = 0: h2 = 0 For i = 1 To 4 If x(3, d(3, i)) = 1 Then h1 = h1 + 2 ^ (4 - i) If x(4, e(4, i)) = 1 Then h2 = h2 + 2 ^ (4 - i) Next x(4, c(4, h2)) = x(4, c(4, h2)) Xor x(3, c(3, h1))

Page 137: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-6

'*** register 4 *** '*** calculate feedback bit *** xone4 = x(4, 1) For i = 2 To b(4) If g(4, i) = 1 Then xone4 = xone4 Xor x(4, i) Next '*** shift *** For i = b(4) To 2 Step -1 x(4, i) = x(4, i - 1) Next x(4, 1) = xone4 '*** put bit in next reg. *** h1 = 0: h2 = 0 For i = 1 To 4 If x(4, d(4, i)) = 1 Then h1 = h1 + 2 ^ (4 - i) If x(5, e(5, i)) = 1 Then h2 = h2 + 2 ^ (4 - i) Next x(5, c(5, h1)) = x(5, c(5, h1)) Xor x(4, c(4, h2)) '*** register 5 *** '*** calculate feedback bit *** xone5 = x(5, 1) For i = 2 To b(5) If g(5, i) = 1 Then xone5 = xone5 Xor x(5, i) Next '*** shift *** For i = b(5) To 2 Step -1 x(5, i) = x(5, i - 1) Next x(5, 1) = xone5 '*** put bit in next reg. *** h1 = 0: h2 = 0 For i = 1 To 4 If x(5, d(5, i)) = 1 Then h1 = h1 + 2 ^ (4 - i) If x(1, e(1, i)) = 1 Then h2 = h2 + 2 ^ (4 - i) Next x(1, c(1, h2)) = x(1, c(1, h2)) Xor x(5, c(5, h1)) '*** calculate output bit *** h3 = x(2, f(1)) * 2 ^ 0 + x(3, f(2)) * 2 ^ 1 + x(4, f(3)) * 2 ^ 2 Select Case h3 Case 0, 1: ks(k) = x(1, b(1)) Case 2: ks(k) = x(2, b(2)) Case 3, 4: ks(k) = x(3, b(3)) Case 5: ks(k) = x(4, b(4))

Page 138: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-7

Case 6, 7: ks(k) = x(5, b(5)) End Select Next End Sub Subroutine 3: Test stream Public Sub test(n) '*** frequency test *** n0 = 0: n1 = 0 For i = 1 To n If ks(i) = 0 Then n0 = n0 + 1 If ks(i) = 1 Then n1 = n1 + 1 Next test1 = (n0 - n1) ^ 2 / n If test1 < 3.8415 Then Form1.Text4(0).Text = Form1.Text4(0).Text + " pass" Else Form1.Text4(0).Text = Form1.Text4(0).Text + " fail" End If '*** Sreial test *** n0 = 0: n1 = 0 n00 = 0: n01 = 0: n10 = 0: n11 = 0: t = "" For i = 1 To n If ks(i) = 0 Then n0 = n0 + 1 If ks(i) = 1 Then n1 = n1 + 1 t = t + Mid(Str(ks(i)), 2, 1) Next For i = 1 To n - 1 t1 = Mid(t, i, 2) If t1 = "00" Then n00 = n00 + 1 If t1 = "01" Then n01 = n01 + 1 If t1 = "10" Then n10 = n10 + 1 If t1 = "11" Then n11 = n11 + 1 Next test2 = (4 / (n - 1) * (n00 ^ 2 + n01 ^ 2 + n10 ^ 2 + n11 ^ 2)) - (2 / n) * (n0 ^ 2 + n1 ^ 2) + 1 If test2 < 5.9915 Then

Page 139: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-8

Form1.Text4(1).Text = Form1.Text4(1).Text + " pass" Else Form1.Text4(1).Text = Form1.Text4(1).Text + " fail" End If '*** Poker test *** bm = 3 ' length of block f1 = Int(n / bm) For i = 1 To n t = t + Mid(Str(ks(i)), 2, 1) Next n000 = 0: n001 = 0: n010 = 0: n011 = 0 n100 = 0: n101 = 0: n110 = 0: n111 = 0 For i = 1 To n Step bm t1 = Mid(t, i, bm) If t1 = "000" Then n000 = n000 + 1 If t1 = "001" Then n001 = n001 + 1 If t1 = "010" Then n010 = n010 + 1 If t1 = "011" Then n011 = n011 + 1 If t1 = "100" Then n100 = n100 + 1 If t1 = "101" Then n101 = n101 + 1 If t1 = "110" Then n110 = n110 + 1 If t1 = "111" Then n111 = n111 + 1 Next test3 = (2 ^ bm / f1) * (n000 ^ 2 + n001 ^ 2 + n010 ^ 2 + n011 ^ 2 + n100 ^ 2 + n101 ^ 2 + n110 ^ 2 + n111 ^ 2) - f1 If test3 < 14.0671 Then Form1.Text4(2).Text = Form1.Text4(2).Text + " pass" Else Form1.Text4(2).Text = Form1.Text4(2).Text + " fail" End If '*** Run test *** Dim e1(20) b1 = 0: b2 = 0: b3 = 0: g1 = 0: g2 = 0: g3 = 0 k = 3:p=0:t = ks(1): p = 1 For i = 2 To n If ks(i) = t Then p = p + 1 Else

Page 140: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-9

If ks(i) = 0 And p = 1 Then b1 = b1 + 1 If ks(i) = 0 And p = 2 Then b2 = b2 + 1 If ks(i) = 0 And p = 3 Then b3 = b3 + 1 If ks(i) = 1 And p = 1 Then g1 = g1 + 1 If ks(i) = 1 And p = 2 Then g2 = g2 + 1 If ks(i) = 1 And p = 3 Then g3 = g3 + 1 t = ks(i): p = 1 End If Next If ks(i - 1) = 0 And p = 1 Then b1 = b1 + 1 If ks(i - 1) = 0 And p = 2 Then b2 = b2 + 1 If ks(i - 1) = 0 And p = 3 Then b3 = b3 + 1 If ks(i - 1) = 1 And p = 1 Then g1 = g1 + 1 If ks(i - 1) = 1 And p = 2 Then g2 = g2 + 1 If ks(i - 1) = 1 And p = 3 Then g3 = g3 + 1 For i = 1 To k e1(i) = (n - i + 3) / 2 ^ (i + 2) Next test4 = (((b1 - e1(1)) ^ 2 / e1(1)) + ((b2 - e1(2)) ^ 2 / e1(2)) + ((b3 - e1(3)) ^ 2 / e1(3))) + (((g1 - e1(1)) ^ 2 / e1(1)) + ((g2 - e1(2)) ^ 2 / e1(2)) + ((g3 - e1(3)) ^ 2 / e1(3))) If test4 < 9.4877 Then Form1.Text4(3).Text = Form1.Text4(3).Text + " pass" Else Form1.Text4(3).Text = Form1.Text4(3).Text + " fail" End If '*** Autocorrelation test *** d1 = 8: ad = 0 For i = 0 To n - d1 - 1 ad = ad + (ks(i) Xor ks(i + d1)) Next test5 = 2 * (ad - (n - d1) / 2) / Sqr(n - d1) If test5 < 1.96 Then Form1.Text4(4).Text = Form1.Text4(4).Text + " pass" Else Form1.Text4(4).Text = Form1.Text4(4).Text + " fail" End If End Sub

Page 141: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-10

Subroutine 4: Convert text to Huffman code Public Sub text_to_binary_huffman() t5 = Form1.Text1.Text t1 = Format(Len(Form1.Text1.Text), "000") + Form1.Text1.Text q3 = "" Call huffarray For i = 1 To Len(t5) xt = Mid(t5, i, 1) For j = 1 To 36 If huff(1, j) = xt Then q3 = q3 + huff(2, j): Exit For Next Next hidelen = Len(q3) s = Len(q3) s1 = Space(10) For j = 1 To 10 t4 = s Mod 2 s = Int(s / 2) Mid(s1, 11 - j, 1) = Mid(Str(t4), 2, 1) Next j q3 = s1 + q3 hidelen = Len(q3) k = 0: coun = 0 For i = 0 To hidelen k = k + 1 m(k) = Val(Mid(q3, i + 1, 1)) Next hidelen = k - 1 End Sub

Page 142: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-11

Subroutine 5: Convert Huffman code to text Public Sub binary_to_text_huffman() t1 = 0 For j = 0 To 9 t1 = t1 + (2 ^ (9 - j) * m(j + 1)) Next j bitlen = t1 t1 = "" For i = 11 To bitlen + 10 t1 = t1 + Mid(Str(m(i)), 2, 1) Next i Call huffarray k = 1: c1 = 3: p = 0 Do xt = Mid(t1, k, c1) For j = 1 To 36 If huff(2, j) = xt Then st = st + huff(1, j): p = 1: Exit For Next If p = 1 Then k = k + c1 p = 0: c1 = 3 If k > bitlen Then Exit Do Else c1 = c1 + 1 End If Loop End Sub Subroutine 6: Encipher process Public Sub encipher() For i = 1 To hidelen ci(i) = m(i) Xor ks(i) Next End Sub

Page 143: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-12

Subroutine 7: Decipher process Public Sub decipher() For i = 1 To hidelen m(i) = ci(i) Xor ks(i) Next End Sub Subroutine 8: Create Huffman array Public Sub huffarray() huff(1, 1) = " ": huff(2, 1) = "000" huff(1, 2) = "ا": huff(2, 2) = "001" huff(1, 3) = "ل": huff(2, 3) = "0100" huff(1, 4) = "ي": huff(2, 4) = "0101" huff(1, 5) = "م": huff(2, 5) = "0110" huff(1, 6) = "و": huff(2, 6) = "0111" huff(1, 7) = "ت": huff(2, 7) = "10000" huff(1, 8) = "ن": huff(2, 8) = "10001" huff(1, 9) = "ر": huff(2, 9) = "10010" huff(1, 10) = "ف": huff(2, 10) = "10011" huff(1, 11) = "ة": huff(2, 11) = "101000" huff(1, 12) = "ع": huff(2, 12) = "101001" huff(1, 13) = "ه": huff(2, 13) = "101010" huff(1, 14) = "ب": huff(2, 14) = "101011" huff(1, 15) = "س": huff(2, 15) = "101100" huff(1, 16) = "ق": huff(2, 16) = "101101" huff(1, 17) = "ك": huff(2, 17) = "101110" huff(1, 18) = "د": huff(2, 18) = "101111" huff(1, 19) = "أ": huff(2, 19) = "110000" huff(1, 20) = "ح": huff(2, 20) = "110001" huff(1, 21) = "ش": huff(2, 21) = "110010" huff(1, 22) = "ص": huff(2, 22) = "110011" huff(1, 23) = "ى": huff(2, 23) = "110100" huff(1, 24) = "ذ": huff(2, 24) = "110101" huff(1, 25) = "ج": huff(2, 25) = "110110" huff(1, 26) = "خ": huff(2, 26) = "110111" huff(1, 27) = "إ": huff(2, 27) = "111000" huff(1, 28) = "ط": huff(2, 28) = "111001" huff(1, 29) = "ث": huff(2, 29) = "111010" huff(1, 30) = "ض": huff(2, 30) = "111011"

Page 144: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-13

huff(1, 31) = "ظ": huff(2, 31) = "111100" huff(1, 32) = "ئ": huff(2, 32) = "111101" huff(1, 33) = "غ": huff(2, 33) = "1111100" huff(1, 34) = "ز": huff(2, 34) = "1111101" huff(1, 35) = "ء": huff(2, 35) = "1111110" huff(1, 36) = "ؤ": huff(2, 36) = "1111111" End Sub Subroutine 9: Hide process (Unicode method) Public Sub hide_text1() Call asc_to_uni k1 = 0 For i = 1 To x.ActiveDocument.Words.Count With doc.ActiveDocument.Words(i) t = Trim(.Text) X1 = "" For j = 1 To Len(t) X2 = Mid(t, j, 1) x3 = Mid(t, j + 1, 1) If (j = 1 And (X2 = "ا" Or X2 = "أ" Or X2 = "د" Or X2 = "ذ" Or X2 = "ر"_ Or X2 = "ز" Or X2 = "و")) Or ((X1 = "ا" Or X1 = "أ" Or X1 = "د" Or_ X1 = "ذ" Or X1 = "ر" Or X1 = "ز" Or X1 = "و") And (X2 = "ا" Or_ X2 = "أ" Or X2 = "د" Or X2 = "ذ" Or X2 = "ر" Or X2 = "ز" Or_ X2 = "و")) Or (j = Len(t) And (X1 = "ا" Or X1 = "أ" Or X1 = "د" Or_ X1 = "ذ" Or X1 = "ر" Or X1 = "ز" Or X1 = "و")) Then k1 = k1 + 1 If ci(k1) = 1 Then h1 = AscW(.Characters(j).Text) For k2 = 1 To 36 If h1 = ascii(k2) Then .Characters(j).Text = ChrW(unicode(k2)) Exit For End If Next k2 GoTo f End If End If

Page 145: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-14

X1 = X2 Next j End With f: Next i End Sub Subroutine 10: Unhide process (Unicode method) Public Sub unhide_text1() Call asc_to_uni k1 = 0 For i = 1 To doc.ActiveDocument.Words.Count With doc.ActiveDocument.Words(i) t = Trim(.Text) If Len(t) = 1 Then For k = 1 To 36 If 2 ^ 16 + AscW(t) = unicode(k) Then k1 = k1 + 1: ci(k1) = 1 GoTo f End If Next k End If If Len(t) = 2 Then For k = 1 To 36 If 2 ^ 16 + AscW(Left(t, 1)) = unicode(k) Then k1 = k1 + 1: ci(k1) = 1 k1 = k1 + 1: ci(k1) = 1 i = i + 1: GoTo f End If Next k End If If Len(t) = 3 Then For k = 1 To 36 If 2 ^ 16 + AscW(Left(t, 1)) = unicode(k) Then k1 = k1 + 1: ci(k1) = 1 k1 = k1 + 1: ci(k1) = 1 k1 = k1 + 1: ci(k1) = 1 GoTo f

Page 146: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-15

End If Next k End If X1 = "" For j = 1 To Len(t) X2 = Mid(t, j, 1) x3 = Mid(t, j + 1, 1) If (j = 1 And (X2 = "ا" Or X2 = "أ" Or X2 = "د" Or X2 = "ذ" Or X2 = "ر"_ Or X2 = "ز" Or X2 = "و")) Or ((X1 = "ا" Or X1 = "أ" Or_ X1 = "د" Or X1 = "ذ" Or X1 = "ر" Or X1 = "ز" Or_ X1 = "و") And (X2 = "ا" Or X2 = "أ" Or X2 = "د" Or_ X2 = "ذ" Or X2 = "ر" Or X2 = "ز" Or X2 = "و"))_ Or (j = Len(t) And (X1 = "ا" Or X1 = "أ" Or X1 = "د" Or_ X1 = "ذ" Or X1 = "ر" Or X1 = "ز" Or X1 = "و")) Then k1 = k1 + 1: ci(k1) = 0 End If X1 = X2 Next j End With f: Next i End Sub Subroutine 11: Lookup table between ASCII code and Unicode Public Sub asc_to_uni() ascii(1) = 1569: unicode(1) = 65152 ascii(2) = 1570: unicode(2) = 65153 ascii(3) = 1571: unicode(3) = 65155 ascii(4) = 1572: unicode(4) = 65157 ascii(5) = 1573: unicode(5) = 65159 ascii(6) = 1574: unicode(6) = 65161 ascii(7) = 1575: unicode(7) = 65165 ascii(8) = 1576: unicode(8) = 65167 ascii(9) = 1577: unicode(9) = 65171 ascii(10) = 1578: unicode(10) = 65173 ascii(11) = 1579: unicode(11) = 65177 ascii(12) = 1580: unicode(12) = 65181

Page 147: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix B Program Subroutines

B-16

ascii(13) = 1581: unicode(13) = 65185 ascii(14) = 1582: unicode(14) = 65189 ascii(15) = 1583: unicode(15) = 65193 ascii(16) = 1584: unicode(16) = 65195 ascii(17) = 1585: unicode(17) = 65197 ascii(18) = 1586: unicode(18) = 65199 ascii(19) = 1587: unicode(19) = 65201 ascii(20) = 1588: unicode(20) = 65205 ascii(21) = 1589: unicode(21) = 65209 ascii(22) = 1590: unicode(22) = 65213 ascii(23) = 1591: unicode(23) = 65217 ascii(24) = 1592: unicode(24) = 65221 ascii(25) = 1593: unicode(25) = 65225 ascii(26) = 1594: unicode(26) = 65229 ascii(27) = 1601: unicode(27) = 65233 ascii(28) = 1602: unicode(28) = 65237 ascii(29) = 1603: unicode(29) = 65241 ascii(30) = 1604: unicode(30) = 65245 ascii(31) = 1605: unicode(31) = 65249 ascii(32) = 1606: unicode(32) = 65253 ascii(33) = 1607: unicode(33) = 65257 ascii(34) = 1608: unicode(34) = 65261 ascii(35) = 1609: unicode(35) = 65263 ascii(36) = 1610: unicode(36) = 65265 End Sub Variable Definition Public x(5, 50) As Byte ' Registers bits Public g(5, 50) As Byte ' Feedback bits Public b(5) As Byte ' Register length Public c(5, 16) As Byte ' Locations for transition Public d(5, 4) As Byte ' Locations of address * up Public e(5, 4) As Byte ' Locations of address * down Public f(3) As Byte ' Locations for multiplexer selector Public ks(5000) ' Key stream Public tim As String ' Timer Public m(5000) As Byte ' Plain text Public ci(5000) As Byte ' Cipher text Public hidelen As Integer ' Length of data to cipher Public timloc(7) As Byte ' Timer locations Public huff(2, 50) ' Hufman array Public ascii(40), unicode(40) ' ASCII code and Unicode tables

Page 148: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix C Expert’s and Learner’s Questionnaire Forms

C-1

Expert’s Questionnaire Form University of Technology Technical Education Dept. Electricity engineering section Dear Sir …………………… Please, answer on this questionnaire items, which concern in (Hiding information in Arabic text), by putting (*) in the suitable field, from your point of view.

Beforehand, thank you very much for your cooperation Auday Jamal The researcher No. The items Large Medium little

1 The instructions of using the package are simple.

2 The density of displaying information on computer screen is suitable.

3 The information that displayed in the package are suitable scientifically.

4 Designing of the instructional package takes into account the personal differences.

5 Clearness of the item’s titles.

6 Displaying package style is limiting.

7 The attached images in the instructional units are participating to understand the concepts.

8 Understanding the producing questions in the program.

9 Suitable of the used color in the package.

10 The questions are including all items in the instructional package.

11 The language style that used to explain the scientific concepts and information is clear.

Page 149: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix C Expert’s and Learner’s Questionnaire Forms

C-2

Learner’s Questionnaire Form University of Technology Technical Education Dept. Electricity engineering section Dear learner …………………. Please, answer on this questionnaire items, which concern in (Hiding information in Arabic text), by putting (*) in the suitable field, from your point of view.

Beforehand, thank you very much for your cooperation Auday Jamal The researcher No. The items Large Medium little

1 The division of the instructional package subject into five typical units participated to increase your understanding of the package.

2 The scientific concepts that displayed in the package were simple.

3 The information that displayed in the package was clear.

4 The flowchart assisted to increase in understanding the scientific information

5 The harmony between display image and related information was fines.

6 Moving steps between the instructional package screens were simple.

7 English language is better that Arabic language in displaying the instructional package materials.

8 The language style to explain the scientific concepts is understand.

9 Using of colors cleared the displayed concepts.

10 Titles of the instructional package items are clear.

11 Immediate support for the answer, increased your desire to continue with the package.

12 The package increases your learning desire.

Page 150: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

Appendix C Expert’s and Learner’s Questionnaire Forms

C-3

Expert Names

No. Name Place of word

1 Dr. Krikor S. Krikor Technical Education Department

2 Dr. Sameera Abdulla Technical Education Department

3 Dr. Inaam A. Al-Sadik Technical Education Department

4 Dr. Ibtesam Raheem Karhiy Technical Education Department

5 Dr. Hosham Salim Technical Education Department

6 Dr. Sahar Radiy Technical Education Department

7 Dr. Intethar Institute of Instructor Composing

8 Ashuaq Kassem Technical education Department

9 Asia Mohammad Institute of Technology

10 Nagham Ezat Institute of Technology

Page 151: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

الخلاصة

ضم العالم تانتقال تلك المعلومات ضمن شبكة كيفية المعلومات وفي عالم تطورت فيه

ككل يتطلب ضرورة وجود طريقة للحفاظ على خصوصية وحماية المعلومات، فكان ذلك دافعا

لقيام هذا البحث، إذ إن هدفه إيجاد تقنية جديدة لتشفير وإخفاء البيانات في ملفات النصوص

.العربية

برنامج حاسوبي يقوم بتشفير المعلومة النصية لغرض إعدادولتحقيق هدف البحث تم

بنظر الاعتبار التعامل مع نوعين من الملفات حيث يضم النوع ينآخذ ضمن الملفات، إخفاءها

Rich Text ويضم النوع الثاني الملفات من نوع Document file الملفات من نوع الأول

Format file معالج النصوص تطبيقات كلا النوعين مع قيتواف حيثMicrosoft Word

Processor.

ربية ومن ثم إخفاءها باستخدام عبرنامج لتشفير النصوص العمل على بناء ليركز هذا ا

مع ناللتان تتعاملا Word Shift Coding و White Space Methodكلا من طريقتي

برنامج إعداد وكذلك . النصوص العربيةاءإخف حيث تم استخدامها في الإنكليزيةالنصوص

المستخدمة في الكتابة العربية ) Extension(لاخفاء النصوص العربية مستعينا بفكرة الاستطالة

. البياناتإخفاءلغرض

زال هناك حاجة الى طريقة اكثر تتم تطبيق الطرق اعلاه لاخفاء النصوص وتبين انه لا

تتعامل مع للإخفاء طريقة جديدة إعداد سرية عالية لذا تم منالإخفاءكفاءة لما تتطلبه عملية

على الإخفاء تعتمد في عملية إذ Unicode System methodالنصوص العربية وتم تسميتها

ومن ثم تطبيقها بشكل عملي على النصوص العربية . الخاصة بالحروف العربية) Code(الشفرة

النصوص هو نفس حجم الملف قبل إخفاءلملف بعد لها بان حجم االإيجابيةوتبين من النتائج

عند استعراضه في برنامج معالج النصوص الإخفاء، ومطابقة الملف بعد عملية الإخفاءعملية

بشكل كامل مما يجعل من الصعب اكتشاف البيانات المخفية من الإخفاءمع الملف قبل عملية

.قبل الشخص المعترض للرسالة

شبكة المعلومات العالمية مستخدمي من قبل والإخفاءة التشفير ونظرا لاستخدام عملي

تشغيل مختلفة قد يصعب على البعض منها استعراض أنظمةولكون هذه الشبكة تتعامل مع

ولضمان انسيابية وصول النصوص ما بين Document fileملفات النصوص من نوع

جميع ان تتعرف عليه بالإمكان من الملفات آخر نوع إيجادالمرسل والمستلم، كان لابد من

. التشغيلأنظمة

Page 152: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

لاخفاء البيانات حيث Rich Text Formatالى استخدام الملفات من نوع تم اللجوء لذا

برنامج حاسوبي بإعدادوذلك ) Source Code(تم التعامل مع الشفرات الرئيسية للملفات

. ضمن الملفللإخفاء

عند الإخفاءتي تم التوصل اليها هي مطابقة الملف بعد عملية ومن النتائج الإيجابية ال

إمكانية بشكل كامل، و الإخفاءاستعراضه في برنامج معالج النصوص مع الملف قبل عملية

من سلبيات هذه الطريقة فهو زيادة الى جانب ذلك . إخفائهامضاعفة كمية المعلومات التي يمكن

برنامج فرعي يقوم إعداد، ولغرض تفادي ذلك تم إخفائهاتم حجم الملف بزيادة المعلومات التي ي

.الأصليبضغط الملف وتصغير حجمه بحيث يكون مقارب تماما لحجم الملف

التصميم مبادئتصميم وتنفيذ حقيبة تعليمية بالاعتماد على كذلك البحث من أهداف

الخاصة بعمليتي التشفير لتقديم المفاهيم والمعلوماتالإرشاديةالتعليمي وباستخدام الطريقة

استبيان لغرض إعدادولغرض ضمان استفادة الفئة المستهدفة من الحقيبة التعليمية تم . والإخفاء

الخبراء ومجموعة اخرى من الطلبة، وبناءا الأساتذةاستعراضها وتقييمها من قبل مجموعة من

التعديلات التي تتطلبها الحقيبة راءإجعلى نتائج الاستبيان وبالاستعانة بعملية التغذية العكسية تم

.التعليمية من اجل الوصول الى الربط بين الجانبين العملي والنظري للبحث

Page 153: Data Hiding in Arabic Text - uotechnology.edu.iq

Data Hiding in Arabic Text

جمهورية العراق

وزارة التعليم العالي والبحث العلمي

الجامعة التكنولوجية

قسم التعليم التكنولوجي

אאא

الىمقدمةأطروحة

قسم التعليم التكنولوجي في الجامعة التكنولوجية

دكتوراه فلسفةدرجةوهي جزء من متطلبات نيل

في

هندسة كهربائية/ تكنولوجيا التعليم الهندسي

من قبل

فوزيعدي جمال

بأشراف

شوكت ذياب الهيازعي. د. أ صالح مهدي القرعاوي. د. م.أ

2007