introduction ns report

9
GROUPF Break-SubstitutionCipher Introduction: Substitution ciphers are one of the oldest and simplest methods for encrypting text. Historically, the substitution (sometimes it's called replacement cipher) cipher has been used many times since it provides a good illustration of basic cryptography. The name substitution cipher comes from the fact that each letter from the message we want to encrypt is substituted by another character (e.g. letter or symbol). Someone who knows which character from the plain text has been mapped to which character from the cipher text can easily decrypt it. For example, to encrypt the following: Substitution Cipher 

Upload: yasser-al-mohammed

Post on 06-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 1/9

GROUPF Break-SubstitutionCipher

Introduction:

Substitution ciphers are one of the oldest and simplest methods for encrypting text. Historically, the substitution(sometimes it's called replacement cipher) cipher has been used many times since it provides a good illustration

of basic cryptography. The name substitution cipher comes from the fact that each letter from the message wewant to encrypt is substituted by another character (e.g. letter or symbol). Someone who knows which character 

from the plain text has been mapped to which character from the cipher text can easily decrypt it.

For example, to encrypt the following:

Substitution Cipher 

Page 2: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 2/9

2

Howitcanbecracked:

 

Brute-Force:

Finding the key for a substitution cipher by using brute-force is almost impossible and waste of both time and

resources. The reason is a simple statistical trivia, in case of using English letters only the key space will be26!≈2

88which a huge number even if the attacker use hundreds of powerful computers.

 

Letter Frequency Analysis:

This method represents a major weakness in substitution ciphers. That is because each plaintext symbolalways maps to specific cipher text symbol. That means that the statistical properties of the plain text are

 preserver in the cipher text. In English 'E' is the most frequent letter (about 12.51%), 'T' comes in second(9.25%) then ‘A’ (8.04). How is that going to help? Easy, by a simple observation we can find out what's the

most frequent symbol in the cipher text then we replace it with the most frequent letter in English and so on.This method can be generalized by testing pairs or triples or even quadruples. For instance, in English, a U

almost always follows the letter Q. This behavior can be exploited to detect the substitution of the letter Q andthe letter U.

 

Letter Frequency Letter Frequency

E 12.51% M 2.53%

T 9.25% F 2.30%

A 8.04% P 2.00%

O 7.60% G 1.96%

I 7.26% W 1.92%

  N 7.09% Y 1.7

S 6.54% B 1.54%

R 6.12% V 0.99%

H 5.49% K 0.67%

L 4.14% X 0.19%

D 3.99% J 0.16%

C 3.06% Q 0.11%U 2.71% Z 0.09%

Page 3: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 3/9

3

Implementation:

Attheveryfirst,Iwouldliketomentionourprogramminglanguageweused.WeusedQtwithC++.

Whenwestartedfromthebeginningtoseewhatstepswewouldfollowwetriedtonottakeaverydeeplook

attheInternetandjusttrytocopyortakeanyideafromit.Wetriedtoworkalmost%100withourwayof

implementation,sowestartedwiththedesignofourprojectandwetriedtomakeitsimpleandeasytouse.

Hereascreenshotofourfirstdesign:

Afterthatwestartedworkingonit,whatwehaveinourmindsiswewillfirsttakeaverygoodadvantageof

the“LFA”.Sothefirstthingwedidistocountthemostrepeatedletterfromtheciphertext.Forexamplewe

pickthefirstletterandseehowmanytimesitgotrepeatedthenstoreit,afterthatwedothesamethingwithalllettersthenputallthatinvariables.Whenwefinishedthatwebasedon“LFA”toreplacethemost

repeatedletterwith“E”andthesecondwith“T”thirdwith“A”forthwith“O”fifthandlastwith“I“.We

ignored“N”tohaveonlyfive.Atthebeginningwethoughtthatwouldbealwaystrue!Butdiscoveredthat

isn’tthecasealwaysandwilltalkaboutthatlateron.

Page 4: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 4/9

4

Afterwereplacedthemostfiverepeatedletterswegotatextwecanonlyreadfivelettersfromit.Ofcourse

thatwasn’tenough.Sotriedtoseewhatwegotfromthatsowecouldproceed.Wetookalooktothetextto

seewhatcorrectwordsdowegotafterthisoperation.Wefoundthesewordsappears:

“Eat–ate–tea–tie…“Thesewordsdidnothelpmuchsowetrytofindthemostwordswiththreeletterswefound:

//  the, and, for, are, but, not, you, all, any, can, //  had, her,  was, one, our, out, day, get, has, him, his, //  how,  man, new, now, old, see, two,  way,  who, boy, //  did, its, let, put, say, she, too, use 

Thenwetriedtopickawordwithamissingoneletterlike:“The”wegotthe“T”&“E”andwewantthe“H”so

oneletterismissingitlooksperfectforus,soweusedregularexpressionsinthiscasetomaketheprogram

findanywordstartingwith“T”thenunknownletterthen“E”=“T?e”where‘?’representthemissingletter.

Whenwedidthatwegothitssoletssayafterwereplacedthebigfiveletterswegot“Tke”soourprogramwiltakethatwordsinceitmatched“T?e”andweaskedtheprogramtogiveustheletterinthesecondposition

whichus‘K’inthiscasetoreplaceEVERY‘K’ontheourlittleplaintext.Nowwegotinourdictionary6letters.

Sowetookotherwordslikethiswithamissingoneletter.Butnowinsteadofsayingweonlygotoneletterin

theword“Her”nowwecansaywegot2letters,sowearenowabletofindtheletter“R”andaddtoour

dictionaryanduseittohuntotherwords.

Butbyusingtheabovetechniquewehavetoaccepttheriskoffindingotherwordswiththesame

regularexpressionlikeinsteadoffinding“The”wewillget“Tee”or“Tie”or“Toe”soinsteadoffindingthe

letter“H”weget‘e’,’I’or‘o’.

Afterthatwecouldn’tfoundallthethreeletterwordssowetried4letterwordswithrespectoftheorderof

themostlettersinEnglishsoweusethatnewletterinawiderarea.

Trulywegotarealproblemwhichisiftheciphertextwegotdoesnotcontainsthewordswepickedto

checkwithwegotnochanceoffindingit,forexampleiftheciphertextcamewithno“the”wewillneverfind

theletter“H”outofit.SowetriedfirstlytousemostcommonwordssothepossibilityofthesewordsnotcomealmostZERO.Also

wetriedtoputmorethanonewordtofindthatlettertohelpusmakesurethatwordwillbethereandthe

lettergotfound.

Note:weusedinthecrackedtext(UpperlettersrepresentUnknownletter,Lowerletterrepresentthat

lettercamefromusandithasbeencracked).

Page 5: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 5/9

5

Afterallthatworkwefacedaproblemaswementionedearlier,notalwaystheorderofthemost

repeatedlettersarethesame.Now,shouldwechangeeverything?Becauseoneletterofthebigletters

disorderedwetoast.Sowecan’tbetonthat,thatiswhyweprovidefortheusertocheckifcrackedtextis

readable,niceandcleanortheuserhavetoabilitytoclickonotherbuttontochangetheorderofthebigfiveletters.

Sotomaketheusersureoftherealorderthebenefitofaccuracy:

Iftheuserclickoneofthemaswementionedtheorderwillchangeandanumberofaccuracywillchange

automaticallyuntilyoufindtherealorder.

Everybuttonofthe

Abovewillcall

Differentfunctionof

These.Wefoundno

Disorderofthe“E”and

“T”letterssowejustplayingwiththerest.

Page 6: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 6/9

6

Eventhoughthereisnowaywefoundahundredpercentinallcases.Soinourcasewefoundthatin

ordertohave

Almost%95oftheplaintexttheusermustenteratleast8-10pagesaverage.Weknowitseemstoolongbut

aftermanytrieswefoundthatthatnumberofpageshaveahighpossibilitytocontainsallthepatternswehavetohavehighaccuracyrate.

Lastthingweaddedthatweprovidealsoaverypowerfulfunctionthatmayleadto%99.99andit

seemsbadoroldbutitdoesmakethatcharm.

Yes,Replacefunctionallowsyoutoreplaceanyletterwiththeletteryouwantmanually.Becausesometimes

aftertheprogramdoitsjobandleadsyouto%90andyouchecktheplaintextandyoufindtheword

“eQample”andyouinstantlyknewitas“example”theprogrammadeallthesechangesbutgotstuck

somehowanddidnotfindtheletter“X”whywedon’tprovideyouthisfunctiontochangeALL“Q”sinthe

plaintextwiththerightlettertomakeit%99foryou.

Accuracy:

Inordertocalculatetheaccuracyoftheplaintextweusedsimplealgorithm.Whichisafterthe

programfinishedreplacingthelettersitfoundcallsafunctionthatcounteverysmalllettersintheplaintext

boxwhichindicatestheletterswefound.Andthencountthewholetextincludingupperlettersandlower

letters.Thendividethenumberoflowerlettersoverthenumberofalltext.Thenmultiplyitby100.

Page 7: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 7/9

7

TheFinaldesign:

Page 8: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 8/9

8

ScreenShoots:

Wehopethegoodmark&luckandwishthebestforallofus.

Herewetriedaciphertext

thenweputitinitsplace

thenwepressed“1”butthe

resultcamewith%82just

Note:(“@”means“a”)but

wegotweirderrorwhenwe

triedtoputit“a”sowedid

thatinstead.

Page 9: Introduction NS Report

8/3/2019 Introduction NS Report

http://slidepdf.com/reader/full/introduction-ns-report 9/9

9