introduction ns report
TRANSCRIPT
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 1/9
GROUPF Break-SubstitutionCipher
Introduction:
Substitution ciphers are one of the oldest and simplest methods for encrypting text. Historically, the substitution(sometimes it's called replacement cipher) cipher has been used many times since it provides a good illustration
of basic cryptography. The name substitution cipher comes from the fact that each letter from the message wewant to encrypt is substituted by another character (e.g. letter or symbol). Someone who knows which character
from the plain text has been mapped to which character from the cipher text can easily decrypt it.
For example, to encrypt the following:
Substitution Cipher
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 2/9
2
Howitcanbecracked:
Brute-Force:
Finding the key for a substitution cipher by using brute-force is almost impossible and waste of both time and
resources. The reason is a simple statistical trivia, in case of using English letters only the key space will be26!≈2
88which a huge number even if the attacker use hundreds of powerful computers.
Letter Frequency Analysis:
This method represents a major weakness in substitution ciphers. That is because each plaintext symbolalways maps to specific cipher text symbol. That means that the statistical properties of the plain text are
preserver in the cipher text. In English 'E' is the most frequent letter (about 12.51%), 'T' comes in second(9.25%) then ‘A’ (8.04). How is that going to help? Easy, by a simple observation we can find out what's the
most frequent symbol in the cipher text then we replace it with the most frequent letter in English and so on.This method can be generalized by testing pairs or triples or even quadruples. For instance, in English, a U
almost always follows the letter Q. This behavior can be exploited to detect the substitution of the letter Q andthe letter U.
Letter Frequency Letter Frequency
E 12.51% M 2.53%
T 9.25% F 2.30%
A 8.04% P 2.00%
O 7.60% G 1.96%
I 7.26% W 1.92%
N 7.09% Y 1.7
S 6.54% B 1.54%
R 6.12% V 0.99%
H 5.49% K 0.67%
L 4.14% X 0.19%
D 3.99% J 0.16%
C 3.06% Q 0.11%U 2.71% Z 0.09%
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 3/9
3
Implementation:
Attheveryfirst,Iwouldliketomentionourprogramminglanguageweused.WeusedQtwithC++.
Whenwestartedfromthebeginningtoseewhatstepswewouldfollowwetriedtonottakeaverydeeplook
attheInternetandjusttrytocopyortakeanyideafromit.Wetriedtoworkalmost%100withourwayof
implementation,sowestartedwiththedesignofourprojectandwetriedtomakeitsimpleandeasytouse.
Hereascreenshotofourfirstdesign:
Afterthatwestartedworkingonit,whatwehaveinourmindsiswewillfirsttakeaverygoodadvantageof
the“LFA”.Sothefirstthingwedidistocountthemostrepeatedletterfromtheciphertext.Forexamplewe
pickthefirstletterandseehowmanytimesitgotrepeatedthenstoreit,afterthatwedothesamethingwithalllettersthenputallthatinvariables.Whenwefinishedthatwebasedon“LFA”toreplacethemost
repeatedletterwith“E”andthesecondwith“T”thirdwith“A”forthwith“O”fifthandlastwith“I“.We
ignored“N”tohaveonlyfive.Atthebeginningwethoughtthatwouldbealwaystrue!Butdiscoveredthat
isn’tthecasealwaysandwilltalkaboutthatlateron.
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 4/9
4
Afterwereplacedthemostfiverepeatedletterswegotatextwecanonlyreadfivelettersfromit.Ofcourse
thatwasn’tenough.Sotriedtoseewhatwegotfromthatsowecouldproceed.Wetookalooktothetextto
seewhatcorrectwordsdowegotafterthisoperation.Wefoundthesewordsappears:
“Eat–ate–tea–tie…“Thesewordsdidnothelpmuchsowetrytofindthemostwordswiththreeletterswefound:
// the, and, for, are, but, not, you, all, any, can, // had, her, was, one, our, out, day, get, has, him, his, // how, man, new, now, old, see, two, way, who, boy, // did, its, let, put, say, she, too, use
Thenwetriedtopickawordwithamissingoneletterlike:“The”wegotthe“T”&“E”andwewantthe“H”so
oneletterismissingitlooksperfectforus,soweusedregularexpressionsinthiscasetomaketheprogram
findanywordstartingwith“T”thenunknownletterthen“E”=“T?e”where‘?’representthemissingletter.
Whenwedidthatwegothitssoletssayafterwereplacedthebigfiveletterswegot“Tke”soourprogramwiltakethatwordsinceitmatched“T?e”andweaskedtheprogramtogiveustheletterinthesecondposition
whichus‘K’inthiscasetoreplaceEVERY‘K’ontheourlittleplaintext.Nowwegotinourdictionary6letters.
Sowetookotherwordslikethiswithamissingoneletter.Butnowinsteadofsayingweonlygotoneletterin
theword“Her”nowwecansaywegot2letters,sowearenowabletofindtheletter“R”andaddtoour
dictionaryanduseittohuntotherwords.
Butbyusingtheabovetechniquewehavetoaccepttheriskoffindingotherwordswiththesame
regularexpressionlikeinsteadoffinding“The”wewillget“Tee”or“Tie”or“Toe”soinsteadoffindingthe
letter“H”weget‘e’,’I’or‘o’.
Afterthatwecouldn’tfoundallthethreeletterwordssowetried4letterwordswithrespectoftheorderof
themostlettersinEnglishsoweusethatnewletterinawiderarea.
Trulywegotarealproblemwhichisiftheciphertextwegotdoesnotcontainsthewordswepickedto
checkwithwegotnochanceoffindingit,forexampleiftheciphertextcamewithno“the”wewillneverfind
theletter“H”outofit.SowetriedfirstlytousemostcommonwordssothepossibilityofthesewordsnotcomealmostZERO.Also
wetriedtoputmorethanonewordtofindthatlettertohelpusmakesurethatwordwillbethereandthe
lettergotfound.
Note:weusedinthecrackedtext(UpperlettersrepresentUnknownletter,Lowerletterrepresentthat
lettercamefromusandithasbeencracked).
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 5/9
5
Afterallthatworkwefacedaproblemaswementionedearlier,notalwaystheorderofthemost
repeatedlettersarethesame.Now,shouldwechangeeverything?Becauseoneletterofthebigletters
disorderedwetoast.Sowecan’tbetonthat,thatiswhyweprovidefortheusertocheckifcrackedtextis
readable,niceandcleanortheuserhavetoabilitytoclickonotherbuttontochangetheorderofthebigfiveletters.
Sotomaketheusersureoftherealorderthebenefitofaccuracy:
Iftheuserclickoneofthemaswementionedtheorderwillchangeandanumberofaccuracywillchange
automaticallyuntilyoufindtherealorder.
Everybuttonofthe
Abovewillcall
Differentfunctionof
These.Wefoundno
Disorderofthe“E”and
“T”letterssowejustplayingwiththerest.
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 6/9
6
Eventhoughthereisnowaywefoundahundredpercentinallcases.Soinourcasewefoundthatin
ordertohave
Almost%95oftheplaintexttheusermustenteratleast8-10pagesaverage.Weknowitseemstoolongbut
aftermanytrieswefoundthatthatnumberofpageshaveahighpossibilitytocontainsallthepatternswehavetohavehighaccuracyrate.
Lastthingweaddedthatweprovidealsoaverypowerfulfunctionthatmayleadto%99.99andit
seemsbadoroldbutitdoesmakethatcharm.
Yes,Replacefunctionallowsyoutoreplaceanyletterwiththeletteryouwantmanually.Becausesometimes
aftertheprogramdoitsjobandleadsyouto%90andyouchecktheplaintextandyoufindtheword
“eQample”andyouinstantlyknewitas“example”theprogrammadeallthesechangesbutgotstuck
somehowanddidnotfindtheletter“X”whywedon’tprovideyouthisfunctiontochangeALL“Q”sinthe
plaintextwiththerightlettertomakeit%99foryou.
Accuracy:
Inordertocalculatetheaccuracyoftheplaintextweusedsimplealgorithm.Whichisafterthe
programfinishedreplacingthelettersitfoundcallsafunctionthatcounteverysmalllettersintheplaintext
boxwhichindicatestheletterswefound.Andthencountthewholetextincludingupperlettersandlower
letters.Thendividethenumberoflowerlettersoverthenumberofalltext.Thenmultiplyitby100.
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 7/9
7
TheFinaldesign:
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 8/9
8
ScreenShoots:
Wehopethegoodmark&luckandwishthebestforallofus.
Herewetriedaciphertext
thenweputitinitsplace
thenwepressed“1”butthe
resultcamewith%82just
Note:(“@”means“a”)but
wegotweirderrorwhenwe
triedtoputit“a”sowedid
thatinstead.
8/3/2019 Introduction NS Report
http://slidepdf.com/reader/full/introduction-ns-report 9/9
9