cse 373: data structures and algorithms...the traveling salesperson problem seattle tehran beijing...

31
Instructor: Lilian de Greef Quarter: Summer 2017 CSE 373: Data Structures and Algorithms Lecture 21: Finish Sorting, P vs NP

Upload: others

Post on 17-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Instructor:LiliandeGreefQuarter:Summer2017

CSE373:DataStructuresandAlgorithmsLecture21:FinishSorting,PvsNP

Page 2: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Today

• Announcements• Finishupsorting• RadixSort• Finalcommentsonsorting

• ComplexityTheory:P=?NP

Page 3: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Announcements

• FinalExam:• Nextweek• Duringusuallecturetime(10:50am- 11:50am)• Cumulative(soallmaterialwe’vecoveredinclassisfairgame)• ...butwithemphasisonmaterialcoveredafterthemidterm• Date:Friday

Page 4: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

BacktoSortingAlgorithms

Page 5: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

TheBigPicture

Surprisingamountofjuicycomputerscience:2-3lectures…Simple

algorithms:O(n2)

Fancieralgorithms:O(n log n)

Comparisonlower bound:W(n log n)

Specializedalgorithms:

O(n)

Handlinghuge data

sets

Insertion sortSelection sortShell sort…

Heap sortMerge sortQuick sort (avg)…

Bucket sortRadix sort

Externalsorting

How???• Changethemodel– assumemorethan“compare(a,b)”

Page 6: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Radixsort

• Radix=“thebaseofanumbersystem”• Exampleswilluse10becauseweareusedtothat• Inimplementationsuselargernumbers

• Forexample,forASCIIstrings,mightuse128

• Idea:• Bucketsortononedigitatatime

• Numberofbuckets=radix• Startingwithleast significantdigit• Keepingsortstable

• Doonepassperdigit• Invariant:Afterk passes(digits),thelastk digitsaresorted

• Aside:Originsgobacktothe1890U.S.census

Page 7: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

RadixSort:Example

Input:

4785379

72133814367

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

Output:

Firstpass:bucketsortbyone’sdigit

Secondpass:stablebucketsortbyten’sdigit

Thirdpass:stablebucketsortbyhundred’sdigit

Page 8: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Analysis

Inputsize:nNumberofbuckets=Radix:BNumberofpasses=“Digits”:P

Workperpassis1bucketsort:

Totalworkis

Comparedtocomparisonsorts,sometimesawin,butoftennot• Example:StringsofEnglishlettersuptolength15

• Run-timeproportionalto:15*(52+n)• Thisislessthann lognonlyifn >33,000• Ofcourse,cross-overpointdependsonconstantfactorsoftheimplementations

• Andradixsortcanhavepoorlocalityproperties

Page 9: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

CommentsonSortingAlgorithms

Page 10: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Sortingmassivedata

• Needsortingalgorithmsthatminimizedisk/tapeaccesstime:• QuicksortandHeapsortbothjumpalloverthearray,leadingtorandomdiskaccesses• Mergesortscanslinearlythrougharrays,leadingto(relatively)sequentialdiskaccess

• Mergesortisthebasisofmassivesorting

• Mergesortcanleveragemultipledisks

Page 11: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

ExternalMergeSort• Sort900MBusing100MBRAM

• Read100MBofdataintomemory• Sortusingconventionalmethod(e.g.quicksort)• Writesorted100MBtotempfile• Repeatuntilalldatainsortedchunks(900/100=9total)

• Readfirst10MBofeachsortedchuck,mergeintoremaining10MB• writingandreadingasnecessary• Singlemergepassinsteadoflogn• Additionalpasshelpfulifdatamuchlargerthanmemory

• Parallelismandbetterhardwarecanimproveperformance• Distributionsorts(similartobucketsort)arealsoused

Page 12: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Wrap-uponSorting• SimpleO(n2)sortscanbefastestforsmalln

• Insertionsort(latterlinearformostly-sorted)• Good“belowacut-off”fordivide-and-conquersorts

• O(nlog n)sorts• Heapsort,in-place,notstable,notparallelizable• Mergesort,notinplacebutstableandworksasexternalsort• Quicksort,inplace,notstableandO(n2)inworst-case

• Oftenfastest,butdependsoncostsofcomparisons/copies

• W (n log n) isworst-caseandaveragelower-boundforsortingbycomparisons

• Non-comparisonsorts• Bucketsortgoodforsmallnumberofpossiblekeyvalues• Radixsortusesfewerbucketsandmorephases

• Bestwaytosort?

Page 13: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

ComplexityTheory:PvsNPJustasmalltasteofComplexityTheory

Page 14: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

“Easy”ProblemsfortheComputer

Sortingalistofnnumbers

Multiplyingtwonxnmatrices

3 5 2 71 6 8 92 4 6 109 3 2 12

1 5 5 45 12 8 67 6 1 59 23 5 8

=n

n n

n

n

Page 15: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

“Easy”ProblemsfortheComputer

ShortestPathAlgorithm MinimumSpanningTreeAlgorithms

A B

CD

F

E

G

2

12 5

11

1

2 65 3

10

A B

DC

F H

E

G

2 2 3

110 23

111

7

1

9

2

4 5

Edsgar Dijkstra

Page 16: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

“Hard”ProblemsfortheComputer

TheKnapsackProblem

Iwanttocarryasmuchmoney’sworthasIcanthatstillfitsinmybag!WhatdoIpack?

Page 17: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

“Hard”ProblemsfortheComputer

TheTravelingSalespersonProblem

Seattle

Tehran

Beijing

Berlin

Moscow

SãoPaulo

54116775

6685

52175058

4583

6356

2185

1004

3608

7325 1533

34887577

10933

I’llleaveSeattletosellgoods,visitingeachcityonlyonce,andreturntoSeattle.What’stheshortestroute?

Page 18: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

“Hard”ProblemsfortheComputer

FindaHamiltonianpath(apaththatvisitseachvertexexactlyonce)

(nevermindweightsorevenreturningtoourstartingpoint!)

Page 19: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Comparingn2 vs2n

Thealien’scomputerperforms109 operations/sec

n=10 n=30 n=50 n=70

n2 100<1sec

900<1sec

2500<1sec

4900<sec

2n 1024<1sec

n! 3628800<1sec

1091sec

101511.6days

102131,688years

1016 years 1048 years 1083 years(105 xageoftheuniverse!)

Page 20: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

“Easy”vs“Hard”ProblemsfortheComputer

“PolynomialTime”=“Efficient”

Isanalgorithm“efficient”with…

O(n)? O(nlogn)?O(n2)? O(n10)? O(nlog n)? O(2n)? O(n!)?

Page 21: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

PolynomialTime?• Soweknowtherearepolynomialtimealgorithmsto

• Sortnumbers• Multiplynxnmatrices• Findtheshortestpathinagraph• Findtheminimumspanningtree• … andmore

• Butthemilliondollarquestionis…aretherepolynomialtimealgorithmstosolve• TheKnapsackProblem?• TheTravelingSalespersonProblem?• FindingHamiltonianPaths?• … andthousandsmore!

Page 22: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533
Page 23: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

P=NP?

AndthereareproblemsevenharderthanNP!

P(PolynomialTime)

sort

search

Dijkstra’s

Prim’s

Kruskal’s

NP(NondeterministicPolynomialTime)

HamiltonianPath

TravelingSalesperson

KnapsacknxnxnSudoku

SAT

Page 24: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

RelevanceofP=NP

NPcontainslotsofproblemswedon’tknowtobeinP• ClassroomScheduling• Packingobjectsintobins• Schedulingjobsonmachines• Findingcheaptoursvisitingasubsetofcities• Findinggoodpacketroutingsinnetworks• Decryption...

Page 25: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Withthisknowledge,wecanavoidsaying…

“Ican’tfindanefficientalgorithm.IguessI’mtoodumb.”

Cartooncourtesyof“ComputersandIntractability:AGuidetotheTheoryofNP-Completeness” byM.Garey andD.Johnson

Page 26: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

Butknowitisn’twisetosay…

“Ican’tfindanefficientalgorithmbecausenosuchalgorithmispossible!”

Cartooncourtesyof“ComputersandIntractability:AGuidetotheTheoryofNP-Completeness” byM.Garey andD.Johnson

Page 27: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

And,instead,proveit’sinNPtothensay…

“Ican’tfindanefficientalgorithm,butneithercanallthesefamouspeople.”

Cartooncourtesyof“ComputersandIntractability:AGuidetotheTheoryofNP-Completeness” byM.Garey andD.Johnson

Page 28: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

PreparingforFinalExam

Page 29: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

FinalExamStudyStrategies

Page 30: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

PracticeProblem

Given a value ‘x’ and an array of integers, determine whether two of the numbers add up to ‘x’

Questionsyoushouldhaveaskedme:

1) Isthearrayinanyparticularorder?2) ShouldIconsiderthecasewhereaddingtwolargenumberscouldcausean

overflow?3) Isspaceafactor,inotherwords,canIuseanadditionalstructure(s)?4) Isthismethodgoingtobecalledfrequentlywithdifferent/thesamevalueof‘x’?5) AbouthowmanyvaluesshouldIexpecttoseeinthearray,oristhatunspecified?6) Will‘x’alwaysbeapositivevalue?7) CanIassumethearraywon’talwaysbeempty,whatifitsnull?

Page 31: CSE 373: Data Structures and Algorithms...The Traveling Salesperson Problem Seattle Tehran Beijing Berlin Moscow São Paulo 5411 6775 6685 5217 5058 4583 6356 2185 1004 3608 7325 1533

PracticeProblem

Given a value ‘x’ and an array of integers, determine whether two of the numbers add up to ‘x’