kopi - core.ac.uk · kopi kopi protection instead of copy protection mátépataki. dsd department...

23
DSD Department of Distributed Systems MTA SZTAKI KOPI KOPI Protection Instead of Copy Protection Máté Pataki

Upload: dangtram

Post on 28-Mar-2019

232 views

Category:

Documents


0 download

TRANSCRIPT

DSD

Department of Distributed

Systems

MTA SZTAKI

KOPI

KOPI ProtectionInstead of Copy Protection

Máté Pataki

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Topics

n Plagiarismn KOPI Portaln How KOPI Worksn KOPI Protectionn Future Plans

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Problems

n Plagiarism is a huge problem at universitiesn There are too many theses even at one

university, no one can be familiar with all of them

n It is not enough to feel that something could be a plagiarism, some proof is needed

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Problems - Existing Systems

n Watermark or checksumn Authorship attributionn Open search enginesn Text comparisonn Questionnairen Systems with unknown algorithmsn No system for the Hungarian community

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

What we need

n Detects Partial Overlappingn Can not be automatically removedn Language independentn Can protect proprietary documentsn One to many comparisonn Without user interventionn Known algorithm

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Topics

n Plagiarismn KOPI Portaln How KOPI Worksn KOPI Protectionn Future Plans

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

The KOPI Project

n KOPI Online Plagiarism Search and Information Portal – Web based similarityand plagiarism search service

n Partner: Monash University, Melbournen Sponsored by the Hungarian Governmentn Developed 2003-2004n The Service is freely available to everybody

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

The Goal of KOPI

n Protect digital libraries from illegal copyingn Help teachers, professors, conference organizers to

easily find copied work, and the original sourcen Inform students and authors about plagiarism and

citations and the relevant (Hungarian) lawsn Increase the values of papers, theses by certifying

their genuineness

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Plagiarism Search Services

n Compare uploaded documents to each othern Find similar documents on the database of the

system:n Within the users own documentsn Documents uploaded by others

n Documents from the Internetn Digital libraries (MEK)n Universitiesn …

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Topics

n Plagiarismn KOPI Portaln How KOPI Worksn KOPI Protectionn Future Plans

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

How it works

text

chunk

fingerprint

DB

result

å Chunking

ç Compress (MD5)

é Upload to DB

è Query

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

The goal of the KOPI online Plagiarism Search and Information Portal is to protect documents against plagiarism.

the goal of the kopi online plagiarism search and informationportal is to protect documents …

• Original

• Word chunking (n=5)

Word chunking

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

The goal of the KOPI online Plagiarism Search and Information Portal is to protect documents against plagiarism.

the goal of the kopigoal of the kopi onlineof the kopi online plagiarismthe kopi online plagiarism searchkopi online plagiarism search portal…

• Original

• Overlapping word chunking (n=5)

Overlapping word chunking

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Hash based algorithmHash based algorithm ((MD5MD5))

MD5chunk fingerprint

Compressing fingerprints

n Input length is not limitedn Fastn The chance of two different texts to have the same

MD5 code is smalln Irreversiblen Can protect proprietary documents

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Topics

n Plagiarismn KOPI Portaln How KOPI Worksn KOPI Protectionn Future Plans

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Copy protection

n Prosn Harder to copy itn The way of the work can be followed (DRM)n More income for authors and sellers

n Consn Harder to use itn Can not totally prevent copyingn Sometimes for the legal use it must be circumventedn It is not always legal to usen Personal rights problems (DRM)n Hinders the spreading of the work

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Copy Protection forText Documents

n PDF, DOC… protectionn Can be easily and automatically circumvented

n Allow only online viewingn Strongly restricts the usen It is harder, but can be circumvented

n Narrow down the number of authorized usersn If once the documents is out of the system…

n Nothing protects against typing downn Close up into a drawer and leave it there

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

KOPI Protection

n Documents uploaded into the KOPI Systemn Plagiarism can be easily discoveredn The sources will also be knownn The risk to plagiarize will be too highn Circumventing it is time consuming and can not be

done automaticallyn The work can be freely distributed

n Must not deal with copy protectionn Search engines can index itn More people read itn More people cite from it

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Topics

n Plagiarismn KOPI Portaln How KOPI Worksn KOPI Protectionn Future Plans

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Future Plans

n Distributed Systemn Each university has an own system, butn Their are able to search in the others DBn Secure search with MD5 codes

n Upload databasesn Online and offline databasesn Documents found on the Internet

n Recognizing source codes and programming languages

n SOAP interface for integrated use of KOPI

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Future Plans

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

KOPI Portal

http://kopi.sztaki.hu

DSD

DSDDepartment ofDistributed Systems

MTA SZTAKI

Web: http://dsd.sztaki.hu

Email: Mate.Pataki sztaki.hu

Thank you for your attention!

@