informacije i komunikacije - naslovnicazeljkoj/nastava/informacije_komunikacije/... · 1...
TRANSCRIPT
1
INFORMACIJE I KOMUNIKACIJE
Željko Jeričević, dr. sc.Zavod za računarstvo, Tehnički fakultet &
Zavod za biologiju i medicinsku genetiku, Medicinski fakultet51000 Rijeka, Croatia
Phone: (+385) 51-651 594 E-mail: [email protected] [email protected]
http://www.riteh.hr/~zeljkoj/Zeljko_Jericevic.html
14 December 2009 Zeljko Jericevic, Ph.D. 2
INFORMACIJE I KOMUNIKACIJE
Predavači:Željko Jeričević
2-54 651-594 [email protected]
Damir Arbula
2-08a 651-436 [email protected]
2
14 December 2009 Zeljko Jericevic, Ph.D. 3
Ukratko• Preliminarno: logaritmi, vjerojatnost, itd.• Povijesni pregled i značajni istraživači• Model komunikacijskog sustava• Što je informacija?• Prenos informacija, informacijske mjere• Komunikacijski kanali• Kodiranje informacije• Kompresija• Sigurni prenos informacija• Teorija informacija u obradi podataka• Tok informacija u biološkom sustavu
14 December 2009 Zeljko Jericevic, Ph.D. 4
Što se od studenata očekuje
Da usvojite materijal relevantan za studij i budućiprofesionalni rad:
• Razumijevanje kvantitativnog pristupainformacijama i značaja digitalne tehnologije
• Sposobnost održavanja i projektiranja informacijskihsustava.
• Razumijevanje značaja poznavanja engleskog jezika i slobodnog pristupa informacijama
3
14 December 2009 Zeljko Jericevic, Ph.D. 5
Obavezna i preporučenaliteratura za teoriju informacije
• Igor S. Pandžić i drugi, “Uvod u teoriju informacije i kodiranje”, Element, Zagreb, 2007
• Vjekoslav Sinković, “Informacija, simbolika i semantika”, Školska knjiga, Zagreb, 1997
• Željko Pauše, “Uvod u teoriju informacije”, Školska knjiga, Zagreb, 1980• Robert M. Gray, “Entropy and Information Theory”, Springer-Verlag, New
York, 1990http://ee.stanford.edu/~gray/it.html
• Claude E. Shannon, "A Mathematical Theory of Communication", Bell System Technical Journal, 27, pp. 379–423 & 623–656, July & October, 1948http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf
• Wikipedia articles
14 December 2009 6
Information theory
“Information theory is a branch of applied mathematics and electrical engineeringinvolving the quantification of information. Historically, information theory was developed by Claude E. Shannon to find fundamental limits on compressing and reliably storing and communicating data.”
From Wikipedia
Claude E. Shannon (1916-2001)
4
14 December 2009 Zeljko Jericevic, Ph.D. 7
Information theory
“Since its inception it has broadened to find applications in many other areas, including statistical inference, natural language processing, cryptography generally, networks other than communication networks — as in neurobiology,[1] the evolution[2] and function[3] of molecular codes, model selection[4] in ecology, thermal physics,[5] quantum computing, plagiarism detection[6]and other forms of data analysis.[7]”
From Wikipedia
14 December 2009 Zeljko Jericevic, Ph.D. 8
Information theory history
Kroz povijest informacijske tehnologije provlači se činjenica da smo neprestano u potrazi za bržim i boljimmetodama komunikacije, jer glad za informacijama i spremnost ljudi da potroše novac u tu svrhu se čini danema kraja.Nije slučajno da su temelji teorije informacija razvijeniu Bell Labs (istraživački odjel American Telephone & Telegraph Company).Takoñer nije slučajna interakcija izmeñu Bell Labs & Massachusetts Institute of Technology.
5
14 December 2009 Zeljko Jericevic, Ph.D. 9
Information theory historyOptički telegraf:
1787 Joseph Chudy 1752-1813
1789 Claude Chappe 1763-1805
Napoleonska era:
10km izmeñu stanica,
20s/znaku
8min/210km
14 December 2009 10
Information theory history
Claude Chappe's optical telegraph on the Litermontnear Nalbach, Germany
6
14 December 2009 Zeljko Jericevic, Ph.D. 11
Information theory history
Construction schematic of a Prussian optical telegraph (or semaphore) tower, C. 1835
14 December 2009 Zeljko Jericevic, Ph.D. 12
Information theory history
1832 Needle telegraph by Paul Schilling (1786-1837)
7
14 December 2009 Zeljko Jericevic, Ph.D. 13
Information theory history
1833 Carl Friedrich Gauss (1777-1855) & Wilhelm Weber (1804-1891) telegraph
14 December 2009 Zeljko Jericevic, Ph.D. 14
Information theory history1837 five needle
telegraph
William Fothergill Cooke (1806-1879)
Charles Wheatstone (1802-1875)
8
14 December 2009 15
Information theory history
Samuel Morse
14 December 2009 Zeljko Jericevic, Ph.D. 16
Teorija informacije
1838 Morse je smislio kod za prijenos teksta putemtelegrafa koji se i danas koristi u nekim vrstamasignalizacije
Morse je u tiskari odredio koja se slova najčešće koristeu engleskom tekstu i konstruirao kodove tako danajčešće upotrebljavana slova imaju najkraće kodove
Postigao je uštedu u dužini poruke od otprilike 15%
Morse je empirijski smislio kod po principima koje je kasnije teorijski definirao David Huffamn
9
14 December 2009
Information theory historySamuel F.B. Morse (1791-1872)Slovo e je najčešće upotrebljavano
14 December 2009 18
Information theory historyMorseov telegraf, patentiran 1837, u Europi prihvaćen kao
standard 1851 (osim GB)"What hath God wrought", a message in American Morse code
sent by Samuel F. B. Morse to officially open the Baltimore-Washington telegraph line on May 24, 1844
10
14 December 2009 Zeljko Jericevic, Ph.D. 19
Information theory history
14 December 2009 20
Information theory history
19th Century map showing the early telegraph cables which connected Britain with the rest of the World.
11
Zeljko Jericevic, Ph.D. 21
Information theory history
Major telegraph lines in 1891
14 December 2009 Zeljko Jericevic, Ph.D. 22
Teorija informacijeTomas Edison (1847-1931)
1874 Edison je uveo kvadripleksnitelegraf s četiri razine jakosti struje. S ovakvim ureñajem bilo je moguće slatidvije poruke istodobno.
Empirijski je utvrñeno da korištenjemdvostruko više simbola možemo prenositidvostruko više poruka.
12
23
Teorija informacijeKvadripleksni telegraf s četiri razine jakosti struje
offon-3
offoff-1
onoff+1
onon+3
Message 2Message 1Current
14 December 2009 Zeljko Jericevic, Ph.D. 24
Information theory history
1866 Trans-Antlantik kabel
1902 Trans-Pacifik kabel
On 27 January 2006, Western Union discontinued all telegram and commercial messaging services, though it still offered its money transfer services.
13
Zeljko Jericevic, Ph.D. 25
Seminarske teme1) Booulean algebra & hardware2) DFT/FFT3) Sampling rate, aliasing & quantization4) Hartley transform5) Hartley & transmission of information6) Nyquist-Shannon Sampling theorem7) Shannon’s information entropy8) Hamming coding9) Huffman coding10) Arithmetic coding11) Sound encoding12) Picture encoding13) Video encoding
14 December 2009 26
Information theory history
His early theoretical work on determining the bandwidth requirements for transmitting information laid the foundations for later advances by Claude Shannon, which led to the development of information theory.In 1927 Nyquist determined that the number of independent pulses that could be put through a telegraph channel per unit time is limited to twice the bandwidth of the channel. Nyquist published his results in the paper Certain topics in Telegraph Transmission Theory (1928). This rule is essentially a dual of what is now known as the Nyquist–Shannon sampling theorem.
From Wikipedia
Harry Nyquist (1889-1976)
14
14 December 2009 Zeljko Jericevic, Ph.D. 27
Information theory history1917 Harry Nyquist počinje raditi u American Telephone and Telegraph Company nakon doktoriranja na Yale University.
Bavi se brzinama prenosa telegrafskog signala i 1924 prvikvantitativno definira odnos izmedu empirijskog opažanja o brzini prenosa signala. Pokazao je da slanjem k simbola u sekundi, gdje k može poprimiti jednu od m različitih vrijednostimožemo postići teorijsku brzinu prijenosa W
2ln [ bit sec ]W k m=
14 December 2009 Zeljko Jericevic, Ph.D.
Information theory history
Nyquist je teorijskiobjasnio ono što je Edison empirijskiučinio sa svojimkvadripleksnimtelegrafom (m su nivoistruje, log2m je faktorpovećanja brzineslanja znakova.
2log
1 0
2 1
3 1.6
4 2
8 3
16 4
m m
15
14 December 2009 Zeljko Jericevic, Ph.D. 29
Teorija informacijeHarry Nyquist je proučavao frekvencijske komponentesignala (Furierova transformacija) i otkrio da je zaprijenos i rekonstrukciju signala ograničenogfrekvencijskog pojasa potreban broj uzoraka dvostrukoveći od najveće frekvencije signala, što je kasnijedokazao Shannon (Nyquist-Shannon teorem).
Nyquist frekvencija u Furierovoj transformaciji je naviša prisutna frekvencija u signalu ograničenogfrekvencijskog pojasa
14 December 2009 30
Information theory history
Ralph Hartley (1888-1970)
Hartley, R.V.L., "Transmission of Information", Bell System Technical Journal, July 1928, pp.535–563. http://www.dotrose.com/etext/90_Miscellaneous/transmission_of_information_1928b.pdfHartley, R.V.L., "A More Symmetrical Fourier Analysis Applied to Transmission Problems," Proc. IRE 30, pp.144–150 (1942).Discrete Hartley transform
From Wikipedia
16
14 December 2009 Zeljko Jericevic, Ph.D. 31
Teorija informacijeRalph Hartley je proučavao problem komunikacije u smislu izvora i prijamnika. Definirao je veličinu H –informaciju sadržanu u poruci
Nyquist i Hartley su Shannonovi prethodnici i bili suvažni za njegov rad
lnH n s=
Information theory history
Claude Elwood Shannon (April 30, 1916 – February 24, 2001), an American electronic engineer and mathematician, is known as "the father of information theory".Shannon is famous for having founded information theory with one landmark paper published in 1948.From Wikipediahttp://cm.bell-labs.com/cm/ms/what/shannonday/paper.html
Claude E. Shannon
17
14 December 2009 Zeljko Jericevic, Ph.D. 33
Information theory history
Claude E. Shannon
Model komunikacijskog kanala
14 December 2009 Zeljko Jericevic, Ph.D. 34
Informacijska entropija H
21
20
ln
ln 0lim
je vjerojatnost stanja
I
i ii
p
i
H p p
p p
p i
=
→
= −∑
=
18
35
Information theory historyClaude E. Shannon… he is also credited with founding both digital computer and digital circuit design theory in 1937, when, as a 21-year-old master's student at MIT, he wrote a thesis demonstrating that electrical application of Boolean algebra could construct and resolve any logical, numerical relationship. It has been claimed that this was the most important master's thesis of all time.From Wikipedia
http://dspace.mit.edu/bitstream/handle/1721.
1/11173/34541425.pdf?sequence=1
14 December 2009 Zeljko Jericevic, Ph.D. 36
Information theory history1950 Richard Wesley Hamming (1915-1998) prvi
specificirao kodiranje koje omogućuje automatskukorekciju nekih grešaka
19
14 December 2009 37
Information theory history
David Huffman is best known for his legendary Huffman code, a compression scheme for losslessvariable length encoding. It was the result of a term paper he wrote while a graduate student at the Massachusetts Institute of Technology (MIT), where he earned a D.Sc. degree on a thesis named The Synthesis of Sequential Switching Circuits, advised by Samuel H. Caldwell (1953)."Huffman Codes" are used in nearly every application that involves the compression and transmission of digital data, such as fax machines, modems, computer networks, and high-definition television (HDTV), to name a few.From Wikipedia
David A. Huffman (1925-1999)
14 December 2009 Zeljko Jericevic, Ph.D. 38
Information theory history
http://www.its.bldrdoc.gov/fs-1037/fs-1037c.htm
Federal Standard 1037C
20
14 December 2009 Zeljko Jericevic, Ph.D. 39
Information theory
“A key measure of information in the theory is known as entropy, which is usually expressed by the average number of bits needed for storage or communication. Intuitively, entropy quantifies the uncertainty involved when encountering a random variable. For example, a fair coin flip (2 equally likely outcomes) will have less entropy than a roll of a die (6 equally likely outcomes).”
From Wikipedia
14 December 2009 Zeljko Jericevic, Ph.D. 40
Informacijska entropija H
21
20
ln
ln 0lim
je vjerojatnost stanja
I
i ii
p
i
H p p
p p
p i
=
→
= −∑
=
21
14 December 2009 Zeljko Jericevic, Ph.D. 41
Information theory
“Applications of fundamental topics of information theory include lossless data compression (e.g. ZIP files), lossy data compression (e.g. MP3s), and channel coding(e.g. for DSL lines). The field is at the intersection of mathematics, statistics, computer science, physics, neurobiology, and electrical engineering.”
From Wikipedia
14 December 2009 Zeljko Jericevic, Ph.D. 42
Information Entropy
“Shannon's entropy represents an absolute limit on the best possible lossless compression of any communication, under certain constraints: treating messages to be encoded as a sequence of independent and identically-distributed random variables, Shannon's source coding theorem shows that, in the limit, the average length of the shortest possible representation to encodethe messages in a given alphabet is their entropy divided by thelogarithm of the number of symbols in the target alphabet.”
From Wikipedia
22
14 December 2009 Zeljko Jericevic, Ph.D. 43
Information Entropy
“A fair coin has an entropy of one bit. However, if the coin is not fair, then the uncertainty is lower (if asked to bet on the nextoutcome, we would bet preferentially on the most frequent result), and thus the Shannon entropy is lower. ... A long string of repeating characters has an entropy rate of 0, since every character is predictable. The entropy rate of English text is between 1.0 and 1.5 bits per letter,[1] or as low as 0.6 to 1.3 bits per letter, according to estimates by Shannon based on human experiments.”
From Wikipedia
14 December 2009 Zeljko Jericevic, Ph.D. 44
Vježba
Ako je prosječna entropija slova u engleskom jeziku 1.6 bit, a tekstovi se čuvaju u formatu gdje svaki znakzauzima 1 byte, kolika je teorijski maksimalnakompresija?
23
14 December 2009 Zeljko Jericevic, Ph.D. 45
Vježba
Ako je prosječna entropija slova u engleskom jeziku 1.6 bit, a tekstovi se čuvaju u formatu gdje svaki znakzauzima 1 byte, kolika je teorijski maksimalnakompresija?
8/1.6 = 5
14 December 2009 Zeljko Jericevic, Ph.D. 46
Vježba – histogram programfor (i=0; i<ID; i++) his[i] = 0.0;total = 0.0;
while ((c=getc(stdin)) != EOF) {his[c] += 1.0;total += 1.0;}
for (i=0; i<ID; i++) printf("%03d %e\n",i,his[i]);printf("%g\n",total);
24
14 December 2009 Zeljko Jericevic, Ph.D. 47
Vježba – entropy programlog2 = log((double) 2.0);max_entropy = log((double)i)/log2;for (j=0; j<i; j++) {
if (his[j] > 0.0) {hold = his[j]/total; entropy -= hold*log(hold)/log2;}
}
14 December 2009 Zeljko Jericevic, Ph.D. 48
Vježba – histogram programSa web stranice projekta Gutemberg povucite engleski tekst H.G. Wells-a, “The Time Machine”, analizirajte ga uz pomoćuhistogram programa da izračunate informacijsku entropiju.
Prevedite tekst u doc i pdf format i ponovo izračunajteinformacijsku entropiju.
Kompresirajte tekst pomoću nekoliko različitih programa: zip, rar, gzip, … i izračunajte informacijsku entropiju kompresiranihdatoteka.
Komentirajte rezultate.
http://www.gutenberg.org/files/35/35.txt
25
14 December 2009 Zeljko Jericevic, Ph.D. 49
Data, information, knowledge, wisdom
Where is the Life we have lost in living?Where is the wisdom we have lost in knowledge?Where is the knowledge we have lost in information?
-- from T.S. Eliot, "Choruses from 'The Rock'"
14 December 2009 Zeljko Jericevic, Ph.D. 50
Data, information, knowledge, wisdom
Information is not knowledge,Knowledge is not wisdom,Wisdom is not truth,Truth is not beauty,Beauty is not love,Love is not music,and Music is THE BEST.
-- from Frank Zappa,
"Packard Goose"
26
14 December 2009 Zeljko Jericevic, Ph.D. 51
Piramidna prezentacija
MudrostZnanjeInformacijaPodaci
14 December 2009 Zeljko Jericevic, Ph.D. 52
Piramidna prezentacija
Evaluate any choice
Know how (useful info.)Who, what, where,how many?Exists
27
14 December 2009 Zeljko Jericevic, Ph.D. 53
Grafička prezentacija
14 December 2009 Zeljko Jericevic, Ph.D. 54
DIKW
From Futurist
28
14 December 2009 Zeljko Jericevic, Ph.D. 55
Hvala na pažnji
Željko Jeričević, dr. sc.Zavod za računarstvo, Tehnički fakultet &
Zavod za biologiju i medicinsku genetiku, Medicinski fakultet51000 Rijeka, Croatia
Phone: (+385) 51-651 594 E-mail: [email protected] [email protected]
http://www.riteh.hr/~zeljkoj/Zeljko_Jericevic.html