spectral composition of semantic spaces

15
Spectral Composition of Semantic Spaces Spectral Composition of Semantic Spaces Peter Wittek andor Dar ´ anyi Swedish School of Library and Information Science University of Bor˚ as 27/06/11

Upload: peter-wittek

Post on 11-May-2015

595 views

Category:

Documents


2 download

DESCRIPTION

Spectral theory in mathematics is key to the success of as diverse application domains as quantum mechanics and latent semantic indexing, both relying on eigenvalue decomposition for the localization of their respective entities in observation space. This points at some implicit "energy" inherent in semantics and in need of quantification. We show how the structure of atomic emission spectra, and meaning in concept space, go back to the same compositional principle, plus propose a tentative solution for the computation of term, document and collection "energy" content.

TRANSCRIPT

Page 1: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Peter Wittek Sandor Daranyi

Swedish School of Library and Information ScienceUniversity of Boras

27/06/11

Page 2: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Outline

1 Spectroscopy

2 Semantic Spaces

3 Spectral Composition of Semantic Spaces

4 Evolving Semantics

Page 3: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectroscopy

Emission Spectra of Individual Atoms

Each element’s emission spectrum is unique.Spectroscopy identifies the elements in a compound ofunknown composition.Solutions to the time-independent Schrodinger waveequation are commonly used to calculate the energy levelsin the emission spectrum.

Figure: The emission spectrum of hydrogen

Page 4: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectroscopy

Emission Spectra of Compounds

A spectrogram is a spectral representation of anelectromagnetic signal that shows the spectral density ofthe signal.The continuum of energy levels called “spectral bands”.Band spectra are the combinations of many differentspectral lines, resulting from rotational, vibrational andelectronic transitions.

Figure: The visible spectrogram of the red dwarf EQ Vir

Page 5: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Semantic Spaces

Examples

Semantic spaces are algebraic models for representingterms as vectors.HALVSM

Latent semantic indexingRandom indexingTerm co-occurrence models

Page 6: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Semantic Spaces as Observables

The semantic space must be a self-adjoint operatorDifferent solutions exist to make a semantic spaceself-adjoint:

Since semantic spaces are typically real-valued, self-adjointsimply means symmetricPadding a TD matrixHAL: H + HT

Term co-occurrence matrix

Page 7: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Semantic Spectrum

Eigendecomposition of a term co-occurrence matrix.Decomposing a semantic space results in a concept spaceor a topic model.We identify this latent topic mixture in LSI with the energyeigenstructure.More prevalent hidden topics correspond to higher energystates of atoms and molecules.In a metaphoric sense, words in an eigendecompositionare similar to chemical compounds: as both are composedof doses of latent constituents, the dosimetric view appliesto them.Since the term co-occurrence matrix does not have anunderlying physical meaning, we mapped the eigenvaluesto the visible spectrum.

Page 8: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

The Spectrum of the Reuters Collection

The semantic spectrum of the collection is a composite, asum of spectra of elementary components, which wouldcorrespond to individual elements in a chemical compoundin spectrophotometry.

Page 9: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

The Spectrum of the Term Japan

We match spectral components to terms based on theirproximity to latent variables.

Page 10: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

The Spectrum of the Term Courage

Page 11: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

The Spectrum of the Term Male

Page 12: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Energy Level for a Sentence

Work required to create an utterance.The less likely to encounter a certain sense, the moreenergy required to construct a sentence with the term inthat particular sense.

Page 13: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Evolving Semantics

Language Change

An attempt to formalize corpus dynamics.External forces leading to expansion.Inherent quality in terms and their agglomerates calledtheir meaning.

Evolving vector spaces of terms and documents followdirectly from variable matrix spectra.

Page 14: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Evolving Semantics

Hamiltonian

The Hamiltonian which describes the energy stored in asystem.H = T + VT is the potential energy and V is the kinetic energy of asystem.The change in T goes back partly to changes in documentcollection content reflected by different index termoccurrence rates.

Page 15: Spectral Composition of Semantic Spaces

Spectral Composition of Semantic Spaces

Evolving Semantics

Summary

http://www.squalar.org/publications.html

Compositional semantics versus spectral bands.A representation richer than a simple vector space.

Connecting QM and language by the concept of energyIntellectual work stored in documents.The total energy of a system changes over time.

BottlenecksNot every word has a nice intuitive spectrum.