spectral composition of semantic spaces
DESCRIPTION
Spectral theory in mathematics is key to the success of as diverse application domains as quantum mechanics and latent semantic indexing, both relying on eigenvalue decomposition for the localization of their respective entities in observation space. This points at some implicit "energy" inherent in semantics and in need of quantification. We show how the structure of atomic emission spectra, and meaning in concept space, go back to the same compositional principle, plus propose a tentative solution for the computation of term, document and collection "energy" content.TRANSCRIPT
Spectral Composition of Semantic Spaces
Spectral Composition of Semantic Spaces
Peter Wittek Sandor Daranyi
Swedish School of Library and Information ScienceUniversity of Boras
27/06/11
Spectral Composition of Semantic Spaces
Outline
1 Spectroscopy
2 Semantic Spaces
3 Spectral Composition of Semantic Spaces
4 Evolving Semantics
Spectral Composition of Semantic Spaces
Spectroscopy
Emission Spectra of Individual Atoms
Each element’s emission spectrum is unique.Spectroscopy identifies the elements in a compound ofunknown composition.Solutions to the time-independent Schrodinger waveequation are commonly used to calculate the energy levelsin the emission spectrum.
Figure: The emission spectrum of hydrogen
Spectral Composition of Semantic Spaces
Spectroscopy
Emission Spectra of Compounds
A spectrogram is a spectral representation of anelectromagnetic signal that shows the spectral density ofthe signal.The continuum of energy levels called “spectral bands”.Band spectra are the combinations of many differentspectral lines, resulting from rotational, vibrational andelectronic transitions.
Figure: The visible spectrogram of the red dwarf EQ Vir
Spectral Composition of Semantic Spaces
Semantic Spaces
Examples
Semantic spaces are algebraic models for representingterms as vectors.HALVSM
Latent semantic indexingRandom indexingTerm co-occurrence models
Spectral Composition of Semantic Spaces
Spectral Composition of Semantic Spaces
Semantic Spaces as Observables
The semantic space must be a self-adjoint operatorDifferent solutions exist to make a semantic spaceself-adjoint:
Since semantic spaces are typically real-valued, self-adjointsimply means symmetricPadding a TD matrixHAL: H + HT
Term co-occurrence matrix
Spectral Composition of Semantic Spaces
Spectral Composition of Semantic Spaces
Semantic Spectrum
Eigendecomposition of a term co-occurrence matrix.Decomposing a semantic space results in a concept spaceor a topic model.We identify this latent topic mixture in LSI with the energyeigenstructure.More prevalent hidden topics correspond to higher energystates of atoms and molecules.In a metaphoric sense, words in an eigendecompositionare similar to chemical compounds: as both are composedof doses of latent constituents, the dosimetric view appliesto them.Since the term co-occurrence matrix does not have anunderlying physical meaning, we mapped the eigenvaluesto the visible spectrum.
Spectral Composition of Semantic Spaces
Spectral Composition of Semantic Spaces
The Spectrum of the Reuters Collection
The semantic spectrum of the collection is a composite, asum of spectra of elementary components, which wouldcorrespond to individual elements in a chemical compoundin spectrophotometry.
Spectral Composition of Semantic Spaces
Spectral Composition of Semantic Spaces
The Spectrum of the Term Japan
We match spectral components to terms based on theirproximity to latent variables.
Spectral Composition of Semantic Spaces
Spectral Composition of Semantic Spaces
The Spectrum of the Term Courage
Spectral Composition of Semantic Spaces
Spectral Composition of Semantic Spaces
The Spectrum of the Term Male
Spectral Composition of Semantic Spaces
Spectral Composition of Semantic Spaces
Energy Level for a Sentence
Work required to create an utterance.The less likely to encounter a certain sense, the moreenergy required to construct a sentence with the term inthat particular sense.
Spectral Composition of Semantic Spaces
Evolving Semantics
Language Change
An attempt to formalize corpus dynamics.External forces leading to expansion.Inherent quality in terms and their agglomerates calledtheir meaning.
Evolving vector spaces of terms and documents followdirectly from variable matrix spectra.
Spectral Composition of Semantic Spaces
Evolving Semantics
Hamiltonian
The Hamiltonian which describes the energy stored in asystem.H = T + VT is the potential energy and V is the kinetic energy of asystem.The change in T goes back partly to changes in documentcollection content reflected by different index termoccurrence rates.
Spectral Composition of Semantic Spaces
Evolving Semantics
Summary
http://www.squalar.org/publications.html
Compositional semantics versus spectral bands.A representation richer than a simple vector space.
Connecting QM and language by the concept of energyIntellectual work stored in documents.The total energy of a system changes over time.
BottlenecksNot every word has a nice intuitive spectrum.