transduction and active learning in the quantum learning of unitary transformations
DESCRIPTION
Quantum learning of a unitary transformation estimates a quantum channel in a process similar to quantum process tomography. The classical counterpart of this goal, finding an unknown function, is regression, although the methodology hardly resembles the outline of classical algorithms. To gain a better understanding what such a methodology means to learning theory, we anchor it to the familiar concepts of active learning and transduction. Learning the unitary transformation translates to optimally storing it in quantum memory, but the quantum learning procedure also requires an optimal, maximally entangled input state. We argue that this is akin to active learning. Two different retrieval strategies apply when we would like to use the learned unitary transformation: a coherent strategy, which stores the unitary in quantum memory, and an incoherent one, which measures the unitary and stores it in classical memory; the latter strategy is considered optimal. We further argue that the incoherent strategy is a blend of inductive and transductive learning, as the optimal input state depends on the number of target states on which the transformation should be applied, yet once it is learned, the transformation can be used an arbitrary number of times. On the other hand, the sub-optimal coherent strategy of storing and applying the unitary is a form of transduction with no inductive element.TRANSCRIPT
Transduction and Active Learning in theQuantum Learning of Unitary
TransformationsPeter Wittek
CLASSICAL REGRESSIONMachine learning is a field of artificial intel-ligence that seeks patterns in empirical datawithout forcing models on it. Consider a train-ing set {(x1, y1), . . . , (xN , yN )}, where xi is a fi-nite dimensional data point, and yi is an asso-ciated outcome. In regression, the task is to ap-proximate an unknown function f with f suchthat f(xi) = yi for all i. Then, encountering apreviously unseen x, we calculate f(x).
LEARNING A UNITARYIf we denote the unitary transformation by U ,the goal of quantum process tomography is toderive an estimate for U while probing it withρin input states and observing the correspond-ing output states ρout. To make it analogous toregression, we have the input and correspond-ing output states fixed, and estimate the unitarysuch that U(ρin) = ρout. Then, just as in theclassical case, we would like to calculate U(ρ)for a new state.Unlike in the classical setting, learning a uni-tary requires a double maximization: we need anoptimal measuring strategy that optimally ap-proximates the unitary and we also need an op-timal input state that best captures the informa-tion of the unitary.
Outline of learning a unitary
In indirect process tomography, we use theChoi-Jamiołkowski isomorphism to imprint theunitary on a state. The key steps are [1]:
• Learning the unitary translates to the op-timal storage and parallel application ofthe unitary on a suitable input state.
• It requires an optimal input state, a super-position of maximally entangled states.This resembles active learning.
• Applying the learned unitary either witha coherent strategy, that is, retrievingfrom quantum memory, or with an inco-herent strategy, that is, after measurementand retrieving it from classical memory.The optimal incoherent strategy is a mix-ture of inductive and transductive learn-ing, whereas the suboptimal coherentstrategy is purely transduction.
INDUCTION VERSUS TRANSDUCTION
Class 1Class 2Decisionsurface
Unlabeled instancesClass 1Class 2
Inductive learning Transductive learning-N Labelled or unlabelled points -N Labelled or unlabelled points
-A function is inferred -No inference-Learned function is applied on -K additional data points are known in
unseen data advance: {xN+1, . . . ,xN+K}
Transduction avoids inference to the more general, and it infers from particular instances to theparticular instances, asking for less: an inductive function implies a transductive one. Transductionis similar to instance-based learning, a family of algorithms that compares new problem instanceswith training instances. The goal in transductive learning is actually to minimize test error, insteadof the more abstract goal of maximizing generalization performance [3].The optimal retrieving of U from the memory state |φU 〉 is achieved by measuring the ancillawith the optimal covariant POVM in the form Ξ = |η〉〈η| [1], namely PU = |ηU 〉〈ηU |, where|ηU 〉 = ⊕j
√dj |Uj〉, and, conditionally on outcome U , by performing the unitary U on the new
input system. The optimal probability coefficients can be written as
pj =djmj
dN,
where mj is the multiplicity of the corresponding space [2]. This true for the K = 1 case. Themore generic case for arbitrary K takes the global fidelity between the estimate channel CU andU⊗K , that is, the objective function averages over (|tr(U†CU )|/d)2. Hence the exact values of pjalways depend on K, tying the optimal input state to the number of applications, making it a caseof transduction.
ACTIVE LEARNINGIn active learning, the learning algorithm is able to solicit labels for problematic unlabeled instancesfrom an appropriate information source, for instance, from a human annotator [5]. Some typicalclassical strategies:
• Uncertainty sampling: the selected set corresponds to those data instances where the confidence is low.
• Expected error reduction: select those data instances where the model performs poorly, that is, wherethe generalization error is most likely to be reduced.
• Variance reduction: generalization performance is hard to measure, whereas minimizing output vari-ance is far more feasible; select those data instances which minimize output variance.
Optimal quantum learning of unitaries is similar to active learning in a sense: it requires an optimalinput state. Since the learner has access toU , by calling the transformation on an optimal input state,the learner ensures that the most important characteristics are imprinted on the approximation stateof the unitary. The optimal input state for storage can be taken of the form
|φ〉 = ⊕j∈Irr(U⊗N )
√pjdj|Ij〉 ∈ H,
where pj are probabilities, the index j runs over the set Irr(U⊗N ) of all irreducible representations{Uj} contained in the decomposition of {U⊗N}, dj is the dimension of the corresponding subspaceHj , and H = ⊕j∈Irr(U⊗N )(Hj ⊗ Hj) is a subspace of Ho ⊗ Hi carrying the representation U =⊕j∈Irr(U⊗N )(Uj ⊗ Ij), Ij being the identity in Hj .Binary classification scheme based on the quantum state tomography of state classes through Hel-strom measurements also requires an optimal input state [4].
REFERENCES
[1] A. Bisio, G. M. D’Ariano, P. Perinotti, and M. Sedlák. Quantum learning algorithms for quantum measurements. Physics Letters A, 375:3425–3434, 2011.[2] G. Chiribella, G. M. D’ariano, and M. F. Sacchi. Optimal estimation of group transformations using entanglement. Physical Review A, 72(4):042338, 2005.[3] R. El-Yaniv and D. Pechyony. Transductive Rademacher complexity and its applications. In Proceedings of COLT-07, 20th Annual Conference on Learning Theory, 2007.[4] M. Guta and W. Kotłowski. Quantum learning: Asymptotically optimal classification of qubit states. New Journal of Physics, 12(12):123032, 2010.[5] B. Settles. Active learning literature survey. Technical Report 1648, University of Wisconsin–Madison, 2009.