corso di laurea magistrale in ingegneria biomedica · corso di laurea magistrale in ingegneria...
TRANSCRIPT
1
POLITECNICO DI MILANO Facoltà di Ingegneria dei Sistemi
Corso di Laurea Magistrale in Ingegneria Biomedica
TESI
Interaction between dynamic Rubber Hand Illusion
and viewpoint changes in Virtual Reality
Relatori: Prof. Giuseppe BASELLI
Prof. Olaf BLANKE
Correlatore: Danilo REZENDE
Maria Elena GRISOSTOLO
Matricola: 740159
Anno Accademico 2010-2011
2
INDEX
INDEX OF FIGURES............................................................................................... 4
INDEX OF TABLES................................................................................................. 8
SOMMARIO............................................................................................................. 9
SUMMARY ............................................................................................................ 17
1. INTRODUCTION ......................................................................................... 24
2. BACKGROUND AND LITERATURE REVIEW .............................................. 27
2.1 MULTISENSORY PERCEPTION ............................................................ 27
2.2 THE RUBBER HAND ILLUSION .............................................................. 33
2.3 VIRTUAL RUBBER HAND ILLUSION ...................................................... 42
3. EXPERIMENTAL AND DATA ANALYSIS METHODS................................... 45
3.1 RATIONALE OF THE CURRENT EXPERIMENT .................................... 45
3.2 DATA ANALYSIS METHODS................................................................... 49
3.2 DATA MODEL........................................................................................... 51
3.3 ESTIMATE METHODS............................................................................. 58
4. MATHERIALS & METHODS.......................................................................... 61
4.1 SUBJECTS ............................................................................................... 61
4.2 TOOLS...................................................................................................... 61
4.3 SETUP...................................................................................................... 65
4.4 PROCEDURE........................................................................................... 70
4.6 PILOT STUDY .......................................................................................... 76
4.7 DATA MODEL........................................................................................... 78
4.8 DATA ANALYSIS...................................................................................... 85
5. RESULTS....................................................................................................... 95
6. DISCUSSION............................................................................................... 108
3
7. CONCLUSIONS and FUTHER DEVELOPMENTS...................................... 117
8. BIBLIOGRAPHY .......................................................................................... 120
4
INDEX OF FIGURES
Figure 1: risultati studio 1. Probabliita` di percepire la scena da una prospettiva in
prima persona, in funzione di spostamenti della telecamera lungo x, y, z. ........... 14
Figure 2: risultati studio 2. Probabilità di percepire la mao virutale come propria, in
funzione di spostamenti della telecamera lungo x, y, z e utilizzando diversi gradi di
congruenza tra stimolo visivo e tattile. Linee blu: massimo grado di congruenza
visuo-tattile. Linea verde: minimo grado di congruenza. ....................................... 15
Figure 3: risultats for study 1 - probability of perceiving a 1PP scene by displacing
the camera along x,y and z. .................................................................................. 21
Figure 4: results for study 2 - probability of ownership feeling of the virtual hand,
varying the camera position and the congrency level. blue line: masimum visuo-
tactile congruency. Red line: minimum visuo-tactile congruency. ......................... 22
Figure 5: Posterior of an object position, given the visual input (dotted line), the
auditory input (dashed line) and the visual and auditory inputs combined (solid
lined). The values along x axis represent the MAP estimate of the object position.
............................................................................................................................... 30
Figure 6: Iterative basis function network performing multisensory predictions. ... 32
Figure 7: An example of the rubber hand used to perform the experiments. ....... 33
Figure 8: stimulation of both the rubber and the real hand.................................... 33
Figure 9: the typical questionnaire used to evaluate the illusion. .......................... 35
Figure 10: relationship between the reach displacement towards the rubber hand
and the illusion duration, during 30’ stimulation..................................................... 36
Figure 11: drift towards the rubber hand for the synchronously brushed group [
tsakiris&haggard 2005].......................................................................................... 37
Figure 12: 1PP vs. 3PP. ........................................................................................ 40
Figure 13: Virtual RHI set-up. Left panel: the participant wears passive stereo
glasses and a head-tracker, and the virtual image is determined as a function of
his head direction. The experimenter taps and strokes the participant’s real hand
with a special device, whose position is tracked and ............................................ 43
5
Figure 14: Bayesian analysis combines prior beliefs about the quantities of interest
and information (or evidence) contained in an observed set of data, to make
inference on unknown quantities ........................................................................... 49
Figure 15: psychometric curve and how parameter b and a affects its center and
its shape are plotted, using random values for a and b just to provide a qualitative
example. ................................................................................................................ 52
Figure 16: spatial distribution of the collected answers for each camera position
(indicated in the axes of the plot). Different colors are used to discriminate different
answers: Blue points = “YES”, Red point = “NO”. The big blue ball in the middle
represents the subject’s head................................................................................ 55
Figure 17: the fusionCORE™ software interface. ................................................. 62
Figure 18: subject seated in the middle of the capture area while perfoming the
experiment............................................................................................................. 63
Figure 19: Head Mounted Display (HMD) ............................................................. 64
Figure 20: the virtual scene seen without wearing the HMD. ................................ 64
Figure 21: the virtual scene, with a table, a ball and a hand. ................................ 66
Figure 22: the virtual scene viewed by the operator. Green sphere: subject’s head.
Blue cone: camera. Blue grid: volume inside witch the camera randomly moves.
Red line: bias between camera and head position. Yellow ball: light. ................... 67
Figure 23: virtual scene viewed by the operator while the subject was performing
the experiment....................................................................................................... 68
Figure 24: 3 different body axes for 3 different camera movements. .................... 69
Figure 25: the experimental flow. The experiment was divided into 3 main parts:
black screen, virtual scene and question............................................................... 70
Figure 26: one subject during the experiment. ...................................................... 71
Figure 27: question for the study 1: “Do you feel that the point of view of this trial is
the same as yours?" .............................................................................................. 72
Figure 28: set-up for the subject's hand; two markers placed on the back of the
hand and one vibrating motor on the tip of the finger. ........................................... 73
Figure 29: question for the study 2: “Do you feel the virtual hand as your own
hand?" ................................................................................................................... 74
6
Figure 30: the virtual hand with the forearm as appear in the scene. It lacked of
physiological parts such as the elbow, the arm and the shoulder. ........................ 76
Figure 31: study 1 - results. Probability of having an answer equal to yes,
displacing the camera along x (lef/right to the subject). ........................................ 96
Figure 32: study 1 - results. Probability of having an answer equal to yes,
displacing the camera along y (up/down to the subject)........................................ 97
Figure 33: study 1 - results. Probability of having an answer equal to yes,
displacing the camera along z (near/far from the subject)..................................... 98
Figure 34: isoprobability curves. The blue area represents the maximum
porbabliity of having yes (80%) and the related range of positions along x and y
axis. ....................................................................................................................... 99
Figure 35: study 1 - results. Probability of having an answer equal to yes,
displacing the camera along x (left/right to the subject), using 3 different
congruency levels (λ=0%,50%,100%)................................................................. 102
Figure 36: study 1 - results. Probability of having an answer equal to yes,
displacing the camera along y (up/down to the subject), using 3 different
congruency levels (λ=0%,50%,100%)................................................................. 103
Figure 37: study 1 - results. Probability of having an answer equal to yes,
displacing the camera along z (near/far from the subject), using 3 different
congruency levels (λ=0%,50%,100%)................................................................. 104
Figure 38: Comparison between range of position for λ=0% (panels a and b) and
λ=100% (panels c and d), for x (top panels) and z (bottom panels). ................... 106
Figure 39: results – study 1. Probability of having “YES, given a camera
displacement along the axis x, y, z..................................................................... 110
Figure 40: possible explanation for the y bias that seems to be related to the set-
up......................................................................................................................... 111
Figure 41: isoprobability curves. The light blue area represents the maximum
porbabliity of having yes (~80%) and the related range of positions along x and y
axis. The head of the subject is also plotted for an easier comprehension of the
figure.................................................................................................................... 112
7
Figure 42: results – study 2. Probability of experiencing the ownership feeling,
given a camera displacement along the 3 axis (x, y, z) and using 3 different
congruency levels................................................................................................ 113
Figure 43: comparison between the range of positions along x (left panel) and z
(right panel) . ....................................................................................................... 115
8
INDEX OF TABLES
Table 1: Optimum model for study 1 ..................................................................... 81
Table 2: Values for optimum model for study 1 ..................................................... 81
Table 3: Optimum model for study 2 ..................................................................... 84
Table 4: Values of optimum model for study 2 ...................................................... 84
Table 5: Output of the MCMC estimation, study 1................................................. 88
Table 6: Maximum likelihood estimation, study 1 .................................................. 89
Table 7: Output of the MCMC estimation, study 2................................................. 92
Table 8: Maximum likelihood estimation, study 2 .................................................. 93
Table 9: : Comparison between the two estimation methods................................ 94
Table 10: Summary of main findings ..................................................................... 98
Table 11: Summary of the predicted values and estimated standard errors ....... 100
Table 12: Estimated standard errors and predicted values ................................. 107
Table 13: Comparison of the standard deviation for the two studies presented here
............................................................................................................................. 115
9
SOMMARIO
Questo progetto è stato svolto nel laboratorio di neuroscienze cognitive (LNCO)
presso l'Ecole Polytechnique Fédérale di Losanna (CH). Il lavoro si basa sullo
studio del processo cognitivo relativo alla percezione del proprio corpo, in inglese
definito come senso di “ownership”.
In particolare si vuole investigare in quale misura all’interno di un ambiente di
realtà virtuale la percezione di ownership di una mano fittizia (detta “mano di
gomma”, “rubber hand”), che agisce e provoca feedback sensoriali in sincrono con
quella vera, venga influenzata da spostamenti del punto di vista virtuale
incongruenti con la posizione di vista naturale.
In generale, le ricerche su questo argomento rivestono un ruolo importante nello
studio dei disturbi che riguardano la consapevolezza del sé corporeo, la
percezione somatosensoriale (propriocettiva e tattile) e la percezione
dell’immagine corporea, che possono essere alterate in pazienti affetti da disturbi
psichiatrici, come la schizofrenia o la sindrome dell’arto fantasma.
Recentemente psicologi e ricercatori stanno indagando sui processi di percezione
utilizzando illusioni cognitive. Quest’ultime infatti costituiscono un utile
approccio per indagare sul senso di “ownership” e la percezione di sé e del
proprio corpo.
In particolare questo progetto si colloca tra le numerose ricerche basate su
un'illusione cognitiva chiamata "rubber hand illusion" (RHI). Si parla di RHI quando
un soggetto giunge a percepire come propria una mano di gomma, posta davanti
ad esso in posizione congrua con il resto del suo corpo, cioè con un'angolazione
simile a quella che potrebbe avere la sua stessa mano. La mano reale viene
nascosta alla vista del soggetto ed un operatore stimola con un pennello sia la
mano di gomma che la mano reale. Osservando la stimolazione effettuata sulla
mano finta ma percependo la stimolazione tattile della mano reale, il soggetto
cade in una incredibile illusione percettiva: si inganna sull’appartenenza delle mani
che vede davanti a sé, percependo come propria la mano di gomma. Questo
avviene però solo se la stimolazione delle due mani è eseguita in modo sincrono.
10
(Botvinick M, J Cohen(1998)). E’ evidente l’interesse in realtà virtuale dove di
norma viene rappresentato il corpo del soggetto, o un sua porzione, o un attrezzo
che agisce sotto il comando del soggetto e viene fatto interagire con lo scenario
virtuale volendo creare una percezione di ownership.
Tutti gli esperimenti relativi a questo progetto sono stati svolti utilizzando una
realtà virtuale, progettata in modo da ricreare un set-up simile a quello
dell’esperimento sopra descritto, sostituendo la mano di gomma con una mano
virtuale. In questo modo abbiamo potuto testare la classica RHI, sfruttando però i
vantaggi di un ambiente virtuale, quali la sistematica modifica dei parametri sui
quali volevamo indagare, la riproducibilità delle stesse condizioni tra soggetti
diversi ed una generale automatizzazione per raccogliere i dati ed analizzarli,
riuscendo cosi a studiare questa illusione con un approccio più scientifico rispetto
agli studi psico-fisici presenti in letteratura.
Materiali & Metodi
OBIETTIVO DELLO STUDIO
Questo progetto vede come obiettivo lo studio di come una mano virtuale venga
percepita come propria dal soggetto che esegue l’esperimento. In particolare
abbiamo analizzato come la prospettiva della scena virtuale osservata modificava
questa illusione. Infatti in letteratura è noto che la visione della scena da una
prospettiva in prima persona (first person visual perspective – 1PP) rappresentava
un fattore critico per far nascere questa illusione (Ehrsson HH(2007); Petkova VI,
Khoshnevis M. e Ehrsson HH (2011).
SET-UP e STRUMENTI UTILIZZATI
E' stato appositamente progettato un set-up per ricreare una scena virtuale e
simulare le condizioni della classica RHI. La scena virtuale prevedeva un tavolo
con una pallina posta su di esso ed un braccio virtuale. Durante gli esperimenti i
11
soggetti dovevano interagire con gli oggetti della scena, avendo come compito
quelli di toccare la pallina virtuale sul tavolo.
Abbiamo utilizzato un dispositivo (tracking system: ACT ReActor2) per riuscire a
riprodurre i movimenti reali dei soggetti nello scenario virtuale. Grazie a markers
applicati alla mano del soggetto, la mano virtuale si muoveva seguendo i
movimenti della mano reale. Durante gli esperimenti i partecipanti indossavano
anche un particolare dispositivo a caschetto (Head-mounted-display HMD), dotato
di lenti che permettevano l'osservazione di una scena in 3D. Anche su questo
dispositivo abbiamo applicato dei markers, in modo che anche i movimenti della
testa dei soggetti venissero registrati, per permettere la proiezione di una scena
coerente con l’inclinazione della testa e la direzione dello sguardo. In questo
modo, utilizzando sia il dispositivo per il tracking dei movimenti, che il caschetto
per osservare la scena in 3D, i soggetti erano in grado di osservare la stanza,
muoversi nello spazio ed interagire con gli oggetti. Inoltre abbiamo applicato sotto
al polpastrello dell’indice della mano destra dei soggetti un piccolo generatore di
vibrazioni. Questo dispositivo era utilizzato per trasmettere la stimolazione tattile
necessaria per l’esperimento: il soggetto riceveva un feedback tattile ogni volta
che toccava la pallina virtuale nella scena.
Tuttavia, per ottenere un set-up simile a quello della classica RHI il soggetto
doveva ricevere un stimolo visivo oltre che tattile. Cosi abbiamo progettato la
scena in modo che la pallina cambiasse colore e si spostasse ogni volta che essa
veniva toccata dal soggetto, e questo rappresentava lo stimolo visivo necessario
per suscitare l’illusione.
Allo stesso tempo, toccando la pallina, il soggetto riceveva anche lo stimolo tattile
dal motorino applicato alla mano reale. In questo modo, modificando la
congruenza tra stimolo visivo e tattile (ad esempio, lo stimolo tattile non veniva
sempre trasmesso ad ogni tocco della pallina), abbiamo potuto analizzare come
questo fattore di congruenza influenzasse l’illusione.
Abbiamo inoltre voluto testare un secondo fattore critico: la prospettiva della scena
osservata dal soggetto, valutando come una scena vista da una prospettiva in
prima persona potesse influire sull’illusione di percepire la mano virtuale come
propria. Per fare ciò, venivano mostrate ai soggetti diverse prospettive della stessa
12
scena. In seguito veniva loro chiesto di valutare se la scena fosse stata riprodotta
con una 1PP o meno. Questo concetto si basa sull’idea che i soggetti potessero
osservare la scena da una telecamera virtuale che riproduceva l’immagine
virtuale. La telecamera era stata progettata in modo da muoversi in modo casuale
nello spazio, riproducendo scene da diversi punti di vista, a seconda della sua
posizione. Assumendo che la telecamera fosse in grado di ricreare una scena in
1PP ogni qualvolta si trovasse posizionata al centro della testa del soggetto
questo modo, siamo riusciti ad identificare il range delle posizioni che
permettessero la riproduzione di una scena in 1PP, correlando la posizione della
telecamera virtuale con la prospettiva della scena mostrata.
PROCEDURA SPERIMENTALE
Per realizzare questo lavoro, sono stati ideati e svolti due esperimenti separati.
Con un primo studio preliminare (studio 1) sono stati studiati i meccanismi relativi
alla pura percezione in un ambiente virtuale, per cercare di capire quali fossero le
condizioni necessarie affinché il soggetto percepisse la scena virtuale da una
prospettiva in prima persona, in termini di posizione della telecamera. Una volta
evidenziato questo range di posizioni, abbiamo eseguito un secondo studio (studio
2) per testare specificatamente l’illusione del percepire la mano virtuale come
propria.
Ai soggetti veniva chiesto di toccare ripetutamente la pallina, spostando la mano
virtuale verso di essa. Grazie al sistema di tracking dei movimenti del braccio, i
soggetti riuscivano a pilotare la mano virtuale semplicemente muovendo il proprio
braccio. In seguito veniva loro chiesto di indicare se avessero percepito la mano
virtuale come propria.
Sfruttando la realtà virtuale, siamo quindi riusciti a testare i soggetti sotto diverse
condizioni, ottenute sia modificando la posizione della telecamera e quindi la
prospettiva della scena, sia variando il livello di congruenza tra lo stimolo visivo e il
feedback tattile al momento del contatto con la pallina virtuale.
In questo modo abbiamo analizzato come questi due fattori potessero influenzare
il senso di “ownership”, suscitando o meno l’illusione.
13
ANALISI DEI DATI
Per analizzare i dati al fine di ottenere i risultati, abbiamo costruito un modello di
dati basato sull'idea di una curva psicometrica, ossia una curva sigmoide che
legasse l’intensità dello stimolo alla risposta ad esso. Nel nostro set-up, lo stimolo
differiva in funzione della posizione della telecamera, della direzione dei suoi
movimenti nello spazio e del livello di congruenza del feedback tattile.
Per stimare i parametri del modello abbiamo impiegato due metodi, uno basato sul
metodo della massima verosimiglianza e l'altro su tecniche di analisi bayesiana.
Quest'ultimo, in particolare, si basa sul metodo Markov Chain Monte Carlo
(MCMC) per la stima dei parametri.
RISULTATI
I risultati relativi allo studio 1 dimostrano che movimenti della telecamera eseguito
lungo l'asse x e y (movimenti laterali e verticali rispetto al soggetto) influivano sulla
risposta dei soggetti, i quali percepivano una scena in 1PP spostando la
telecamera tra -10 e 10 cm a destra o a sinistra, unitamente ad uno spostamento
verticale tra -3 cm e -20 cm verso il basso alla testa del soggetto. Invece
movimenti lungo l’asse di profondità z (allontanando/avvicinando la telecamera
dagli occhi del soggetto, mostrando una scena da vicino/da lontano) non
influenzavano significativamente la risposta dei soggetti, che non notavano
differenze in termini di prospettiva, nella scena percepita. inoltre, per spostamenti
lungo l'asse y, la curva non era centrata in zero (posizione della testa del
soggetto) ma era presente un bias di circa -10 cm, probabilmente dovuto ad
impostazioni iniziali del set-up (Figure 1).
14
Figure 1: risultati studio 1. Probabliita` di percepire la scena da una prospettiva in prima persona, in
funzione di spostamenti della telecamera lungo x, y, z.
I risultati dello studio 2 hanno mostrato invece come l’insorgere dell’illusione
dipendesse in primo lungo dal grado di congruenza tra stimolo visivo e tattile.
Quando il feedback tattile era sincrono al 100% con quello visivo, la probabilità di
percepire la mano virtuale come propria era ~ 25% maggiore della probabilità
ottenuta applicando un feedback completamente asincrono [P(illusione) max ~
80% con λ = 100% e P(illusione) max <60% con λ = 0%]. Questo risultato si
dimostra essere coerente con i quelli ottenuti nell'esperimento classico della RHI
(Tsakiris & Sparuto 2005).
Abbiamo ottenuto anche un ulteriore risultato circa l’effetto di un cambio di
prospettiva della scena. Infatti, indipendentemente dal grado di congruenza
applicato, l’illusione raggiungeva un massimo quando la scena virtuale veniva
mostrata da una prospettiva in prima persona (confrontando i risultati ottenuti dallo
studio 1 circa il range di posizioni della fotocamera) e diminuiva presentando al
soggetto scene da punti di vista diversi dal proprio. Anche questo secondo
risultato si dimostra essere in linea con i risultati presenti in letteratura, riguardanti
15
l'importanza nel suscitare l’illusione di una scena vista in prima persona (Petkova,
Khoshnevis Ehrsson (2011); Ehrsson (2007)). il bias presente lungo l'asse y era
presente anche in questo studio, supportando ulteriormente l'ipotesi relativa al set-
up iniziale (Figure 2).
Inoltre, analizzando l’effetto dei due fattori combinati insieme, abbiamo notato che
il feedback tattile era in grado di rafforzare l’illusione, presente anche quando la
scena non veniva esattamente mostrata da una prospettiva in prima persona.
Quindi, abbiamo potuto concludere che, in generale, lo stimolo tattile svolga un
ruolo importante nella percezione corporea in esperimenti svolti in realtà virtuali,
rafforzando la sensazione di trovarsi all’interno di un ambiente reale e producendo
di conseguenza una più vivida sensazione che la mano virtuale sia vera.
Figure 2: risultati studio 2. Probabilità di percepire la mao virutale come propria, in funzione di
spostamenti della telecamera lungo x, y, z e utilizzando diversi gradi di congruenza tra stimolo
visivo e tattile. Linee blu: massimo grado di congruenza visuo-tattile. Linea verde: minimo grado di
congruenza.
16
Conclusioni
Questo set-up virtuale si è dimostrato essere valido per riprodurre le condizione
dell’esperimento classico della RHI.
Inoltre abbiamo dimostrato che una scena vista da una prospettiva in prima
persona (1PP scena) insieme ad una stimolazione tattile sincrona con quella
visiva, rappresentano delle condizioni necessarie per percepire la mano virtuale
come propria (in generale, per indurre un senso di “ownership”), anche se si è
rivelato che la condizione di ownership permette un campo di variazione del
spostamento del punto di vista virtuale ampliato. Questo notevole risultato, non
prevedibile a priori, sembra legato alla maggiore capacità di adattamento sensori-
motorio nel caso in cui si passi da una visione puramente passiva a una
interazione attiva con la scena, coi relativi feedback.
Tuttavia questo rappresenta solo un primo tentativo di creare una set-up virtuale e
dinamico. Saranno necessarie ulteriori analisi per ottenere dei risultati completi,
come ad esempio la necessità di eseguire una più accurata analisi degli errori ed
eseguire anche un’analisi tra i diversi soggetti. Inoltre, per migliorare la scena
virtuale potrebbero essere necessarie anche opportune modifiche al set-up, in
termini di qualità dell’immagine e verosimiglianza con una scena reale. Anche
l’aspetto della mano virtuale si potrebbe ulteriormente migliorare, specialmente per
quel che riguarda la riproduzione fedele dei movimenti reali in ambiente virtuale.
17
SUMMARY
The current project was carried out in the laboratory of cognitive neuroscience
(LNCO) at the École Polytechnique Fédérale de Lausanne (CH). This work
focuses on the exploration of the body ownership feeling, a cognitive process
involving the experience of being the owner of one’s body. In particular, it was
investigated how changes in perspective affected the ownership feeling of a virtual
hand in a virtual environment.
Since psychiatric disorders such as schizophrenia or phantom limb syndrome are
related to an impaired self-perception, many neurological and psychological
studies are being developed around this topic in order to better understand these
psychiatric diseases. In recent years researchers try to learn more about
perception processes also using cognitive illusions, an useful approach to
investigate bodily self-perception and sense of ‘self’. In particular this project is
related to a cognitive illusion called “Rubber hand illusion” (RHI), in which people
live the experience that a fake rubber hand is actually their own hand (Botvinick M,
Cohen J (1998)). It is evident the relevance for virtual reality studies where the
subject, a part of his body or a device operated by him is interacting with the virtual
scene.
Materials & Methods
GOAL OF THE CURRENT WORK
The present work was carried out to explore the mechanisms responsible for the
body ownership feeling, exploiting a recent developed technology: the Virtual
Reality. The goal of this study was to test an aspect of body ownership, the first-
person visual perspective (1PP), known as being a critical factor for triggering the
illusion of body ownership (Ehrsson H.H. (2007); Petkova V. I., Khoshnevis M.and
Ehrsson H.H. (2011).
18
Operating in a virtual environment, we managed to build a robust virtual set-up
that could well reproduce the classical RHI conditions. Moreover VR allowed a
systematic change of variables and a precise reproduction of the conditions
among different subjects, supporting a more scientific approach to investigate on
the classical RHI.
SET-UP and TOOLS
A virtual set-up was designed to recreated and simulated the RHI conditions. We
recreated a virtual scene, with a virtual table and a virtual ball on it. A virtual arm
was also designed to mimic the rubber fake hand used in the classical RHI.
Subjects sitting on a real chair had to interact with this virtual scenario. They felt
themselves as seat at the table, with their arm place on it. During the experiments
they had to interact with this virtual scenario. They were asked to touch a virtual
ball on the table, moving their real arm.
A tracking system (ACT ReActor2) was necessary to conduct the experiments in
order to reproduce the subjects’ movements in the virtual scenario. In this way
when subject’s moved their own arm even the virtual one moved, following the real
movements.
During the experiments participants wore a head-mounted display set (HMD), that
allows the observation of a 3D scene simulated with the computer. Also this device
was tracked with ACT ReActor 2 system, so that even the positions and the
movements of subjects’ head could be registered. Then the graphics hardware
computed the information about the head’s position and updated the image to
match the motion of the participant's head. Thus, joining the tracking system and
the HMD, participants were completely able to "look around" the virtual scene and
interact with it. In particular, during the experiments, they saw the virtual arm
touching the ball, according to their real movements.
A vibrating motor was also attached to the tip of the right index finger. This device
conveyed a vibration feedback when subject touched the ball.
19
EXPERIMENTAL PROCEDURE
During the experiment, subjects were asked to touch the virtual ball several times.
The mainly idea of this set-up was that the virtual ball, when touched, changed its
position and colour, providing a visual stimulus to the subject. Simultaneously,
subject’s received also a tactile stimulus (provided by the vibrator) on the real
hand. Changing the congruency between the tactile and visual stimuli, we could
analyse how this factor affected the ownership feeling. Moreover, we wanted to
analyse the effect of a 1PP scene. Hence we presented to the subjects virtual
scenes from different points of views. Then, they were asked to evaluate if the
scene was shown in a 1PP or not.
Two different studies were perform in order to achieve this goal. A preliminary
study (study 1) investigated the pure perceptual mechanisms in virtual
environment to find out how subjects could perceive a virtual scene in 1PP.
Collecting and analysing the subjects’ answers, we were able to reproduce the
conditions necessary to show a 1PP virtual scene in a second study. This
experiment based on the idea that subjects observed the scene trough a virtual
camera instead of looking it directly with their own eyes. We accepted that when
the camera was centred on the subject’s head, this scene had exactly the same
point of view of a scene viewed by the subject from his/her position (1PP scene).
The camera was designed to randomly move in the space around subjects’ head,
reproducing scenes by different points of view, according to its position. Thus,
since the perspective of the scene was related to the positions of a virtual camera,
we wanted to identify the range of the positions that allowed the creation of a 1PP
virtual scene. Exploiting the VR, we could systematically modify the 1PP
displacing the virtual camera and collect the relative answer.
Study 2 focused on the investigation of the ownership feeling itself. Subjects were
ask to indicate if they felt the virtual hand as their own, after performing the task.
Operating in VR, we managed to both change the camera positions and the
congruency level of the visual-tactile stimulation systematically and collected the
related answers. Thus, performing the study 2, we found out how the two main
factors involved in the RHI (1PP and congruency) affected the ownership feeling.
20
DATA ANALYSIS
To analyse data in order to obtain the results, we built a data model based on the
idea of a psychometric curve, a sigmoid curve joining stimulus with stimulus-
response. The stimulus depended on the camera position, the direction of the
camera movements around the virtual space and the congruency level of the
tactile feedback. in order to estimate the model paramenters, two diffent estimation
mathods were employed here. First, an iteratively reweighted least squares
method for maximum likelihood estimation of the model parameters was
performed. Second, Bayesian inference was performed on the model’s
parameters, which were estimated by the Markov chain Monte Carlo (MCMC)
method.
RESULTS
Results related to study 1 showed that a 1PP scene could be reproduced by
moving the camera between -10 cm and 10 cm right/left to the subject’s head
combined with a vertical displacement between -3 cm and -20 cm down to the
subject. Movements along the depth axis (by moving the camera far/near to the
subject) do not greatly affect the perspective of the scene. Moreover, an evident
bias of -10 cm on the maximum probability of perceiving a scene as a 1PP scene
was found for vertical camera moviments. This bias was probably due to a
sistematic problem with the set-up and it was not involved with a cognitive effect.
(y axis) (Figure 3).
21
Figure 3: results for study 1 - probability of perceiving a 1PP scene by displacing the camera along
x,y and z.
Results from study 2 showed that the probability of feeling the virtual hand as own
with providing a fully synchronous feedback was ~25% higher than with a
asynchronous feedback [P(ownership) max ~ 80% with λ=100% and
P(ownership) max < 60% with λ=0%].
This findings was coherent with the results obtained in the classical RHI
experiment (see Figure 11) (tsakiris&haggard 2005). Furthermore, independently
to λ value, the ownership feeling reached a maximum when the virtual scene was
shown in a 1PP (placing the camera according to the results of study 1) and
decreased if presenting a virtual scenario from a point of view different from the
subject’s one. Even this result was consistent with the results in literature
concerning the importance of a 1PP for the ownership mechanisms (Petkova,
Khoshnevis Ehrsson (2011) ;Ehrsson (2007)).The bias along the vertical axis was
still present, reinforcing the hypothesis of a set-up problem rather than a cognitive
effect (Figure 4).
22
Furthermore, we found that providing a tactile stimulation to the subject the feeling
of ownership increased, even if the scene is not perfectly view in 1PP. Thus, we
can also conclude that tactile feedbacks plays an important role in the ownership
feeling, giving a more realistic sense of being inside the virtual environment.
Figure 4: results for study 2 - probability of ownership feeling of the virtual hand, varying the
camera position and the congruency level. blue line: maximum visuo-tactile congruency. Red line:
minimum visuo-tactile congruency.
Conclusions
In summary, we conclude that a 1PP virtual scene combined with a congruent
visuo-motor feedback are necessary conditions to induce the ownership feeling of
a virtual hand, performing the classical experiment of the RHI in VR. Furthermore
this project proves that our virtual set-up was suitable to reproduce the classical
RHI conditions. It was shown that the ownership condition allows for variations in
23
the broadened observation region of the subject. This significant result, which was
not foreseen a priori, is linked to the an improved possibility of having an
ownership feeling, thanks to the active interation with the virtual environment and
to the actile feedback.
This was just a first attempt to created a dynamic-virtual set-up and many aspects
need to be completed yet. First, an accurate error analysis and inter-subject
variability analysis should be performed in order to achieve more quantitative
results to support these finding. Further advancements could be also performed to
render the virtual scene more realistic, in terms of quality of the scene and ability
to reproduce the physiological arm movements.
24
1. INTRODUCTION
One of the biggest challenges of present-day research is to fully understand the
human brain and its internal mechanisms. In order to achieve this goal, an
increasing number of studies are being carried out on this topic.
The current study pertains to a group of researches involving the study of the body
ownership feeling, a cognitive process involving the experience of being the owner
of one’s body. Every day we all feel our limbs as something that belongs to our
own body. When we look at our hands, we immediately know that they are part of
our own body. But how and where does this feeling arise?
This question has been discussed by philosophers and psychologists for centuries
(James W. ed. (1890), Jeannerod M. (2003), Merleau-Ponty M., ed. (1945),
Metzinger T., ed. (2003)). It is commonly acknowledged that the experience of
being the owner of one’s body relates to the problem of correctly identifying
oneself in the sensory environment (Graziano M., Botvinick M. (2002), Prinz W.,
Hommel B., eds (2002), Makin T.R., Holmes N.P., Ehrsson H.H. (2008)), involving
the central nervous system (Churchland, P. S. (2002)).
Perception of the body is due to a multi-sensorial integration process where the
brain has to integrate tactile, proprioceptive, visual and vestibular information in
order to create a coherent representation of the body. Neural integration of vision
and touch may be essential for developing and maintaining this sense of bodily
self.
We know from neurology that people suffering from pathological conditions
affecting frontal and parietal lobes can sometimes fail to recognise their limbs as
belonging to themselves. For example, a schizophrenic person might try to throw
his/her left leg out of the bed every morning, thinking that leg belongs to someone
else. Moreover, a damage in the premotor cortex, such as the one caused by a
stroke, could induce a similar misidentification or unawareness of a limb. In the
phantom limb syndrome the patient has the sensation that an amputated or
missing limb, often felt in a distorted and painful position, is still attached to the
body. These neurological observations suggest that certain brain regions might be
25
responsible for generating the feeling of being located within one’s body and the
experience of belonging to it. But processes regulating the ownership feeling and
self-body perception cannot be exhaustively explained by neurological
observations alone and alternative techniques are needed.
For this reason, researchers try to learn more about perception processes using
cognitive illusions. The study of cognitive illusions represents a classic
psychological approach to investigate bodily self-perception and sense of ‘self’.
Among these cognitive illusions there is the so called ‘Rubber Hand Illusion (RHI),
in which people live the experience that a rubber fake hand is actually their own
hand (Botvinick M, Cohen J (1998)). This effect is due to a multisensory conflict
between visual and tactile information when a synchronous visual-tactile
stimulation is applied both on the real hand and the fake one. Thus, due to this
incorrect integration of vision and touch information, the brain is no longer able to
create a current representation of the body position.
The present work, based on the classical RHI just briefly introduced, was carried
out to explore the mechanisms responsible for the body ownership feeling,
exploiting a recent developed technology: the Virtual Reality.
VR, a concept first coined by Jaron Lenier during the mid 80’s, refers to the use of
specific resources of computer technology to provide its users with a computer
based environment in such a way that the user actually believes and feels that he
is immersed in such environment (Monnet J., 1995). This definition implies that
besides being a technology VR refers also to an “experience”. In order to achieve
such an effect and make a VR experience as close as possible to a primary world
experience, VR technology aims to provide users with two essential features:
Immersion and Interactivity. The first reflects the sensation of being inside the
alternative environment, while the latter means that the environment actually
changes in response to the users activity.
VR is apparently not born as a research tool in neuroscience, but mainly as a
mean to interface human senses and action to machines for different purposes.
For example, VR offers great advantages in many different medical applications.
Surgery is one of the most promising fields in which this technique has been
successfully applied. VR could be useful for surgeons, who may have a complete
26
simulated view of the surgery. Moreover, practicing with virtual patients would be a
practical way for medical students to learn how to perform surgery.
Nonetheless, its capability to deliver complex scenarios and to arbitrarily govern
the internal relationships of sensory-motor loops render it an invaluable tool even
in neuroscience, for the investigation of human external and self perception (Slater
et al., 2005).
The goal of this study was to test an aspect of body ownership, the first- person
visual perspective (1PP), known as being a critical factor for triggering the illusion
of body ownership (Ehrsson H.H. (2007); Petkova V. I., Khoshnevis M.and
Ehrsson H.H. (2011). We assessed the adaptation capabilities of the visuo-motor
system in perceiving the ownership of a virtual hand acting in a virtual scenario in
presence of mismatches between the real subject’s vision point and the virtual one
delivered by a VR system.
Operating in a virtual environment, we managed to build a robust virtual set-up
that could well reproduce the classical RHI conditions. Moreover VR allowed a
systematic change of variables and precise reproduction of the conditions among
different subjects, supporting a more scientific approach to investigate on the
classical RHI.
27
2. BACKGROUND AND LITERATURE REVIEW
2.1 MULTISENSORY PERCEPTION
Our perception of the word is the result of a multisensory integration of all the
different information coming from our senses, such as vision, audition and
proprioception. The nervous system combines all these information, letting us
have a coherent representation of the external objects and a perceptual
experience of it.
This multisensory integration is difficult to perform for two main reasons. First, the
reliability of sensory modalities varies widely according to the context. For
example, in daylight, visual cues are more reliable than auditory cues to localize
objects, while the contrary is true at night. Thus, the brain should rely more on
auditory cues at night and more on visual cues during the day to estimate object
positions.
Second, each sensory modality uses a different format to encode the same
properties of an object. Thus, the position of an objects is encoded in different
frames of reference, depending on the sensory modality. For example, visual
stimuli are represented by neurons with receptive fields on the retina (eye-centred
frame of reference), auditory stimuli by neurons with receptive fields around the
head (head-centred frame of reference) and tactile stimuli by neurons with
receptive fields anchored on the skin. A change in the eye position or body posture
will result in a change in the correspondence between visual, auditory and tactile
neural responses encoding the same object.
Thus, multisensory integration cannot be a simple averaging between converging
sensory inputs. More elaborate computations are required to interpret neural
responses corresponding to the same object in different sensory areas.
From literature we know that the probabilistic Bayesian approach provides a
solution to deal with the first issue of combining cues that are not equally reliable
28
(D.C. Knill, W. Richards (1996)). Localizing an object seen and heard at the same
time consists of computing the probability that the object is at that location, given
its image and sound, for each location in space.
The Bayesian approach allows the optimal combinations of multiple sources of
information about a quantity x.
Considering x the position of an object which can be seen and heard at the same
time, rvis the noisy neural responses in the visual cortex and raud the noisy
responses of auditory neuron.
The position of the object is most probably near the receptive fields of the most
active cells, but this position cannot be determined with infinite precision due to the
presence of neural noise. Thus, a good strategy is to compute the posterior
probability that the object is at position x, given the visual neural responses P(x|
rvis).
Using the Bayes rule in equation (1), P(x| rvis) can be obtained by combining the
distribution of neural noise P( rvis| x) with prior knowledge on the distribution of
object position P(x) and the prior probability of neural responses P(rvis).
€
P(x | rvis) = P(rvis | x)P(x) /P(rvis) (1)
This distribution P(x| rvis) is called posterior probability. P( rvis| x) represents the
noise distribution. It corresponds to the variability in neural responses for a fixed
stimulus and it can be measured experimentally by repetitively presenting an
object at the same position x and measuring the variability in rvis (that is why we
call this term the ‘‘noise’’ distribution).
We can ignore the denominator, P(rvis), because it is independent of the variable x
that we want to measure. P(x) is called prior distribution, concerning the
knowledge that an object is more likely to appear in some visual locations than
others. If we can assume that all positions are equally likely P(x)=constant This
implies that P(x) does not depend on x and it can be also ignored.
Therefore, Eq. (1) reduces to:
€
P(x | rvis)∝ P(rvis | x) (2)
29
Once the posterior distribution is computed, an estimate of the position of the
object can be obtained by recovering the value of x that maximizes that
distribution:
€
xvis = [argmax]x P(x | rvis) (3)
This is known as the maximum a posteriori estimate (MAP).
When we hear the object, a similar posterior distribution P(x| raud) and its
corresponding estimate
€
ˆ x aud can be computed based on the noisy responses of
auditory neuron raud.
When the object is heard and seen at the same time, the estimate
€
ˆ x bim (‘‘bim’’
stands for bimodal) need to be computed, using the same approach by maximizing
the posterior distribution, P(x| rvis,raud).
€
xbim = [argmax]x P(x | rvis,raud ) (4)
To compute the posterior distribution, we use Bayes law, which under the
assumption of a flat prior distribution, reduces to:
€
P(x | rvis,raud )∝ P(rvis,raud | x) (5)
Assuming that the noise corrupting the visual neurons is independent from the
one corrupting the auditory neurons we can write the (3) as follows:
€
P(x | rvis,raud )∝ P(rvis | x)P(raud | x) (6)
And finally, knowing the Eq. (2), we can write the (6) as follows:
€
P(x | rvis,raud )∝ P(x | rvis)P(x | raud ) (7)
According to the Eq. (7) the bimodal posterior distribution can be obtained by
simply taking the product of the unimodal distributions. An example of this
operation is illustrated in Figure 5.
30
Figure 5: Posterior of an object position, given the visual input (dotted line), the auditory input
(dashed line) and the visual and auditory inputs combined (solid lined). The values along x axis
represent the MAP estimate of the object position.
.
When
€
P(x | rvis) and
€
P(x | raud ) are Gaussian probability distributions, as is the case
in Figure 5, the bimodal estimate
€
ˆ x bim can be obtained by taking a linear
combination of the unimodal estimates,
€
ˆ x vis and
€
ˆ x aud weighted by their respective
reliabilities, 1/σ2vis and 1/σ2
aud.
The reliability of the visual and auditory estimates are the inverse of the variances
of the visual and auditory posterior probabilities. In particular, if the visual input is
more reliable than the auditory input (1/σ2vis is smaller than 1/σ2
aud) then the
bimodal estimate of position should be closer to the visual estimate and vice versa
if audition is more reliable than vision.
Recent psychophysical data suggests that the brain might employ such adaptive
procedure to combine optimally neural responses in different sensory modalities,
taking into account the relative reliability of different sensory cues before
combining them (R.J. van Beers, A.C. Sittig, J.J. Denier van der Gon (1996), R.J.
van Beers, A.C. Sittig, J.J. Gon(1999), R.J. van Beers, D.M. Wolpert, P. Haggard
(2002), M.O. Ernst, M.S. Banks (2002), M.O. Ernst, M.S. Banks, H.H. Bulthoff
(2000), J.E. Atkins, J. Fiser, R.A. Jacobs (2001)).
31
Unfortunately, the Bayesian model is incomplete as a theory of spatial
multisensory integration because it does not explain how sensory modalities can
be combined despite using different frames of reference.
One possible solution to this last problem is to propose that sensory responses are
remapped in a common frame of reference before converging on multisensory
cells, where the necessary coordinates transformation are computed, putting all
the sensory inputs in the same format.
In order to identify brain areas involved in cross-modal spatial interactions,
Macaluso et al. used brain imaging studies while subject where presented with
lateralized visual, tactile and bimodal stimuli (E. Macaluso, J. Driver,(2001)). They
found unimodal areas that responded only to tactile (post central gyrus) or visual
(lateral and inferior occipital lobe) contralateral stimuli. They also found
multisensory areas activated by both contralateral visual and tactile stimuli
(anterior intra-parietal sulcus). This first result supports the sensory remapping
hypothesis, according to which inputs from unimodal sensory areas would
converge on multisensory areas to be presumably recoded in the same frame of
reference. Moreover, evidences from electrophysiological recordings in monkeys
show that sensory input tends to be recorded in the same frames of reference on
multisensory areas. For example, visual, auditory and tactile stimuli are remapped
in a skin-centered frame of reference in the premotor cortex (M. Graziano, X. Hu,
C. Gross (1997), M.S. Graziano, G.S. Yap, C.G. Gross (1994)).
However, these findings do not solve the problem of how auditory inputs are
remapped in eye-centered frames of reference or visual inputs in skin-centered
frame of reference.
Thus, Deneve et al. (Deneve S., Pouget A., 2004) proposed a new model, in which
multisensory areas combine sensory inputs in a comm0n format, allowing
multidirectional sensory predictions. This model accounts for neurophysiological
and psychophysical data and performs Bayesian multisensory integration without
explicitly representing probability distributions.
In this model unimodal input layers are interconnected with a multisensory
intermediate layers with basis function units (Figure 6). At each iteration, the eye-
centered (visual), head-centered (auditory) and eye position (proprioceptive)
32
inputs are combined on the multisensory layer. These multisensory activities are
then fed back into the input layers, in a way that compute the eye-centered
position from the head-centered position and eye position, and vice versa the
head-centered position from the eye-centered and eye position. This process is
iterated until an agreement is reached between the visual and auditory position
encoded in the corresponding layers and the network converges to stable hills of
activities. The positions of the stable hills are the network estimates for the
position of the object in eye-centered and head-centered coordinates, as well as
the position of the eyes.
Figure 6: Iterative basis function network performing multisensory predictions.
33
2.2 THE RUBBER HAND ILLUSION Quite recently, in 1998, Botvinick and Cohen, two american psychologists,
discovered an amazing illusion. They noticed that they could induce an ownership
feeling, putting a rubber hand (Figure 7) on a table in front of the subjects and
stroking it in the same way as the subjects’ hidden real hand (Figure 8).
During this illusion, called "rubber hand illusion" (RHI), the ownership feeling of the
real hand decreased in subject, who felt as the rubber hand was his/her own hand.
Hence, RHI became an useful approach to evaluate how sight, touch and
"proprioception" - the sense of body position - combine together to create a feeling
of body ownership, one of the bases of self-consciousness.
Figure 7: An example of the rubber hand used to perform the experiments.
Figure 8: stimulation of both the rubber and the real hand.
34
RHI: procedure and results [Botvinick M, Cohen J (1998): Rubber hands ‘feel’ touch that eyes see.
Nature]
Ten subjects were tested to perform this experiment. They were seated with their
left arm resting upon a small table. A rubber hand was placed on the table directly
in front of the subject. A standing screen was located beside the arm to hide it
from the subject’s view. Thus, the subject was able to see only the artificial hand.
For ten minutes, the experimenter used two small paintbrushes to stroke the
rubber hand and the subject’s hidden hand. This tactile stimulation provided by
brushing both the real hand and the artificial hand should be as synchronous as
possible.
After the stroking, subjects completed a two-part questionnaire. They were ask to
give an open-ended description of their experience and to affirm or deny the
occurrence of nine specific perceptual effects (Figure 9). Three specific questions
were formulated to measure the strength of the RHI:
- “It seems as if I were feeling the touch of the paintbrush in the location where
I saw the rubber hand touched”
- “It seemed as if the touch I felt was caused by the paintbrush touching the
rubber hand” (illusory touch)
- “I felt as if the rubber hand were my own hand” (illusory ownership)
Collecting and analysing the answers, experimenters noticed that subjects
seemed to feel the touch coming from the viewed rubber hand and not from the
hidden real hand.
35
Figure 9: the typical questionnaire used to evaluate the illusion.
Figure 9 shows the questionnaire results. The underline statements describe the
predicted phenomena. Subjects indicated their response using a scale ranging
from ‘agree strongly’ (+++) to ‘disagree strongly’(---). Points indicate mean
responses. Bars indicate response range.
The questions underlined showed a statistically significant tendency to evoke
affirmative responses.
A new hypothesis was formulated accordingly with the results: the mismatching
between of visual and tactile inputs was due to a distortion of position sense.
Thus, a second study was carried out to investigate on the proprioceptive
information during the RHI.
Participants were asked to completed a series of intermanual reaches. With eyes
closed, they had to move their right hand towards the left one. Particularly, they
were asked to align the right index finger with the left one. They had to perform
this task three times.
Then the subjects’ hand was stimulated with the same conditions of the first
experiment but for a longer stroking period, precisely 30 minutes.
After this stimulation period, participants had to perform other three intermanual
reaches as the previous ones.
Results showed a proprioceptive drift towards the rubber hand, meaning that
subjects were incline to move their right finger towards the rubber hand,
experiencing the illusion after the stimulation. The magnitude of this displacement
varied significantly in proportion to the reported duration of the illusion (Figure 10).
36
Figure 10: relationship between the reach displacement towards the rubber hand and the illusion
duration, during 30’ stimulation.
The duration of illusion presence during the stimulation (“illusion prevalence” in the
Figure 10) were proportional to the magnitude of the illusion felt by the subjects.
The more the illusion arose in subjects, the more the illusion prevalence
increased. According to Figure 10, the illusion prevalence are reported along the
x-axis. The displacements of the three reaches made after the stimulation period
from the three made before, calculated as the difference between the means in the
two groups are reported along the y-axis.
As control group, some subjects were stroked asynchronously, with a delay
between the brushing of the two hands. The illusion did not arise in this group.
Results showed that RHI-prevalence (how much time was the illusion present)
dropped from 42% of the stroking period for the synchronously brushed group, to
7% for the control asynchronously brushed group. Moreover the reaching task
displacement in the direction of the artificial hand was not significant in the control
group. Results showed a 23 mm drift towards the rubber hand for the
synchronously brushed group and a 13 mm drift for the control group (Figure 11).
37
Figure 11: drift towards the rubber hand for the synchronously brushed group [ tsakiris&haggard
2005].
Thus, these results confirmed that a synchronous tactile stimulation of the real
hand and the fake hand was a necessary condition in order to induce the RHI.
RHI: conclusion
During a synchronous stimulation of the real hand and the rubber hand,
subjects seemed to feel the touch coming from the viewed rubber hand and not
from the hidden real hand. Subjects felt as if the tactile feedback was arising from
the rubber hand. The strength of illusion varies from subject to subject and could
be measured using a questionnaire or calculating the proprioceptive drift with
intermanual reaches. On the contrary, the illusory ownership feeling could not be
induced with asynchronous stimulation.
This illusion belongs to a class of perceptual effects involving intersensory bias,
resulting when the information available to different sensory modalities is
discordant. Neural integration of vision and touch is essential for developing and
maintaining the sense of bodily self. Thus, modifying visual and tactile perception,
a multisensory conflict arises. Due to this conflict, the brain is no longer able to
create a coherent representation of the body. A misattribution of the fake hand
emerges in subjects, that feel the rubber hand become part of the their body
image,
In summary, these studies suggest that the congruency between vision and touch
seems to recalibrate proprioception. The RHI shows that the intermodal matching
38
between visual, tactile and proprioceptive information can be sufficient to explain
the ownership feeling and the self- attribution of a non-self-object.
RHI: More explorations
During the following years, different variations on the RHI experiment were
performed in order to investigate more about this illusion.
Armel and Ramachandran used a new approach to test the RHI: the stress
response to a threat to the hand, measured by skin conductance response (SCR)
(K. C. Armel and V. S. Ramachandran (2003)). It is known that when our own
body is injured with an external object, the anticipation of pain produces autonomic
nervous system (ANS) arousal that can be registered by the SCR. They exploited
this physiological pain reaction to test what happens when the fake hand was
injured, recording subjects’ skin conductance response (SCR).
As consequence, if the fake hand became integrated into subject’s body image,
subjects displayed a strong skin conductance response (SCR) when the table/fake
hand was ‘injured’, even though nothing was done to the real hand.
Many other studies concerning this topic were conducted along the years.
Experimenters examined what happened in changing the position of the rubber
hand, the tactile stimulation and the correlation with the motor sense of agency
(D.M. Lloyd (2007), F.H. Durgin, L. Evans, N. Dunphy (2007), M. Tsakiris, M.D.
Hesse, C. Boy, P. Haggard, G.R.Fink (2006)).
Other studied were carried out to investigate what happens in the brain during the
RHI. Physiological and neural mechanisms responsible for integrating the tactile
and visual information have been identified. In 1999, Graziano et al. investigated
about how the brain encodes the relative positions of body parts (M. S. Graziano
(1999)). They showed that the position of the arm was represented in the premotor
cortex of the monkey (Macaca fascicularis) brain by means of a convergence of
visual cues and proprioceptive cues onto the same neurons. These neurons
responded to the felt position of the arm when the arm was covered from view.
They also responded in a similar fashion to the position of a false arm.
Moreover, in 2004, Ehrsson et al. studied the neuronal counterparts of the
ownership feeling and the self- consciousness (H. Ehrsson, C. Spence and R. E.
39
Passingham (2004)). The RHI was used to manipulate the ownership feeling of
healthy subjects while brain activity was measured by functional magnetic
resonance imaging (fMRI). Results showed that a neural activity in the premotor
cortex and posterior parietal areas reflected the feeling of ownership of the hand.
Moreover, Tsakiris et al. found a neural activity in the right posterior insula
(Tsakiris, Hesse, Boy, Haggard, Fink (2007)).
These areas contain many neurons that integrate visual, tactile and proprioceptive
information in head
and body-part centred reference frames. Furthermore, these multisensory cells are
sensitive to the temporal and spatial congruency of multisensory signals. This
neuronal system has the capacity to perform the binding of visual, tactile,
proprioceptive and motor signals. This suggests that multisensory integration in
these areas provides a mechanism for bodily self-attribution.
Moreover, scientists and psychologists started also to investigate about the role of
the visual perspective in the process of attributing an external body to the self
(Ehrsson HH (2007), Gibson JJ, ed (1979), Petkova, V. I., and Ehrsson, H. H.
(2008), Slater, M., Perez-Marcos, D., Ehrsson, H. H., and Sanchez-Vives, M. V.
(2009), Slater, M., Spanlang, B., Sanchez-Vives, M. V., and Blanke, O. (2010),
Lenggenhager, B., Mouthon, M., and Blanke, O. (2009), Lenggenhager, B., Tadi,
T., Metzinger,T., and Blanke, O. (2007), Aspell, J. E., Lenggenhager, B., and
Blanke, O. (2009), Valeria I. Petkova*†, Mehrnoush Khoshnevis† and H. Henrik
Ehrsson (2011)).
In spatial cognition research, a basic distinction is made between the first person
perspective (1PP) and the third person perspective (3PP) related to an ego-centric
or allocentric reference frames, respectively (Vogeley, K., and Fink, G. R. (2003),
Klatzky, R. L. (1998)). The 1PP refers to the perception of the visual scene from
subject’s point of view, while the 3PP refers to a view of the same scene from
another person’s viewpoint (Figure 12).
40
Figure 12: 1PP vs. 3PP.
Many studies were conducted to investigate on which of the two basic visual
perspectives (1PP vs. 3PP) was most important for the perceptual illusion of
ownership and for the general mechanisms of attributing a body to oneself.
In these experiments, synchronous visual and tactile stimulation was always
applied both to the fake body in sight of the participant and to the participant’s own
body, out of sight, as in the classical RHI. However, the illusory “own” body was
either viewed from a 3PP, as though looking at another individual a couple of
meters in front of oneself (Lenggenhager et al. (2007, 2009); Aspell et al. (2009);
Petkova and Ehrsson (2008); Slater et al. (2009, 2010), or from a 1PP, as though
directly looking down at one’s body (Ehrsson (2007)).
Results showed that an important factor in determining how we perceive our own
body was the adoption of the 1PP.
This finding demonstrated that the sensation of owning one’s body was the result
of a multisensory integration where visual, tactile, and proprioceptive signals are
combining together in an egocentric reference frame (a coordinate system
centered on the body), which presupposes the 1PP (Costantini M., Haggard P.
(2007); Rizzolatti et al. (1981); Graziano (1999); Ehrsson et al. (2004); Makin et
al. (2008); Petkova, V. I., and Ehrsson, H. H. (2008); Ehrsson, H. H. (2011)).
For the purposes of this project the characteristics and the shape of the rubber
hand is another important aspect to analyze, in order to verify if a virtual hand
could replace the rubber hand during the experiment.
Armel and Ramachandran suggested that the illusion can be generated even
without an arm: they could induce the ownership feeling as well as with a rubber
hand, but simply stroking a table-top (Armel and Ramachandran (2003)).
41
Tsakiris and Haggard contradicted this hypothesis. They proposed that there must
be some correlation between the fake arm and the real arm to let the illusion arise.
Results showed that the fake hand had to look like a real hand to make the illusion
work, and that it should be aligned with the orientation of the real hand (Tsakiris
and Haggard (2005)).
Although the literature suggests that the fake hand must look like an hand, it does
not appear to be important that it looks exactly like the participant’s own hand with
the correct skin color and dimension. This suggests that the illusion may work also
with a virtual hand.
42
2.3 VIRTUAL RUBBER HAND ILLUSION
Mediating the RHI via technology was a further step in the study of the self-
perception.
In 2006 IJsselsteijn et al. used video projections to perform the classical RHI
experiment (IJsselsteijn et al. (2006)). The rubber hand and the participant’s one
were stimulated with a small painter's brush held by the experimenter, like in the
classical experiment. Then the rubber hand stroked by the experimenter was
projected on the table in front of the participant. This VR condition provided a fully
mediated equivalent of the original rubber hand experiment. Results showed that
the RHI still accursed, even if it was less strong than in the classical condition.
In 2008, Hagni et al. studied the ownership feeling combining virtual reality and
mental imagery (Hagni et al. (2008)). As Virtual Reality condition, they used a
video with two moving virtual arms on large screen. Subjects were instructed to
imagine the two arms to be their own. They recorded the increase of skin
conductance response when the right virtual arm was unexpectedly ‘‘stabbed’’ by
a knife and began ‘‘bleeding’’. Results showed that visual input (virtual reality)
combined with mental imagery may induce the brain to measurably temporarily
incorporate external objects into its body image.
In 2008, Slater et al. demonstrated the feasibility of inducing a feeling of ownership
of simulated body parts in a virtual reality, showing that the RHI could be
reproduced also in this kind of environment (Slater et al. (2008)).
Instead of a rubber arm, participants saw a completely 3D virtual arm and hand,
projecting out of their right shoulder. This was achieved with a back-projected
screen onto which a stereo image of an arm was rendered. The real hand was
hidden behind a screen, out of view, and resting on a support (Figure 13).
Subjects were standing in front of a projection screen, wearing glasses with
polarising lenses. The computer generated both left eye and right eye images
which were filtered by the polarising lenses so that passive stereo vision was
realised.
43
Since participants were also head-tracked, the virtual scene was displayed as a
function of head position and orientation. Accordingly, the virtual arm was seen
from the participants’ point of view as projecting out of their right shoulder.
A second tracked device (a Wand) was employed. Thus, its movements in real
space were replicated by the movements of a small yellow ball in the 3D virtual
space. When the experimenter touched with the Wand the real hand of the
subject, the virtual ball touched the virtual hand synchronously. In this way
synchronous visual and tactile stimuli could be applied to the virtual and real hand.
The results were comparable with the original Botvinick and Cohen experiment
with respect to the subjective reporting of the illusion and with respect to the
proprioceptive drift. Both these measures were greater in the synchronous
condition than in the asynchronous condition.
With this experiment they demonstrated that a simulated object can be fully
incorporated into the body representation and become part of the participant.
This phenomenon paved the way to further explorations in VR. Many studied
involving the bodily self-consciousness and VR were therefore carried out
(Lenggenhager et al. (2007, 2009); Ehrsson (2007)); Perez-Marcos (2009); Slater
et al. (2009), Petkova VI, Ehrsson HH (2008)). Experimenters can now face new
problems, testing conditions that were impossible in the physical world, e.g. real-
time modifications of the virtual limb (length, size, appearance) and complex
motions.
Figure 13: Virtual RHI set-up. Left panel: the participant wears passive stereo glasses and a head-
tracker, and the virtual image is determined as a function of his head direction. The experimenter
taps and strokes the participant’s real hand with a special device, whose position is tracked and
44
used to determine the position of the virtual sphere. Right panel: in the projection the participant
sees a sphere striking in synchrony and in the same place on the virtual hand as the touch stimuli
delivered to his own hand.
.
45
3. EXPERIMENTAL AND DATA ANALYSIS METHODS
3.1 RATIONALE OF THE CURRENT EXPERIMENT
The interaction of the human brain with computers is an interesting new area of
applied neuroscience. One application consists in the replacement of a person’s
real body by a virtual representation. From literature (see Chapter 2) we know that
a virtual limb can be felt as part of our body if appropriate multisensory correlations
are provided. In the Rubber Hand Illusion multisensory conflict leads to false
ownership of a fake hand associated with a mislocalization of one’s own hand.
RHI is usually referred to a cognitive and psychological context and only
conceptual models exist to explain the arising of this illusion. Hence, we intended
to investigate this illusion using a more systematic approach.
From the literature we known that the ownership feeling of an external object can
be recreated mediating the RHI by technology, e.g. using the VR. Operating in a
virtual environment, we therefore built a robust virtual set-up that allows a
systematic change of variables and precise reproduction of the RHI experiment
conditions among different subjects.
The current project was carried out to test a crucial aspect of body ownership: the
first-person perspective (1PP) of the participant. From literature we know that 1PP
is a critical factor for triggering the illusion of body ownership (Ehrsson H.H.
(2007); Petkova V. I., Khoshnevis M.and Ehrsson H.H. (2011).
Nonetheless only experiments testing the difference between a scene viewed from
the 1PP and from the 3PP exist. Hence here we wanted to test the ownership
feeling in different perspective conditions, systematically manipulating the 1PP (i.e.
deviations along several axes) through the VR system.
In the planned experiments the 1PP refers to the visual perspective that the
participant has of his/her own hand. We managed to assess the quantitative
effects of the 1PP on RHI with an automatic and standardised approach, modifying
the perspective of a virtual scene viewed by the subjects.
46
Two different studies were performed. A preliminary study (study 1) provided
useful evidences necessary to better understand the results of a second study
(study 2). In particular, study 1 investigated the pure perceptual mechanisms in a
virtual environment, focusing on the conscious perspective perception, without
involving the RHI directly. In study 2 we collected precise measures and evidences
on the effect of changes in perspective and in visual-tactile congruency on the
ownership feeling.
STUDY 1
As previously mentioned, this experiment was not directly related with the RHI.
The objective of this study was to establish the bounds of conscious perception in
changing the perspective of a scene. In this experiment, the analysis was carried
out to evaluate the subjects’ ability to notice a variation of the 1PP among different
virtual scenes. The subjects were asked to evaluate if the experienced virtual
scenes was presented with a 1PP or not, by giving an answer to the question: “Do
you think that the point of view in this trial is the same of yours?”
This experiment based on the idea that subjects observed the scene trough a
virtual camera instead of looking it directly with their own eyes. We accepted that
when the camera was centred on the subject’s head, this scene had exactly the
same point of view of a scene viewed by the subject from his/her position.
The camera was designed to randomly move in the space around subjects’ head,
reproducing scenes by different points of view, according to its position.
Thus, since the perspective of the scene was related to the positions of a virtual
camera, we wanted to identify the range of the positions that allowed the creation
of a 1PP virtual scene. Exploiting the VR, we could systematically modify the 1PP
displacing the virtual camera and collect the relative answers.
Assuming that subjects have access to estimate their own movements using
sensory information provided by the vestibular system, visual system etc.., the
only source of ambiguity about whether the view-point change came from subjects’
own movements or an actual shift in the virtual camera position was a subject’s
47
uncertainty about its own movements and about its head position with respect to
the scene.
This study required a Head Mounted Display (HMD) to deliver and observe the
virtual scene in 3D and a tracking system able to follow the subject’s head position
to compute a correct scene.
The subjects’ answers were then collected and analysed to find out the range of
camera positions that allowed the perception of a 1PP scene.
STUDY 2
The goal of this study was to investigate the role played by the visual perspective
in mechanisms underlying the self-attribution and body ownership.
We suggested that the sense of the ownership of the virtual hand could be
modulated by changing the perspective of the scene. We intended to obtain a
precise measure of the effects of perspective changes exploiting the results
obtained by the study 1.
The analysis was carried out to also evaluate how the RHI arose by properly
combining visual and tactile stimuli. We suggested that the degree of congruency
of the tactile stimulus with the visual stimulus could modulate the ownership
feeling.
In this study, subjects were asked not only to passively observe the virtual scene
as in study 1. Participants interacted with virtual environment, receiving a
synchronous/asynchronous tactile feedback by a vibrating motor attached on their
fingers, after having touched a virtual ball. Then subjects were asked to evaluate if
the experienced virtual hand was felt as their own hand, by answering to the
question: “Do you think that the virtual hand is your own hand?” The answers
depended on the perspective of this virtual scene and on the congruency level
between the visual and the tactile stimulation, conveyed to the subjects during the
experiments.
Therefore, this more complex study required to track the subject’s head and also
the real arm to compute the movements of the rubber hand. To convey the tactile
feedback a vibrating motor was attached on the subject’s finger.
48
The subjects’ answers were then collected and analysed, to find out the role
played by the 1PP and the visual-tactile stimulation in the ownership feeling
mechanisms.
49
3.2 DATA ANALYSIS METHODS
The Bayes Approach And Its Potential Advantages
Bayesian analysis is a statistical method that makes inference on unknown
quantities of interest (in this case the model’s parameters) by combining prior
beliefs about the quantities of interest and information (or evidence) contained in
an observed set of data (Box G.E.P. and Tiao G.C. (1973)) (Fig. 20).
Figure 14: Bayesian analysis combines prior beliefs about the quantities of interest and information
(or evidence) contained in an observed set of data, to make inference on unknown quantities
This is an inductive reasoning: on the base of the current information, the
probability of a future event can be estimated.
In the past, statistical analysis based on the Bayes theorem was often daunting
because of the numerical integrations needed. Recently developed computer-
intensive sampling methods of estimation have revolutionised the application of
Bayesian methods, and such methods now offer a comprehensive approach to
complex model estimation, for example in hierarchical models with nested random
effects (Gilks et al., (1993)). They provide a way of improving estimation in sparse
datasets by borrowing strength (e.g. in small area mortality studies or in stratified
sampling) (Richardson and Best (2003); Stroud, (1994)), and allow finite sample
inferences without appeal to large sample arguments as in maximum likelihood
and other classical methods. Sampling-based methods of Bayesian estimation
provide a full density profile of a parameter so that any clear non-normality is
50
apparent, and allow a range of hypotheses about the parameters to be simply
assessed using the collection of parameter samples from the posterior.
Bayesian methods may also improve on classical estimators in terms of the
precision of estimates. This happens because specifying the prior brings extra
information or data based on accumulated knowledge, and the posterior estimate
in being based on the combined sources of information (prior and likelihood)
therefore has greater precision. Indeed a prior can often be expressed in terms of
an equivalent ‘sample size’.
Bayesian inference is based on Bayes' theorem, expressed by the equation (8):
€
P(ϑ |D) =P(D |ϑ)P(D)
P(ϑ)
(8)
θ is assumed to be the proposition that a hypothesis is true and D represents the
collected evidences.
€
P(D |ϑ)P(D)
is a factor representing the impact of the evidence on the confidence in
the hypothesis. The numerator is called the likelihood.
€
P(ϑ) is the prior probability, concerning the confidence that the hypothesis is
true before the evidence is taken into account.
€
P(ϑ |D) is the posterior probability, concerning the confidence that the hypothesis
is true after the evidence is taken into account.
As a result, Bayesian inference is carried out on the observed data and does not
rely on the assumption that a hypothetical infinite population of data exists.
51
3.2 DATA MODEL
The generic model
A spatial-distribution of the data was obtained by performing the experiments and
collecting the answers from each subject for different camera positions.
A model of the answers’ distribution was used to conduct the statistical analysis
and to obtain the results. This model describes the relationship between changes
of stimuli’s parameters, such as camera position or the congruency level, and the
associated participants’ subjective report.
For study 1 we set that the subject’s answer depended only on the camera
position, in terms of distance between the subject’s head and the virtual camera:
Ans = f (distance)
For study 2, the subjects’ answer depended not only on the distance but also on
the congruency level λ:
Ans = f (distance, λ)
It is reminiscent of the well known psychometric function for studying perceptual
thresholds.
A psychometric curve for 1D stimuli is typically described by a sigmoid function.
The participants’ perception of the stimulus displays on the y-axis and the stimulus
intensity on the x-axis (Figure 15).
52
Figure 15: psychometric curve and how parameter b and a affects its center and its shape are
plotted, using random values for a and b just to provide a qualitative example.
The equation (9) describes the psychometric function σ:
€
f (x) =1
1+ e−x (9)
The shape of a sigmoid curve is specified by two parameters, � and �, contained
in the x value. The value of the parameter b affects the centre of the curve and the
value of the width (and the shape) of the curve (Figure 15). The inflection point of
the sigmoid is situated where the function reaches the value of 0.5. This value is
usually considered as the sensory threshold, i.e. the capability to perceive the
stimulus. If a stimulus is less intense than this sensory threshold, it will not elicit
any sensation in subjects.
According to this idea, the probability of receiving a generic answer is given by:
p(ans|θ,X)= ɸ(X)ans (1− ɸ(X))1−ans (10)
and this represents a binomial distribution1.
1 When the response data, Y, are binary (taking on only values 0 and 1), the distribution function is generally chosen to be the binomial distribution. The binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p: f(k,n,p)=P(K=k)= pk (1-p) n-k. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial; when n = 1, the binomial distribution is a Bernoulli distribution.
53
Thus this model falls within the Generalized Linear Models (GLM) (Nelder J.,
Wedderburn R. (1972)), in which the distribution of the dependent or response
variable Y can be non-normal, and does not have to be continuous, i.e., Y is
assumed to be generated from a particular distribution in the exponential family,
like, binomial and Poisson distributions. The dependent variable values are
predicted from a linear combination of predictor variables, which are "connected"
to the dependent variable via a link function.
GLMs are defined by three elements: a probability distribution from the exponential
family (in this case, a Binomial distribution) P(Y|X), a linear predictor η = Xθ and a
link function g (η). The mean µ of the distribution depends on the independent
variables, X, through:
E(Y) = µ = g -1 (Xθ) (11)
where E(Y) is the expected value of Y. Xθ is the linear predictor, a linear
combination of unknown parameters θ. They are typically estimated with maximum
likelihood method. The maximum likelihood estimates can be computed
numerically by using iteratively reweighted least squares. g is the link function that
provides the relationship between the linear predictor and the mean of the
distribution function. There are several popular link functions for binomial
functions; the most typical is the canonical logit link 2and its inverse, shown in the
formula 13
€
g(η) = log(1/(1− p)) (12)
€
g−1(η) =1/(1+ e−η )) (13)
According to our model, if we take the probability distribution to be binomial, the
linear predictor to be η=θX and the link function to be the logit E(ans)=
€
1/(1+ e−ϑX )) .
2 GLMs with this setup are logistic regression models. In statistics, logistic regression (sometimes
called the logistic model or logit model) is used for prediction of the probability of occurrence of an
event by fitting data to a logit function logistic curve. It is a generalized linear model used for
binomial regression. Like many forms of regression analysis, it makes use of several predictor
variables.
54
According to Eq. 9, this is exactly the equation used to describe a psicophysic
curve:
€
σ(ϑX) =1/(1+ e−ϑX )).
Therefore our model is mathematically identical to a GLM with logit link function
and binomial distribution. Thus we could perform our analysis also applying the
GLM to estimate the parameter.
The values of the estimate parameters were obtained both with the iteratively
reweighted least squares method for maximum likelihood estimation and also with
MCMC. As we will show, the values were not significantly different. Since GLM
was faster to compute in R, we used this method to perform the analysis and the
prediction of the probability of having an answer equal to ‘Yes’.
The idea of a psychometric curve joining stimulus with stimulus-response was
extended in our model. In this project, the camera displacements (together with
congruency level for study 2) corresponded to the conveyed stimuli and the
answer corresponded to the response of the stimulus. For study 1 the stimulus
consisted in displacing the virtual camera along three different direction and the
same for study 2 with the addition of the congruency level as a fourth stimulus
parameter. Hence since in these experiments the stimuli were not 1D stimuli
depending on only one parameter, en extension of the 1D psychometric curve to
an n-dimensional one was needed.
According to our set-up, the camera was randomly placed around the position of
the head. The camera had a different position from trial-to-trial.
The participants’ answer to the questions: “Do you feel that the point of view of this
trial is the same as yours?” (Study 1) and “Do you feel as if the virtual hand were
your own hand” (Study 2) corresponded to the stimulus-response.
Our data-set consists of all the stimulus-responses obtained by all subjects in all
trials. Figure 16 shows the spatial distribution of the collected answers. Each point
represents a data-point that was obtained by placing the camera at the indicated
3D-coordinates and measuring the associated subject’s answer. The big blue
sphere in the center represents the head of the subject. The colour of each point
indicates the answer given by the subject for a specific camera position blue
meaning ``YES'' and red meaning ``NO'', according to the question.
55
. Figure 16: spatial distribution of the collected answers for each camera position (indicated in the
axes of the plot). Different colors are used to discriminate different answers: Blue points = “YES”,
Red point = “NO”. The big blue ball in the middle represents the subject’s head.
The optimum model
To find out which were the parameters that more affected our data distribution, a
choice between several models was made.
In choosing a statistical model, there is always a trade-off between simplicity and
completeness. Simple models tend to be easier to understand, computationally
more tractable, but they are frequently at odds with data.
On the other hand, complicated models tend to fit the data better and to capture
richer conceptual pictures, but they can be computationally awkward or intractable.
To obtain the optimum model a Bayesian model selection was here performed,
applying the Bayesian information criterion (BIC). Bayesian model comparison not
inform about which model is `true', but rather about the preference for the model
given the data. These preferences can be used to choose a single `best' model.
This approach relies on an estimation of different models, by comparing them on
the basis of the posterior probability of the model (Mi) given the data (D), P(Mi|D)
(Eq. 14).
Each model is built combining all the factors that could be involved in the
phenomenon to be model.
This posterior distribution is calculated using Bayes rule.
56
€
P(Mi |D) =P(D |Mi)P(D)
P(Mi) (14)
Assuming two candidate models M0 and M1 are regarded as equally probable a
priori, a Bayes factor (BF) in equation 15 represents the ratio of the posterior
probabilities of the models. The model which is a posteriori most probable is
determined by whether the Bayes factor is less than or greater than one (Kass et
al. (1995)).
€
BF =P(M1 |D)P(M0 |D)
=P(M1 |D)P(M1)P(M0 |D)P(M0)
(15)
Because the Bayes factor is often difficult or impossible to calculate, an alternative
is to adopt an approximation to the Bayes factor, such as applying the Bayesian
information Criterion (BIC). BIC was introduced by Schwarz in 1978 “for the case
of independent, identically distributed observations, and linear models", as a
competitor to the Akaike information criterion (Schwarz (1978)).
This criterium is based in part on the likelihood function and it is closely related
to Akaike information criterion (AIC). When fitting models, it is possible to increase
the likelihood by adding parameters, but doing so may result in overfitting. The BIC
resolves this problem by introducing a penalty term for the number of parameters
in the model. Consequently, BIC tends to favour smaller models than AIC.
Let y denote the observed data and assume that these data are described using a
model Mk selected from a set of candidate models Mk1, Mk2, . . . , MkL . Assume
that each Mk is uniquely parameterized by a vector θk , where θk is an element of
the parameter space ϴ(k). Let L(θk | y) denote the likelihood for y based on Mk.
this term is equal to the approximated model f (y | θk). Let
€
ˆ ϑ denote the maximum
likelihood estimate of θk obtained by maximizing L(θk | y) over ϴ(k).
Thus, the Bayesian (Schwarz) information criterion is defined in equation (16) as
follows:
€
BIC = −2lnL( ˆ ϑ | y) + k lnn (16)
The first term is the familiar probability of the data given the model, computed at
the value
€
ˆ ϑ that maximises this probability. The second term promotes model
57
parsimony by penalising models with increased model complexity and sample
size.
In summary, BIC provides a large-sample estimator of a transformation of the
Bayesian posterior probability associated with the approximating model. Then, by
choosing the fitted candidate model f (y|
€
ϑ ), corresponding to the minimum value
of BIC, one is attempting to select the candidate model corresponding to the
highest Bayesian posterior probability.
58
3.3 ESTIMATE METHODS
Estimation of the model parameters
We used two different methods to estimate the model parameters.
First, an iteratively reweighted least squares method for maximum likelihood
estimation of the model parameters was performed. Recall that the least squares
estimator for the ordinary linear regression model is also the maximum-likelihood
estimator in the case of normally distributed error terms. Assuming that the
distribution of data belongs to the binomial family it is possible to derive maximum-
likelihood estimates for the coefficients of a GLM. Each step of the iteration can be
given by a weighted least squares fit. Since the weights are varying during the
iteration, the likelihood is optimized by an iteratively reweighted least squares
algorithm.
Second, Bayesian inference was performed on the model’s parameters, which
were estimated by the Markov chain Monte Carlo (MCMC) method.
Approximating an unknown but potentially complex distribution by drawing
samples is an effective way to estimate the distribution. In practice, the posterior
distribution can be difficult to estimate precisely. There are various ways to obtain
the posterior distribution (Gelman et al. 2003, Carlin & Louis 2000), here we used
The Markov Chain Monte Carlo (MCMC) approach.
MCMC methods calculate the samples successively from a target distribution
p(Θ|data) (the posterior distribution of the parameters given the data). The idea of
MCMC simulation is to let the parameters perform a random walk in parameter
space according to a Markov chain set up in such a way that its stationary
distribution is the posterior distribution. Each estimated sample depends on the
previous one, hence the notion of the Markov chain.
A useful algorithm for setting up the Markov chain is the Metropolis-Hastings (MH)
algorithm. (Metropolis and Ulam 1949; Metropolis et al. 1953; Hastings 1970).
59
The MH algorithm is a Markov chain Monte Carlo method for obtaining a sequence
of random samples from a probability distribution for which direct sampling is
difficult. The general idea of the algorithm is to use a Markov chain that, at
sufficiently long times, generate states that obeys the data distribution. To this
purpose, the Markov chain must fulfil two conditions: ergodicity and condition of
balance, which is generally obtained if it fulfils detailed balance (a stronger
condition). The ergodicity condition ensures that at most, one asymptotic
distribution exists. The detailed balance condition ensures that there exists at least
one asymptotic distribution.
The MH-algorithm works as follows (Gelman et al . 2003):
1. Choose a starting value Θ0 for parameters Θ.
2. For u=1, 2, ….
a. Choose a candidate point Θ∗ (proposal) from a known distribution at iteration
u, given the previous sample Θu−1. Denote the known distribution by
Ju[Θu|Θu−1], called proposal distribution.
b. Calculate the acceptance rate
€
r =p(Θ* | data)p(Θu−1 | data)
(17)
c. Set
€
Θu =Q*,with probability min(r,1)
Qu−1,otherwise
⎧ ⎨ ⎩
In this way a list of Θu is generated and the Θu with u > uburn-in constitute (a sample
of) the posterior distribution for Θ, uburn-in being the point where the process is
considered to have converged to its stationary distribution; the period up to this
point is called the burn-in period. In general, due to the time necessary to reach
stationarity, we discard part of the samples in the beginning of the Markov chain.
Furthermore, due to the dependency of the samples on the past samples
introduced by the Markov chain, it is better to consider for statistical analysis only
the samples every few time steps. This is called “thinning”. The size of the thinning
step should be chosen based on the correlation length of the chain.
60
Thus, for a typical MCMC method the non-normalized target distribution p(Θ|data),
the number of Markov chain steps, the size of the burning phase and the thinning
steps need to be specified.
Each parameter is then described by the mean and the standard deviation of all
the samples computed with the MCMC algorithm.
Once the mean value of each model parameter is estimate, additional statistical
inferences can be carried out as desired.
61
4. MATHERIALS & METHODS
4.1 SUBJECTS
For these experiments twenty-five participants (six for study 1, nineteen for study
2) were recruited from the EPFL - École Polytechnique Fédérale de Lausanne
(CH). Subjects’ age varied from 20 to 30. Volunteers were asked to read and sign
an information consent form and they were paid 20 CHF/hour.
Specific requirements were not needed in choosing subjects. None of the subjects
reported to have any neurological defects that could influence the performance in
conducting the experiments. Not all the participants were naïve with respect to the
rubber hand illusion. Four of them had performed other experiments involving the
RHI, involving other projects, always carried out in the laboratory LNCO.
4.2 TOOLS
Two main tools were used to perform the experiments: a tracking system and a
head-mounted display (HMD). These two devices were combined together, putting
4 markers of the tracking system on the HMD.
Tracking system
A tracking system was necessary to let subjects interact with the virtual scene.
To perform the two experiments, the movements of the subjects’ head and arm
had to be tracked. Thus, when subjects turned their head the scene changed
according to its head position. Furthermore when subjects moved their own arms,
the virtual one moved too, following the real hand displacements.
We used a wireless, optical motion capture system: the ATC ReActor2.
The system operates through a network of infrared light emitting markers and
sensing detectors. The latter are contained in an open-sided rectangular frame
62
made of aluminium-extrusion. The frame defines the motion-capture workspace. A
base station collects the data acquired by the detectors.
This computer, the ReActor PC, collects the data acquired by the sensor modules,
and calculates the position of the active markers.
In addition, it runs FusionCORE™, a Windows software interface (Figure 17) that
interprets the data collected by the motion capture system.
Figure 17: the fusionCORE™ software interface.
An active tracking system such as ReActor 2 has many advantages. It can be
used in natural light and it does not requires any previous calibration.
Since a significant amount of signal processing is already carried out by
electronics in the bars and in the PC, only a small amount of noise is present in
ReActor 2’s data. Limited post processing is therefore required. Markers
occlusions are also intelligently handled thereby minimizing marker dropout.
Marker cables could usually represent one of the disadvantages in these kind of
tracking systems. Subjects are free to move in a capture area, but sometimes they
could be obstructed by the cables and they can not perform the movements as
requested.
63
In our set-up restricted movements were performed: subjects moved only the arm
and no other more complex movements were required to execute the task. Thus,
cables did not affect subjects’ movement.
In the ReActor 2, the active IR markers are connected to a belt pack by thin,
flexible cables. The belt pack communicates with the ReActor 2 PC via a wireless
battery powered radio link. During the experiments, subjects were placed in the
middle of the capture area, wearing the belt, with four active IR markers on their
head (on the HMD) and two on their right hand (Figure 18).
Figure 18: subject seated in the middle of the capture area while perfoming the experiment.
Head-Mounted Displays (HMD)
During the experiments, participants wore a head-mounted display set (Figure 19).
Head Mounted Displays are devices that allow the subject to observe the 3-D
world simulated with the computer. It is fully immersive, creating a sense of
presence and allowing subjects to experience the feeling of being inside the virtual
scene. Thus, with the HMD subjects can look and move around in the computer 3-
D simulated world as if they were actually inside it.
The Head Mounted Display used for the experiments was the fakespace Wide5
HMD. It consists in a special 3D glasses with two lenses. It permits a true
stereoscopic vision with a wide field-of-view (~150° Horizontal/~88° Vertical).
64
Figure 19: Head Mounted Display (HMD)
In our experiments we combined the tracking system with the HMD to make
subjects fully integrated within the virtual scene. Putting the IR markers around
the helmet allowed to track the position of the subject’s head. Then the graphics
hardware computed the information about the head’s position and updated the
image to match the motion of the participant's head. In this way participants were
able to "look around" the virtual scene. In our set-up, the original designed scene
was composed of 4 different images (Figure 20). Conveying them to the HMD,
they merged together and subject could watch a single 3D image.
Figure 20: the virtual scene seen without wearing the HMD.
The only limit of the Wide5 HMD was the resolution (1600 x 1200 pixels at 60 Hz).
The image seen on the computer monitor was more resolute in term of details than
the 3D one viewed by the subjects through the HMD. This sometimes caused
65
problems in recreating a virtual scene similar to a real one, in term of quality and
precision of images.
4.3 SETUP
A virtual scene was necessary to perform these experiments.
We recreated a virtual room, similar to a real one, using OpenGL and Python.
A ground, a gray table, a red ball, a hand with a forearm and a lamp were
combined together to create this virtual room.
• Arm
This object represents a virtual arm, with hand, fingers and forearm.
A pre-existent hand model is used to create it. A real hand texture has been
used to obtain a virtual hand as similar as possible to a real hand.
A limitation of the framework was the impossibility to rotate the virtual object
as a physiological arm. The virtual arm can only move through rigid
translations along Cartesian axes. Neither wrist rotations are allowed.
These physical restrictions discord with the physiological movements. We
were forced to limit the subjects’ movement in a restricted area to avoid this
collision between real physiological movements and virtual movements.
• Light
This object, a yellow ball at the top of the virtual scene, represents a light. It
is created to illuminate the virtual scene and to properly see the objects and
the colours.
• Ground
The ground is created to give the impression to be inside a real room. A
special texture is used to make this object as close as possible to reality.
• Table
This object represents a gray table (120cm x 60cm) with 4 legs (100cm).
The virtual hand and the red ball are positioned over the table.
• Ball
This object represents a vibrating red sphere (radius = 4 cm), placed on the
table.
66
In combining all these objects, the virtual scene appeared like a room, in which the
subject was able to interact during the experiment (Figure 21).
Figure 21: the virtual scene, with a table, a ball and a hand.
By wearing the HMD, the subject was able to watch this scene in 3D, as if he/she
were inside it. The original image was composed of 4 different images of the same
scene. Using the HMD, they merged into a single 3D image.
The position of subject’s head was tracked with the ReActor2 motion capture
system. Thus the system could recognize the subject’s position in the virtual scene
and let him interact with all the objects or move inside the room (e.g. he could walk
around the table or watch under it).
A second scene shown in Figure 22 was created to help the operator in
conducting the experiment.
67
Figure 22: the virtual scene viewed by the operator. Green sphere: subject’s head. Blue cone:
camera. Blue grid: volume inside witch the camera randomly moves. Red line: bias between
camera and head position. Yellow ball: light.
This virtual scene represents the same previous scene but in more detail, with
some objects necessary for the operator. The subject was not allowed to watch
this virtual scene.
All the objects of the previous scene - the gray table, the red ball, the hand with a
forearm, the ground and the lamp - are still present in this scene.
But three new objects were created: a green sphere, a blue cone and two blue
spherical grids.
• Green sphere
This green sphere represents the head of the subject. It moves when the
subject moves his head, tracked by the ReActor2 motion capture system.
• Blue spherical grids
It consists of two concentric spherical grids built around the subjects’ head.
The region between the two spheres is the space where the virtual camera
can move.
• Blue cone
This cone represents the virtual camera that can randomly change its
position moving inside the blue grid, along the x,y,z axis.
The perspective of the virtual scene changes according to the cone position
and orientation. The distance between the cone and the green sphere is
68
considered as the bias (red line) between the camera and the subject’s
head (Figure 22).
This scene was created to monitor the camera position and the subject’s head
position. The operator was able to know the point of view of the scene presented
to the subject, by comparing the position of the camera (blue cone) with the
subject’s head position (green ball). Thus the operator immediately knew if there
was a perspective shift in the scene presented to the subject by looking at the bias
between the blue cone and the green ball.
Moreover, checking this scene the operator was immediately able to verify the
subject’s position in the virtual room, in terms of subject’s distance from the table,
the head position and the hand position. This was useful especially at the
beginning of the experiment, to fix the subject in the correct position in the virtual
environment.
Figure 23: virtual scene viewed by the operator while the subject was performing the experiment.
To better understand the experiments, it’s important to explain the meaning of
displacing the camera in the space around the subject, along x,y and z axis.
Figure 24 helps to comprehend the direction of the camera movements with
respect to the subject. As shown in Figure 22, the virtual scene is built on a
69
Cartesian coordinate system. According to Figure 24 we indicated with x the
transverse axis, with y the longitudinal one and with z the saggital one.
Figure 24: 3 different body axes for 3 different camera movements.
Thus, changing the camera position along the z axis means moving it along the
sagittal axis of the subject. The virtual scene moves near/far from the subject.
Moreover, changing the camera position along the y axis means moving it
vertically along the longitudinal axis and the virtual scene moves up/down.
Finally, changing the camera position along the x axis means moving it laterally
along the transverse axis of the subject, with a consequent movement of the
virtual scene to the left/right.
z-Sagittal axis
X - Transverse axis
Y - Longitudinal axis
z
y
x
70
4.4 PROCEDURE
Study 1
This first study is not directly related with the RHI. As previous explained, we
wanted to found out the camera positions necessary to produce a 1PP virtual
scene. To establish these camera positions analyses were carried out to evaluate
how subjects could notice differences between the virtual scenes characterized by
different points of view - depending on different camera positions. In other words,
subjects were ask to evaluate if the virtual scene was presented in 1PP or not.
To perform the experiment an experimental flow was developed (Figure 25). The
flow coordinated the different scenes presented to the subjects. It was divided into
three main parts (Figure 25). First, a black screen appeared to the subject for 1
second at the beginning, then the virtual scene appeared for 5 seconds and lastly
the question for 3 seconds.
Figure 25: the experimental flow. The experiment was divided into 3 main parts: black screen,
virtual scene and question.
This sequence represents one single trial, in which the subject had to perform the
task.
In order to collect data 180 trials were performed by each subject. Every 60 trials a
break momentarily stopped the experiment to let the subject rest for a while. This
Black Screen
(1s)
Scene
&
task performing
(5 s)
Ques;on
(3s)
71
break was not fixed, each subjects could decide when to go on with the
experiment.
The participants were comfortably seated at a chair in the virtual room in the dark
(Figure 26).
Figure 26: one subject during the experiment.
The HMD device was placed on the subject’s head to let the subject watch the
scene. Four markers were attached on the HMD. These four markers tracked the
subject’s head position. No other device was necessary for this experiment. We
used the ReActor2 motion capture system to record the subject’s head position, by
monitoring the position of the four markers. The arm was not tracked in this
experiment.
The subjects were asked to simply watch the scene and answer to the question,
when it appeared on the scene. The question was: “Do you feel that the point of
view of this trial is the same as yours?".
72
Figure 27: question for the study 1: “Do you feel that the point of view of this trial is the same as
yours?"
This task could be thought as a passive task, as the subject did not interact
actively with the virtual scene.
As the objective of this study was to establish the bounds of conscious perception
in changing the perspective of a scene, to achieve this goal different viewpoints of
the same scene was shown to the subject by changing the camera position. For
every trial the virtual camera changed its position in a random manner,
displacing in the volume between two concentric spheres – the blue grid described
in the operator display. For every trial we recorded the camera coordinates [x,y,z]
and the subject’s answer related to that position.
After every 60 trials the experiment was interrupted to allow the subject to rest for
a while. Then, when the subject agreed, the operator could restart the experiment.
Study 2
As previously mentioned, the objective of this second study is to precisely
measure the effects of a viewpoint change on the ownership feeling of a virtual
hand.
As for the study 1, the position of subject’s head was tracked with the ReActor2
motion capture system. For this study the subject’s right hand was also tracked.
Thus, when subject moved his real hand the virtual hand moved as well, following
the movements of the real one. In this experiment, another tool was also
employed. A vibrating motor was attached on the tip of the subject’s index finger. It
73
conveyed the tactile feedback that together with the virtual stimulus recreated the
same condition of the RHI.
The task involved in study 2 could be thought as an active task, because the
subject actively interacted with the virtual scene, reaching the target and receiving
a tactile feedback.
To perform the experiment an experimental flow was developed (Figure 25). The
flow coordinated the different scenes presented to the subjects. It is divided into
three main parts as in study 1, only the timing of each part was different in these
two studies. A black screen appeared for 1 second at the beginning, then the
virtual scene appeared for 10 seconds and lastly the question for 3 seconds. This
sequence represents one trial, in which the subject had to perform the task.In
order to collect data 180 trials were performed by each subject. Every 60 trials a
break allowed the subjects to rest for a while. Even in this study, the participants
were comfortably seated in a chair in the virtual room in the dark. The HMD device
was placed on the subject’s head to let the subject watch the scene. Four markers
were attached on the HMD (Figure 26). These four markers tracked the subject’s
head position. Two markers were attached on the back of the subject’s right hand.
We used the ReActor2 motion capture system to record the subject position by
monitoring the position of the six markers. One vibrator was also attached to the
tip of the right index finger (Figure 28).
Figure 28: set-up for the subject's hand; two markers placed on the back of the hand and one
vibrating motor on the tip of the finger.
74
At the start of the experiment some habituation trials were presented to the subject
to let him get familiar with the virtual scene. In particular he/she was instructed to
touch the red ball several times for each trial.
Then the real experiment started and the subject was asked to observe the scene
and touch the red virtual ball. Then he, or she, had to answer to the question about
the ownership feeling during the task. The question was: “Do you feel the virtual
hand as your own hand?" (Figure 29).
Figure 29: question for the study 2: “Do you feel the virtual hand as your own hand?"
As the subjects' hand was tracked, when they moved their real arm, the virtual arm
in the scene moved too. Thus, while they moved their own arm to catch the ball,
they could see the virtual arm touching the ball too. Every time the subject reached
the red ball, this became green and it changed its position on the table. These
changes in colour and position of the virtual ball represent the visual stimulus that
was necessary to induce the RHI.
At the same time, the subject received also a tactile stimulus by the vibrating
motor on his finger. The virtual ball was designed to have a fast wave movement.
Thanks to this movement, the ball had the impression of being a vibrating ball.
When the subject touched the vibrating virtual ball, he felt the vibration on his
finger, due to the motor. Thus, he got the impression to really touch the ball. The
tactile stimulation could occur either synchronously or asynchronously with the
visual stimulus. To get the illusion, we assumed that the subject received a
synchronous visual and tactile stimulation -the subject felt the vibration, at the
same time he was touching the ball. Conversely, as control condition, an
75
asynchronous stimulation was provided. In this condition, there was not a tactile
feedback (no vibration) when touching the ball.
The tactile stimulation depended on the congruency level λ randomly set in each
trial. λ represents the probability of receiving a vibration when touching the target.
Hence, the tactile feedback, given by the vibrating motor, occurs randomly,
depending on the degree of visual-tactile stimuli congruence λ.
Moreover, by changing the camera’s position, the perspective of virtual scene
changed too. Thus we could analyse how much the ownership feeling was
affected by both changing the position of the camera and the degree of
congruency of the tactile stimulus with the visual stimulus. For each trial, the
camera’s position, the congruency level and the related subject’s answer were
recorded. After every 60 trials, the experiment was interrupted to allow the subject
to rest for a while. Then, when the subject agreed, the operator restarted the
experiment.
76
4.6 PILOT STUDY Pilots experiments involve all the preliminary studies conducted before running the
main experiment to validate and calibrate the final set-up and to check that final
experiment works as expected.
The particular setup that we used was achieved at after pilot studies with about 10
participants using different configurations that either did not result in the illusion, or
in which the illusion was reported to a much lesser extent.
The pilot study provided useful enhancements to the virtual scene, which was
modified to become more similar to the real scenario.
We initially focused on the size of the virtual objects which needed to be tailored
according to the physical dimensions of the subject. We fitted the virtual table
height using a pilot subject. He/her was asked to sit on the chair and to indicate if
the virtual table was like a real table in terms of height.
We then focused on the position of the virtual ball and the interaction with the
subject. Subjects were asked to reach the ball by moving their arm and, initially, no
constraints were imposed on the position of the object. However, participants often
performed physiological movements involving the rotation of the shoulder, the
elbow and the wrist. This was particularly frequent when the virtual ball was on the
left of the subjects. The virtual arm could not reproduce this complex movement as
it could only rigidly translate towards the left. This non-coherent position of the
virtual hand made the subjects aware that it was not their real one. Furthermore,
another problem involved the virtual arm itself. As appear in Figure 30, this virtual
arm was “cut” and just an hand with the forearm was reproduced in the virtual
environment.
Figure 30: the virtual hand with the forearm as appear in the scene. It lacked of physiological parts
such as the elbow, the arm and the shoulder.
77
When the virtual ball appeared too far from the subjects, they could notice that arm
was not physiologically entire by moving it. Thus they understood that the arm was
a fake arm. To solve these two problems involving the virtual arm position in the
virtual scenario, we therefore restricted the area where the virtual ball could move.
After these modifications, the virtual ball could move only on the right of the table
top, to prevent rotations, and on the bottom of the table top, to prevent that
subjects noticed the incomplete reconstruction of the arm.
Pilot experiments were also useful to improve the interaction between the virtual
object and the subjects, in terms of tactile feedback perceived from the ball.
As the subject received a vibrating tactile feedback when touching the ball, we
decided to assign a visual vibration to the virtual object (to convey a more realistic
contact feeling).
Another problem we addressed with the pilot experiment was the 1PP of the
scene. With our set-up, the region between the two blue spherical grids defined
the space where the virtual camera could move. The perspective of the virtual
scene changed according to the camera displacements through this area. Hence,
if this area was dimensionally inadequate, the camera could not move enough to
induce an evident perspective modification. As a consequence, subjects did not
notice a different perspective of the virtual scene. To avoid this inconsistency, the
size of the blue spheres was carefully chosen to obtain the optimal range required
for our studies.
78
4.7 DATA MODEL
Study 1
The goal of study 1 was to estimate the probability of perceiving a 1PP virtual
scene. The subject had to answer to the question: “Do you feel that the point of
view of this trial is the same as yours?” saying “YES” or “NO”.
The answer “”YES” meant that the subject was not conscious that the scene’s
perspective had changed and so he/she perceived the virtual scene as a 1PP real
scene. The subject answered “NO” if he/she noticed that the scene was presented
with a different perspective with respect to his/her visual 1PP.
We wanted to estimate the probability of having an answer equal to yes
(P(ans=YES)), systematically moving along x, y, z axes the virtual camera to
change the perspective of the virtual scene,. For each trial, we collected the
camera position and the subject's answer.
Figure 16 shows the spatial distribution of the collected answers. Each point
represents a data-point that was obtained by placing the camera at the indicated
3D-coordinates and measuring the subject’s answer. The big blue sphere in the
center represents the head of the subject. The colour of each point indicates the
answer given by the subject for a specific camera position blue meaning ‘YES’ and
red meaning ‘NO’.
A model of the data was then created to describe and reproduce the relationship
between answers and distance between subject’s head and virtual camera.
We represented the camera position along each of the three axes x, y, z,
separately. To extend the 1D psychometric curve (see chapter 3.2) some
parameters needed to be introduced: two vectors b and b0 and a matrix A.
The term b represents the 3D stimulus given by displacing the virtual camera. It is
a vector made by three components: b= [dx, dy, dz], representing the current
camera position in the virtual scene.
79
A is the matrix of the parameters that determine the shape of the answers’
distribution. It could be thought as a scale factor. It describes how the camera
deviations in each direction affect the probability of saying YES.
€
A =
dxx dxy dxzdyx dyy dyzdzx dzy dzz
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟
The elements dxx, dyy and dzz along the diagonal represent the camera
displacements along the x,y,z axes respectively. The matrix A models the
possibility that subjects’ answers may have different widths for camera’s biases in
different directions.
The term b0 is a vector related to the center of the answers’ distribution. It
represents the point in the space that corresponds to the highest probability of
saying YES.
The model was constructed as follows.
P(ans= YES| b, A, b0) = σ [- (b-b0) A (b-b0)T + intercept ] (18)
The term
€
p(ans =YES |b,A,b0) represents the probability of having the answer
equal to yes, given b, A and b0 . Function σ is the sigmoid curve of the equation 9
that represents the psychometric curve shown in Figure 15.
(b – b0) represents the bias between the current camera position b and the
position that corresponds to the highest probability of having yes. We called it “b0”,
under the hypothesis that if the camera was place in [x,y,z]=[0,0,0] we had the
maximum probability to have a 1PP scene since the camera was aligned with the
subject’s head.
Hence, the scalar argument of the sigmoid is a linear transform of the 3D bias (b-
b0) that permits according to A parameters to describe arbitrary ellipsoids around
the head position. The minus sign emphasizes that the probability of the YES
answer (i.e., not feeling a difference from normal view) decreases with the
perceptive bias.
80
The idea behind this model is that for stimuli b which are close to the center b0, the
probability of saying “YES” will be maximum. For a stimulus far away from b0 this
probability will decrease, depending on the values of the parameter A in each
direction. Thus, in this model the parameters b0 provides a generalisation of the
center in the standard psychometric curve while the parameter A provides a
generalization of the width/shape.
As already explained in ch.3.2, a model of this type falls in the class of General
Linear Models (GLMs).
According to the Eq. (18), X represent our data set, in terms of camera positions
and θ is the set of parameters we wanted to estimate (A, b0).
With this notation, Eq. 11 becomes:
P (ans=Yes| θ,X )= sigmoid (θX)
To find out which were the factors that more affected the answers, we conducted a
model selection using the fuction
glmulti(formula,data,family,level,crit) in R, an open source
environment for statistical computing and graphics. This function carries out an
automated model selection and multi-model inference. It finds what the n best
models among all possible models are. Models are fitted with the specified fitting
function (default is glm) and are ranked using the BIC (Bayesian information
criterium). The best model is the one with the minimum BIC value. fit_persp=glmulti(ans~(dx+dy+dz+dxx+dyy+dzz+dxy+dxz+dyz),data
=D_persp,family=binomial(),level=1,crit=bic) was the specific
function applied in R. As formula, we used fitted glm model specifying the
response variable and the terms (main effects and/or interactions) to be used in
the candidate models. Level = 1 means that only main effects (terms of order 1)
were used to build the candidate set. A description of the data distribution
(family=binomial) and link function (logit is set by default) to be used in the
model was also provided.
After having analysed 500 models combining all the factors that could affect the
answer, the best found model (Table 1) was the one that considered only dy (bias
81
between the subject’s head and the camera position along the y axis – vertical
axis), dyy (camera moviments along y axis – vertical displacements) and dxx
(camera moviments along x axis – left/right displacements) as the factors that
affected the subjects’ answers. This meant that all the other factors (dx, dz, dzz,
dxy, dxz, dyz) did not count so much in provide a positive answer.
Table 1: Optimum model for study 1
glmulti.analysis Method: h / Fitting: glm / IC used: bic Level: 1 / Marginality: FALSE From 100 models: Best IC: 1141.62559458361 Best model: ans ~ 1 + dy + dxx + dyy" Evidence weight: 0.732566719723095 Worst IC: 1188.17526126577 1 models within 2 IC units. 6 models to reach 95% of evidence weight.
Table 2: Values for optimum model for study 1
Table 2 shows that the factors dyy and dxx (vertical and lateral camera
displacement) with the factor dy (vertical bias) are the elements that more affect
Nb models Importance dzz 46 0,03101595 dxz 43 0,03117299 dz 44 0,0341906 dx 44 0,05051286 dxy 45 0,05077023 dyz 48 0,10359727 dyy 64 0,99999917 (intercept) 100 1 dy 100 1 dxx 100 1
82
the subject’s answer, with an high importance rate (~1) and with more than 60
models that contains them.
Study 2
The goal of study 2 was to measure the ownership feeling of a virtual hand viewed
from several perspectives, displacing the virtual camera and by changing the
congruency level.
The subject had to answer to the question: “Do you feel as if the virtual hand were
your own hand?” saying “YES” or “NO”. The answer “YES” means that the
ownership feeling arose in the subject.
We wanted to estimate the probability of having an ownership feeling
(P(ans=YES)), by changing the perspective (camera position) and the congruency
level.
As for study 1, a data set was obtained by collecting the answers of each subject.
Figure 16 shows this spatial distribution of the collected answers, around the head
of the subject –blue sphere-. Each point represents the answer – blue= YES / red=
NO - for a specific camera position and a specific congruency level.
As for the previous study, some features of this distribution need to be extracted
and a model was therefore required. In this study, not only the camera position
affected the subject’s answer, but also the congruency level between the tactile
and visual stimuli.
Thus, to extend the 1D-stimulus psychometric curve some parameters needed to
be introduced. We used the same parameters b, b0 and A to model the
perspective effect on the answer and we introduced a new parameter C, modelling
the effect of the tactile stimulus on the answer.
This probability of having an answer equal to ‘Yes’ was estimated by the following
model.
P(ans= YES| b, λ, A, b0, C) = σ [- (b-b0) A (b-b0)T + Cλ + intercept] (19)
83
The term
€
p(ans =YES |b,λ,A,b0,C) represents the probability of having an answer
equal to yes, given five parameters: b, λ, A, b0, C.
The parameters A, b0 and b are the same already explained for the study 1. The
term λ represents the value of the congruency level that control whether visual and
tactile stimuli were in synchrony. C is the new coefficient that controls the
influence of the congruency level λ on the probability of having YES. Finally, σ is
the sigmoid curve that represents the psychometric curve, used to build this model
(Figure 15).
In this model, the viewpoint displacement (b – b0) and the congruency level λ, are
independent sources (factors) that we use to explain the subject answers.
As already explained in ch.3.2, a model of this type falls in the class of General
Linear Models (GLMs). According to the Eq. (11), X represent our data set, in
terms of camera positions and θ is the set of parameters we wanted to estimate
(A, b0,C).
With this notation, Eq. 19 becomes:
P (ans=Yes| θ,X )= sigmoid (θX)
and θ is the set of parameters we wanted to estimate (A, b0, C).
Again we performed the model selection using the fuction glmulti in R, as for
study 1. fit_rhi=glmulti(ans~(lambda+dx+dy+dz+dxx+dyy+dzz+dxy+dxz+dyz)
,data=D_rhi,family=binomial(),level=1,crit=bic) was the function
that we used.
After having analysed 1050 models combining all the factors that could affect the
answer, the best found model was again the one that considered only dy (bias
between the subject’s head and the camera position along the y axis – vertical
axis), dyy (camera moviments along y axis – vertical displacements) and dxx
(camera moviments along x axis – left/right displacements) as the factors that
affected the subjects’ answers. But now also the displacements along dzz (moving
84
the camera far/near to the subject) and the congruency level of the tactile
feedback (λ) affected the answer (Table 3).
Table 3: Optimum model for study 2
glmulti.analysis Method: h / Fitting: glm / IC used: bic Level: 1 / Marginality: FALSE From 100 models: Best IC: 3336.13594168629 Best model: [1] "ans ~ 1 + lambda + dy + dxx + dyy + dzz" Evidence weight: 0.524233128654795 Worst IC: 3360.82452647859 2 models within 2 IC units. 8 models to reach 95% of evidence weight.
In Table 4 we can see that the factors dxx, dyy, dzz, dy and λ are the elements
that more affect the subject’s answer, with an high importance rate (>0.5) and with
more than 40 models that contains them.
Table 4: Values of optimum model for study 2
Nb models Importance dyz 34 0.01959709 dx 34 0.01980592 dz 34 0.0204347 dxz 34 0.02130514 dxy 36 0.10125568 dzz 46 0.64445074 dxx 60 0.95796294 dyy 84 0.99942761 (Intercept) 100 1 lambda 100 1 dy 100 1
85
4.8 DATA ANALYSIS
Estimation of the parameters
In order to estimate the parameters of the model, we performed two different
method: the MCMC method and the iteratively reweighted least squares method
for maximum likelihood estimation. R provides us two specific function to compute
the estimation of the parameters: the MCMClogit function and the glm function.
MCMClogit function
Performing the MCMC methods on the data, the model’s parameters were
extracted (see ch.3.3). We analysed the data operating with R, using the
dedicated function MCMClogit.
The function MCMClogit(formula, data, burnin, mcmc, thin,
marginal.likelihood)is used to perform the analysis. This function generates
a sample from the posterior distribution of a model using a random walk Metropolis
algorithm. We supplied the data distribution, the subjects’ answers associated with
the camera positions- and the prior of the elements of the matrix A and the
elements of the vector b0 . The main concept about the prior is that it should be
non-informative. It should not “bias” the parameters of the model more than the
data itself. Thus we chose a Gaussian prior with variance equal to 100 because a
Gaussians with “large” standard-deviations is known to be uninformative (spread).
We set burnin = 5*103..This represents the number of burn-in iterations for the
burning phase, to be sure that the algorithm reached the stationarity. Then we set
thin=100, to avoid the dependency of the samples on the past ones and mcmc=
5*104 as the number of the Metropolis interactions.
Finally, applying these value in the function, we obtained an acceptance rate of
~0.3 in both studies, that is satisfactory (It is generally accepted (that an
acceptance rate of about 20% is right (Gelman, Roberts, and Gilks, 1996)). Using
these values for the burning phase and the thinning interval, the convergence was
reached. In general, convergence refers to the idea that MCMC technique that we
choose will reach a stationary distribution and from this point on the samples stay
86
in this distribution. Thus, since the algorithm has converged, samples from the
conditional distributions are used to summarize the posterior distribution of
parameters of interest.
glm function
glm(formula, family, data) is used to fit generalized linear models, specified by
giving a symbolic description of the linear predictor and a description of the data
distribution. We set family = binomial (and its link function logit set by
default), according to our model.
STUDY 1: estimate of parameters A and b0.
We needed to estimate the parameters A and b0 to implement the eq.18.
The coefficients of the parameters are shown in the following tables.
MCMClogit function
The function mcmc_persp = MCMClogit(I(as.numeric(ans)-
1)~(dx+dy+dz+dxx+dyy+dzz+dxy+dxz+dyz),data=D_persp,tune=0.6,b
urnin=5e3,mcmc=5e4,thin=100) is used to perform the analysis. An
acceptance rate equal to 0.36 was obtained. Thus a sample (mean and standard
deviation of all the samples) from the posterior distribution was used as estimated
parameters.
The coefficients’ values and the relative confident intervals are reported in Table 5.
As we explained before, the factors that most affect the subject’s answer were the
terms dy, dxx and dyy, so we focused on them. Table 5 shows that dy has a
negative value. This means that displacing the camera up/down, the maximum
probability of having ‘yes’ is not centre in zero, when the camera is in the same
position of the subject’s head, but there is a bias indeed. Also dxx and dyy has a
negative values, showing that the probability of having ‘yes’ decreases moving the
camera along the axes x and y, far from the subject’s head.
87
Looking at the confident intervals, we are 95 percent certain that the parameters
dy, dxx and dyy are in that range of values (negative values).
88
Table 5: Output of the MCMC estimation, study 1
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ The Metropolis acceptance rate for beta was 0.36669 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > summary(mcmc_persp) Iterations = 5001:54901 Thinning interval = 100 Number of chains = 1 Sample size per chain = 500 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-‐series SE (Intercept) 1,28E+00 0,1543882 6,90E-‐03 7,60E-‐03 dx 5,29E-‐03 0,0058971 2,64E-‐04 2,46E-‐04 dy -‐7,13E-‐02 0,0066398 2,97E-‐04 3,09E-‐04 dz -‐3,82E-‐03 0,0058995 2,64E-‐04 2,80E-‐04 dxx -‐3,87E-‐03 0,0004837 2,16E-‐05 1,83E-‐05 dyy -‐2,93E-‐03 0,0004979 2,23E-‐05 2,56E-‐05 dzz 6,36E-‐05 0,000449 2,01E-‐05 2,12E-‐05 dxy -‐5,37E-‐04 0,0005606 2,51E-‐05 2,87E-‐05 dxz -‐8,35E-‐05 0,0005378 2,41E-‐05 2,64E-‐05 dyz 1,02E-‐03 0,0005909 2,64E-‐05 3,05E-‐05
2. Quantiles for each variable: 2,50% 25% 50% 75% 97,50% (Intercept) 0,974774 1,1866541 1,29E+00 1,37365 1,6145712 dx -‐0,0056012 0,0014279 5,34E-‐03 0,0091658 0,0166227 dy -‐0,0840809 -‐0,0760146 -‐7,16E-‐02 -‐0,0668546 -‐0,0578921 dz -‐0,0157524 -‐0,0077272 -‐3,74E-‐03 0,0004935 0,0069497 dxx -‐0,0048034 -‐0,0042079 -‐3,86E-‐03 -‐0,0035334 -‐0,0030272 dyy -‐0,0039242 -‐0,0032363 -‐2,95E-‐03 -‐0,0026152 -‐0,0018867 dzz -‐0,0007807 -‐0,0002474 4,38E-‐05 0,0003761 0,0009295 dxy -‐0,0015796 -‐0,0009289 -‐5,41E-‐04 -‐0,0001582 0,0005588 dxz -‐0,0011406 -‐0,0004385 -‐9,10E-‐05 0,0002735 0,0010172 dyz -‐0,0001362 0,0006395 1,02E-‐03 0,0014121 0,0021486
89
glm function
fit_persp=glm(ans~(dx+dy+dz+dxx+dyy+dzz+dxy+dxz+dyz),data=D_p
ersp,family=binomial())was the used function.
The values of the coefficients are reported in Table 6.
Table 6: Maximum likelihood estimation, study 1
Call: glm(formula = ans ~ (dx + dy + dz + dxx + dyy + dzz + dxy + dxz + dyz), family = binomial(), data = D_persp) Deviance Residuals: Min 1Q Median 3Q Max -2.0266 -0.9905 0.6040 0.8597 2.6048 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.278e+00 1.572e-01 8.129 4.31e-16 *** dx 5.468e-03 5.983e-03 0.914 0.3607 dy -7.028e-02 6.604e-03 -10.642 < 2e-16 *** dz -3.543e-03 6.102e-03 -0.581 0.5615 dxx -3.832e-03 4.821e-04 -7.948 1.90e-15 *** dyy -2.925e-03 5.126e-04 -5.706 1.16e-08 *** dzz 3.521e-05 4.471e-04 0.079 0.9372 dxy -5.259e-04 5.915e-04 -0.889 0.3739 dxz -7.348e-05 5.429e-04 -0.135 0.8923 dyz 9.966e-04 6.050e-04 1.647 0.0995 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1339.5 on 977 degrees of freedom Residual deviance: 1109.3 on 968 degrees of freedom AIC: 1129.3 Number of Fisher Scoring iterations: 4
It is shown that the variable “answer” significantly depended on the coefficients dy
(p-value < 2e-16 ***) dxx (p-value = 1.90e-15 ***) and dyy (p-value = 1.16e-08 ***).
Even with this method, the estimate value of dy resulted to be negative, as the
values of dxx and dyy. The same discussions already made for MCMC estimation
could be done even for this case. The negative dy value means that displacing the
camera up/down, the maximum probability of having ‘yes’ is not centre in zero and
90
negative dxx and dyy values mean that the probability of having ‘yes’ decreases
moving the camera along the axes x and y, far from the subject’s head.
91
STUDY 2: estimate of parameters A and b0 and c
We needed to estimate the parameters A and b0 to implement the eq. 15.
The coefficients of the parameters are shown in the following tables.
MCMClogit function
We applied the same formula mcmc_rhi=MCMClogit(I(as.numeric(ans)-
1)~(lambda+dx+dy+dz+dxx+dyy+dzz+dxy+dxz+dyz),data=D_rhi,tune=
0.6,burnin=5e3,mcmc=5e4,thin=100)
An acceptance rate equal to 0.34 was obtained. Singe again the convergence was
reached the samples from the conditional distributions were used to summarize
the posterior distribution of parameters of interest.
The coefficients’ values and the relative confident intervals are reported in Table 7.
As we explained before, the factors that most affect the subject’s answer in study
2 were the terms dy, dxx and dyy, dzz and λ. Table 7 shows that dy again has a
negative value. This means that even for this study, displacing the camera
up/down, the maximum probability of having ‘yes’ is not centre in zero, when the
camera is in the same position of the subject’s head, but there is a bias indeed.
Also dxx, dyy and dzz has a negative values, showing that the probability of
having ‘yes’ decreases moving the camera along the axes x and z, far from the
subject’s head. On the contrary, λ has a positive value, meaning that the
probability of having ‘yes’ increases with it.
Looking at the confident intervals, we are 95 percent certain that the parameters
dy, dxx and dyy are in that range of values (negative values for dy, dxx, dyy and
dzz and positive value for λ).
92
Table 7: Output of the MCMC estimation, study 2
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ The Metropolis acceptance rate for beta was 0.34438 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > summary(mcmc_rhi) Iterations = 5001:54901 Thinning interval = 100 Number of chains = 1 Sample size per chain = 500 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-‐series SE (Intercept) 1,90E-‐01 0,1070576 4,79E-‐03 4,88E-‐03 lambda 1,02E+00 0,1482318 6,63E-‐03 6,12E-‐03 dx 3,17E-‐04 0,0036966 1,65E-‐04 1,42E-‐04 dy -‐3,39E-‐02 0,0037864 1,69E-‐04 1,42E-‐04 dz 7,29E-‐04 0,0035792 1,60E-‐04 1,88E-‐04 dxx -‐1,18E-‐03 0,0002933 1,31E-‐05 1,19E-‐05 dyy -‐1,54E-‐03 0,000298 1,33E-‐05 1,41E-‐05 dzz -‐8,25E-‐04 0,0002804 1,25E-‐05 1,05E-‐05 dxy 7,01E-‐04 0,0003812 1,71E-‐05 1,73E-‐05 dxz 1,28E-‐04 0,0003483 1,56E-‐05 1,65E-‐05 dyz -‐7,16E-‐06 0,0003495 1,56E-‐05 1,66E-‐05
2. Quantiles for each variable: 2,50% 25% 50% 75% 97,50% (Intercept) -‐1,75E-‐02 0,1148183 1,90E-‐01 0,2686761 0,3866878 lambda 7,29E-‐01 0,9154259 1,01E+00 1,1224778 1,2967216 dx -‐6,81E-‐03 -‐0,0022057 3,57E-‐04 0,0028398 0,0075483 dy -‐4,14E-‐02 -‐0,0365163 -‐3,37E-‐02 -‐0,0312897 -‐0,0268257 dz -‐6,63E-‐03 -‐0,001697 7,49E-‐04 0,0031646 0,0074596 dxx -‐1,76E-‐03 -‐0,0013765 -‐1,19E-‐03 -‐0,0009843 -‐0,000625 dyy -‐2,11E-‐03 -‐0,0017524 -‐1,53E-‐03 -‐0,0013483 -‐0,0009667 dzz -‐1,37E-‐03 -‐0,00102 -‐8,24E-‐04 -‐0,0006326 -‐0,0002369 dxy -‐2,76E-‐05 0,0004407 7,13E-‐04 0,0009417 0,0015024 dxz -‐5,26E-‐04 -‐0,0001021 1,23E-‐04 0,0003503 0,0008058 dyz -‐6,97E-‐04 -‐0,0002418 -‐1,92E-‐06 0,0002172 0,0006825
93
glm function
fit_rhi=glm(ans~(lambda+dx+dy+dz+dxx+dyy+dzz+dxy+dxz+dyz),
data=D_rhi,family=binomial())was the used function.
The values of the coefficients are reported in Table 8.
Table 8: Maximum likelihood estimation, study 2
Call: glm(formula = ans ~ (lambda + dx + dy + dz + dxx + dyy + dzz + dxy + dxz + dyz), family = binomial(), data = D_rhi) Deviance Residuals: Min 1Q Median 3Q Max -1.7648 -1.1976 0.8027 1.0306 1.9874 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.825e-01 1.082e-01 1.686 0.09171 . lambda 1.007e+00 1.460e-01 6.894 5.44e-12 *** dx 4.587e-04 3.694e-03 0.124 0.90116 dy -3.348e-02 3.798e-03 -8.816 < 2e-16 *** dz 9.119e-04 3.569e-03 0.255 0.79834 dxx -1.136e-03 2.844e-04 -3.996 6.44e-05 *** dyy -1.514e-03 2.997e-04 -5.052 4.36e-07 *** dzz -8.052e-04 2.672e-04 -3.014 0.00258 ** dxy 7.106e-04 3.875e-04 1.834 0.06668 . dxz 1.304e-04 3.493e-04 0.373 0.70895 dyz 3.991e-06 3.570e-04 0.011 0.99108 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 3457.9 on 2518 degrees of freedom Residual deviance: 3285.5 on 2508 degrees of freedom AIC: 3307.5 Number of Fisher Scoring iterations: 4
The variable “answer” significantly depended on the coefficients dy (p-value < 2e-
16 ***) dxx (p-value = 6.44e-05***), dyy (p-value = 4.36e-07***) and now even on
94
dzz (p-value = 0.00258**). The “answer” significantly depends even on λ (p-value
= 5.44e-12***).
Even with this method, the estimate value of dy resulted to be negative, as the
values of dxx and dyy and dzz. On the contrary, λ has a positive value. The same
discussions already made for MCMC estimation could be done even for this case.
The negative dy value means that displacing the camera up/down, the maximum
probability of having ‘yes’ is not centre in zero and negative dxx, dyy and dzz
values mean that the probability of having ‘yes’ decreases moving the camera
along the axes x, y and even z far from the subject’s head. On the contrary, th
probability of having ‘yes’ increases with a higher congruency level (bigger λ).
Table 9: : Comparison between the two estimation methods
glm: Estimate (Intercept) 1,27E+00 dy -‐6,97E-‐02 dxx -‐3,77E-‐03 dyy -‐2,92E-‐03
MCMClogit: Estimate (Intercept) 1,28E+00 dy -‐7,13E-‐02 dxx -‐3,87E-‐03 dyy -‐2,93E-‐03
Looking at the values of the estimated parameters, we noticed that they did not
differ significantly from the ones estimated by MCMClogit (Table 9). Thus we
performed all the analysis using the glm-fitted parameters, only for practical
purposes due to a higher computational efficiency.
95
5. RESULTS
A model based on the psychometric function was built to extract the parameters
that characterize the data distribution.
Performing the Bayesian inference using the MCMC method, the parameters the
affect the stimulus-response of the subjects were estimated. Then, having the
parameters’ estimation, we could predict the subjects’ answers with the function
Predict(object, newdata, se.fit) in R. With this function we could obtain
predictions and optionally estimates standard errors of those predictions from a
fitted generalized linear model object. The estimated parameters from GLM
(fit_persp) were used and the predicted probabilities where calculated from the
maximum likelihood fit. We created a new data frame in which looking for the
variables with which to predict. We define the data frame, setting the values of the
camera position over a range from -25 cm to 25 cm, along each direction
separately.
The results are plotted as function of the camera displacements along the three
directions x, y, z separately, according to the body axes in Figure 24.
Study 1
The relationship between the answer YES and the camera positions was
analysed. In this study, the subject’s answer depended only on the camera
distance from the subject’s head: ans= f(d).
The probability of having a positive answer (''YES'') was calculated by applying the
estimated parameters A and b0 in the model and predicting the answers over a
range of [-25 25] cm for each direction, applying the function pred_persp =
predict(fit_persp,newdata=D_pred_persp,se.fit=TRUE).
96
Figure 31 shows the results obtained by moving the camera laterally left and right,
along the x axis. The head of the subject was assumed to coincide with the center
of the coordinate system [x, y, z] = [0, 0, 0].
Figure 31: study 1 - results. Probability of having an answer equal to yes, displacing the camera
along x (lef/right to the subject).
Data are described by a bell-shaped curve, with a peak centre in zero (µ=0.8,
st.dev=14,6). The highest probability of having ‘YES’ is 0.78± 0.026.
This data distribution shows that the probability of having the answer ‘Yes’ (i.e.,
the subjects perceives the virtual scene as a real one, in terms of point of view) is
maximum (~80%) when the camera position is centred on the subject’s head and
that it decreases moving the camera along the x axis to the left or to the right of
the subject. To have the highest probability of having ‘Yes’ (~80%), the camera
could move within a range between ~ [-10,10] (Figure 34).
In general, we do not need the precise value of this range but only an approximate
one, in order to compare it with the head dimensions (biparietal diameter ~ 16 cm
in an adult man). Thus these results define the range of positions along x where
97
the camera needs to be displacing to ensure a 1PP scene. In summary, subjects
perceive a 1PP scene if displacing the camera right and left within a range which
is comparable with their head.
Figure 32 shows the results obtained by moving the camera up and down along
the y axis. Again the head of the subject was assumed to coincide with the center
of the coordinate system [x, y, z] = [0, 0, 0].
Figure 32: study 1 - results. Probability of having an answer equal to yes, displacing the camera
along y (up/down to the subject).
As for the x-displacement, a similar bell-shaped curve is also obtained for camera
displacements in the y direction. The maximum value of the probability of having
YES is 0.84 ± 0.020 and it is centred in -12 cm (µ=-11,9, st.dev=17,5).
This means that the probability of having “YES” decreases also with the vertical
displacement of the camera, moving it up or down compared to the subject’s head,
but with a peak 10 cm below the head’s centre position. This shift could be due to
the set-up of the virtual scene. In particular, if the camera was aligned with the
head’s position, subjects would experience an unreal recreated scene, because
they would look straight at the table. For a natural view of the scene, subjects
98
should look down to the table, as they would do in a real situation. Thus, putting
the camera below the subject’s head, they experienced a more realistic scene.
In Figure 34 the range of the camera positions where the probability of having
‘Yes’ was equal to 80% is shown ( ~[-20, 0] cm). Thus these results define the
vertical range where the camera needs to be displacing to ensure a 1PP scene.
In summary, the scene should be shifted down to the subject’s eye to simulate the
view of a realistic 1PP scene.
Table 10 summarizes the main features of the probability distribution of having
YES, displacing the camera along the x and y axes.
Table 10: Summary of main findings
MAX VALUE P(yes)
STANDARD DEVIATION (cm)
BIAS (cm)
X AXIS 0.78 14, 6 0 Y AXIS 0.84 17,4 -12
Lastly, Figure 33 shows the results obtained by moving the camera backward and
forward along the z axis. The head of the subject was still centred in [x,y,z,= 0, 0,
0].
Figure 33: study 1 - results. Probability of having an answer equal to yes, displacing the camera
along z (near/far from the subject).
In this case, the shape of the data curve differs from the other two ones. Instead of
a bell shaped-function, results are characterized by a curve which is reasonably
constant when varying z, along the gaze direction. This means that camera
99
movements along the z axis do not influence the probability of having “YES”,
underlying a less sensitivity along the depth axis. This is due to the nature of
forward/backward movements which do not modify the subject's perspective, but
rather the distance from the scene.
Figure 34: isoprobability curves. The blue area represents the maximum porbabliity of having yes
(80%) and the related range of positions along x and y axis.
Figure 34 represents the Isoprobability curves - constant slices of the 3D positive
answers distribution on a 2D format. For z=0, it shows the range of positions along
the axes x and y where the camera could be displaced maintaining the same
probability of having ‘yes’, with different probability levels (80% , 60%, 40%, 20%).
Varying the camera position along x and y (z=0), we could ensure an high
probability (~ 80% in study 1) of having ‘Yes’ if moving the camera left/right to the
subject’s head along x within the range in blue, which is approximately ~ [-8 , 12]
cm and up/down along y within the range ~ [-23,-3] cm.
100
Table 11: Summary of the predicted values and estimated standard errors
RIGHT/LEFT DISPLACMENTS summary(pred_persp$fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.2525 0.4843 0.6636 0.6090 0.7536 0.7803 summary(pred_persp$se.fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.02136 0.02168 0.02196 0.02694 0.03164 0.04352 UP/DOWN DISPLACMENTS >summary(pred_persp$fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.09141 0.48540 0.77290 0.64170 0.82740 0.84320 >summary(pred_persp$se.fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.01743 0.02020 0.02613 0.02700 0.03273 0.05200 FORWARDS BACKWARDS DISPLACMENTS > summary(pred_persp$fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.7803 0.7803 0.7803 0.7803 0.7803 0.7803 > summary(pred_persp$se.fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.02196 0.02196 0.02196 0.02196 0.02196 0.02196
Table 11 shows the predicted values and the estimated standard errors of those
predictions, for camera movements along x,y and z axes.
101
RESULTS- study 2
As for study 1, the relationship between the answer YES and the camera positions
were analysed. But for study 2, the subject’s answer depended not only on the
camera distance from the subject’s head but also on the congruence level λ
between the visual and tactile stimuli: ans= f(d + λ).
The probability of having a positive answer (''YES'') was calculated by applying the
estimated parameters A, b0 and C in the model and predicting the answers over a
range of [-25 25] cm for each direction, applying the function pred_persp =
predict(fit_persp,newdata=D_pred_rhi,se.fit=TRUE).
First, we focused on the analysis of the congruency effect on the probability of
having ‘Yes’, because we found the same results moving, independently from the
of the camera movements.
Different probabilities related to different λ values along x,y and z are plotted in
Figure 35, Figure 36 and Figure 37. The highest P(Yes) value is referred to the red
curve, associated with a congruency level of 100%. Vice versa, the lowest P(Yes)
value is referred to the blue curve, associated with a congruency level of 0%. The
green curve, related to a congruency level of 50%, represents an intermediate
value. It is immediate to understand that λ is a critical factor to determine the
probability of having ‘Yes’. Using high λ values the probability of receiving a tactile
feedback when touching the ball increased. Vice versa, using λ= 0%, the subject
never received the tactile feedback when touching the virtual ball.
The analysis of the effect of the camera movements must be performed separately
for each direction.
Figure 35 shows the results obtained by moving the camera laterally left and right,
along the x axis.
102
Figure 35: study 1 - results. Probability of having an answer equal to yes, displacing the camera
along x (left/right to the subject), using 3 different congruency levels (λ=0%,50%,100%).
Similarly to study 1, the data follows the behaviour of a bell-shaped curve.
Therefore, the probability of having the ownership P(Yes) is maximum when the
camera position is centred on the subject’s head and that it decreases moving the
camera along the x axis to the left or to the right of the subject.
For λ= 100% the maximum value of the probability of having YES is 0.76 ± 0.020
and it is centred in 0 cm (µ=0.2, st.dev=26,52). For λ= 50% the maximum value is
0.66 ± 0.018, still centred in 0 cm (µ=0.2, st.dev=26,52). For λ= 0% the peak is
greatly decreased, with a value of 0.54 ± 0.026, but still centred in 0 cm (µ=0.2,
st.dev=26,52).
103
Figure 36 shows the results related to the vertical movements of the camera.
Figure 36: study 1 - results. Probability of having an answer equal to yes, displacing the camera
along y (up/down to the subject), using 3 different congruency levels (λ=0%,50%,100%).
As already mentioned, the probability of answering ‘Yes’ increases with the λ
value.
Moreover, as in study 1, the probability of having the ownership P(Yes) peaks at y
~ -10cm, and decreases as the camera is displaced along the y axis,
independently from the congruency level. This bias could be explained with the
same discussion made for study 1 (see Results –study 1), involving the necessity
of looking at a shifted scene to simulate a natural view.
For λ= 100% the maximum value of the probability of having YES is 0.79 ± 0.012
and it is centred in -11 cm (µ=-11,1, st.dev=23,46). For λ= 50% the maximum
value is 0.70 ± 0.016, still centred in -11 cm (µ=-11,1, st.dev=23,46). For λ= 0%
the pea decreases, with a value of 0.59 ± 0.025, but still centred -11 cm (µ=-11,1,
st.dev=23,46).
104
Finally, Figure 37 shows the results of a camera displacement along the z axis,
far/near to the subject.
Figure 37: study 1 - results. Probability of having an answer equal to yes, displacing the camera
along z (near/far from the subject), using 3 different congruency levels (λ=0%,50%,100%).
A bell-shaped curve centered in zero describes that the probability of having
P(Yes), similarly to what observed in the previous two cases (x and y
displacements), but greatly different from the results of study1.
Same results and discussions made for x and y displacements hold for
congruency levels. Again, the probability of having the ownership P(Yes) is
maximum when the camera position is centred on the subject’s head and that it
decreases moving the camera along the z axis, far or near to the subject. For λ=
100% the maximum value of the probability of having YES is 0.76 ± 0.020 and it is
centred in 0 cm (µ=0.6, st.dev=31,49). For λ= 50% the maximum value is 0.66 ±
0.018, still centred in 0 cm (µ=0.6, st.dev=31,49). For λ= 0% the peak is greatly
decreased, with a value of 0.54 ± 0.026, but still centred in 0 cm (µ=0.6,
st.dev=31,49).
105
Again we plotted the iso-probability curves to analyse the range of positions along
the axes x, y and z where the camera could be displaced maintaining the same
probability of having ‘yes’.
Figure 38 shows the position range along x, y and z that ensures an high
probability of having ‘Yes’, with a maximum congruency level (P(Yes~70%)) and
with a minimum congruency level ((P(Yes~50%))).
Figure 38 a) shows the displacements along x and y with λ=0%. If the camera
moves left/right and up/down within the green area (approximately x ~ [-17, 17] cm
and
y ~ [-23, 0] cm) we obtain the maximum probability of having ‘Yes’ in these
conditions. With λ=100% the probability reaches the 70%, moving the camera
within a range of [-20, 20] for lateral displacements and [-25, 4] for vertical
displacements. For later movements along x, the range is bigger than the one
necessary to ensure a 1PP scene, according to results in study 1. Nonetheless an
high probability of having an ownership sensation is reached, thanks to the visual-
tactile feedback (Figure 38).
Concerning the range of vertical movements along y, the results are basically the
same found in study 2.This means that again the scene should be shifted down to
the subject’s eye to simulate the view of a realistic 1PP scene.
For movements along the saggital axis (z), we found out that the camera position
affects the probability of having ownership, decreasing it if putting the camera far
from the subject over a. However, Figure 38 shows that subject are less sensitive
for displacements along z axis (as for study1). It ensures a 70% of probability by
moving the camera within the range between ~[-23, 23]. Moreover the camera has
to be moved over 40 cm far from the head to decrease the probability down to
50% .
.
106
Figure 38: Comparison between range of position for λ=0% (panels a and b) and λ=100% (panels c
and d), for x (top panels) and z (bottom panels).
107
Table 12: Estimated standard errors and predicted values
RIGHT/LEFT DISPLACEMENTS >summary(pred_persp$fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.6192 0.6880 0.7334 0.7197 0.7582 0.7662 >summary(pred_persp$se.fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.01969 0.01993 0.02033 0.02285 0.02401 0.03787 UP/DOWN DISPLACEMENTS > summary(pred_persp$fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.3557 0.6305 0.7559 0.6942 0.7871 0.7971 > summary(pred_persp$se.fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.01812 0.01954 0.02273 0.02538 0.02922 0.04605 FORWARDS BACKWARDS DISPLACMENTS summary(pred_persp$fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.6642 0.7112 0.7428 0.7336 0.7604 0.7662 > summary(pred_persp$se.fit) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.01957 0.01983 0.02029 0.02221 0.02300 0.03479
Table 12 shows the values related to the prediction values and the estimated
standard errors of those predictions, for camera movements along x, y and z axis,
for λ=100%.
108
6. DISCUSSION
This project focused on the RHI in a dynamic virtual reality set-up and two studies
were developed to investigate the limits of conscious perception. The main goal
goal of this work was to systematically analyse and quantify how perspective
affects RHI, testing the perception of both point-of-view changes and ownership of
virtual hand.
Our experiments were based on the classical RHI set-up (Botvinick M., Cohen J.,
1998). This illusion belongs to a class of perceptual effects involving intersensory
bias, resulting when the information available to different sensory modalities is
discordant. As in the classical RHI experiments, the main idea of this project was
that a multisensory conflict arose by modifying visual and tactile perception. As a
consequence, the brain was no longer able to create a coherent representation of
the body. Thus, a misattribution of the fake hand emerged in subjects, that felt the
rubber hand become part of the their body image. To test the subjects’ ownership
feeling and collecting data we based on the classical RHI experiment. Subjects
were asked to answer to questions taken from the original experiment by Botvinick
and Cohen.
Classical RHI set-up has many limitations. Particularly, imprecise experimental
control affects the stimulation that varies within different subjects. E.g. the tactile
stimuli, varying for pressure’s intensity, stimulation timing and stimulated points,
could not be exactly reproduced in each participant. Thus it is difficult to perform
and repeat the experiment with the same conditions, preventing a modelling of the
data. We therefore planned to investigate this cognitive illusion with a more
systematic and standardized approach, operating with a virtual environment.
Working with VR allowed us to achieve a better automation in conveying the tactile
stimulus, in collecting data and in replicating the same conditions within different
subjects. The tactile stimulation was systematically controlled by changing the
value λ. Using the same congruency levels we could test the subjects with the
same condition, in terms of tactile stimulation.
109
Moreover in literature we found experiments concerning how the perspective
affects the ownership feeling. Nonetheless only 1PP and 3PP effects were tested
(Lenggenhager et al. (2007, 2009); Aspell et al. (2009); Petkova and Ehrsson
(2008); Slater et al. (2009, 2010), (Ehrsson (2007)). Operating within the VR, it
was possible to easily go beyond what is feasible in the physical world. Thus we
could systematically modify the perspective of the virtual scene all around the
space, not only testing the 1PP or 3PP conditions, in order to study the effect of a
change in the point of view.
Subjects could interact with the virtual scenario watching the scene with the HMD.
They experienced the sensation of being sit at a virtual table, looking at their arm.
To study the effects of changing the perspective, different points of view of the
same scene had to be presented to the subjects. Thus, the scene was reproduced
by a virtual camera that showed the virtual scene from various perspectives,
moving around the subjects’ head. Hence subjects observed the scene from
different points of view, depending on different randomly changing camera
positions.
In the study 1 we analyzed only the conscious perspective perception, without
involving the RHI directly. Analyses were carried out to evaluate when subjects
could perceived a 1PP virtual scene, moving the camera around the subject, along
the x, y, z axes. We suggested that some camera positions could produced a 1PP
scene more than other positions. Analysing the collected data we were able to
estimate the range of the camera positions that allowed the creation of a 1PP
scene.
Since the answer “YES” referred to the question “Do you feel that the point of view
of this trial is the same as yours?”, the probability of having YES indicated that
subjects perceived a 1PP scene.
Different probabilities of having “YES” are obtained by shifting the camera position
along the three axes x, y, z, according to Figure 24.
Figure 39 shows the probability of having “YES” given the camera displacements
along x, y, z, respectively.
110
Figure 39: results – study 1. Probability of having “YES, given a camera displacement along the
axis x, y, z.
Results involving camera lateral displacements along x axis showed that the
probability of obtaining a positive answer (YES) peaked at zero and then
decreased.
This means that when the camera position was centred on the subject’s head –in
zero – the probability of having YES was maximum. By moving the camera on the
right/left to the subject, this probability decreases.
Results involving camera displacements along y axis showed that the probability
of having the answer YES was still described by a bell-shape curve. It decreases
with the vertical displacement of the camera, moving it up or down compared to
the subject’s head. Results also showed that in this case the peak was not centred
in zero.
One first hypothesis could be that the set-up of the virtual scene. In particular,
when the camera was centred in y=0, subjects would look straight at the table
111
(Figure 40a). For a natural view of the scene, subjects should look down to the
table, as they would do in a real situation (Figure 40b). Thus, putting the camera
below the subject’s head, they experienced a more realistic 1PP scene.
Figure 40: possible explanation for the y bias that seems to be related to the set-up.
But it will be necessary to more carefully investigate more about this problem, to
ensure that it is related to the set-up or if it origins from a cognitive perceptual
mechanism.
Nonetheless, even in this case, we conclude that moving the camera position up
or down the perceived perspective of the scene changes. The more the virtual
camera is placed far from the subject’s head the more he/she is able to notice a
change in perspective.
Concerning the results for camera displacement along the z axis, a different curve
trend. Instead of a bell-shaped curve, results show a flat curve, reasonably
constant. This means that camera movements along the z axis do not significantly
change the perspective of the scene, still perceived as a 1PP even moving the
camera. We conclude that subjects are less sensitive to discriminate perspective
differences along the depth direction.
Lastly, Figure 41 qualitatively shows the range of position that the camera could
cover to produced a 1PP virtual scene (blu area). The lateral range basically
covers the head position, while the vertical one is shifted down to the subject to
ensure a more realistic vision, as previously explained.
112
Figure 41: isoprobability curves. The light blue area represents the maximum porbabliity of having
yes (~80%) and the related range of positions along x and y axis. The head of the subject is also
plotted for an easier comprehension of the figure.
In study 2 we systematically collected measures and data on the effect of changes
in perspective and in visual-tactile congruency on the ownership of a virtual hand.
One result we expected from this experiment was that the RHI was stronger when
the camera position was closer to the subject’s head position. In this condition, a
first person perspective vision was reproduced, according to the results of study 2.
The subject was able to watch the scene with the same point of view of his own
eyes. The scene was therefore closer to the real one, thus allowing the RHI to
occur more strongly.
Analysing the collected date we were able to estimate the probability of having the
answer equal to YES (ownership feeling), after a camera displacement and with a
specific congruency level.
Two separate effects that affected the ownership feeling were studied: the
perspective changes and the congruency level between the visual and the tactile
stimuli, indicated by the value λ.
Figure 42 shows the P(ans=YES) given camera displacements along the 3 axis
(x, y, z), for a specific congruency level.
113
Figure 42: results – study 2. Probability of experiencing the ownership feeling, given a camera
displacement along the 3 axis (x, y, z) and using 3 different congruency levels.
For all the three directions, the probability of having an answer equal to YES is
maximum when the camera is placed in the same position of the subject’s head
and it decreases by displacing the camera far from the head.
Focusing on the congruency level effects, Figure 42 showed that for higher λ, the
P(ans=YES) increased. In particular, results showed that the probability of feeling
the virtual hand as own with providing a fully synchronous feedback was ~25%
higher than with a asynchronous feedback [P(yes) max ~ 80% with λ=100% and
P(yes) max < 60% with λ=0%].
This is the first main result, meaning that the tactile feedback, together with the
visual feedback, counts for the rise of the ownership feeling. In particular if the
visual and tactile stimuli are congruent, the subjects feels a stronger ownership
feeling of the virtual hand. Vice versa, providing a tactile stimulation incongruent
with the visual stimulus, a feebler feeling arises.
This result is consistent with the findings of the classical RHI experiments inferring
that a synchronous tactile stimulation of the real hand and the fake hand was a
necessary condition in order to induce the RHI (tsakiris&haggard 2005).
114
Concerning the analysis on the effects of a changes in perspective, similar results
in terms of probability distribution shapes and width were found in all the panels of
Figure 42. Independently from the λ value, a bell-shaped curve described the
probability of having an ownership feeling by moving the camera along three
different axes.
Results involving camera displacements along x axis showed that the probability
of having ownership had a peak around zero and then decreased.
Even from the results obtained by a vertical camera displacement along the y axis,
we noticed that the probability of having an ownership feeling had a peak and then
decreased. It means that the feeling of ownership decreases by displacing the
camera up/down from the subject, along the longitudinal axis (fig.21). We still
notice a shift in the peak of the curve. We elaborated the same hypothesis already
made for study 1, about a set-up problem.
Concerning the results of camera displacement along the z axis, we found a
difference between the results obtained from study 1. Now the probability of
having ownership had a peak around zero and then decreased. It means that the
ownership feeling arises more strongly when the virtual camera is close to the
head of the subject (1PP) but decrease by moving the camera far from the
subject’s head along the sagittal axis.
This could be explained because of the visual-tactile stimulation that could provide
a sense of being inside the virtual environment, increasing the subject’s sensitivity
of perspective changes even along the depth axis. However, Figure 43-a shows
that subject are still less sensitive for displacements along z axis (as for study1), in
compared to the camera movements along x and y (var z > var x, var y –see Table
13). In fact, a 70% of probability is still obtained even moving the camera within a
bigger the range (~[-25, 25] cm) and putting the camera very far from the head (z
> ±30 cm) still provide a 50% of probability of having an ownership feeling. This
effect does not equally occur by moving the camera laterally/vertical along x/y axes
(Figure 43b).
115
:
Figure 43: comparison between the range of positions along x (left panel) and z (right panel) .
In general, the only significant different between the curves related to study 1 and
study 2 are the size of the bell (for study 1 the variance is bigger than study 2 –
see Table 13). This probes that a tactile feedback enhances the sensation of being
immerse in a real environment and thus even the feeling of ownership.
Table 13: Comparison of the standard deviation for the two studies presented here
Study 1 Study 2
St. dev x
14, 6 26, 5
St. dev y
17,4 23, 4
St. dev z -- 31, 49
In summary, we conclude that perspective counts for the rise the ownership
feeling. This sensation is stronger when the camera covers the right positions
necessary to provide a 1PP, and it becomes feebler by displacing the camera far
from the subject’s head, along all the axes, x, y and even z. Furthermore, the
synchrony between the visual and tactile stimulation increases the sense of
ownership feeling, coherently with the results obtained in the classical RHI
experiment (see Figure 11) (tsakiris&haggard 2005). Lastly, the tactile feedback in
general is also important because enhances a more realistic sense of being inside
116
the virtual environment, increasing the feeling of ownership even if the scene is not
perfectly view in 1PP.
117
7. CONCLUSIONS and FUTHER DEVELOPMENTS
This project was carried out to explore the mechanisms responsible for the body
ownership feeling in VR. The classical experiment concerning the rubber hand
illusion (Botvinick M., Cohen J., 1998) was reproduced in a virtual environment,
exploiting the many advantages in working with VR to explore body perception. VR
allowed the construction of an automatic set-up and a successive systematic data
collection, in order to study this ownership illusion with a more scientific approach.
Two different studies were perform in order to achieve this goal. A preliminary
study (study 1) was necessary to better understand the results of a second study
(study 2) focused on the ownership feeling mechanisms.
Study 1 investigated the pure perceptual mechanisms in a virtual environment.
Subjects were ask to indicate if the virtual scene was presented with a 1PP or not.
Since the perspective of the scene was related to the positions of a virtual camera,
we could identify the range of these positions that allowed the creation of a 1PP
virtual scene.
Results show that a 1PP scene could be reproduced by moving the camera
between -10 cm and 10 cm right/left to the subject’s head combined with a vertical
displacement between -3 cm and -20 cm down to the subject, while movements
along the depth axis do not greatly affect the perspective of the scene.
Study 2 focused on the effects of using a 1PP scene and a visuo-tactile feedbacks
on the ownership feeling, always performing the experiments in VR. Subjects were
ask to evaluate if the virtual hand in the scene was felt as their own hand. The
answers depended again on the perspective of this virtual scene and also on the
congruency between the visual and the tactile stimulation, conveyed to the
subjects during the experiments.
Results showed that the probability of feeling the virtual hand as own with
providing a fully synchronous feedback was ~25% higher than with a
asynchronous feedback [P(yes) max ~ 80% with λ=100% and P(yes) max < 60%
with λ=0%].
118
This result was coherent with the results obtained in the classical RHI experiment
(see Figure 11) (tsakiris&haggard 2005). Furthermore, independently to λ value,
the ownership feeling reached a maximum when the virtual scene was shown in a
1PP (placing the camera according to the results of study 1) and decreased if
presenting a virtual scenario from a point of view different from the subject’s one.
Even this result was consistent with the results in literature concerning the
importance of a 1PP for the ownership mechanisms (Petkova, Khoshnevis
Ehrsson (2011) ;Ehrsson (2007)). Furthermore, we found that providing a tactile
stimulation to the subject the feeling of ownership increased, even if the scene is
not perfectly view in 1PP. Thus, we can also conclude that tactile feedbacks plays
an important role in the ownership feeling, giving a more realistic sense of being
inside the virtual environment.
In summary, we conclude that a 1PP virtual scene combined with a congruent
visuo-motor feedback are necessary conditions to induce the ownership feeling of
a virtual hand, performing the classical experiment of the RHI in VR.
Nonetheless a more accurate error analysis and inter-subject variability analysis
should be performed in order to achieve more quantitative results to
support these results.
Concluding, this project proves that our virtual set-up was suitable to reproduce
the classical RHI conditions. However further advancements could be performed
to render the virtual scene more realistic. The virtual scene could be improved in
terms of quality of the images (e.g. adding objects in the scene). The virtual arm
could be rendered more realistic, adding anatomical and physiological
characteristics. Even the movements of the real arm could be better reproduced in
the virtual scene by tracking not only the hand, but also the wrist, the forearm and
the elbow.
Furthermore, further information about the brain activities could obtained, e.g.
applying electrodes on the subjects’ head and recorded the EEG during the
experiments. Exploiting this virtual RHI set-up, one could detect which brain areas
are activated while automatically and systematically changing the perspective of
the virtual scene and the visual-tactile congruency of the stimulation.
119
Finally the principles of full-body ownership described here could be employed in
the wide range of industrial and clinical applications, e.g. to investigate on
neurological diseases such as schizophrenia or phantom limb syndrome, or in
order to optimise artificial devises such as prosthetics, orthotics and artificial limbs.
120
8. BIBLIOGRAPHY
Armel K. C. and Ramachandran V. S. (2003). Projecting sensations to external
objects: Evidence from skin conductance response. Proceedings of the
Royal Society of London: Biological 270: 1499–1506
Ascension Technology Corporation: “Re-Actor2, Installation and Operation
Guide”, ATC 2003.
Aspell J. E., Lenggenhager B., and Blanke O. (2009). Keeping in touch with
one’s self: multisensory mechanisms of selfconsciousness. PLoS ONE 4,
e6488. doi: 10.1371/journal.pone.0006488.
Atkins J.E., Fiser J., Jacobs R.A. (2001). Experience-dependent visual cue
integration based on consistencies between visual and haptic percepts,
Vision Res. 41: 449–461.
Blanke O., Landis T., Spinelli L., Seeck M. (2004). Out-of-body experience and
autoscopy of neurological origin. Brain 127: 243–258
Blanke O., Ortigue S., Landis T., Seeck M. (2002) .Neuropsychology:
Stimulating illusory own-body perceptions. Nature 419: 269–270.
Botvinick M., Cohen J. (1998). Rubber hands 'feel' touch that eyes see.
NATURE, VOL 39.
Box G. and Tiao, G.C. (1973) Bayesian Inference in Statistical Analysis,
Wiley, ISBN 0-471-57428-7
121
Burge Burgess N. (2006). Spatial memory: how egocentric and allocentric
combine. Trends Cogn. Sci. 10: 551–557.ss.
Carlin, B.P. & T.A. Louis (2000). Bayes and empirical Bayesian methods for data analysis. London, U.K.: Chapman & Hall/CRC.
Carlin J., Wolfe R., Hendricks Brown C. and Gelman A. (2001) A case study on
the choice, interpretation and checking of multilevel models for longitudinal
binary outcomes. Biostatistics, 2, 397–416.
Chib S. and Greenberg E. (1995). Understanding the Metropolis-Hastings
Algorithm The American Statistician Vol. 49, No. 4
Costantini M., Haggard P. (2007). The rubber hand illusion: Sensitivity and
reference frame for body ownership. Consciousness and Cognition 16:
229–240.
Deneve S., Pouget A. (2004). Bayesian multisensory integration and cross-
modal spatial links, Journal of Physiology - Paris 98: 249–258
Durgin F.H. , Evans L., Dunphy N. (2007). Rubber hands feel the touch of light.
Psychol. Sci. 2007 Feb;18(2):152-7.
Ehrsson H H., Spence C., and Passingham R.E. (2004). That’s my hand!
Activity in premotor cortex reflects feeling of ownership of a limb. Science
305, 875–877
Ehrsson H.H. (2007). The Experimental Induction of Out-of-Body Experiences.
Science 317: 1048
122
Ehrsson H.H. (2011). “The concept of body ownership and its relationship to
multisensory integration,” in The Hand Book of Multisensory Processes,
ed. B. Stein (Boston: MIT Press)
Ernst M.O., Banks M.S. (2002).Humans integrate visual and haptic information
in a statistically optimal fashion, Nature 415: 429–433.
Fogassi L., Gallese V., di Pellegrino G., Fadiga L., Gentilucci M., Luppino G.,
Matelli M., Pedotti A., and Rizzolatti G. (1992). Space coding by premotor
cortex. Exp. Brain Res. 89: 686–690.
Gelman, A., J.B. Carlin, H.S. Stern & D.B. Rubin (2003). Bayesian data analysis. London, U.K.: Chapman & Hall/CRC.
Gibson J.J., ed (1979). The ecological approach to visual perception. Hillsdale,
New Jersey Lawrence Erlbaum Associates, Publishers.
Gilks W., Clayton D., Spiegelhalter D., Best N., McNeil A., Sharples L. and Kirby
A. (1993). Modelling complexity: applications of Gibbs sampling in
medicine. Journal of the Royal Statistical Society, Series B, 55, 39–52.
Graziano M. (1999). Where is my arm? The relative role of vision and
proprioception in the neuronal representation of limb position. Proceedings
of the National Academy of Sciences of the United States of America 96:
10418–10421.
Graziano M., and Gross C. G. (1998). Visual responses with and without
fixation: neurons in premotor cortex encode spatial locations independently
of eye position. Exp. Brain Res. 118: 373–380.
Graziano M., Botvinick M. (2002). How the brain represents the body: insights
123
from neurophysiology and psychology.
Graziano M., Hu X., Gross C. (1997). Visuospatial properties of ventral
premotor cortex, J. Neurophysiol. 77: 2268–2292.
Graziano M.S., Yap G.S., Gross C.G. (1994). Coding of visual space by
premotor neurons, Science 266:1054–1057.
Hagni K., Eng K., Hepp-Reymond M.C., Holper L., Keisker B., Siekierka E.,
Kiper D.C. (2008). Observing Virtual Arms that You Imagine Are Yours
Increases the Galvanic Skin Response to an unexpected Threat.
Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and
their applications. Biometrika 57: 97-109.
IJsselsteijn W., de Kort Y. and Haans A. (2006). Is this my hand I see before
me? The rubber hand illusion in reality, virtual reality and mixed reality.
Presence: Teleoperators & Virtual Environments, 15, 4, 455–464.
James W. ed. (1890). Prinicples of Psychology: Holt and Company.
Jeannerod M. (2003). The mechanism of self-recognition in humans.
Behavioural Brain Research 142: 1–15.
Kass R. and Raftery A. (1995). Bayes Factors. Journal of the American
Statistical Association 90, 773 - 795.
Klatzky R. L. (1998). “Allocentric and egocentric spatial representations:
definitions, distinctions, and interconnections,in Spatial Cognition – An
Interdisciplinary Approach to Representation and Processing of Spatial
Knowledge, eds C. Freksa C. Habel, and K.F. Wender (Berlin: Springer),
1–17.
124
Knill D.C., Richards W. (1996). Perception as Bayesian Inference, Cambridge
University Press, Cambridge, MA.
Lenggenhager B., Mouthon M., and Blanke O. (2009). Spatial aspects of bodily
self-consciousness. Conscious. Cogn. 18: 110–117
Lenggenhager B., Tadi T., Metzinger T., and Blanke O. (2007). Video ergo sum:
manipulating bodily self-consciousness. Science 317: 1096–1099.
Lloyd D.M. ( 2007). Spatial limits on referred touch to an alien limb may reflect
boundaries of visuo-tactile peripersonal space surrounding the hand” -
Brain and Cognition – Elsevier.
Macaluso E., Driver J. (2001). Spatial attention and crossmodal interactions
between vision and touch, Neuropsychologia 39: 1304–1316.
Maguire E. A., Burgess N., Donnett J. G., Frackowiak R. S., Frith C. D., and
O’Keefe J. (1998). Knowing where and getting there: a human navigation
network. Science 280: 921–924.
Makin T.R., Holmes N.P., and Ehrsson H.H. (2008). On the other hand: dummy
hands and peripersonal space. Behav. Brain Res. 191, 1–10
Merleau-Ponty M., ed. (1945). Phenomenologie de la perception.
Paris:Gallimard.
Metropolis, N. & S. Ulam (1949). The Monte-Carlo method. Journal of the
American Statistical Association 44:335-341.
Metropolis, N., A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller & E. Teller
(1953). Equation of state calculations by fast computing machines. Journal
125
of Chemical Physics 21: 1087-1092.
Metzinger T., ed. (2003). Being No One: The Self-Model Theory of
Subjectivity:MIT Press.
Monnet, J (1995), “Virtual Reality the technology and its applications”, Wiley &
Sons, New York.
Nelder J., Wedderburn R. (1972). "Generalized Linear Models". Journal of the
Royal Statistical Society. Series A (General) (Blackwell Publishing) 135 (3):
370–384. doi:10.2307/2344614. JSTOR 2344614.
Perez-Marcos D., Slater M., Sanchez-Vives M. (2009). Inducing a virtual hand
ownership illusion through a brain-computer interface. Neuroreport 20:
589–594.
Petkova V. I., and Ehrsson H. H. (2008). If I were you: perceptual illusion of
body swapping. PLoS ONE 3, e3832. doi: 10.1371/journal.pone.0003832
Petkova V. I., Khoshnevis M.and Ehrsson H.H. (2011). The perspective matters!
Multisensory integration in egocentric reference frames determines full-
body ownership. Frontiers in psychology vol 2, art. 35.
Petkova V.I., and Ehrsson H.H. (2008). If I were you: perceptual illusion of body
swapping. PLoS ONE 3, e3832. doi: 10.1371/journal.pone.0003832
Prinz W., Hommel B., eds (2002). Common Mechanisms in Perception and
Action: Attention and Performance XIX. Oxford: Oxford University Press.
Richardson S. and Best N. (2003). Bayesian hierarchical models in ecological
126
studies of healthenvironment effects. Environmetrics, 14, 129–147.
Rizzolatti G., Scandolara C., Matelli M., Gentilucci M. (1981). Afferent
properties of periarcuate neurons in macaque monkeys. I. Somatosensory
responses. Behavioural Brain Research 2: 125–146.
Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics 6,
461- 464.
Slater M., Perez-Marcos D., Ehrsson H.H., Sanchez-Vives M.V. (2009). Inducing
Illusory Ownership of a Virtual Body. Frontiers in Neuroscience 3: 214–
220.
Slater M., Sanchez-Vives M.V. (2005). From presence to consciousness through
virtual reality. Nature Reviews Neuroscience 6 , 332-339.
Slater M., Spanlang B., Sanchez-Vives M. V., and Blanke O. (2010). First
person experience of body transfer in virtual reality. PLoS ONE 5, e10564.
doi: 10.1371/journal. pone.0010564
Slater M., Perez-Marcos D., Ehrsson H.H. and Sanchez-Vives M.V. (2008).
Towards a Digital Body: The Virtual Arm Illusion. Frontieres in Human
Neurosci.
Stroud T. (1994). Bayesian analysis of binary survey data. Canadian Journal of
Statistics, 22, 33–45.
Tipping M.E. (2006). Bayesian Inference : An Introduction to Principles and
Practice in Machine Learning From Least-Squares to Bayesian Inference.
1-19.
127
Tsakiris M. and Haggard P. (2005). The rubber hand illusion revisited: visuo-
tactile integration and self-attribution.Journal of Experimental Psychology:
Human Perception and Performance, 31, 80–91.
Tsakiris M., Hesse M. D., Boy C., Haggard P., Fink G. R. (2007). Neural
Signatures of Body Ownership: A Sensory Network for Bodily
SelfConsciousness. Cerebral Cortex, Volume17, Issue10: 2235-2244
Van Beers R.J., Sittig A.C., Denier van der Gon J.J. (1996). How humans
combine simultaneous proprioceptive and visual position information, Exp.
Brain Res. 111 253–261.
Van Beers R.J., Sittig A.C., Denier van der Gon J.J. (1999). Integration of
proprioceptive and visual position-information: An experimentally supported
model, J. Neurophysiol. 81 1355–1364.
Van Beers R.J., Wolpert D.M., Haggard P. (2002). When feeling is more
important than seeing in sensorimotor adaptation, Curr. Biol. 12 834–837.
Vogeley K. and Fink G.R.(2003). Neural correlates of the first-
personperspective, , TRENDS in Cognitive Sciences Vol.7 No.1.