transactions on vehicular technology, vol. x, no. x, …

13
TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, MAY 2016 1 Mitigating Large Errors in WiFi-based Indoor Localization for Smartphones Chenshu Wu, Member, IEEE, Zheng Yang, Member, IEEE, Zimu Zhou, Member, IEEE, Yunhao Liu, Fellow, IEEE, and Mingyan Liu, Fellow, IEEE Abstract—Although WiFi fingerprint-based indoor localization is attractive, its accuracy remains a primary challenge especially in mobile environments. Existing approaches either appeal to physical layer information or rely on extra wireless signals for high accuracy. In this paper, we revisit the RSS fingerprint-based localization scheme and reveal crucial observations that act as the root causes of localization errors, yet are surprisingly overlooked or not adequately addressed in previous works. Specifically, we recognize APs’ diverse discrimination for fingerprinting a specific location, observe the RSS inconsistency caused by signal fluctuations and human body blockages, and uncover the transitional fingerprint problem on commodity smartphones. Inspired by these insights, we devise a discrimination factor to quantify different APs’ discrimination, incorporate robust regression to tolerate outlier measurements, and reassemble different normal fingerprints to cope with transitional fingerprints. Integrating these techniques in a unified system, we propose DorFin, a novel scheme of fingerprint generation, representation, and matching, which yields remarkable accuracy without incurring extra cost. Extensive experiments in three campus buildings demonstrate that DorFin achieves a mean error of 2.5 meters and more importantly, decreases the 95th percentile error under 6.2 meters, both significantly outperforming existing approaches. Index Terms—WiFi, fingerprints, smartphones, indoor localization 1 I NTRODUCTION T HE proliferation of mobile computing has spurred extensive interests in location-based services, leading to an urgent need for fine-grained location. The past decade has witnessed the conceptualization and development of various wireless indoor localiz- ation techniques, including WiFi [1], [2], RFID [3], [4], acoustic signals [5], [6], ultrasound [7], [8], etc. Due to the wide deployment and availability of WiFi infrastructure, WiFi fingerprint-based indoor localiza- tion has become one of the most attractive localization techniques [9]–[14]. Roughly speaking, a fingerprint- based scheme consists of two stages: site survey and fingerprint matching. During site survey (a.k.a calib- ration or war-driving), received signal strengths (RSS) from multiple WiFi access points (APs) are recorded at known locations to construct a fingerprint database. To locate a user, localization algorithms match his RSS measurements against the pre-labeled records and estimate his location to be the one with the best-fitted fingerprint. There is generally a tradeoff between accuracy, ubi- quity, and cost in designing a pervasive indoor local- ization system. Accuracy has long been the primary C. Wu, Z. Yang and Y. Liu are with the School of Software and TNLIST, Tsinghua University. Email: {woo, yangzheng}@tsinghua.edu.cn, [email protected]. Zimu Zhou is with the the Computer Engineering and Networks Laboratory (TIK), ETH Zurich. E-mail: [email protected]. Mingyan Liu is with the Department of EECS, University of Michigan. E-mail: [email protected] challenge especially in mobile environments. Even schemes that have been reported to have very high accuracy in some instances, e.g., [2], [13], [15], can experience rapid performance degradation in realistic environments, with median error consistently above 5 meters [16]. In addition, there are always unaccept- ably large tail errors, e.g., 1020m or larger. Recent works [12], [16] found that large errors of prior works could range from 12 to around 40 meters. Mobility further deteriorates the performance especially for smartphone based methods. Efforts to gain high ac- curacy include to leverage physical layer informa- tion [17] and incorporate acoustic ranging [6], [12], among others. Despite of the notable improvements, these methods typically either rely on information unavailable on commodity smartphones, or resort to unrealistic cooperation among a dense crowd of peers, and WiFi fingerprinting is usually employed as a fundamental module [18]. Hence any improvement on WiFi fingerprinting itself is of great significance and is usually not conflict but complementary to enhancements by additional information [19]–[21]. In this paper, we investigate to mitigate large errors for WiFi fingerprinting and achieve accurate and robust localization, especially for mobile phones, without degrading the ubiquity or increasing the costs. To investigate the root cause of limited localization accuracy, we conduct extensive experiments and un- cover or revisit the following characteristics of WiFi fingerprint-based localization: 1) APs have different discriminatory capabilities to fingerprint a specific loc- ation since RSS changes are inversely proportional to

Upload: others

Post on 02-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, MAY 2016 1

Mitigating Large Errors in WiFi-based IndoorLocalization for Smartphones

Chenshu Wu, Member, IEEE, Zheng Yang, Member, IEEE, Zimu Zhou, Member, IEEE,Yunhao Liu, Fellow, IEEE, and Mingyan Liu, Fellow, IEEE

Abstract—Although WiFi fingerprint-based indoor localization is attractive, its accuracy remains a primary challenge especiallyin mobile environments. Existing approaches either appeal to physical layer information or rely on extra wireless signals for highaccuracy. In this paper, we revisit the RSS fingerprint-based localization scheme and reveal crucial observations that act as theroot causes of localization errors, yet are surprisingly overlooked or not adequately addressed in previous works. Specifically,we recognize APs’ diverse discrimination for fingerprinting a specific location, observe the RSS inconsistency caused by signalfluctuations and human body blockages, and uncover the transitional fingerprint problem on commodity smartphones. Inspiredby these insights, we devise a discrimination factor to quantify different APs’ discrimination, incorporate robust regression totolerate outlier measurements, and reassemble different normal fingerprints to cope with transitional fingerprints. Integratingthese techniques in a unified system, we propose DorFin, a novel scheme of fingerprint generation, representation, and matching,which yields remarkable accuracy without incurring extra cost. Extensive experiments in three campus buildings demonstrate thatDorFin achieves a mean error of 2.5 meters and more importantly, decreases the 95th percentile error under 6.2 meters, bothsignificantly outperforming existing approaches.

Index Terms—WiFi, fingerprints, smartphones, indoor localization

F

1 INTRODUCTION

THE proliferation of mobile computing has spurredextensive interests in location-based services,

leading to an urgent need for fine-grained location.The past decade has witnessed the conceptualizationand development of various wireless indoor localiz-ation techniques, including WiFi [1], [2], RFID [3],[4], acoustic signals [5], [6], ultrasound [7], [8], etc.Due to the wide deployment and availability of WiFiinfrastructure, WiFi fingerprint-based indoor localiza-tion has become one of the most attractive localizationtechniques [9]–[14]. Roughly speaking, a fingerprint-based scheme consists of two stages: site survey andfingerprint matching. During site survey (a.k.a calib-ration or war-driving), received signal strengths (RSS)from multiple WiFi access points (APs) are recordedat known locations to construct a fingerprint database.To locate a user, localization algorithms match his RSSmeasurements against the pre-labeled records andestimate his location to be the one with the best-fittedfingerprint.

There is generally a tradeoff between accuracy, ubi-quity, and cost in designing a pervasive indoor local-ization system. Accuracy has long been the primary

• C. Wu, Z. Yang and Y. Liu are with the School of Software andTNLIST, Tsinghua University.Email: {woo, yangzheng}@tsinghua.edu.cn, [email protected].

• Zimu Zhou is with the the Computer Engineering and NetworksLaboratory (TIK), ETH Zurich.E-mail: [email protected].

• Mingyan Liu is with the Department of EECS, University of Michigan.E-mail: [email protected]

challenge especially in mobile environments. Evenschemes that have been reported to have very highaccuracy in some instances, e.g., [2], [13], [15], canexperience rapid performance degradation in realisticenvironments, with median error consistently above5 meters [16]. In addition, there are always unaccept-ably large tail errors, e.g., 10∼20m or larger. Recentworks [12], [16] found that large errors of prior workscould range from 12 to around 40 meters. Mobilityfurther deteriorates the performance especially forsmartphone based methods. Efforts to gain high ac-curacy include to leverage physical layer informa-tion [17] and incorporate acoustic ranging [6], [12],among others. Despite of the notable improvements,these methods typically either rely on informationunavailable on commodity smartphones, or resort tounrealistic cooperation among a dense crowd of peers,and WiFi fingerprinting is usually employed as afundamental module [18]. Hence any improvementon WiFi fingerprinting itself is of great significanceand is usually not conflict but complementary toenhancements by additional information [19]–[21]. Inthis paper, we investigate to mitigate large errors forWiFi fingerprinting and achieve accurate and robustlocalization, especially for mobile phones, withoutdegrading the ubiquity or increasing the costs.

To investigate the root cause of limited localizationaccuracy, we conduct extensive experiments and un-cover or revisit the following characteristics of WiFifingerprint-based localization: 1) APs have differentdiscriminatory capabilities to fingerprint a specific loc-ation since RSS changes are inversely proportional to

Page 2: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

2 TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, MAY 2016

-70-70

ΔRSS=20dB

ΔdΔd

Figure 1: DiscriminationDiversity

AP

RS

S

Non block Left RightLeft RightAP

RS

D

Time

RS

S 8dB

4dB

Body Blocking Effect

RSS Difference Inconsistency

Figure 2: Fingerprint Inconsist-ency

ID RecordTime (ms) TSF Time (us) MAC RSS SSID

18 1369749309456 264186215408 #3 -91 TNS18 1369749309456 264182232101 #2 -74 CSI232-118 1369749309456 264184882125 #1 -79 NETGEAR

19 1369749310797 264187543624 #2 -74 CSI232-119 1369749310797 264184882125 #1 -79 NETGEAR19 1369749310797 264187543746 #3 -90 TNS

=1.33s

=1.34s

17 1369749308120 264184882125 #1 -79 NETGEAR17 1369749308120 264182232101 #2 -74 CSI232-117 1369749308120 264184882156 #3 -90 TNS

#1 00:0f:b5:dd:21:d6 #3 44:e4:d9:aa:b7:4f#2 c8:d3:a3:a3:02:f5

Figure 3: Transitional finger-prints

the physical distance, subject to radio signal propaga-tion laws. Intuitively, faraway APs may lead to largelocation estimation errors while close ones can helpmitigate the location uncertainty. 2) Biased RSS meas-urements caused by signal fluctuation and humanbody blockage may present themselves as outliersin fingerprint matching. Human body blockage tosmartphones can remove line-of-sight and weaken thereceived signal by up to 10dB, thus greatly exagger-ating the discrepancies of fingerprints measured fromthe same location. 3) The real-time measured RSSvalues may be in fact outdated due to incomplete scanby hardware and software limitations of commoditywireless devices. In other words, latest reported RSSvalues could be cached duplicates of previous scansperformed several seconds ago, as we call outdatedRSS. Considering user mobility, the outdated RSSscould actually be measurements done at a previouslocation, which result in transitional fingerprint, i.e.,a patchwork of the up-to-date RSSs of the currentlocation and the cached RSSs of the past locations. Inoverlooking such farraginous information, previousworks directly compare the transitional fingerprintswith those collected at a single location, incurringfrequent fingerprint mismatches. The above are keyreasons behind location errors of fingerprint-basedschemes, especially in mobile environments; yet sur-prisingly they have not been adequately addressedin existing works (in spite that some of them, e.g.,AP quality and body blockage, have been noticedpreviously [22], [23]).

With these observations in mind, we designDorFin (named after Discrimination diversity,Outdated RSS, and Fingerprint inconsistency), anaccurate and robust fingerprint-based scheme thatunleashes the true potential of WiFi-based localizationfor smartphone applications. DorFin includesthree main components. First, we quantitativelydifferentiate distinct AP’s discriminatory abilityw.r.t. a specific location. APs with stronger abilityare emphasized with more weights in fingerprintmatching, while others are de-emphasized. Second,noting fingerprint inconsistency, we apply a robustregression technique in fingerprint matching identifyand mitigate those outlying RSS values, in the hope

of ensuring accuracy under noisy measurements.Finally, we propose phantom fingerprints thatincorporate multiple normal fingerprints in thefingerprint database to deal with the transitionalfingerprints. Phantom fingerprints are assembledaccording to the specific geometrical constraints ofoutdated RSSs, which are derived by monitoringuser mobility using smartphones’ built-in inertialsensors. Integrating these components, we design auniform fingerprint similarity metric which furthertakes account of common AP ratio as a factor tomitigate erroneous matches of distant fingerprints.

To validate our design, we implement DorFin oncommodity devices and conduct extensive experi-ments in three campus buildings. We also employ twoclassical [1], [2] and two latest [24], [25] approaches forcomparison. Experimental results demonstrate com-petitive performance of DorFin even to solutionsbased on additional ranging techniques. In additionto the average accuracy of 2.5m, DorFin signific-antly reduces large location errors by limiting the95 percentile errors in 6.2m, both outperforming allcomparison approaches by at least 34% and 21%,respectively. Using only the most essential RSS, theproposed approach requires no extra hardware andis amendable to general fingerprint-based frameworkas well as mutually beneficial and complementary toexisting or upcoming augmentations based on inertialsensors, acoustics, images or others [11], [12], [18]. Weenvision our approach as an important step towardsaccurate location estimation on smartphones with theprevalent WiFi infrastructure.

Our contributions are summarized as follows:

• We identify and mitigate several crucial problemsthat explain the root cause of location errors buthave not been adequately studied.

• We are the first to tackle the transitional finger-print problem caused by incomplete scan andhuman mobility. In addition, to the best of ourknowledge, DorFin is the first systematic attemptto integrate a suit of novel techniques in a unifiedsolution. The proposed scheme achieves accurateand robust localization with only the prevalentRSS, requiring no additional hardware.

• We implement a prototype system and conduct

Page 3: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

WU et al.: MITIGATING LARGE ERRORS IN WIFI-BASED INDOOR LOCALIZATION FOR SMARTPHONES 3

real world experiments in multiple buildings us-ing commodity devices. In addition to the re-markable performance, our method can be con-veniently integrated in existing WiFi fingerprint-based localization systems.

The rest of the paper is organized as follows. Section2 presents our preliminary measurements and basicobservations. The method design is detailed in Section3, followed with the experiments and performanceevaluation in Section 4. We discuss the state-of-the-art of indoor localization in Section 5 and concludethe paper in Section 6.

2 PRELIMINARY AND MEASUREMENTS

In this section, we review the classical RSS finger-printing problem and investigate fundamental char-acteristics of radio fingerprints through real meas-urements. Our preliminary results show some crucialfeatures, which, having been largely overlooked in thepast, shed light on how to achieve high accuracy offingerprint-based localization.

2.1 Problem Statement

The working process of a typical fingerprint-basedlocalization scheme consists of two stages: site surveyand fingerprint matching. During site survey, wirelessfingerprints (i.e., the set of RSS values from multipleAPs) are measured and recorded at every locationof interests. A fingerprint database (a.k.a radio map)is accordingly constructed, in which the fingerprint-location relationships are stored. To locate a userwho sends a location query with his current RSSfingerprint, localization algorithms retrieve the finger-print database and return the location of the matchedfingerprint as the user’s location estimation.

Denote a fingerprint as f = [fi, i = 1, · · · , n], wherefi is the RSS value of the AP Ai ∈ A, the set of ndetectable APs appearing in f . For two fingerprintsf and f ′, denote the RSS difference (RSD) vector asδ = [δi, i = 1, · · · , p] where δi = |fi − f ′i | indicatesthe RSD of AP Ai ∈ A ∪ A′ in the two fingerprintsand p = |A ∪ A′|. Since the sample fingerprint f andquery one f ′ do not necessarily contain identical setsof APs, we set fi (f ′i ) to -100, the default minimum RSSvalue, if Ai /∈ A (A′). By doing this, we can alwaysobtain an extended version of fingerprint that containsp effective APs for a couple of any sample and queryfingerprint. Let φ be the dissimilarity between f andf ′, which, if measured by Euclidean distance, can becalculated as φ(f ,f ′) = ‖δ‖ =

√∑pi=1 δ

2i . For all

fingerprints stored in the fingerprint database F , thegoal of fingerprint matching is to find the fingerprintf∗ that achieves the highest similarity with respect tothe query fingerprint f . Formally,

f∗ = arg minf i∈F

φ(f ,f i). (1)

Then the user’s location is estimated as the corres-ponding location L(f∗) of f∗. Assumimg the truelocation of f is L(f), the location estimation error isgiven by ε = ‖L(f)− L(f∗)‖.

2.2 ObservationsObservation 1 (Discrimination Diversity) APs havediverse discrimination capability to fingerprint a specificlocation, subject to inherent constraints of radio signalpropagation.

We term Discrimination capability as the ability of oneAP to distinguish a specific location when includ-ing its RSS observations in the location’s fingerprint.Ideally, subject to the propagation law of wirelesssignals, RSS decays logarithmically with propagationdistance d. More formally, RSS ∝ − log(d), indicatingthat ∆RSS

∆d ∝ − 1d , where ∆RSS denotes the RSS

change and ∆d is the corresponding distance change.In other words, an identical ∆RSS can imply a smal-ler distance change ∆d at closer locations, or a larger∆d at faraway positions. Figure 1 depicts the illustrat-ive RSS spatial distribution of two APs. As seen, anRSS variance of 1dB in value corresponds to vastlydifferent changes in physical distance, depending onthe specific d. Specifically, faraway APs may containlarger uncertainties in location determination thancloser APs. In a nutshell, distance changes indicatedby RSS variances depend on the transmitter-receiverdistance, leading to diverse discrimination capabilityacross different locations.

Several previous works perform AP selection todeal with AP diversity. A subset of APs are chosen forlocation estimation using either specifically definedcomplex metrics or just RSS cutoff [23], [26]. As ispointed out in [27], AP selection is data dependentand thus the selected APs may not always be the mostdiscriminative ones due to significant RSS fluctuationsindoors. Hence selecting only a portion of APs anddiscarding the others may not be the best way to dealwith the discrimination diversity, leaving a room forfurther improvements.

Observation 2 (Fingerprint Inconsistency) The ma-jority of APs hold similar RSSs for fingerprints from thesame/close locations while a small fraction may exhibit largedifferences due to environmental dynamics and humanbody blockages.

Location errors originate from unmatched fingerprintsmeasured from the same/close locations. Our invest-igation on these fingerprints indicate that a major-ity of APs exhibit relatively stable RSSs even whenthese fingerprints are not matched. That is to say, thefingerprint dissimilarity (under certain metric suchas Euclidean distance) is primarily produced by thedrastically fluctuating RSSs of a small portion of APs,which is, however, obviously not caused by location

Page 4: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

4 TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, MAY 2016

0 5 10 15 200

10

20

30

Out

date

d R

atio

(%

)

Outdated Ratio

0 5 10 15 20−100

−80

−60

−40

Mea

n R

SS

(dB

m)

AP Index

Mean RSS

(a) Outdated rates over different APs

1 1.5 2 2.5 3 3.5 4 4.50

0.2

0.4

0.6

0.8

1

CD

F

Time Delay (s)

OutdatedDelay

1st missing

3rd missing

2nd missing

(b) Time delays of outdated RSS

0 1 2 3 4 50

0.2

0.4

0.6

0.8

1

CD

F

Time Delay (s)

Max Outdated DelayMin Outdated Delay

(c) Time delays in transitional fingerprints

Figure 4: Outdated RSS phenomenon: The real-time reported RSSs in one fingerprint might be outdated dueto incomplete scan.

changes, but probably stems from ambient dynamicsand human body blockages [28], [29] especially inmobile environments.

As shown in Figure 2, signal strengths perceived bysmartphones decrease significantly when the humanbody blocks the direct path of signal propagation,compared to when the user is facing the AP. Theseweakened RSS observations of blocked APs tend todeviate from the normal profiles, resulting in abnor-mal RSSs when compared with fingerprints measuredduring the training phase. Taking Figure 2 as an ex-ample, the normal RSS profile absent of body blockageis measured to be f = [−40,−65,−50]. When a useris present and faces left, the right AP is blocked, res-ulting in a biased fingerprint f left = [−40,−65,−65](for simplicity, we assume RSSs of unblocked APsremain unchanged). When facing right, the line ofsight of the left AP is blocked and its RSS is corres-pondingly weakened, creating a fingerprint f right =[−52,−65,−50]. Then comparing f left and f right withthe normal f produces inconsistent RSD distributionsδleft = [0, 0, 15] and δright = [12, 0, 0], both generatingabnormally larger fingerprint dissimilarity and ulti-mately leading to greater location uncertainty.

To tackle with such inconsistency, previous workstypically collect orientation-dependent fingerprintsfor multiple directions [30], [31], but this requires highlabour efforts while offers limited gains. Furthermore,the one-time constructed fingerprints are vulnerableto environmental changes, leading to degraded per-formance over time. Some recent solutions incorpor-ate direction information in the fingerprint database[32] or resort to peer-assisted acoustic ranging amongmultiple phones [12], [33]. The former increases thecosts of fingerprint constructions, while the latter re-lies on cooperation among multiple users, rendering itimpractical. Probabilistic schemes [34], [35] have beendesigned to tolerate RSS variations by modeling theRSS distributions from sufficient number of samples.In contrast, we aim to identify and mitigate them.

Observation 3 (Transitional Fingerprint) The real-time reported RSS values might be outdated due toincomplete scanning results, caused by software and

hardware restrictions.

The incomplete scan phenomenon is a widely ex-isted yet surprisingly overlooked problem on off-the-shelf devices due to the wireless protocol andhardware capability limitations. Figure 3 illustratesa glance of scanning results from mobile devicesrunning the Android OS. Commodity smartphonesacquire WLAN information in a passive scanning modeby listening to periodic beacons from surroundingAPs on all working channels. In this mode, the timea client stays on a channel is 100ms by default,which is specified by the 802.11 standard [36] andis equal to the default beacon interval. Consequently,the latency incurred in capturing the AP informationfor 2.4GHz WiFi is about 1,100ms since there are 11available channels. In practice, it takes about 1∼1.5seconds for mainstream Android OS to complete ascan with commodity smartphones. Due to beaconconflicts and channel collisions, the beacon intervalof 100ms cannot be always guaranteed, potentiallyresulting in some missed APs during a scan. However,to maintain quality of service, these missed APs canstill appear in the scanning results by duplicatinginformation from last several scans a few seconds ago.As shown in Figure 4, a significant portion of APsexperience high outdated rates, ranging from 2% to25%. In particular, about 60% of the outdated RSSsbear an outdated delay of 1.4s, while around 20%and 15% has a delay of 2.7s and 4s, respectively.Translated into fingerprints, Figure 4c indicates thatover 80% of fingerprints contain outdated RSSs, andfor about 20% the maximum delay time exceeds 4s.Similar phenomenon is observed on various commod-ity devices including smartphones and pads such asGoogle Nexus S and Nexus 4, LG D820, and SamsungT210, and laptops such as Lenovo T430s and X1Carbon.

If a user is stationary, such outdated RSS prob-lem has little impact on location fingerprinting sincethe cached RSSs are also measured from the sameposition within a short period of time (less thana few seconds). In mobile environments, however,users may have moved several meters away between

Page 5: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

WU et al.: MITIGATING LARGE ERRORS IN WIFI-BASED INDOOR LOCALIZATION FOR SMARTPHONES 5

Phantom Fingerprint

Discrimination Factor

Robust Regression

Location Estimation

Fingerprint Database

l, θ F

ρ

δ Query Fingerprints

Sensor Hints

Locations

Figure 5: System architecture

Fingerprint Database

Location Space

A

B

C

Phantom Fingerprint

Figure 6: Phantom fingerprint

l-Δll

l+Δl

North

Figure 7: Bequeathal locations

consecutive scans, resulting in a fingerprint comprisedby RSS values that are actually observed at mul-tiple locations, which we call transitional fingerprint.A transitional fingerprint is a patchwork of the up-to-date RSSs of the current location and the cached RSSsof past locations. Previous works treat these spatiallymixed fingerprints as normal ones and directly com-pare them to those stored in the fingerprint database,which are all collected at single locations. Obviously,matching fingerprints mixed from multiple locationsto those from single positions may result in frequentfingerprint mismatches or even localization failures.

Note that the transitional fingerprint problem isvery different from the conventionally denoted out-of-date fingerprints, which refer to fingerprints thatwere collected a considerably long period of timeago [37]. The out-of-date fingerprints are typically theresults of RSS variants due to environment dynamicsand usually will be deprecated or adapted to date forlocalization [37]. The transitional fingerprint problem,however, is an intrinsic, environment-irrelevant issuesubjected to prevalent WiFi and commercial hardwarespecifications in mobile contexts, which usually oc-curs within a short time period of a few seconds.To the best of our knowledge, this problem has notbeen noticed before. Previous solutions [38], [39] formobile users utilize information from the past to comeup with better disambiguation of candidate user loca-tions, which potentially leverage physical constraintsimposed by user movements and thus are completelydifferent from and orthogonal to our consideration.More importantly, previous works treat the entirefingerprint as a basic unit and consider only the timedomain. In contrast, we focus on each RSS componentthat composes the fingerprint, and investigate the spa-tial relationships between them caused by incompletescan and user mobility.

Either having been noticed or not in the literat-ure, the above-mentioned problems have not beenadequately resolved. In this study, we reconsider theRSS fingerprinting scheme based on these signific-ant observations to mitigate large location errors forsmartphone localization.

3 DESIGN METHODOLOGY

By designing DorFin, we do not target at providingthe most accurate solution for indoor localizationamong all existing techniques such as those based onRFID [4], PHY layer information [14], acoustic ranging[6], etc, but attempt to explore the true potentialof pure WiFi fingerprint based localization scheme.DorFin is designed as an amendable technique thatcan be widely incorporated in various existing orupcoming WiFi-based solutions. Pursuing this goal,we do not resort to any extra information exceptfor involving inertial sensing for mobility monitoringin DorFin. As illustrated in Figure 5, the proposedsolution includes a phantom fingerprint assemblingmodule, a robust regression procedure and a dis-criminatory policy, unified in a normal fingerprintmatching scheme.

3.1 Phantom FingerprintsAn intuitive way to overcome transitional fingerprintproblem is to recognize and discard the outdatedentities before fingerprint matching. However, as in-dicated in Figure 4, a significant portion of APs bearoutdated RSSs. Discarding all of them may degradethe performance of localization since larger number ofAPs can typically result in better accuracy [27], [32].

In contrast, one query fingerprint consisting of RSSsobserved at multiple locations should be matchedwith fingerprints recombined by measurements fromthose locations, which, however, are not directly avail-able in the fingerprint database. In this sense, oneneeds to assemble special fingerprints, i.e., combin-ations of fingerprints from multiple locations, formatching, as shown in Figure 6. These newly con-structed fingerprints do not yet exist in the fingerprintdatabase, and are referred to as phantom fingerprints.

For a fingerprint f = [fi, i = 1, · · · , n], denote theencountered timestamp of each AP Ai in f by ti.Recall Figure 3, the scanning delay is typically longerthan 1 second by our measurements while the differ-ences of all APs’ detected time in one fingerprint areusually small (indicated by the Time SynchronizationFunction timestamp provided by the Android OS).Hence, if the time difference between two APs in onefingerprint exceeds a certain value, e.g., 0.5s, then the

Page 6: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

6 TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, MAY 2016

earlier one is definitely outdated. In particular, forAP Ak, the outdated duration ∆tk is computed by∆tk = max

i=1,··· ,nti − tk.

As illustrated in Figure 6, assume that fk is actuallythe measurement of Ak at a previous location, calledbequeathal location (BL), where a user was present ∆tkseconds ago. Further assume that the distance anddirection from the BL to the user’s current locationis `k and θk, respectively (we will describe how tocompute `k and θk shortly). Then when comparing fwith a candidate location, say, Lz , instead of directlycomputing the dissimilarity between f and fz , asample fingerprint of Lz , we match it against thephantom fingerprints ffz assembled from fz andfBL(z), fingerprint from the BL. Concretely, the RSSvalue fk in f is replaced by that of the same AP infBL(z). Considering there would generally be tem-poral RSS samples from AP Ak, we replace all its rawRSS observations with those from the BL and thenaccordingly regenerate a new version of fingerprint.In case of outdated RSSs from multiple APs, all ofthem are replaced according to their individual BLs,finally resulting in a precise phantom fingerprint ffz .

The distance offset ` and direction θ can be estim-ated by dead reckoning method using smartphonebuilt-in inertial sensors like accelerometer, gyroscope,and compass [10], [11], [13]. Specifically, we adoptthe method proposed in [40], which counts steps asaccurately as up to 98%, regardless of the phoneattitudes. The footsteps could then be converted tophysical displacement by multiplying with the user’sstep length, which can be automatically tracked [13].The direction, on the other hand, is estimated usinggyroscope and compass as [13]. In the following, wedemonstrate that although dead-reckoning may notalways be adequate for localization, it is sufficientfor our purpose of estimating ` and θ. Note thatwe merely involve inertial sensing to monitor shortdistance movements, but do not resort to extra in-formation such as a detailed digital floor plan that isrequired by previous works like Zee [13].

Due to noisy sensors and arbitrary human behavior,` and θ cannot be 100% accurately computed. Tocope with the erroneous estimations, we introducean error range for each of them, denoted as ∆` and∆θ, respectively, and demonstrate that the procedureof choosing BLs can tolerate these errors gracefully.Mathematically, as shown in Figure 7, potential BLsneed to satisfy the condition that their distances anddirections to the candidate location are bounded in[` − ∆`, ` + ∆`] and [θ − ∆θ, θ + ∆θ], respectively.The size of the shaded area is S = ∆θ

((` + ∆`)2 −

(` − ∆`)2))

= 4∆θ`∆`. Assuming a location sampledensity of 2m×2m and ∆` ≤ 2 meters, the minimalsize S0 to cover two sample locations should be atleast 4∆` m2. Thus, if ∆` ≤ 2 meters and ∆θ < 1/`,we have S < S0, which means the shaded area covers

at most one sample location, i.e., there is only onecandidate BL. In practice, the maximal value of themissing delay ∆t is less than 5s (APs not seen formore than 5s would no longer be reported until beingdetected again next time). Thus, assuming a normalwalking speed of 1.2 m/s, the distance offset can be atmost 6 meters, resulting in a minimum value of 1/`of 1

6 . In other words, even though the distance anddirection estimations are erroneous, we could identifya suspicious area and, with high probability, there isonly one possible BL in the area, as long as the errorsare in certain ranges (∆` ≤ 2 meters and ∆θ < 1

6 ).In case of multiple BLs (which is rare based on ourmeasurements), the one closest to the center of thesuspicious area (the shaded area shown in Figure 7)is selected. Phantom fingerprints are then constructedby combining fingerprints from the candidate locationand those from the BLs, i.e., substituting the tuplescorresponding to the outdated RSSs, as shown inFigure 6.

According to specific location sampling density, notall outdated RSSs need to be replaced. Only RSSs withdistance offsets ` exceeding half of the unit length ofsampling grids should be replaced. If ` is less thanhalf of the sampling distance (including being equalto 0 which means static user), fingerprints are merelytreated in the traditional way. Different from exist-ing mobility-assisted approaches [11], [13], [39] thatexplore spatial mobility constraints, we solely util-ize essential mobility hints to amend the transitionalfingerprints. Hence DorFin can be further integratedwith previous mobility-assisted techniques to achievebetter performance.

3.2 Robust Fingerprinting

As we have observed, RSSs of one pair of fingerprintsmay contain outliers because of impaired measure-ments due to human body blockage. Since this is aprimary cause of biased RSSs in mobile environments,only RSSs over a small portion of APs (that areblocked) may present outliers while most APs wouldremain consistent. Thus in this section, we propose toapply robust regression method on the inconsistentfingerprints, in the hope of bounding the influence ofoutlying measurements.

There are a large body of robust regression tech-niques, such as M -estimator, S-estimator, L-estimator,etc [41]. Among them, we choose the most widelyadopted Least Median of Squares (LMS) [42] estimatordue to its simplicity, effectiveness, and high break-down point (0.5), which is demonstrated to yield suf-ficient results with efficient computation (as indicatedin Section 4).

Given a query fingerprint fs = [fs,i, 1 ≤ i ≤ p] anda sample fingerprint f t = [ft,i, 1 ≤ i ≤ p] (supposethat both of them have been adjusted to be of p RSSvalues as introduced in Section 2.1), we adopt a simple

Page 7: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

WU et al.: MITIGATING LARGE ERRORS IN WIFI-BASED INDOOR LOCALIZATION FOR SMARTPHONES 7

linear regression model as follows:

yi = α1xi + α2 + ei, i = 1, · · · , p, (2)

where the response variables y are given by fs, whileexplanatory variables x = f t. e = [e1, · · · , ep] indic-ates the error term which is assumed to be normallydistributed with zero mean and an unknown standarddeviation σ.

As the AP number p is usually small, applyingrobust regression on insufficient observations doesnot always produce convincing statistical results. Toobtain sufficient data for regression, we propose tocompare the query fingerprint against all samplefingerprints corresponding to a candidate location,instead of a single averaged fingerprint. Specifically,for the candidate location L = L(f t) with samplefingerprints FL = {fL

k , k = 1, · · · ,m}, we simultan-eously match fs to all records in FL. In doing so,we acquire mp observations, which can achieve thescale of hundreds since there are generally at leastdozens of sample fingerprints for one location in thefingerprint database, and thus are sufficient for LMSregression. In this case, the explanatory variables xbecomes x = [fL

1 , · · · ,fLm]T1×mp and correspondingly

y is expanded as y = [fs, · · · ,fs]T1×mp. The regression

model is thus rewritten as

yk,i = α1xk,i + α2 + ek,i, (3)

where i = 1, · · · , p, k = 1, · · · ,m, and xk,i and yk,iindicate the value of fi in fL

k and fs, respectively.Applying LMS to the data [xy] yields α = [α1, α2]where the estimates αi denote the regression coef-ficients. Multiplying x with these αi, we obtain theestimated values of yi as

yk,i = α1xk,i + α2. (4)

The LMS estimator is given by minimizing the medianof squares of residuals as follows:

minα

medi,k

(yk,i − yk,i)2. (5)

To determine whether a value yk,i is an outlieramong all elements in y, we compare the residualrk,i = yk,i − yk,i to the scale estimate σ∗ defined by[41]. Then each yk,i is adjusted to yk,i as follows:

yk,i =

{yk,i if |rk,i/σ∗| ≤ 2.5yk,i otherwise (6)

where

σ∗ =

√∑mk=1

∑pi=1 wk,ir2

k,i∑mk=1

∑pi=1 wk,i

,

wk,i =

{1 if |rk,i/s0| ≤ 2.50 otherwise ,

s0 = 1.4826 ∗ (1 +5

n)√

medk,ir2k,i.

The involved constant values are widely recognizedfactors that has been suggested by preliminary exper-ience in the literature [41] and can generalize to dif-ference scenarios. Accordingly, the RSS values of thequery fingerprint fs are regulated as fs,i = 1

m

∑k yk,i

and the RSDs δst between fs and f t are thus tunedas δst,i = |fs,i − ft,i|.

3.3 Discriminatory Policy

Given that APs have diverse discrimination capabilityto fingerprint a specific location, it is inappropriate,and also unnecessary, to match two fingerprints withall APs equally involved. More accurate location es-timations can be achieved by relying more on thediscriminative APs, and limiting the influence of thosefluctuating and distant ones. Different from previousworks that select a subset of APs for localization [23],[26], we attempt to appropriately assess and leverageeach AP by seeking a discrimination metric that com-plies with physical constraints of signal propagationand simultaneously stays robust to RSS fluctuations.

To quantitatively differentiate each AP for a specificlocation, we define a discrimination factor accordingto the physical distance estimation between the APand the mobile client using the widely adopted Log-Distance Path Loss (LDPL) model [43]:

Pd = Pd0 − 10γ lg(d

d0), (7)

where Pd0 denotes the received power at a distanced0 (which usually takes the value of 1 meter), γ is thepath loss exponent, and Pd is the RSS in decibel meas-ured at a distance of d (in meters). Generally, Pd0

isa constant empirical value given the AP transmittingpower. Although γ can change between each pair ofAP-client, there are a lot of works targeting at adapt-ively estimating its value [14], which is not withinthe scope of this paper. In the prototyped DorFin,we determine Pd0

and γ by empirical values andexperimental measurements, as detailed in Section 4.

Deriving the distance to AP Ai from the LDPLmodel, we calculate its discrimination factor in fin-gerprint fu to location Lu as follows:

ρui =1

dui= 10

fu,i−Pd010γ , (8)

where dui is the estimated distance between Ai andLu. The rationale of using the reciprocal of physicaldistance lies in that it is consistent with the derivativeof the LDPL equation, which indicates the RSS change∆RSS ∝ − 1

d . More generally speaking, the basic ruleis to emphasize closer APs with stronger RSSs.

While the exponential ρui effectively discriminatesdifferent APs, it may also induce unnecessary match-ing errors in case of fluctuating RSSs, which maylead to significant distance estimation errors. Considerone of the APs in a fingerprint that fluctuates to

Page 8: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

8 TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, MAY 2016

a very large value (e.g., -45dBm). In this case, theeffects of other representative APs, which could holdconsiderable RSSs (e.g., up to -60dBm) and are thusdiscriminative, may become negligible since they canonly get inappreciable factors three or four timessmaller than the fluctuating AP. Hence to cope withnoisy RSSs, we additionally incorporate a sigmoidfunction to retain the effects of most discriminativeAPs. Mathematically, ρui is adjusted as follows:

ρui =

10

fu,i−Pd010γ if fu,i ≤ f0

1a

(1 + e

−2(fu,i+100

10 −c))−1

otherwise

(9)where the watershed RSS value f0 can be a flex-ible empirical value, e.g., -55dBm. Then the constantparameters a and c need to be determined based onthe specific value of γ such that ρui is continuous atf0. When applying to different location systems, γcan be derived by empirical values and experimentalmeasurements [14]. As shown in Section 4, empiricalvalues based on real measurements yield grateful per-formance in practice, better than previous AP sectionpolicy [26]. ρui then serves as a differential weightwhich will be attached to the regressed RSD of Ai

between fu and another fingerprint when computingtheir dissimilarity, as detailed in Section 3.4. Notethat the discrimination factor will be normalized as∑n

k=1 ρuk = 1 to keep the total power of a weighted

fingerprint unchanged.

3.4 LocalizationIntegrating all of the above components in a unifiedsolution, we define a new metric as follows for uni-form fingerprint dissimilarity judgment.

h(fs,f t) =( pst∑

i=1

(ρsti · δst,i)2) 1

2

, (10)

where pst = |As∪At| is the total number of distinctiveAPs in fs and f t, and ρsti = max{ρsi , ρti} denotes thediscrimination capability of AP Ai for matching fs

and f t, which is calculated based on the regressedfingerprints. Note that the RSD δst,i could also begiven by other suitable metrics such as a probabil-ity estimation. Realizing that fingerprints from closerlocations share more common APs (or equivalently,fingerprints with very few common APs is unlikelyto be from adjacent or same locations), the ultimateform of dissimilarity between two fingerprints fs andf t is expanded as follows:

φ(fs,f t) = h(fs,f t) ·pstqst

, (11)

where qst = |As ∩ At| denotes the number of com-mon APs in fs and f t. With the above metric, thedissimilarity of two fingerprints with fewer commondiscriminative APs will be amplified. In case of no

common APs (qst = 0), the dissimilarity will goto infinity, which eradicates the mismatch of twocompletely irrelevant fingerprints.

4 EXPERIMENTS AND EVALUATION

4.1 Experimental Methodology

Data Collection. We develop an application for sitesurvey and implement DorFin on Google Nexus S andNexus 4 phones, which both run the mainstream An-droid OSs (with Android API level 16 and 17, respect-ively). The two models are equipped with differentWiFi chipsets, the former with Broadcom BCM4329and the later with Qualcomm Atheros WCN3660. Wetreat the two models equally and interchangeably fortraining and testing during evaluation and examinethe integrated performance.

We conduct experiments in three campus buildings(denoted as Area #1, Area #2 and Area #3), as shownin Figure 8a, Figure 8b, and Figure 8c, respectively.We manually sample areas of interests in all build-ings (corridors in Area #1 and #2, while corridorsand rooms in Area #3). To construct the fingerprintdatabase, we collect around 30 to 60 sample records ateach location (which typically takes about 1 minute).In Area #1 and #2, we collect RSS data by puttingthe phone on a portable desk. To get rid of humanbody effects, user does not present around the deskwhen the mobile phone is collecting data (but thereare passengers passing through the corridors). A highsampling density of 1m×1m is used for extensiveevaluation. Sparser data are then derived from thesedensely surveyed samples. In total, we obtain 90locations in Area #1 and 83 sample locations in Area#2. In Area #3 where we sample the whole floor,we hold the phone in hand for collection. We use asampling density of 2m×2m and survey 293 locationsin total, for each we gather at least 60 RSS samples.293 sample locations are gathered in Area #3. Notethat the training data collection can be also done viacrowdsourcing-based mechanisms [9], [11], [13]. How-ever, we currently still conduct in manual manner inpurpose of obtaining qualified and reliable groundtruths for evaluation. Our future work includes build-ing a real system that integrates automatic techniquesfor fingerprint collection and adaptation.

We consider both static and mobile cases for testing.For stationary cases, we collect query data by lettingusers record measurements at each location with theirsmartphones held in hand. For a mobile user, thesmartphone measures RSSs while the user is walkingat a constant speed along a designated path with pre-defined start and end points. Note that the individualwalking speed varies from user to user and fromtrace to trace. To obtain the ground truth locationsof fingerprint records along the moving trace, wecompute user’s walking speed by dividing the path

Page 9: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

WU et al.: MITIGATING LARGE ERRORS IN WIFI-BASED INDOOR LOCALIZATION FOR SMARTPHONES 9

50m

50m

(a) Area #1

80m

30m

(b) Area #2

UpDown

UpDown

UDown

UDown

(c) Area #3

Figure 8: Experiment areas with sizes of around (a) 1,000 m2, (b) 1,200 m2 and (c) 1,500 m2. APs deployed bythe university are marked with stars. Locations of most APs are unknown.

length to the total time, and accordingly interpol-ate between the start and end points to obtain thelocation corresponding to each measurement basedon their timestamps. The data are all collected atdifferent time (mostly from afternoon to the night)over two days. When collecting data, people are work-ing routinely in their offices or labs and some willoccasionally walking around and pass through thecorridors. Users hold their phones in hand naturallywith free styles during collection. In total, we collectstatic queries from around 200 locations in Area #1and #2 and all 293 locations in Area #3. We gather over20 mobile traces reported from different pathways.

Methods: We compare DorFin with two classicaland two state-of-the-art schemes for fingerprint-basedlocalization. Despite of numerous fingerprint-basedapproaches built upon RADAR and Horus, we still in-clude them in purpose of confirming the performanceimprovements of DorFin over pure RSS fingerprint-based schemes.

• Enhanced RADAR (RADAR) [1]: RADAR is oneof the most classical and widely adopted finger-printing scheme, upon which a large body ofalgorithms are built [44]. We enhance RADAR byintegrating the proposed common AP (CA) factorand using K-nearest neighbours for location es-timation.

• Enhanced Horus (Horus) [2]: A classical prob-abilistic algorithm that computes the probabilitydistribution of the RSS values at each location asthe fingerprint metric, and retrieves the targets ofthe maximum likelihood as estimated locations.Horus is also implemented with the CA factorand KNN scheme.

• Temporally weighted KNN (TW-KNN) [24]:TW-KNN forms fingerprints with temporallyweighted RSS by applying an iterative recursiveweighted average filter on training RSS samples.Only a set of “important” APs are selected ac-cording to RSS values.

• Kullback-Leibler Divergence (KLDiv) [25], [45]: Afingerprint-based localization scheme that utilizesthe Kullback-Leibler divergence distance betweentwo signal distributions as the similarity measure.

4.2 Performance Evaluation4.2.1 Overall performanceFigure 9a illustrates the localization error distributionsseen by DorFin in different areas. DorFin achievesmean accuracy of around 1.7m in both Area #1 and#2, while the mean error in the larger Area #3 appearsto be higher, achieving 3.8m. Besides the promisingaverage accuracy, DorFin significantly reduces largelocalization errors. In all experimental areas, DorFindecreases the 95th percentile errors to less than 7.0m.

To examine the performance in mobile scenarios,we test the proposed approach on the mobile tracesand report the integrated results. As illustrated inFigure 9b, despite slight drop in accuracy comparedto the static cases, DorFin maintains graceful per-formance in mobile cases, far superior to Horus andRADAR. Specifically, the average and 95th errors areabout 3.0 meters and 8.5 meters, respectively. In com-parison with RADAR and Horus, DorFin decreasesboth errors by nearly 50%. Even though the perform-ance in mobile cases is not as good as static cases,the achieved accuracy remains comparable and prom-ising. In addition, other complementary techniquessuch as path matching [9] can be integrated to furtherimprove the accuracy for continuous localization.

Performance comparison. Integrating all results inFigure 9c, DorFin consistently surpasses comparisonmethods. Specifically, DorFin achieves an average ac-curacy of 2.5m and and 95th percentile accuracy of6.2m. TW-KNN and Horus achieve the most com-parable performance, with mean and 95th percentileerrors of 3.8m, 3.9m and 8.0m and 7.8m, respectively.Among all approaches, RADAR yields the worst res-ults with a mean error of 4.5m. KLDiv suffers fromremarkable large errors, with 95th percentile error of19.8m, although it achieves a slightly better medianaccuracy of 2.6m than DorFin. In addition, DorFin sig-nificantly mitigates the large errors by around 40%,limiting the max location errors within 10m, whileall comparative methods produce max errors up toat least 16m.

Impact of training RSS samples. Due to theinstability of RSS measurements, we are interestedin whether and how the localization accuracy ofDorFin would be affected by different sizes of trainingRSS samples for each location. Hence we tried DorFin

Page 10: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

10 TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, MAY 2016

0 5 10 150

0.2

0.4

0.6

0.8

1

CD

F

Error (m)

Area #3Area #2Area #1

(a) Accuracy in different areas

0 5 10 15 20 250

0.2

0.4

0.6

0.8

1

Error (m)

CD

F

DorFinRADARHorus

(b) Accuracy in mobile cases

0 5 10 15 20 250

0.2

0.4

0.6

0.8

1

CD

F

Error (m)

DorFinRADARHorusTW−KNNKLDiv−KNN

(c) Accuracy comparison

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

CD

F

Error (m)

#Samples = 60#Samples = 40#Samples = 20

(d) Impact of sample number

Figure 9: Accuracy of DorFin

0 5 10 15 20 250

0.2

0.4

0.6

0.8

1

CD

F

Error (m)

DorFin 1×1DorFin 2×2DorFin 3×3Horus 1×1RADAR 1×1

Figure 10: Effect ofsample density

0 5 10 15 20 250

0.2

0.4

0.6

0.8

1C

DF

Error (m)

Basic+DFBasic+RRBasic+CABasic

Figure 11: Effects of indi-vidual modules

0 5 10 150

0.2

0.4

0.6

0.8

1

CD

F

Error (m)

DFNo selectionRSS Cut by −95dBRSS Cut by −90dB

Figure 12: Comparing DFwith RSS-cutting method

0 5 10 15 20 250

0.2

0.4

0.6

0.8

1

Error (m)

CD

F

Basic+PFBasic

Figure 13: Effect of PF(mobile scenarios)

with different number of training samples respect-ively and illustrate the results in Figure 9d. As seen, ityields only marginal differences in performance whenusing 20, 40, and 60 RSS samples for training. Whenshrinking the training sizes from 60 to 20 samples, themean and 95th percentile errors merely increase by2.4% and 6.4%. The results demonstrate the gracefulrobustness of DorFin to RSS variations.

Impact of sample density. As mentioned above, wesample the areas of interests with a density of 1m×1m,which is relatively high for practical operations. To ex-amine the performance with sparser sample locations,we perform DorFin with training data of differentsample densities (sample density is adjusted by siftingparts of the samples according to their locations).As shown in Figure 10, DorFin preserves excellentaccuracy even with sample densities of 2m×2m and3m×3m. Specifically, with density of 2m×2m, themean and 95th percentile errors are still limited at2.5m and 7.0m respectively, both better than those ofHorus and RADAR with density of 1m×1m.

In conclusion, DorFin achieves remarkable perform-ance in both stationary and mobile cases, with reas-onable sample densities. To understand how eachmodule of DorFin contributes to the integral accuracy,we next perform an analysis across different modules.

4.2.2 Effect of Individual ModulesWe separately employ each module of DorFin, i.e., thediscrimination factor (DF) module, the robust regres-sion (RR) module, the common AP constraints (CA)module, and the phantom fingerprint (PF) moduleon the most basic nearest neighbor method (denotedas Basic) described in Section 2.1 and evaluate theindividual performance.

Effect of DF. We evaluate the impact of DF byusing a set of empirical parameters to calculate thediscrimination factor. Specifically, the path loss expo-

nent is set to a typical value of 3 in indoor envir-onments while the referenced received power Pd0 isdetermined as -40dB by some on-site measurements.The sigmoid function parameters a and c accordinglyadopts the values of 4 and 4.3, respectively. As shownin Figure 11, DF limits the 95th percentile estimationerror by about 40%, while the average error is 1.5mlower than the Basic scheme, which has mean and95th percentile errors of 5m and 17.5m. By placingmore weight on more discriminatory APs and limitingthose of the others, DF achieves the improvementby ensuring the similarity between fingerprints ofclose locations. In addition, results from differentbuildings indicate that discrimination factors withuniform parameter settings can generate satisfactoryresults in different scenarios. Previous works employRSS cutting method based on a simple thresholdto select a subset of APs for localization. We alsoimplement this scheme in our settings and compareits performance with DF. As shown in Figure 12, byexploiting potentials of all valuable APs, the proposedDF achieves better performance than the simple RSScutting methods, no matter what thresholds are used.

Effect of RR. As shown in Figure 11, by employingRR over the Basic scheme, an average accuracy of2.2m is achieved, with the corresponding 95th per-centile accuracy of only 6m. Evidently, the advantagesof RR are the most significant among all modules byreducing the mean and 95th percentile errors by about56% and 65% compared with the Basic scheme, re-spectively. Such results on RR confirm our observationthat fingerprint inconsistency counts as a major causeof localization errors of fingerprint-based methodsespecially for smartphones.

Effect of CA. Figure 11 also demonstrates thatthe CA module is simple yet effective. Incorporat-ing the CA module with Basic scheme, the averageand 95th percentile localization errors are reduced by

Page 11: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

WU et al.: MITIGATING LARGE ERRORS IN WIFI-BASED INDOOR LOCALIZATION FOR SMARTPHONES 11

about 27% and 40%, turning into 3.6m and 11.2m, re-spectively. Dissimilarity of fingerprints from farawaylocations is largely enlarged by the common AP ra-tio, while that of fingerprints from close locationsis hardly affected (since close locations share morecommon APs).

Effect of PF. To examine the effectiveness ofphantom fingerprints in dealing with outdated RSSmeasurements, we compare the performance of theBasic method on mobile data with and without con-structing phantom fingerprints. As depicted in Fig-ure 13, the average and 95th percentile errors decreasefrom 3.9m and 10.4m to 2.4m and 6.9m respectivelywhen the sample fingerprints are appropriately re-placed with phantom fingerprints. With these res-ults, it is of interests to examine to what extent themeasured RSSs and further the entire fingerprints areoutdated. As we observed, over 11% of RSS measure-ments are outdated in our experiment data. Further-more, almost every fingerprint undergoes outdatedRSSs. In particular, there frequently exist large offsetdistances ranging from 2m to 6m in most fingerprints.The effectiveness of the PF convincingly validates ourobservation that the transitional fingerprint problemcan lead to location errors in mobile environments.

Building upon these components, DorFin producespromising accuracies that are competitive with thoseachieved by leveraging physical layer information[14], [17] or introducing extra ranging techniques [12],[33] (both with mean accuracy of about 1m∼3m).Without degrading the ubiquity nor increasing thecosts, we believe the performance achieved by DorFinoutperforms most of existing approaches and demon-strates promising potentials in serving as a practicalscheme for worldwide deployment.

Considering potential RSS variations over long-term running, the radio map of some locations willchange over time and lead to higher localizationerrors. Accounting this, self-calibration techniques forradio map updating [11], [37], [46], which tackle theRSS temporal variations to maintain an up-to-datedatabase, could be incorporated for practical deploy-ment and usage.

5 RELATED WORKSIn the literature of indoor localization, many tech-niques have been proposed in the past two decades.The state-of-the-art generally falls into two categories:fingerprint-based and ranging-based.

Fingerprint-based techniques. A large body ofindoor localization approaches adopts fingerprintmatching as the basic scheme for location estimation.Researchers have explored diverse signatures includ-ing WiFi [2], RFID [3], acoustic [5], etc. Among varioussignatures used, WiFi based scheme has been the mostattractive.

Smartphones with various built-in sensors havebeen leveraged in fingerprint-based localization to

reduce or eliminate site survey efforts. Examples in-clude LiFS [9], unloc [11], Zee [13], Walkie-Markie[10], etc. They typically combine user mobility withextra information like digital floor plan [9], [13] orindoor landmarks [11] and usually can only handlemobile trajectories [10]. As these works mainly fo-cus on easing the site survey in the training phase,DorFin is orthogonal to them in targeting at finger-print matching of the online phase to improve local-ization accuracy. Nevertheless, DorFin is also compat-ible to crowdsourced fingerprint database constructedvia these schemes.

Pursuing better accuracy, sophisticated probabilitymodels and advanced machine learning techniqueshave been employed [34], [35]. The study [16] valid-ates a broad range of approaches in a realistic environ-ments and reports that median errors of prior workare consistently greater than 5 meters and, counter-intuitively, that simpler algorithms frequently outper-form more sophisticated ones. Realizing that largeerrors always exist due to possibly faraway locationswith similar WiFi signatures, authors in [12], [33]attempt to incorporate acoustic ranging in WiFi fin-gerprinting to limit the large tail errors. Although sig-nificant improvements are achieved, these approacheseither rely on ranging among a dense crowd of usersor require calibrating additional information. Recentworks also explore new fingerprint features such asneighbour relative RSSs [47], neighbour RSS gradi-ent [21] and RSS ratio over multiple antennas [48]for accurate and robust fingerprinting. To completelybypass the instability of RSS, physical layer ChannelState Information (CSI) is recently introduced andachieves an accuracy of ∼1m [17], but at the costof ubiquity degradation (since CSI is unavailable onmost commodity smartphones).

To reduce computational complexity, d Different cri-teria for AP’s discriminatory ability such as InfoGain[23] and MaxMean [35] have been proposed to choosea subset of APs for localization. A more intuitivemethod called RSS cutoff, i.e., discarding RSS valuesbelow a specific threshold, is preferred in commercialproducts [26]. These methods conduct AP selectionmainly to reduce the computational complexity. Incontrast, we target at appropriately exploiting allavailable APs for more accurate fingerprint matching.Accounting for human body blockage, fingerprints aretypically collected for multiple directions [30], whichmay increase labour efforts while offers limited gains.An elaborate model is designed in [31] to compensatefor the signal attenuation of human body and thusgenerate orientation-independent fingerprints frommeasurements on just one orientation. In contrast tolabour-intensive measurements or vulnerable models,we resort to exploit robust regression techniques toachieve effective robustness to RSS uncertainties.

Ranging-based techniques. These schemes calcu-late locations based on geometrical models rather

Page 12: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

12 TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, MAY 2016

than search for best-fitted signatures from pre-labeledreference database. The prevalent LDPL model, forinstance, builds up a semi-statistical function betweenRSS values and RF propagation distances [15], [49].These approaches trade measurement efforts for thecost of decreasing localization accuracy. EZ [49] em-ploys a modeling method assuming no knowledge ofphysical layout or AP locations, and reports medianerror of 7 meters. Apart from RSS-based ranging, CSIis recently used to obtain for highly accurate distanceand angle estimation [14]. Acoustic ranging is alsoemployed for fine-grained indoor localization, suchas Centour [33], Guoguo [6], etc.

Different from previous works that introduce addi-tional information or extra signal sources for high ac-curacy, we identify the root causes of location errors inWiFi fingerprint-based localization for mobile devices.Specifically, we uncover and solved the transitionalfingerprint problem, which has not been noticed in ex-isting works. Aiming at a ubiquitous location service,we follow a typical RSS fingerprint-based scheme todesign DorFin, which is thus amendable to integratewith existing approaches and can be easily incorpor-ated in deployed systems with little efforts, making ita promising scheme in practical applications.

6 CONCLUSIONS

While WiFi fingerprint-based localization acts as thedominant scheme in indoor localization, the accuracychallenge remains a primary concern. In this paper,we identify several crucial causes of localization errorsin fingerprint-based schemes. These observations thenlead us to the design of a new WiFi fingerprintingscheme which successfully reduces the mean and 95thpercentile location errors to 2.5 meters and 6.2 meters,without degrading ubiquity nor increasing the costs.Our approach marks a significant progress in RSSfingerprint-based indoor localization, especially forsmartphones, and sheds lights on practical deploy-ment in the real world.

ACKNOWLEDGMENT

This work is supported in part by the NSFC undergrant No. 61672319, 61522110, and 61632008.

REFERENCES

[1] P. Bahl and V. N. Padmanabhan, “RADAR: an in-building RF-based user location and tracking system,” in Proceedings ofIEEE INFOCOM, 2000.

[2] M. Youssef and A. Agrawala, “The horus location determin-ation system,” Wirel. Netw., vol. 14, no. 3, pp. 357–374, Jun.2008.

[3] L. M. Ni, Y. Liu, Y. C. Lau, and A. P. Patil, “LANDMARC:indoor location sensing using active RFID,” Wireless Networks,vol. 10, no. 6, pp. 701–710, 2004.

[4] J. Wang and D. Katabi, “Dude, where’s my card? RFID posi-tioning that works with multipath and non-line of sight,” inProceedings of ACM SIGCOMM, 2013.

[5] S. P. Tarzia, P. A. Dinda, R. P. Dick, and G. Memik, “Indoor loc-alization without infrastructure using the acoustic backgroundspectrum,” in Proceedings of ACM MobiSys, 2011.

[6] K. Liu, X. Liu, and X. Li, “Guoguo: Enabling fine-grainedindoor localization via smartphone,” in Proceedings of ACMMobiSys, 2013.

[7] N. B. Priyantha, A. Chakraborty, and H. Balakrishnan, “Thecricket location-support system,” in Proceedings of ACM Mo-biCom, 2000.

[8] P. Lazik and A. Rowe, “Indoor pseudo-ranging of mobiledevices using ultrasonic chirps,” in Proceedings of ACM SenSys,2012.

[9] C. Wu, Z. Yang, and Y. Liu, “Smartphones based crowd-sourcing for indoor localization,” Mobile Computing, IEEETransactions on, vol. 14, no. 2, pp. 444–457, Feb 2015.

[10] G. Shen, Z. Chen, P. Zhang, T. Moscibroda, and Y. Zhang,“Walkie-markie: indoor pathway mapping made easy,” inProceedings of USENIX NSDI, 2013.

[11] H. Wang, S. Sen, A. Elgohary, M. Farid, M. Youssef, andR. R. Choudhury, “No need to war-drive: unsupervised indoorlocalization,” in Proceedings of ACM MobiSys, 2012.

[12] H. Liu, J. Yang, S. Sidhom, Y. Wang, Y. Chen, and F. Ye,“Accurate wifi based localization for smartphones using peerassistance,” Mobile Computing, IEEE Transactions on, vol. 13,no. 10, pp. 2199–2214, Oct 2014.

[13] A. Rai, K. K. Chintalapudi, V. N. Padmanabhan, and R. Sen,“Zee: zero-effort crowdsourcing for indoor localization,” inProceedings of ACM MobiCom, 2012.

[14] S. Sen, J. Lee, K.-H. Kim, and C. Paul, “Avoiding multipathto revive inbuilding wifi localization,” in Proceedings of ACMMobiSys, 2013.

[15] H. Lim, L. C. Kung, J. C. Hou, and H. Luo, “Zero-configurationindoor localization over IEEE 802.11 wireless infrastructure,”Wireless Networks, vol. 16, no. 2, pp. 405–420, 2010.

[16] D. Turner, S. Savage, and A. C. Snoeren, “On the empiricalperformance of self-calibrating wifi location systems,” in Pro-ceedings of IEEE Conference on Local Computer Networks (LCN),2011.

[17] S. Sen, B. Radunovic, R. R. Choudhury, and T. Minka, “Youare facing the mona lisa: spot localization using PHY layerinformation,” in Proceedings of ACM MobiSys, 2012.

[18] H. Xu, Z. Yang, Z. Zhou, L. Shangguan, K. Yi, and Y. Liu,“Enhancing wifi-based localization with visual clues,” in Pro-ceedings of ACM UbiComp, 2015, pp. 963–974.

[19] L. Li, G. Shen, C. Zhao, T. Moscibroda, J.-H. Lin, and F. Zhao,“Experiencing and Handling the Diversity in Data Density andEnvironmental Locality in an Indoor Positioning Service,” inProceedings of ACM MobiCom, 2014.

[20] S. He, T. Hu, and S.-H. G. Chan, “Contour-based trilaterationfor indoor fingerprinting localization,” in Proceedings of ACMSenSys, 2015.

[21] Y. Shu, Y. Huang, J. Zhang, P. Cou, P. Cheng, J. Chen, andK. G. Shin, “Gradient-based fingerprinting for indoor localiz-ation and tracking,” IEEE Transactions on Industrial Electronics,vol. 63, no. 4, pp. 2424–2433, April 2016.

[22] A. Haeberlen, E. Flannery, A. M. Ladd, A. Rudys, D. S.Wallach, and L. E. Kavraki, “Practical robust localization overlarge-scale 802.11 wireless networks,” in Proceedings of ACMMobiCom, 2004.

[23] Y. Chen, Q. Yang, J. Yin, and X. Chai, “Power-efficient access-point selection for indoor location estimation,” Knowledge andData Engineering, IEEE Transactions on, vol. 18, no. 7, pp. 877–888, 2006.

[24] P. Jiang, Y. Zhang, W. Fu, H. Liu, and X. Su, “Indoor mo-bile localization based on wi-fi fingerprints important accesspoint,” International Journal of Distributed Sensor Networks, vol.2015, p. 45, 2015.

[25] P. Mirowski, D. Milioris, P. Whiting, and T. Kam Ho, “Probab-ilistic radio-frequency fingerprinting and localization on therun,” Bell Labs Technical Journal, vol. 18, no. 4, pp. 111–133,2014.

[26] I. Cisco Systems, “Wi-fi location-based services 4.1 designguide,” 2013.

[27] S.-H. Fang and T.-N. Lin, “Principal component localization inindoor wlan environments,” Mobile Computing, IEEE Transac-tions on, vol. 11, no. 1, pp. 100–110, 2012.

Page 13: TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. X, NO. X, …

WU et al.: MITIGATING LARGE ERRORS IN WIFI-BASED INDOOR LOCALIZATION FOR SMARTPHONES 13

[28] T. B. Welch, R. L. Musselman, B. A. Emessiene, P. D. Gift, D. K.Choudhury, D. N. Cassadine, and S. M. Yano, “The effectsof the human body on uwb signal propagation in an indoorenvironment,” Selected Areas in Communications, IEEE Journalon, vol. 20, no. 9, pp. 1778–1782, 2002.

[29] Z. Zhang, X. Zhou, W. Zhang, Y. Zhang, G. Wang, B. Y.Zhao, and H. Zheng, “I am the antenna: accurate outdoor APlocation using smartphones,” in Proceedings of ACM MobiCom,2011.

[30] A. S. Paul and E. Wan, “Rssi-based indoor localization andtracking using sigma-point kalman smoothers,” Selected Topicsin Signal Processing, IEEE Journal of, vol. 3, no. 5, pp. 860–873,2009.

[31] N. Fet, M. Handte, and P. J. Marron, “A model for wlansignal attenuation of the human body,” in Proceedings of ACMUbiComp, 2013.

[32] W. Sun, J. Liu, C. Wu, Z. Yang, X. Zhang, and Y. Liu, “Moloc:on distinguishing fingerprint twins,” in Proceedings of IEEEICDCS, 2013.

[33] R. Nandakumar, K. K. Chintalapudi, and V. N. Padmanabhan,“Centaur: locating devices in an office environment,” in Pro-ceedings of ACM MobiCom, 2012.

[34] M. Youssef and A. Agrawala, “Handling samples correlationin the horus system,” in Proceedings of IEEE INFOCOM, 2004.

[35] M. A. Youssef, A. Agrawala, and A. Udaya Shankar, “WLANlocation determination via clustering and probability distribu-tions,” in Proceedings of IEEE PerCom, 2003.

[36] “IEEE 802.11, part11: Wireless LAN medium access control(MAC) and physical layer (PHY) specifications,” 2012.

[37] J. Yin, Q. Yang, and L. Ni, “Learning adaptive temporal radiomaps for signal-strength-based location estimation,” MobileComputing, IEEE Transactions on, vol. 7, no. 7, pp. 869–883, 2008.

[38] P. Bahl, V. N. Padmanabhan, and A. Balachandran, “Enhance-ments to the radar user location and tracking system,” Tech.Rep., 2000.

[39] S. Hilsenbeck, D. Bobkov, G. Schroth, R. Huitl, and E. Stein-bach, “Graph-based data fusion of pedometer and wifi meas-urements for mobile indoor positioning,” in Proceedings ofACM UbiComp, 2014, pp. 147–158.

[40] C. Wu, Z. Yang, Y. Xu, Y. Zhao, and Y. Liu, “Human mobilityenhances global positioning accuracy for mobile phone local-ization,” Parallel and Distributed Systems, IEEE Transactions on,vol. 26, no. 1, pp. 131–141, Jan 2015.

[41] P. J. Rousseeuw and A. M. Leroy, Robust regression and outlierdetection. Wiley, 2005.

[42] P. J. Rousseeuw, “Least median of squares regression,” Journalof the American statistical association, vol. 79, no. 388, pp. 871–880, 1984.

[43] T. S. Rappaport et al., Wireless communications: principles andpractice. Prentice Hall PTR New Jersey, 1996, vol. 2.

[44] Z. Yang, C. Wu, Z. Zhou, X. Zhang, X. Wang, and Y. Liu,“Mobility increases localizability: A survey on wireless indoorlocalization using inertial sensors,” ACM Computing Surveys,vol. 47, no. 3, pp. 54:1–54:34, Apr. 2015.

[45] P. Mirowski, H. Steck, P. Whiting, R. Palaniappan, M. Mac-Donald, and T. K. Ho, “Kl-divergence kernel regression fornon-gaussian fingerprint based localization,” in Proceedings ofIEEE IPIN, 2011, pp. 1–10.

[46] C. Wu, Z. Yang, C. Xiao, C. Yang, Y. Liu, and M. Liu, “Staticpower of mobile devices: Self-updating radio maps for wire-less indoor localization,” in Proceedings of IEEE INFOCOM,April 2015, pp. 2497–2505.

[47] K. Lin, M. Chen, J. Deng, M. M. Hassan, and G. Fortino,“Enhanced fingerprinting and trajectory prediction for iot loc-alization in smart buildings,” IEEE Transactions on AutomationScience and Engineering, vol. 13, no. 3, pp. 1294–1307, July 2016.

[48] W. Cheng, K. Tan, V. Omwando, J. Zhu, and P. Mohapatra,“Rss-ratio for enhancing performance of rss-based applica-tions,” in Proceedings of IEEE INFOCOM, 2013.

[49] K. Chintalapudi, A. Padmanabha Iyer, and V. N. Padman-abhan, “Indoor localization without the pain,” in Proceedingsof ACM MobiCom, 2010.

Chenshu Wu received his B.E. degree inSchool of Software in 2010 and Ph.D. de-gree in Department of Computer Science in2015, both from Tsinghua University, Beijing,China. He is currently a Postdoc researcherin School of Software, Tsinghua University.His research interests include wireless net-works and pervasive computing. He is amember of the IEEE and the ACM.

Zheng Yang received a B.E. degree in com-puter science from Tsinghua University in2006 and a Ph.D. degree in computer sci-ence from Hong Kong University of Scienceand Technology in 2010. He is currently anassociate professor in Tsinghua University.His main research interests include wirelessad-hoc/sensor networks and mobile comput-ing. He is a member of the IEEE and theACM.

Zimu Zhou is currently a postdoctoral re-searcher at the Computer Engineering andNetworks Laboratory (TIK), ETH Zurich. Hereceived his B.E. degree from the Depart-ment of Electronic Engineering, TsinghuaUniversity in 2011 and Ph.D. degree fromthe Department of Computer Science andEngineering, the Hong Kong University ofScience and Technology in 2015. He is amember of the IEEE.

Yunhao Liu received the BS degree in auto-mation from Tsinghua University, China, in1995, the MS and PhD degrees in computerscience and engineering from Michigan StateUniversity, in 2003 and 2004, respectively. Heis now EMC Chair Professor at Tsinghua Uni-versity. His research interests include wire-less sensor network, peer-to-peer comput-ing, and pervasive computing. He is a Fellowof the ACM and the IEEE.

Mingyan Liu received her Ph.D. Degree inelectrical engineering from the University ofMaryland, College Park, in 2000 and hassince been with the Department of ElectricalEngineering and Computer Science at theUniversity of Michigan, Ann Arbor, whereshe is currently a Professor. Her researchinterests are in optimal resource allocation,performance modeling and analysis, and en-ergy efficient design of wireless, ad hoc, andsensor networks. She is a Fellow of the IEEE.