pre- and post-processing algorithms with deep learning

17
electronics Article Pre- and Post-Processing Algorithms with Deep Learning Classifier for Wi-Fi Fingerprint-Based Indoor Positioning Amir Haider , Yiqiao Wei , Shuzhi Liu and Seung-Hoon Hwang * Division of Electronics and Electrical Engineering, Dongguk University-Seoul, Seoul 04620, Korea; [email protected] (A.H.); [email protected] (Y.W.); [email protected] (S.L.) * Correspondence: [email protected]; Tel.: +82-2-2260-3994 Received: 28 November 2018; Accepted: 4 February 2019; Published: 8 February 2019 Abstract: To accommodate the rapidly increasing demand for connected infrastructure, automation for industrial sites and building smart cities, the development of Internet of Things (IoT)-based solutions is considered one of the major trends in modern day industrial revolution. In particular, providing high precision indoor positioning services for such applications is a key challenge. Wi-Fi fingerprint-based indoor positioning systems have been adapted as promising candidates for such applications. The performance of such indoor positioning systems degrade drastically due to several impairments like noisy datasets, high variation in Wi-Fi signals over time, fading of Wi-Fi signals due to multipath propagation caused by hurdles, people walking in the area under consideration and the addition/removal of Wi-Fi access points (APs). In this paper, we propose data pre- and post-processing algorithms with deep learning classifiers for Wi-Fi fingerprint-based indoor positioning, in order to provide immunity against limitations in the database and the indoor environment. In addition, we investigate the performance of the proposed system through simulation as well as extensive experiments. The results demonstrate that the pre-processing algorithm can efficiently fill in the missing Wi-Fi received signal strength fingerprints in the database, resulting in a success rate of 88.96% in simulation and 86.61% in a real-time experiment. The post-processing algorithm can improve the results from 9.05–10.94% for the conducted experiments, providing the highest success rate of 95.94% with a precision of 4 m for Wi-Fi fingerprint-based indoor positioning. Keywords: Wi-Fi fingerprint; indoor positioning; IoT; data pre- and post-processing; deep learning 1. Introduction In the past decade, indoor location-based services (LBSs) have attracted lots of attention, driving the development of indoor positioning technologies to fulfil the technological requirements of modern day communications services. One of the key research issues is the development of a precise and reliable indoor positioning system to improve the inadequate performance of current positioning systems in indoor environment [1]. While global positioning systems (GPS) are unavailable in indoor environments due to the low penetration power of microwaves in indoor locations, several other candidates such as radio waves, acoustic signals, magnetic field, Wi-Fi, and other sensors information from mobile devices [24] can serve the purpose of determining the user location in indoor environments. Among the mentioned technologies, Wi-Fi signals-based indoor positioning is the most popular due to its wide deployment and the low cost of Wi-Fi networks [5]. Wi-Fi is considered more suitable in complex environments. Wi-Fi-based positioning technology can be classified into two types, time-space attribute methods and received signal strength (RSS)-based methods [6]. The latter is also known as fingerprinting and is one of the most popular indoor positioning technologies. Electronics 2019, 8, 195; doi:10.3390/electronics8020195 www.mdpi.com/journal/electronics

Upload: others

Post on 15-Oct-2021

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pre- and Post-Processing Algorithms with Deep Learning

electronics

Article

Pre- and Post-Processing Algorithms with DeepLearning Classifier for Wi-Fi Fingerprint-BasedIndoor Positioning

Amir Haider , Yiqiao Wei , Shuzhi Liu and Seung-Hoon Hwang *

Division of Electronics and Electrical Engineering, Dongguk University-Seoul, Seoul 04620, Korea;[email protected] (A.H.); [email protected] (Y.W.); [email protected] (S.L.)* Correspondence: [email protected]; Tel.: +82-2-2260-3994

Received: 28 November 2018; Accepted: 4 February 2019; Published: 8 February 2019�����������������

Abstract: To accommodate the rapidly increasing demand for connected infrastructure, automationfor industrial sites and building smart cities, the development of Internet of Things (IoT)-basedsolutions is considered one of the major trends in modern day industrial revolution. In particular,providing high precision indoor positioning services for such applications is a key challenge.Wi-Fi fingerprint-based indoor positioning systems have been adapted as promising candidatesfor such applications. The performance of such indoor positioning systems degrade drasticallydue to several impairments like noisy datasets, high variation in Wi-Fi signals over time, fading ofWi-Fi signals due to multipath propagation caused by hurdles, people walking in the area underconsideration and the addition/removal of Wi-Fi access points (APs). In this paper, we proposedata pre- and post-processing algorithms with deep learning classifiers for Wi-Fi fingerprint-basedindoor positioning, in order to provide immunity against limitations in the database and the indoorenvironment. In addition, we investigate the performance of the proposed system through simulationas well as extensive experiments. The results demonstrate that the pre-processing algorithm canefficiently fill in the missing Wi-Fi received signal strength fingerprints in the database, resulting ina success rate of 88.96% in simulation and 86.61% in a real-time experiment. The post-processingalgorithm can improve the results from 9.05–10.94% for the conducted experiments, providing thehighest success rate of 95.94% with a precision of 4 m for Wi-Fi fingerprint-based indoor positioning.

Keywords: Wi-Fi fingerprint; indoor positioning; IoT; data pre- and post-processing; deep learning

1. Introduction

In the past decade, indoor location-based services (LBSs) have attracted lots of attention, drivingthe development of indoor positioning technologies to fulfil the technological requirements of modernday communications services. One of the key research issues is the development of a precise andreliable indoor positioning system to improve the inadequate performance of current positioningsystems in indoor environment [1]. While global positioning systems (GPS) are unavailable inindoor environments due to the low penetration power of microwaves in indoor locations, severalother candidates such as radio waves, acoustic signals, magnetic field, Wi-Fi, and other sensorsinformation from mobile devices [2–4] can serve the purpose of determining the user location in indoorenvironments. Among the mentioned technologies, Wi-Fi signals-based indoor positioning is themost popular due to its wide deployment and the low cost of Wi-Fi networks [5]. Wi-Fi is consideredmore suitable in complex environments. Wi-Fi-based positioning technology can be classified intotwo types, time-space attribute methods and received signal strength (RSS)-based methods [6]. Thelatter is also known as fingerprinting and is one of the most popular indoor positioning technologies.

Electronics 2019, 8, 195; doi:10.3390/electronics8020195 www.mdpi.com/journal/electronics

Page 2: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 2 of 17

Unlike measurement-based methods, fingerprint-based methods can be easily implemented withoutadditional hardware. Generally, the process of Wi-Fi fingerprint-based indoor positioning consistsof two phases: offline and online. In the offline phase, the radio map or fingerprint database isconstructed by measuring and recording the RSS of Wi-Fi signals from different access points (APs).In the online phase, the real-time RSS measurements from the user are matched and compared withthe fingerprints in the database via classification algorithms to estimate the position of the user [7].The fingerprint database can be constructed using an IoT device or smartphone. This database is thenused to train the classifier to be able to predict user position in a real indoor environment with desiredtarget locations. A high-performance positioning server is in charge of database storage and positionprediction (classification task). Machine learning and deep learning-based classifiers can be employedto handle this classification task at the positioning server [8–11]. Figure 1 shows the working procedurefor a Wi-Fi fingerprint-based indoor positioning system.

Electronics 2019, 8, 195 2 of 16

additional hardware. Generally, the process of Wi-Fi fingerprint-based indoor positioning consists of two phases: offline and online. In the offline phase, the radio map or fingerprint database is constructed by measuring and recording the RSS of Wi-Fi signals from different access points (APs). In the online phase, the real-time RSS measurements from the user are matched and compared with the fingerprints in the database via classification algorithms to estimate the position of the user [7]. The fingerprint database can be constructed using an IoT device or smartphone. This database is then used to train the classifier to be able to predict user position in a real indoor environment with desired target locations. A high-performance positioning server is in charge of database storage and position prediction (classification task). Machine learning and deep learning-based classifiers can be employed to handle this classification task at the positioning server [8–11]. Figure 1 shows the working procedure for a Wi-Fi fingerprint-based indoor positioning system.

Figure 1. Illustration of Wi-Fi fingerprint-based indoor positioning system.

In the literature, several Wi-Fi fingerprint-based indoor positioning systems have been proposed which intend to improve specific location prediction and the overall system success rate. An efficient indoor positioning technique is presented in [12,13] and employs Wi-Fi fingerprinting along with trilateration. The trilateration technique employs knowledge of distance for three known APs at every target location while constructing the RSS fingerprint database. Similarly, a contour-based trilateration technique was presented in [14] where the RPs with same signal levels were combined to form a contour and the location was estimated. This approach highly depends on the position and orientation of the desired APs. A novel indoor positioning scheme which operates robustly by extracting the multipath propagation delay profile of Wi-Fi signals was presented in [15]. With this approach, changes in the indoor environment like multipath propagation can be efficiently compensated to improve system robustness. An experimental evaluation of location methods based on RSS measurements was presented in [16]. This work focused on VHF frequencies and outdoor positioning system and did not take Wi-Fi RSS fingerprint-based indoor positioning into consideration. To improve the fingerprinting database construction, an algorithm termed “Slide” is presented in [17], where the RSS fingerprint is recorded using a smartphone. This algorithm is limited only to linear positioning, which is not the case in most complex-structured indoor environments. As mentioned earlier, the time-varying nature of Wi-Fi signals and time consumed in the development of the RSS fingerprint database are major constraints in adopting Wi-Fi RSS measurements for indoor positioning. To address this issue, a gradient-based approach to stabilize the RSS gradient of Wi-Fi signals was proposed in [18], but an accuracy of 5.6 m was reported, which is not very practical when the indoor environment is considered. The RSS measurements are prone to environmental dynamics such as weather conditions, which can result in degradation of accuracy of distance estimation. Considering this issue, the rain attenuation effect was included in the pathloss model in order to estimate the impact of precipitation in the outdoor environment [19]. To address time consumption in the construction of the RSS fingerprint database, an autonomous crowdsourcing approach using multiple smartphones is described in [20]. However, autonomous behaviour for database collection is implemented by employing an existing trusted portable navigator in the target location, which is

Figure 1. Illustration of Wi-Fi fingerprint-based indoor positioning system.

In the literature, several Wi-Fi fingerprint-based indoor positioning systems have been proposedwhich intend to improve specific location prediction and the overall system success rate. An efficientindoor positioning technique is presented in [12,13] and employs Wi-Fi fingerprinting along withtrilateration. The trilateration technique employs knowledge of distance for three known APs at everytarget location while constructing the RSS fingerprint database. Similarly, a contour-based trilaterationtechnique was presented in [14] where the RPs with same signal levels were combined to form acontour and the location was estimated. This approach highly depends on the position and orientationof the desired APs. A novel indoor positioning scheme which operates robustly by extracting themultipath propagation delay profile of Wi-Fi signals was presented in [15]. With this approach, changesin the indoor environment like multipath propagation can be efficiently compensated to improvesystem robustness. An experimental evaluation of location methods based on RSS measurements waspresented in [16]. This work focused on VHF frequencies and outdoor positioning system and did nottake Wi-Fi RSS fingerprint-based indoor positioning into consideration. To improve the fingerprintingdatabase construction, an algorithm termed “Slide” is presented in [17], where the RSS fingerprintis recorded using a smartphone. This algorithm is limited only to linear positioning, which is notthe case in most complex-structured indoor environments. As mentioned earlier, the time-varyingnature of Wi-Fi signals and time consumed in the development of the RSS fingerprint database aremajor constraints in adopting Wi-Fi RSS measurements for indoor positioning. To address this issue,a gradient-based approach to stabilize the RSS gradient of Wi-Fi signals was proposed in [18], but anaccuracy of 5.6 m was reported, which is not very practical when the indoor environment is considered.The RSS measurements are prone to environmental dynamics such as weather conditions, which canresult in degradation of accuracy of distance estimation. Considering this issue, the rain attenuationeffect was included in the pathloss model in order to estimate the impact of precipitation in theoutdoor environment [19]. To address time consumption in the construction of the RSS fingerprintdatabase, an autonomous crowdsourcing approach using multiple smartphones is described in [20].

Page 3: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 3 of 17

However, autonomous behaviour for database collection is implemented by employing an existingtrusted portable navigator in the target location, which is also immune to variations in the indoorenvironment, resulting in an estimation error greater than 5.7 m, close to that of the above mentionedgradient-based approach. Another approach to reduce time consumption in RSS fingerprint databaseconstruction is to analytically estimate the RSS distribution instead of manual data collection. Toestimate the RSS distribution for APs, a probabilistic approach is described in [21], where a single- anddouble-peak Gaussian distribution is employed for the AP signals. The extensive efforts to improve theaccuracy of RSS-based centroid localization algorithm in indoor environment were presented in [22].However, the focus was on wireless sensor network nodes and the indoor area for the experimentwas limited to a single room of 6 × 7 square meters. The analysis cannot be generally applied forlarge area. An extensive overview of the offline evaluation of smartphone-based indoor positioningtechniques demonstrated at International Conference on Indoor Positioning and Indoor Navigation(IPIN) competition 2017 was presented in [23]. The authors concluded that the database for fingerprintalgorithm would be recorded in a static manner or at very slow continuous motion such as 0.2 m/s.Additionally, the issue of training database density was addressed in [24], where the fingerprint-basedapproaches were preferred with a dense database.

Both machine learning and deep learning-based classifiers are popular for handling the matchingand classification tasks. Recently, the deep learning gains more attention because of its high accuracywith huge amount of trained data [25]. Compared to the machine learning techniques, the deeplearning attempts to learn high-level features from data in an incremental manner, which can eliminatethe need of domain expertise and hard core feature extraction [26]. Regarding the training time, thedeep learning algorithms take longer time in the offline phase due to the large number of parameters.However, the scenario is totally reversed in the online phase where the deep learning takes much lesstime to run [27]. That is, the deep learning outperforms in the case of large data size. Keeping theabove-mentioned aspects in minds, we employ the deep learning-based classifier in this work.

Several classifiers have been introduced to indoor localization to handle the matching task,which gives rise to a complicated multi-classes classification problem. To address the challenge ofdeveloping an efficient classifier for the positioning server, [28] presents several machine learningapproaches including kNN, a rules-based classifier (JRip), and random forest, employing them forWi-Fi fingerprint-based indoor positioning systems. The results indicate that the Random Forestclassifier offers the best performance. Random forest is one of the most powerful machine learningalgorithms available today [29]. It can handle both classification and regression tasks and has beenwidely used in various applications such as Internet traffic interception, voice and image classification.However, it has been rarely utilized in indoor positioning. An on-device learning approach is depictedin [30], where a mobile application allows users to build and offline train their own RSS maps. A deeplearning-based classifier for indoor position is presented in [31]; however, it employs channel stateinformation (CSI) instead of the Wi-Fi RSS value, which is our focus in this work. To the best ofour knowledge, the only reported work on deep learning-based Wi-Fi fingerprinting is done by [32];however, database collection is not addressed and the existing UJIIndoorLoc dataset [33] is employed.

The key issues with the offline phase of such Wi-Fi fingerprint-based indoor positioning systemsare the high variations in Wi-Fi signals over time, fading signals due to multipath propagation causedby hurdles, people walking in the area under consideration and addition/removal of Wi-Fi APs [34–36].Additionally, the signal strength and resource allocation by the APs is dependent on the number ofconnected users. As most Wi-Fi-based positioning technologies use the existing infrastructure ofthe Wi-Fi network in the indoor environment, these problems cannot be avoided as they requirecontrol of the infrastructure. It is important to mention that the performance of the positioning serverhighly depends on the quality of the RSS fingerprint database used to train the server for positionprediction [37]. Most real world data is composed of inaccurate, noisy, inconsistent, and missing data.The reasons for the existence of such data could be technological problems stemming from gadgetsthat gather data, a human mistake during data entry and much more. The collected data cannot be

Page 4: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 4 of 17

used directly for training the server. Some specified machine learning and deep learning models forthe server need information in a specified format, for example, the random forest algorithm does notsupport null values, and therefore, execution of the random forest algorithm null values has to bemanaged from the original raw dataset. To solve this problem, data pre-processing is performed. Datapre-processing is an important part of data science [38]. It includes the two concepts “data cleaning”and “feature engineering”. These two are compulsory for achieving better accuracy and performancein the machine learning and deep learning algorithms. Data pre-processing is necessary because of thepresence of unformatted real-world data. Similarly, during the online phase for Wi-Fi fingerprint-basedindoor positioning, there can be some errors in position estimation for users as the test data might beaffected due to limitations in the indoor environment. To address this issue, another data processingscheme can be applied and this process is termed data post-processing. Data post-processing isperformed on the result data. Post-processing allows for analysis of the previous result sets in aformal manner. Data pre- and post-processing algorithms are expected to improve the performanceof Wi-Fi fingerprint-based indoor positioning systems [39]. A pre-processing approach that involvesremoving useless APs and their respective RSS from the fingerprint database was presented in [40].This can help in reducing the computational overhead during position estimation and improves thesystem performance. A data pre-processing technique was presented in [41] to remove noisy datafrom the RSS fingerprint database by combining Wi-Fi and geographical information system (GIS).To accommodate the heterogeneity of RSS measuring devices, a data pre-processing algorithm waspresented in [42], which can scale the RSS values from various devices to a uniform range. Unlike theabovementioned server-based pre-processing algorithms, a user-based data pre-processing libraryemploying RSS filtering was presented for android-based devices in [43]. To the best of our knowledge,there is no data pre-processing algorithm developed for filling in missing values in the Wi-Fi RSSfingerprint database, which is expected to improve the system performance. Similarly, a brief surveyon post-processing algorithms for machine learning and data mining was presented in [44]. However,to the best of our knowledge, there is no post-processing algorithm available for deep learning-basedclassifiers for Wi-Fi fingerprint-based indoor positioning.

In order to solve the above problems with Wi-Fi fingerprinting, much research has been conductedon algorithms for data pre-processing to remove noisy and missing data, and many techniques forimproving the classification server have been developed. However, conventional techniques that havealready been developed still have limitations in securing high accuracy, reliable success percentageand compensation for environmental changes that cause performance degradation. To the best of ourknowledge, data post-processing has not been described for Wi-Fi fingerprint-based indoor positioningin the literature. There is a need for the development of an algorithm which can efficiently tackle theabove-mentioned challenges in real-time using the existing infrastructure of the indoor environment.

In this work, we present a Wi-Fi fingerprint-based indoor positioning system using a deep learningclassifier. The main contributions of this work are as follows:

• A data pre-processing algorithm to compensate for impairments in the collected RSSfingerprint database;

• A data post-processing algorithm to enhance the system performance by limiting the effects ofthe indoor environment on the experimental phase of the proposed system; and

• Investigation of the performance of a deep learning-based classifier at the server in charge of theRSS fingerprint database storage and position prediction.

The rest of the paper is organized as follows: The proposed system model is described in Section 2.The simulation and experimental results for the proposed system are discussed in Section 3. Finally,Section 4 presents the conclusion of our work and future directions.

Page 5: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 5 of 17

2. Proposed System

In this section, an overview of the proposed system is first presented. The environment andsetup for simulation and experiment along with RSS fingerprint database construction is describedin Section 2.2. Description of the data pre-processing algorithm is presented in Section 2.3. The deeplearning classifier and data post-processing algorithm are described in Section 2.4 and 2.5, respectively.

2.1. Overview of the Proposed System

In this work, we propose a Wi-Fi fingerprint-based indoor positioning system equipped withdata pre- and post-processing algorithms. Figure 2 depicts the actual two-phase indoor positioningservice delivered by the proposed system. The proposed system is a server-based and user-activeindoor positioning system. A high-performance positioning server is in charge of database storageand position prediction (classification task) by employing deep learning.

Electronics 2019, 8, 195 5 of 16

learning classifier and data post-processing algorithm are described in Sections 2.4 and 2.5, respectively.

2.1. Overview of the Proposed System

In this work, we propose a Wi-Fi fingerprint-based indoor positioning system equipped with data pre- and post-processing algorithms. Figure 2 depicts the actual two-phase indoor positioning service delivered by the proposed sys

Figure 2. Proposed Wi-Fi fingerprint-based indoor positioning system equipped with data pre- and post-processing modules.

The proposed system consists of two phases: an offline and an online phase. Firstly, in the offline phase, the database is constructed by mapping the RSS measurements at each reference point (RP) through a smartphone. The database in the offline phase of Figure 2 consists of two kinds of database. One is training and another is trial database. The training database is used in the offline phase for training the deep learning classifier. The trial database is used to perform the offline test simulation to evaluate the performance of the deep learning classifier. Both training and trial databases are pre-processed using the proposed pre-processing algorithm. The pre-processed databases are then used to train the deep learning classifier, which turns to be the trained classifier in the online phase. Secondly, in the online phase of Figure 2, five real-time measurements at unknown user positions are pre-processed to fill the missing values. The pre-processed measurements are then passed to the trained classifier to determine the user location. The trained classifier returns five decisions for the user’s position which are then passed to the data-post processing algorithm. The post-processing algorithm selects the most frequent decision for user position based on the “Majority-rule”. Finally, the decision on user location is passed to the user’s smartphone.

2.2. Environment and Setup

To implement the proposed system model, RSS fingerprint database collection and experimental verification of the simulation data are both performed on the 7th floor of the New Engineering Building at Dongguk University, Seoul, Korea. As shown in Figure 3, the 52 × 32 square meters target area is divided into 74 RPs with an interval of 2 m between two consecutive RPs. The pre-installed anonymous APs on the 7th floor are used in this work. In Figure 3, the location of the APs is unknown to the positioning server. The main motive is to use the pre-installed infrastructure of Wi-Fi APs in the building for indoor localization. The positioning server is a Dell Alienware model P31e (Alienware, hardware subsidiary of Dell, Miami, FL, USA). A Samsung SM-E310K smartphone is used to measure the RSS value from all available APs at each RP.

Figure 2. Proposed Wi-Fi fingerprint-based indoor positioning system equipped with data pre- andpost-processing modules.

The proposed system consists of two phases: an offline and an online phase. Firstly, in the offlinephase, the database is constructed by mapping the RSS measurements at each reference point (RP)through a smartphone. The database in the offline phase of Figure 2 consists of two kinds of database.One is training and another is trial database. The training database is used in the offline phase fortraining the deep learning classifier. The trial database is used to perform the offline test simulationto evaluate the performance of the deep learning classifier. Both training and trial databases arepre-processed using the proposed pre-processing algorithm. The pre-processed databases are thenused to train the deep learning classifier, which turns to be the trained classifier in the online phase.Secondly, in the online phase of Figure 2, five real-time measurements at unknown user positionsare pre-processed to fill the missing values. The pre-processed measurements are then passed to thetrained classifier to determine the user location. The trained classifier returns five decisions for theuser’s position which are then passed to the data-post processing algorithm. The post-processingalgorithm selects the most frequent decision for user position based on the “Majority-rule”. Finally,the decision on user location is passed to the user’s smartphone.

2.2. Environment and Setup

To implement the proposed system model, RSS fingerprint database collection and experimentalverification of the simulation data are both performed on the 7th floor of the New Engineering Buildingat Dongguk University, Seoul, Korea. As shown in Figure 3, the 52 × 32 square meters target area is

Page 6: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 6 of 17

divided into 74 RPs with an interval of 2 m between two consecutive RPs. The pre-installed anonymousAPs on the 7th floor are used in this work. In Figure 3, the location of the APs is unknown to thepositioning server. The main motive is to use the pre-installed infrastructure of Wi-Fi APs in thebuilding for indoor localization. The positioning server is a Dell Alienware model P31e (Alienware,hardware subsidiary of Dell, Miami, FL, USA). A Samsung SM-E310K smartphone is used to measurethe RSS value from all available APs at each RP.

Electronics 2019, 8, 195 6 of 16

On the software side, construction of the RSS fingerprint database, data pre-processing, deep learning classifier, data post-processing, and online experiment program are all performed in Python and the tensorflow platform.

Figure 3. Environment setup for floor map with 74 reference points.

To collect the RSS fingerprint database for the above environment, a smartphone is used to measure the RSS value from all available APs at each RP. Each measurement contains the RP label, time, date, number of available APs, MAC address and respective RSS value for each AP at every RP on the floor map. This measurement, termed the fingerprint, is then transmitted to the positioning server. Five RSS fingerprints are collected in forward as well as in backward directions at each RP. The sampling time for each RSS measurement is 5 s, which means 25 s for total five measurements. For 74 reference points, the total time consumed 31 min for each direction and 62 min for both directions. For the training data, the measurements are taken in the morning, afternoon and evening for seven days, which results in 22 h approximately. For the trial data, the measurements are taken for two days in the same manner. Therefore, the time consumption is 6.5 h, approximately. All these fingerprints are saved as training and trial files in .csv format in the positioning server, as the deep learning classifier at the server is designed to use .csv format files as input. At each RP, several APs are present with unique MAC addresses. In the .csv file, the MAC addresses are added on the first row at the header of the file. If the MAC address is already present, then the RSS value is added to the corresponding column. If the MAC address is not present, the value is added to a new column. For 74 RPs, the number of columns can, therefore, be more or less than 256. The number 256 would be the maximum corresponding to the input size of the deep learning classifier. Therefore, the training and trial .csv files must contain 257 columns, where 256 columns are used to save the RSS fingerprint while 1 column is used to indicate the RP label. After generating the database, the number

Figure 3. Environment setup for floor map with 74 reference points.

On the software side, construction of the RSS fingerprint database, data pre-processing, deeplearning classifier, data post-processing, and online experiment program are all performed in Pythonand the tensorflow platform.

To collect the RSS fingerprint database for the above environment, a smartphone is used tomeasure the RSS value from all available APs at each RP. Each measurement contains the RP label,time, date, number of available APs, MAC address and respective RSS value for each AP at every RPon the floor map. This measurement, termed the fingerprint, is then transmitted to the positioningserver. Five RSS fingerprints are collected in forward as well as in backward directions at each RP. Thesampling time for each RSS measurement is 5 s, which means 25 s for total five measurements. For74 reference points, the total time consumed 31 min for each direction and 62 min for both directions.For the training data, the measurements are taken in the morning, afternoon and evening for sevendays, which results in 22 h approximately. For the trial data, the measurements are taken for two daysin the same manner. Therefore, the time consumption is 6.5 h, approximately. All these fingerprints aresaved as training and trial files in .csv format in the positioning server, as the deep learning classifierat the server is designed to use .csv format files as input. At each RP, several APs are present with

Page 7: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 7 of 17

unique MAC addresses. In the .csv file, the MAC addresses are added on the first row at the headerof the file. If the MAC address is already present, then the RSS value is added to the correspondingcolumn. If the MAC address is not present, the value is added to a new column. For 74 RPs, thenumber of columns can, therefore, be more or less than 256. The number 256 would be the maximumcorresponding to the input size of the deep learning classifier. Therefore, the training and trial .csv filesmust contain 257 columns, where 256 columns are used to save the RSS fingerprint while 1 columnis used to indicate the RP label. After generating the database, the number of the column in bothtraining and trail .csv files are checked. If the number is less than 256, the remaining columns arefilled with “0 s”. Otherwise, extra columns over 256 are discarded, starting from the AP with theminimum number of RSS fingerprints. This collected RSS fingerprint database is then passed to thedata pre-processing section as shown in Figure 2.

2.3. Pre-Processing Algorithm

As the collected dataset contains RSS fingerprints recorded at 74 RPs on the floor map, 256 APs arepresent in the RSS fingerprint database. However, for each individual RP measurement, the numberof APs is around 20 on average, so the remaining spaces are filled with "0 s” while the database isprepared. As discussed earlier, some algorithms like random forest cannot perform well with nullvalues, so zero values should be removed. To accomplish this task, a data pre-processing algorithm isproposed in this work. The flow chart for the proposed data pre-processing algorithm is depicted inFigure 4.

Electronics 2019, 8, 195 7 of 16

of the column in both training and trail .csv files are checked. If the number is less than 256, the remaining columns are filled with "0 s”. Otherwise, extra columns over 256 are discarded, starting from the AP with the minimum number of RSS fingerprints. This collected RSS fingerprint database is then passed to the data pre-processing section as shown in Figure 2.

2.3. Pre-Processing Algorithm

As the collected dataset contains RSS fingerprints recorded at 74 RPs on the floor map, 256 APs are present in the RSS fingerprint database. However, for each individual RP measurement, the number of APs is around 20 on average, so the remaining spaces are filled with "0 s” while the database is prepared. As discussed earlier, some algorithms like random forest cannot perform well with null values, so zero values should be removed. To accomplish this task, a data pre-processing algorithm is proposed in this work. The flow chart for the proposed data pre-processing algorithm is depicted in Figure 4.

Figure 4. Flow graph for the proposed pre-processing algorithm.

The pre-processing algorithm is designed to fill in the missing values previously recorded as "0 s” in the RSS fingerprint database for each RP measurement with the average values for each AP at 74 RPs. It is important to mention here that the average is taken only for the non-zero RSS values for each AP at 74 RPs. The steps followed in the pre-processing algorithm are below: • Step 1: Calculate the average of all non-zero values in each column (representing one AP) of the

Training RSS fingerprint database (Training.csv); • Step 2: Check the training data file for "0” values; • Step 3: Replace zero with the average value from step1 at the corresponding row and column in

the Training RSS fingerprint database; and • Step 4: Repeat step 1 and step 3 for the trial fingerprint data (Trial.csv). (Note: the average of the

training RSS fingerprint database is replaced for "0” values in Trial.csv).

Figure 4. Flow graph for the proposed pre-processing algorithm.

The pre-processing algorithm is designed to fill in the missing values previously recorded as“0 s” in the RSS fingerprint database for each RP measurement with the average values for each AP at74 RPs. It is important to mention here that the average is taken only for the non-zero RSS values foreach AP at 74 RPs. The steps followed in the pre-processing algorithm are below:

Page 8: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 8 of 17

• Step 1: Calculate the average of all non-zero values in each column (representing one AP) of theTraining RSS fingerprint database (Training.csv);

• Step 2: Check the training data file for "0” values;• Step 3: Replace zero with the average value from step1 at the corresponding row and column in

the Training RSS fingerprint database; and• Step 4: Repeat step 1 and step 3 for the trial fingerprint data (Trial.csv). (Note: the average of the

training RSS fingerprint database is replaced for "0” values in Trial.csv).

The motive behind the pre-processing is to replace the missing values (“0 s”) in the database witha certain computed value which is relatively closer to the RSS fingerprints available in the database.It would be helpful for the deep learning classifier to operate with certain RSS values rather than zeros.Since the number of RSS fingerprints of 15,540 in the Training.csv is larger than 60 in the Trial.csv, theaverage value of the Training.csv replaces the missing values (“0s”) in the Trail.csv. In this manner, themissing values in the RSS fingerprint database are replaced with a computed average value and thedatabase is ready to be used for training the deep learning classifier.

2.4. Deep Learning Classifier

In this work, we employed a deep learning-based classifier to predict user position. To deal withthe variant and unpredictable Wi-Fi signals, the positioning is cast in a convolutional neural network(CNN) structure that is capable of learning reliable features from a large set of noisy samples. Thedatabase required for the simulation and experiments is collected from the real world on differentdays and times to mimic the actual environment. The proposed positioning system is summarized inFigure 2. For the offline CNN training, a four-layer deep neural network (DNN)-based coarse localizeris trained to extract reliable features from a massive noisy RSS fingerprint database. The deep learningclassifier consists of three convolutional layers (CL) and one fully-connected (FC) layer. A SoftMaxlayer is used for multi-class classification. The architecture for the deep learning classifier is depictedin Figure 5. To reduce the training time, the number of neurons decreases as we go deeper in thenetwork. The .csv file is converted to input image of size 16 × 16 by using tf.reshape in tensorflow [45].First, the CNN learns reliable high-level features automatically from a large set of widely fluctuatingRSS fingerprints and avoids hand-engineering. Second, this structure is capable of learning usefulfeatures directly from labelled data. Third, the CNN-based estimator is more robust as it predictsthe position utilizing the extracted high level features. Moreover, the structure has an advantage inhandling massive RSS fingerprints since the prediction at the positioning stage does not rely on anysearches in sample space but requires only the forward evaluation of a trained feed-forward neuralnetwork. For the online positioning, the pre-processed real-time RSS measurements at target RPs arefed into the classifier to obtain a final position estimate. The parameters for the deep learning classifierare summarized in Table 1.

Electronics 2019, 8, 195 8 of 16

The motive behind the pre-processing is to replace the missing values (“0 s”) in the database with a certain computed value which is relatively closer to the RSS fingerprints available in the database. It would be helpful for the deep learning classifier to operate with certain RSS values rather than zeros. Since the number of RSS fingerprints of 15,540 in the Training.csv is larger than 60 in the Trial.csv, the average value of the Training.csv replaces the missing values (“0s”) in the Trail.csv. In this manner, the missing values in the RSS fingerprint database are replaced with a computed average value and the database is ready to be used for training the deep learning classifier.

2.4. Deep Learning Classifier

In this work, we employed a deep learning-based classifier to predict user position. To deal with the variant and unpredictable Wi-Fi signals, the positioning is cast in a convolutional neural network (CNN) structure that is capable of learning reliable features from a large set of noisy samples. The database required for the simulation and experiments is collected from the real world on different days and times to mimic the actual environment. The proposed positioning system is summarized in Figure 2. For the offline CNN training, a four-layer deep neural network (DNN)-based coarse localizer is trained to extract reliable features from a massive noisy RSS fingerprint database. The deep learning classifier consists of three convolutional layers (CL) and one fully-connected (FC) layer. A SoftMax layer is used for multi-class classification. The architecture for the deep learning classifier is depicted in Figure 5. To reduce the training time, the number of neurons decreases as we go deeper in the network. The .csv file is converted to input image of size 16 × 16 by using tf.reshape in tensorflow [45]. First, the CNN learns reliable high-level features automatically from a large set of widely fluctuating RSS fingerprints and avoids hand-engineering. Second, this structure is capable of learning useful features directly from labelled data. Third, the CNN-based estimator is more robust as it predicts the position utilizing the extracted high level features. Moreover, the structure has an advantage in handling massive RSS fingerprints since the prediction at the positioning stage does not rely on any searches in sample space but requires only the forward evaluation of a trained feed-forward neural network. For the online positioning, the pre-processed real-time RSS measurements at target RPs are fed into the classifier to obtain a final position estimate. The parameters for the deep learning classifier are summarized in Table 1.

Figure 5. Network architecture for the deep learning classifier.

Table 1. Summary of parameters used in the deep learning classifier.

Parameters Value No. of layers 4

Type of layers 3-CL 1-FC

Input image size 16 × 16 = 256

No. of Neurons Layer1: 8192 Layer2: 4096

Figure 5. Network architecture for the deep learning classifier.

Page 9: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 9 of 17

Table 1. Summary of parameters used in the deep learning classifier.

Parameters Value

No. of layers 4

Type of layers 3-CL1-FC

Input image size 16 × 16 = 256

No. of Neurons

Layer1: 8192Layer2: 4096Layer3: 2048Layer4: 1024

Batch size 1000Learning rate 0.001

Training Epoch 2000Training probability 0.5

Inner-width 1024Dropout rate 0.5

Location tolerance 2Regularization (β) 0.01

nb_classes (SoftMax size) 74

2.5. Post-Processing Algorithm

After analysis of the output data, one other data processing scheme can be performed, whichis termed data post-processing. The output data for the Wi-Fi fingerprint-based indoor positioningsystem needs to be post-processed as it might contain wrong decisions regarding the user location.For example, the output data for a Wi-Fi fingerprint-based indoor positioning system may containthe wrong target location as the environmental conditions or the signal strength of the APs may varyover time. This can cause the classifier of the Wi-Fi fingerprint-based indoor position system to selectthe wrong user location, hence reducing the success percentage and performance of the system. Thepost-processing algorithm can identify such variations in the output data of the Wi-Fi fingerprint-basedindoor positioning system and compensate for imperfections in the indoor environment, such asvariations in the signal strength level of Wi-Fi signals over time, lower levels of the RSS causedby multipath fading, changes in the Wi-Fi infrastructure due to addition or removal of Wi-Fi APsand hindrance created in the signal path due to people present in the vicinity of the user location.Post-processing eliminates wrong decision outputs from the indoor positioning system resulting fromthe above impairments in the indoor environment.

A data post-processing algorithm is proposed in this work to perform this task on the output datafor the Wi-Fi fingerprint-based indoor positioning system. Figure 6 shows the steps used to implementthe proposed post-processing algorithm. The proposed post-processing algorithm allows analysis ofthe classifier’s result sets in a formal manner. The algorithm employs “majority rule” to select the mostfrequent decision on the user’s position in the result data for the deep learning classifier. Following arethe steps for the post-processing algorithm:

• Step 1: At every RP, five test RSS measurements are taken with the smartphone in forward as wellas backward directions. The positioning server returns 5 predicted results for each RP.

• Step 2: The algorithm for post-processing computes the most frequent value among the fivepredictions by employing the “Majority rule”.

• Step 3: The most frequent value found during Step 2 is then declared as the final decision forthe user location as RP number. The flow graph for the post processing algorithm is shown inFigure 6.

Page 10: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 10 of 17

Electronics 2019, 8, 195 9 of 16

Layer3: 2048 Layer4: 1024

Batch size 1000 Learning rate 0.001

Training Epoch 2000 Training probability 0.5

Inner-width 1024 Dropout rate 0.5

Location tolerance 2 Regularization (β) 0.01

nb_classes (SoftMax size) 74

2.5. Post-Processing Algorithm

After analysis of the output data, one other data processing scheme can be performed, which is termed data post-processing. The output data for the Wi-Fi fingerprint-based indoor positioning system needs to be post-processed as it might contain wrong decisions regarding the user location. For example, the output data for a Wi-Fi fingerprint-based indoor positioning system may contain the wrong target location as the environmental conditions or the signal strength of the APs may vary over time. This can cause the classifier of the Wi-Fi fingerprint-based indoor position system to select the wrong user location, hence reducing the success percentage and performance of the system. The post-processing algorithm can identify such variations in the output data of the Wi-Fi fingerprint-based indoor positioning system and compensate for imperfections in the indoor environment, such as variations in the signal strength level of Wi-Fi signals over time, lower levels of the RSS caused by multipath fading, changes in the Wi-Fi infrastructure due to addition or removal of Wi-Fi APs and hindrance created in the signal path due to people present in the vicinity of the user location. Post-processing eliminates wrong decision outputs from the indoor positioning system resulting from the above impairments in the indoor environment.

Figure 6. Flow graph for the proposed post-processing algorithm.

A data post-processing algorithm is proposed in this work to perform this task on the output data for the Wi-Fi fingerprint-based indoor positioning system. Figure 6 shows the steps used to implement the proposed post-processing algorithm. The proposed post-processing algorithm allows

Figure 6. Flow graph for the proposed post-processing algorithm.

3. Numerical Results

3.1. Simulation Results

The performance of the proposed system is quantified in terms of the training accuracy andsuccess percentage. Training accuracy is the percentage of correct RP predictions without any marginof error when RSS database samples are reused as input to the trained classifier model [7]. The trainingaccuracy for the positioning classifier is given as:

Training accuracy =No. o f correct predictionsTotal no. o f predictions

× 100 (1)

Success is the percentage of correct position prediction with a margin of error of 2 RPs when thetrial database is used as input [46]. As there are 74 RPs in the database, therefore, the total number ofsamples is 74. The success count is incremented with every correct prediction. The success percentagefor the classifier is given as:

Success percentage =Success count

Total no. o f samples× 100 (2)

As mentioned earlier, the interval between two consecutive RPs is 2 m, so the precision of successis 4 m in this work. The simulation for the proposed Wi-Fi fingerprint-based indoor positioning systemis carried out for the raw database as well as the pre-processed database.

To stress the need for data pre-processing, Figure 7a shows five RSS fingerprints for 20 APstaken at RP-1. It can be seen clearly that for several APs, there is missing data filled with "0 s” at theRSS fingerprint database construction stage. This results in a non-uniform RSS fingerprint at everyRP. To solve this issue, the pre-processing algorithm is employed which can replace the "0 s” in theRSS fingerprint database with computed average values for each AP. Figure 7b shows the impactof data pre-processing on the RSS fingerprint database. It can be seen clearly that by filling in themissing values using the pre-processing algorithm, the RSS fingerprints at RP-1 become more uniformand, thus, the probability of error at the training stage of the classifier, as well as in simulation andexperimental results can be significantly reduced. The simulation results for the raw database and

Page 11: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 11 of 17

pre-processed database are presented in Table 2. With pre-processing, a gain of 1.5% can be achievedusing the data pre-processing algorithm. The simulation results are depicted in Figure 8. The datapoints used to generate the graphs of Figures 7 and 8 are available at [47]. Concerning the computationoverhead for pre-processing, it takes 6.5 s for the training and trial database to be pre-processed for thesimulation. Even if such large overhead occurs due to the database size, the system performance is notaffected since it only happens in the offline phase.

Electronics 2019, 8, 195 11 of 16

Overfitting happens when a model learns the detail and noise in the training database to the extent that it negatively impacts the performance of the model on trial data. It can be seen clearly in Figure 8 that the training accuracy of the pre-processed database is lower than that of the raw database. This helps to avoid overfitting of training data and, thus, results in a consistent success rate with an increasing number of training epochs.

Table 2. Summary of simulation results.

Simulation type Training Accuracy

Success Percentage

Without Pre-processing 83.98 87.46 With Pre-processing 60.96 88.96

(a)

(b)

Figure 7. Five RSS fingerprints for 20 APs at RP-1 (a) raw database without pre-processing (b) database with pre-processing.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Index of APs

0

10

20

30

40

50

60

70Database without Pre-processing

RP-1 Fingerprint-1RP-1 Fingerprint-2RP-1 Fingerprint-3RP-1 Fingerprint-4RP-1 Fingerprint-5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Index of APs

0

10

20

30

40

50

60

70Database with Pre-processing

RP-1 Fingerprint-1RP-1 Fingerprint-2RP-1 Fingerprint-3RP-1 Fingerprint-4RP-1 Fingerprint-5

Figure 7. Five RSS fingerprints for 20 APs at RP-1 (a) raw database without pre-processing (b) databasewith pre-processing.

Table 2. Summary of simulation results.

Simulation Type Training Accuracy Success Percentage

Without Pre-processing 83.98 87.46With Pre-processing 60.96 88.96

Page 12: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 12 of 17Electronics 2019, 8, 195 12 of 16

Figure 8. Simulation results for RSS fingerprint database with and without pre-processing.

3.2. Experiment Results

Since the pre-processed RSS fingerprint database shows better performance in the simulation, the trained model .meta file is saved for the epoch with the highest success percentage and used for the online real-time experiment. In the experiment, a smartphone carried by the user captures the real-time RSSs for available APs and sends them with MAC addresses to the positioning server. The server pre-processes the RSS fingerprint and employs the trained classifier model to predict the user position with the received RSS fingerprint. In the online phase, the computation overhead is negligible about 0.4 ms for each real-time RSS measurement. For five real-time RSS measurements, a total computation overhead of 2 ms is needed. To verify the simulation results, we conducted 10 experiments at different times and days to include the effect of environment change on the proposed system. In each experiment, the user carried a smartphone from RP-1 to RP-74, termed the “forward direction” and then from RP-74 to RP-1, termed the “backward direction”. At each RP, the RSS fingerprint measurement and position prediction are both performed five times continuously. The post-processing algorithm is applied to the five predictions received from the positioning server to determine the most frequent value by employing the “majority rule” to predict the user position. The post-processing introduces a negligible computation overhead of 0.03 ms with the majority rule. The detailed experiment results are presented in Table 3.

The experiment results obtained without employing the post-processing algorithm shows the highest success rate of 86.61%, which is 2.35% lower than the simulation results. One reason for this difference is that the changing of environments can result in different propagation paths for Wi-Fi signals and hence greatly affects the RSS fingerprints of APs at each RP. The real-time experiment was performed months after the RSS fingerprint database construction, and the test database used for simulation is split from the training database. To solve this issue, we employed the post-processing algorithm described in this work. The post-processing algorithm can mitigate the effects of environment changes by discarding incorrect decisions regarding user position and selecting only the most frequent decisions. As shown in Figure 9, the post-processing algorithm can improve the results from 9.05–10.94% for the conducted experiments. Furthermore, variations between multiple experiments are reduced from 5.69% to 4.73% by employing the post-processing algorithm. Other approaches to solve such environment variations will be included in future research.

Figure 8. Simulation results for RSS fingerprint database with and without pre-processing.

It is important to note that by employing the pre-processing algorithm, there is not only a gain inthe success percentage of the system, but also the success percentage remains constant as the numberof training epochs increase. Such consistency in success percentage is vital for the experiment as thetrained classifier .meta file is used in the experiment. For the raw database, the success percentagedecreases as the number of epochs increase. This degradation can be explained by looking at thetraining accuracy. As the training accuracy increases, the phenomenon of overfitting occurs. Overfittinghappens when a model learns the detail and noise in the training database to the extent that it negativelyimpacts the performance of the model on trial data. It can be seen clearly in Figure 8 that the trainingaccuracy of the pre-processed database is lower than that of the raw database. This helps to avoidoverfitting of training data and, thus, results in a consistent success rate with an increasing number oftraining epochs.

3.2. Experiment Results

Since the pre-processed RSS fingerprint database shows better performance in the simulation,the trained model .meta file is saved for the epoch with the highest success percentage and usedfor the online real-time experiment. In the experiment, a smartphone carried by the user capturesthe real-time RSSs for available APs and sends them with MAC addresses to the positioning server.The server pre-processes the RSS fingerprint and employs the trained classifier model to predict theuser position with the received RSS fingerprint. In the online phase, the computation overhead isnegligible about 0.4 ms for each real-time RSS measurement. For five real-time RSS measurements,a total computation overhead of 2 ms is needed. To verify the simulation results, we conducted10 experiments at different times and days to include the effect of environment change on the proposedsystem. In each experiment, the user carried a smartphone from RP-1 to RP-74, termed the “forwarddirection” and then from RP-74 to RP-1, termed the “backward direction”. At each RP, the RSSfingerprint measurement and position prediction are both performed five times continuously. Thepost-processing algorithm is applied to the five predictions received from the positioning server todetermine the most frequent value by employing the “majority rule” to predict the user position. Thepost-processing introduces a negligible computation overhead of 0.03 ms with the majority rule. Thedetailed experiment results are presented in Table 3.

Page 13: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 13 of 17

Table 3. Summary of experimental results.

ExperimentIndex Time Direction

Success %Without

Post-Processing

Success %With

Post-ProcessingGain %

1 MorningForward 88.64 97.29

Backward 84.59 94.59Average 86.61 95.94 9.33

2 AfternoonForward 86.75 95.94

Backward 83.78 94.54Average 85.26 95.24 9.98

3 EveningForward 84.05 94.59

Backward 86.75 94.59Average 85.40 94.59 9.19

4 MorningForward 82.43 93.24

Backward 84.05 94.59Average 83.24 93.91 10.67

5 AfternoonForward 84.83 94.59

Backward 82.47 94.59Average 83.65 94.59 10.94

6 EveningForward 84.05 93.24

Backward 85.68 94.59Average 84.86 93.91 9.05

7 MorningForward 82.61 91.89

Backward 81.74 93.24Average 82.17 92.56 10.39

8 AfternoonForward 81.58 90.54

Backward 82.33 91.89Average 81.95 91.21 9.26

9 EveningForward 80.54 90.54

Backward 81.31 91.89Average 80.92 91.21 10.29

10 MorningForward 83.59 93.24

Backward 84.55 94.59Average 84.07 93.91 9.84

The experiment results obtained without employing the post-processing algorithm shows thehighest success rate of 86.61%, which is 2.35% lower than the simulation results. One reasonfor this difference is that the changing of environments can result in different propagation pathsfor Wi-Fi signals and hence greatly affects the RSS fingerprints of APs at each RP. The real-timeexperiment was performed months after the RSS fingerprint database construction, and the testdatabase used for simulation is split from the training database. To solve this issue, we employed thepost-processing algorithm described in this work. The post-processing algorithm can mitigate theeffects of environment changes by discarding incorrect decisions regarding user position and selectingonly the most frequent decisions. As shown in Figure 9, the post-processing algorithm can improve theresults from 9.05–10.94% for the conducted experiments. Furthermore, variations between multipleexperiments are reduced from 5.69% to 4.73% by employing the post-processing algorithm. Otherapproaches to solve such environment variations will be included in future research.

Page 14: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 14 of 17Electronics 2019, 8, 195 13 of 16

Figure 9. Comparison between the experimental results with and without post-processing.

Table 3. Summary of experimental results.

Experiment Index

Time

Direction

Success % Without Post-Processing

Success % With Post-Processing

Gain %

1

Morning

Forward Backward Average

88.64 84.59 86.61

97.29 94.59 95.94

9.33

2

Afternoon

Forward Backward Average

86.75 83.78 85.26

95.94 94.54 95.24

9.98

3

Evening

Forward Backward Average

84.05 86.75 85.40

94.59 94.59 94.59

9.19

4

Morning

Forward Backward Average

82.43 84.05 83.24

93.24 94.59 93.91

10.67

5

Afternoon

Forward Backward Average

84.83 82.47 83.65

94.59 94.59 94.59

10.94

6

Evening

Forward Backward Average

84.05 85.68 84.86

93.24 94.59 93.91

9.05

7

Morning

Forward Backward Average

82.61 81.74 82.17

91.89 93.24 92.56

10.39

8

Afternoon

Forward Backward Average

81.58 82.33 81.95

90.54 91.89 91.21

9.26

1 2 3 4 5 6 7 8 9 10Experiment Index

0

10

20

30

40

50

60

70

80

90

100Experiment Results

Without Post-procesingWith Post-processing

Figure 9. Comparison between the experimental results with and without post-processing.

4. Conclusions

In this work, we propose a server-based user-active Wi-Fi fingerprint-based indoor positioningsystem using a deep learning classifier. The proposed system is equipped with data pre- andpost-processing algorithms to reduce the degradation in performance caused due to limitations in theRSS fingerprint database and the indoor environment. Programs for fingerprint database constructionusing smartphones, data pre-processing, deep learning classifier, data post-processing, and onlinereal-time experiment are developed. The proposed system is implemented and examined in a real-timeindoor environment with 74 target locations. The simulation results show that the pre-processingalgorithm can efficiently fill in missing RSS fingerprints in the database, resulting in a success rate of88.96% with the error margin of 2 RPs. The proposed system equipped with a data post-processingalgorithm is further verified with online real-time experiments. The proposed system is able to providean indoor positioning precision of 4 m with a 95.94% success rate. Future work will focus on findingthe reasons for the gap between the simulation and real-time experiment, and methods to increase thepositioning performance of the proposed system.

Author Contributions: A.H., Y.W., and S.-H.H. contributed to the main idea of this research work. A.H. performedthe simulations, experiments and database collection. S.L. performed experiments for the work. The researchactivity was planned and executed under the supervision of S.-H.H. A.H., and S.-H.H. contributed to the writingof this article.

Funding: This work was supported by the Technology Innovation Program (or Industrial Strategic TechnologyDevelopment Program-Industrial site core technology Early development project) (10084909, Development ofimproved RTLS engine and IoT tag using machine learning technique reflecting real environments) funded by theMinistry of Trade, Industry, and Energy (MOTIE, Korea)

Acknowledgments: The authors would like to offer their sincere gratitude to Mr. Sang Moon Lee, CTO of JMPSystems, Korea, for providing the equipment for database collection and setting up the experimental environment.

Conflicts of Interest: The authors declare no conflicts of interest regarding the publication of this article.

References

1. Basri, C.; El Khadimi, A. Survey on indoor localization system and recent advances of WiFi fingerprintingtechnique. In Proceedings of the 5th International Conference on Multimedia Computing and Systems(ICMCS), Marrakech, Morocco, 29 September–1 October2016; pp. 253–259.

2. Zhuang, Y.; Li, Y.; Qi, L.; Lan, H.; Yang, J.; El-Sheimy, N. A two-filter integration of MEMS sensors and Wi-Fifingerprinting for indoor positioning. IEEE Sens. J. 2016, 13, 5125–5126. [CrossRef]

Page 15: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 15 of 17

3. Zou, H.; Chen, Z.; Jiang, H.; Xie, L.; Spanos, C. Accurate indoor localization and tracking using mobile phoneinertial sensors, Wi-Fi and iBeacon. In Proceedings of the IEEE International Symposium on Inertial Sensorsand Systems (INERTIAL), Kauai, HI, USA, 27–30 March 2017; pp. 1–4.

4. Zhang, P.; Zhao, Q.; Li, Y.; Niu, X.; Zhuang, Y.; Liu, J. Collaborative WiFi fingerprinting using sensor-basednavigation on smartphones. Sensors 2015, 7, 17534–17557. [CrossRef]

5. He, S.; CHAN, S.-H.G. Wi-Fi fingerprint-based indoor positioning: Recent advances and comparisons.IEEE Commun. Surv. Tutorials 2016, 18, 466–490. [CrossRef]

6. Hantoush, R. Evaluating Wi-Fi indoor positioning approaches in a real world environment. Ph.D. Thesis,Universidade NOVA de Lisboa, Lisbon, Portugal, 2016.

7. Xia, S.; Liu, Y.; Yuan, G.; Zhu, M.; Wang, Z. Indoor fingerprint positioning based on Wi-Fi: An overview.ISPRS Int. J. Geo-Inf. 2017, 6, 135. [CrossRef]

8. Bozkurt, S.; Elibol, G.; Gunal, S.; Yayan, U. A comparative study on machine learning algorithms for indoorpositioning. In Proceedings of the International Symposium on Innovations in Intelligent Systems andApplications (INISTA), Madrid, Spain, 2–4 September 2015; pp. 1–8.

9. Mascharka, D.; Manley, E. Machine learning for indoor localization using mobile phone-based sensors.Available online: https://arxiv.org/abs/1505.06125 (accessed on 25 November 2018).

10. Zou, H.; Lu, X.; Jiang, H.; Xie, L. A fast and precise indoor localization algorithm based on an onlinesequential extreme learning machine. Sensors 2015, 1, 1804–1824. [CrossRef] [PubMed]

11. Adege, A.; Lin, H.P.; Tarekegn, G.; Jeng, S.S. Applying Deep Neural Network (DNN) for Robust IndoorLocalization in Multi-Building Environment. Appl. Sci. 2018, 7, 1062. [CrossRef]

12. Kodippili, N.S.; Dileeka, D. Integration of fingerprinting and trilateration techniques for improved indoorlocalization. In Proceedings of the Seventh International Conference on Wireless and Optical CommunicationNetworks (WOCN), Colombo, Sri lanka, 6–8 September 2010; pp. 1–6.

13. Chan, S.; Sohn, G. Indoor localization using Wi-Fi based fingerprinting and trilateration techniques forLBS applications. In Proceedings of the 7th International Conference on 3D Geoinformation, Quebec, QC,Canada, 16–17 May 2012.

14. Suining, H.; Tianyang, H.; Chan, S.-H.G. Contour-based trilateration for indoor fingerprinting localization.In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, Seoul, South Korea,1–4 November 2015; pp. 225–238.

15. Zayets, A.; Steinbach, E. Robust Wi-Fi-based indoor localization using multipath component analysis.In Proceedings of the IEEE International Conference on Indoor Positioning and Indoor Navigation (IPIN),Sapporo, Japan, 18–21 September 2017; pp. 1–8.

16. Laitinen, H.; Juurakko, S.; Lahti, T.; Korhonen, R.; Lahteenmaki, J. Experimental Evaluation of LocationMethods Based on Signal-Strength Measurements. IEEE Trans. Veh. Technol. 2007, 56, 287–296.

17. Chen, K.; Wang, C.; Yin, Z.; Jiang, H.; Tan, G. Slide: Towards fast and accurate mobile fingerprinting forWi-Fi indoor positioning systems. IEEE Sens. J. 2018, 3, 1213–1223. [CrossRef]

18. Shu, Y.; Huang, Y.; Zhang, J.; Coué, P.; Cheng, P.; Chen, J.; Shin, K.G. Gradient-based fingerprinting forindoor localization and tracking. IEEE Trans. Ind. Electron. 2016, 63, 2424–2433. [CrossRef]

19. Fang, S.H.; Cheng, Y.C.; Chien, Y.R. Exploiting Sensed Radio Strength and Precipitation for ImprovedDistance Estimation. IEEE Sens. J. 2018, 18, 6863–6873. [CrossRef]

20. Zhuang, Y.; Syed, Z.; Li, Y.; El-Sheimy, N. Evaluation of two WiFi positioning systems based on autonomouscrowdsourcing of handheld devices for indoor navigation. IEEE Trans. Mob. Comput. 2016, 15, 1982–1995.[CrossRef]

21. Chen, L.; Li, B.; Zhao, K.; Rizos, C.; Zheng, Z. An improved algorithm to generate a Wi-Fi fingerprintdatabase for indoor positioning. Sensors 2013, 8, 11085–11096. [CrossRef] [PubMed]

22. Pivato, P.; Palopoli, L.; Petri, D. Accuracy of RSS-Based Centroid Localization Algorithms in an IndoorEnvironment. IEEE Trans. Instrum. Meas. 2011, 60, 11085–11096. [CrossRef]

23. Joaquin, T.S.; Antonio, R.J.; Adriano, M.; Tomás, L.; Lu, M.C.; Knauth, S.; Germán, M.M.S.; Fernando, S.;Antoni, P.N.; Maria, J.N.; et al. Off-Line Evaluation of Mobile-Centric Indoor Positioning Systems: TheExperiences from the 2017 IPIN Competition. Sensors 2018, 18, 1–27.

Page 16: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 16 of 17

24. Liqun, L.; Jacky, S.; Chunshui, Z.; Thomas, M.; Lin, J.H.; Feng, Z. Experiencing and handling the diversity indata density and environmental locality in an indoor positioning service. In Proceedings of the 20th AnnualInternational Conference on Mobile Computing and Networking (Mobicon), Maui, HI, USA, 7–11 September2014; pp. 459–470.

25. Ling, G.; Lei, G.; Nour, E.D.; Chengwu, L. Statistical Machine Learning vs Deep Learning in InformationFusion: Competition or Collaboration? In Proceedings of the IEEE Conference on Multimedia InformationProcessing and Retrieval (MIPR), Miami, FL, USA, 10–12 April 2018; pp. 251–256.

26. Zheng, Y. Methodologies for Cross-Domain Data Fusion: An Overview. IEEE Trans. Big Data 2015, 1, 16–34.[CrossRef]

27. Qingchen, Z.; Laurence, T.Y.; Zhikui, C. Deep Computation Model for Unsupervised Feature Learning onBig Data. IEEE Trans. Serv. Comput. 2016, 9, 161–171.

28. Jedari, E.; Wu, Z.; Rashidzadeh, R.; Saif, M. Wi-Fi based indoor location positioning employing randomforest classifier. In Proceedings of the IEEE International Conference on Indoor Positioning and IndoorNavigation (IPIN), Banff, AB, Canada, 13–16 October 2015; pp. 1–5.

29. Gomes, R.; Ahsan, M.; Denton, A. Random Forest Classifier in SDN Framework for User-Based IndoorLocalization. In Proceedings of the IEEE International Conference on Electro/Information Technology (EIT),Rochester, MI, USA, 3–5 May 2018; pp. 537–542.

30. Nuño-Maganda, M.; Herrera-Rivas, H.; Torres-Huitzil, C.; Marisol Marín-Castro, H.; Coronado-Pérez, Y.On-Device Learning of Indoor Location for Wi-Fi Fingerprint Approach. Sensors 2018, 7, 2202. [CrossRef]

31. Wang, X.; Gao, L.; Mao, S.; Pandey, S. DeepFi: Deep learning for indoor fingerprinting using channel stateinformation. In Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC),New Orleans, LA, USA, 9–12 March 2015; pp. 1666–1671.

32. Nowicki, M.; Wietrzykowski, J. Low-Effort Place Recognition with WiFi Fingerprints Using Deep Learnin; Springer:Cham, Switzerland, 2017; pp. 575–584.

33. Torres-Sospedra, J.; Montoliu, R.; Martínez-Usó, A.; Avariento, J.P.; Arnau, T.J.; Benedito-Bordonau, M.;Huerta, J. UJIIndoorLoc: A new multi-building and multi-floor database for WLAN fingerprint-based indoorlocalization problems. In Proceedings of the IEEE International Conference on Indoor Positioning and IndoorNavigation (IPIN), Busan, South Korea, 27–30 October 2014; pp. 261–270.

34. Kudo, T.; Miura, J. Utilizing Wi-Fi signals for improving SLAM and person localization. In Proceedings ofthe IEEE/SICE International Symposium on System Integration (SII), Taipei, Taiwan, 11–14 December 2017;pp. 487–493.

35. Depatla, S.; Muralidharan, A.; Mostofi, Y. Occupancy estimation using only WiFi power measurements. IEEEJ. Sel. Areas Commun. 2015, 33, 1381–1393. [CrossRef]

36. Cai, X.; Li, X.; Yuan, R.; Hei, Y. Identification and mitigation of NLOS based on channel state information forindoor Wi-Fi localization. In Proceedings of the IEEE International Conference on Wireless Communications& Signal Processing (WCSP), Nanjing, China, 15–17 October 2015; pp. 1–5.

37. Dey, S.; Chugg, K.M.; Beerel, P.A. Morse Code Datasets for Machine Learning. In Proceedings of the9th International Conference on Computing, Communication and Networking Technologies (ICCCNT),Banglore, India, 10–12 July 2018; pp. 1–7.

38. Gharatkar, S.; Ingle, A.; Naik, T.; Save, A. Review pre-processing using data cleaning and stemmingtechnique. In Proceedings of the IEEE International Conference on Innovations in Information, Embeddedand Communication Systems (ICIIECS), Coimbatore, India, 17–18 March 2017; pp. 1–4.

39. Gu, Y.; Lo, A.; Niemegeers, I. A survey of indoor positioning systems for wireless personal networks. IEEECommun. Surv. Tutor. 2009, 11, 13–32. [CrossRef]

40. Eisa, S.; Peixoto, J.; Meneses, F.; Moreira, A. Removing useless APs and fingerprints from WiFi indoorpositioning radio maps. In Proceedings of the IEEE International Conference on Indoor Positioning andIndoor Navigation (IPIN), Montbeliard-Belfort, France, 28–31 October 2013; pp. 1–7.

41. JoshvaDevadas, T.; Seelammal, C.; Sadasivam, S. On data cleaning with intelligent agents to improve theaccuracy of wi-fi positioning system using gis. Asian J. Sci. Res. 2013, 6, 53–66. [CrossRef]

42. Song, C.; Wang, J. WLAN Fingerprint Indoor Positioning Strategy Based on Implicit Crowdsourcing andSemi-Supervised Learning. ISPRS Int. J. Geo-Info. 2017, 6, 356. [CrossRef]

43. Bogdándy, B. Wi-Fi RSSI pre-processing library for Android. In Proceedings of the 19th InternationalCarpathian Control Conference (ICCC), Szilvasvarad, Hungary, 28–31 March 2018; pp. 649–654.

Page 17: Pre- and Post-Processing Algorithms with Deep Learning

Electronics 2019, 8, 195 17 of 17

44. Bruha, I.; Famili, A. Postprocessing in machine learning and data mining. ACM SIGKDD Explor. Newslett.2000, 2, 110–114. [CrossRef]

45. Tf.reshape.Tensorflow.org. Available online: https://www.tensorflow.org/api_docs/python/tf/reshape(accessed on 28 December 2018).

46. Liu, H.; Darabi, H.; Nanerjee, P.; Liu, J. Survey of wireless indoor positioning techniques and systems.IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2007, 37, 1067–1080. [CrossRef]

47. Hwang, S.H. Pre- and Post-Processing Algorithms with Deep Learning Classifier for Wi-Fi Fingerprint-BasedIndoor Positioning. Available online: https://zenodo.org/record/2556161#.XFa40JCP600 (accessed on3 February 2019).

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open accessarticle distributed under the terms and conditions of the Creative Commons Attribution(CC BY) license (http://creativecommons.org/licenses/by/4.0/).