proceedings of the 14th apan research workshop 2017€¦ · proceedings of the 14th apan research...

Proceedings of the

14th APAN Research Workshop 2017 28th August 2017, ISBN 978-4-9905448-7-4

Dalian International Finance Conference Center Dalian, China

Editor-in-Chief: Yoshiaki Kasahara, Kyushu University, Japan

Teck Chaw Ling, University of Malaya, Malaysia

Publisher: Asia-Pacific Advanced Network (APAN)

APAN-RW 2017 Committee Members

Workshop Co-Chair and Technical Program Committee Co-Chair

Yoshiaki Kasahara, Kyushu University, Japan

Teck Chaw Ling, University of Malaya, Malaysia

Publicity and Local Arrangement

Manesha Hettipathirana, APAN Secretariat

N. S. Weerakoon, APAN Secretariat

Technical Program Committee

J. Adinarayana, IIT Bombay, India Navaneethan C Arjuman, NLTVC Sdn. Bhd., Malaysia Chaodit Aswakul, Chulalongkorn University, Thailand Jun Bi, Tsinghua University, China Nevil Brownlee, The University of Auckland, New

Zealand ByungRae Cha, GIST, Korea Chalermpol Charnsripinyo, NECTEC, Thailand Boncheol Goo, KAIST GSCT, Korea Shigeki Goto, Waseda University, Japan Ho Seong Han, Seoul National University Bundang

Hospital, Korea Dongsoo Har, KAIST, Korea Andrew Howard, Australian National University,

Australia Eiji Kawai, NICT, Japan JongWon Kim, GIST, Korea Takuji Kiura, NARO, Japan Yong-moo Kwon, KIST, Korea HyunYong Lee, ETRI, Korea Eueung Mulyana, Institut Teknologi Bandung,

Indonesia Motonori Nakamura, National Institute of Informatics,

Japan

Suhaimi Napis, Universiti Putra Malaysia, Malaysia Faridah Noor, University of Malaya, Malaysia Koji Okamura, Kyushu University, Japan Sanghoon Park, Hanwha Thales, Korea Sun Park, GIST, Korea Rungsun Rerknimitr, Chulalongkorn University,

Thailand Shuji Shimizu, Kyushu University Hospital, Japan Wang-Cheol Song, Jeju National University, Korea Kei Tanaka, NARO, Japan Nguyen Huu Thanh, Hanoi University of Science and

Technology, Vietnam Denis Villorente, Department of Science and

Technology, Philippine Ye-Nu Wan, NCHU, Taiwan Yufeng Xin, RENCI, USA Ma Yan, Beijing University of Posts and

Telecommunications, China Kitamura Yasuichi, Kyushu University Hospital, Japan Eric Yen, Academia Sinica Grid Computing Centre,

Taiwan Shigetoshi Yokoyama, National Institute of

Informatics, Japan XinWen Yu, Chinese Academy of Forestry, China

Contents

Session 1 - Security and Identity Management

Pg

1 Discriminating DRDoS Packets using Time Interval Analysis Daiki Noguchi, Tatsuya Mori (Waseda Univ., Japan), Yota Egusa, Kazuya Suzuki (SAKURA Internet inc., Japan), and Shigeki Goto (Waseda Univ., Japan)

1

2 Zero-day Malicious Email Behavior Investigation and Analysis Sanouphab Phomkeona (Kyushu Univ. Japan), Kristan Edwards (Univ. of Queensland, Australia), Yoshitatsu Ban (Human Techno System Inc., Japan), and Koji Okamura (Kyushu Univ., Japan)

8

3 A Proof of Stake Sharding Protocol for Scalable Blockchains Y. Gao and H. Nobuhara (Univ. of Tsukuba, Japan)

13

4 Identity Authentication and Data Access Authorization in Regional Education Informatization Qi Feng, ZhongLin Chen, FuKe Shen (East China Normal Univ., China), and YuHong Zhu (Shanghai Municipal Education Commission, China)

17

Session 2 – Cloud/HPC and Miscellaneous

5 Prototyping Workload-based Resource Configuration for Cloud-based HPC/BigData Cluster Sharing Namgon Lucas Kim and JongWon Kim (Gwangju Institute of Science and Technology, Korea)

24

6 NAMD Benchmarking on Publicly Available Philippine Computational Resources Ronny Cheng, Ren Tristan Dela Cruz, Francoise Neil Dacanay, Gil Claudio, Ricky Nellas (Univ. of the Philippines Diliman, Philippines)

29

7 Comparison of Service Description and Composition for Complex 3-Tier Cloud-based Services Moonjoong Kang and Jongwon Kim (Gwangju Institute of Science and Technology, Korea)

37

8 Future Projections Ship Accessibility for the Arctic Ocean based on IPCC CO2 Emission Scenarios Jai-Ho Oh, Sinil Yang (Pukyong National Univ., Korea) and Byong-Lyol Lee (World Meteorological Organization, UN)

41

Session 3 – Networking

9 Design and Development of the Reactive BGP peering in Software-Defined Routing Exchanges Hao-Ping Liu, Pang-Wei Tsai, Wu-Hsien Chang, and Chu-Sing Yang (National Cheng Kung Univ., Taiwan)

48

10 Experimental Tests for Outage Analysis in SISO Li-Fi Indoor Communication Environment Atchutananda Surampudi, Sankalp Chapalgaonkar (Indian Institute of Technology Madras, India), and Paventhan Arumugam (ERNET, India)

54

11 Zigbee based Home Automation and Agricultual Monitoring System Rakesh Jha, Shivam Khare, Rahul Sharma, Anubhav Tewari, and Ankit Tyagi (Shri Mata Vaishno Devi Univ., India), Shubha Jain(SGSITS,Indore,M.P,India)

61

12 Effective Evacuation Route Strategy during Natural Disaster K-zin Phyo and Myint Myint Sein (University of Computer Studies Yangon, Myanmar)

70

Proceedings of the APAN – Research Workshop 2017 ISBN 978-4-9905448-7-4

Abstract— The Distributed Reflection Denial of Service

(DRDoS) attack represents a critical security threat. As such attacks generate unidirectional traffic, it is difficult for the targets to protect themselves. To mitigate against such attacks, defense mechanisms must be installed on backbone networks, to detect and block the attack traffic before it reaches the final destination. Conventional approaches monitor the traffic volume, and assume that an attack is in progress if the observed volume exceeds a certain threshold. However, this simple approach allows the attacker to evade detection by adjusting the traffic volume. In this study, we proposed a novel approach that accurately detects DRDoS attacks using the time intervals between the arriving packets. We applied a K-means clustering algorithm to identify the appropriate threshold value. The proposed algorithm was implemented at a real data center, and the results demonstrated the high level of accuracy that our approach can achieve.

Index Terms—DDoS, DRDoS, NTP, Time interval, K-means

I. INTRODUCTION Distributed Denial of Service (DDoS) attacks pose a severe

security threat. Distributed Reflection Denial of Service (DRDoS) is a sophisticated form of DDoS that makes use of open servers. There are four prominent types of DRDoS attack, using the protocols CharGen, DNS, NTP, and SSDP [1]. When these User Datagram Protocol (UDP)-based attacks are generated by DDoS as-a-service, they are known as Booters [2].

Several mitigation techniques are available. Li et al. report that cloud services may be used as botnets [3], allowing attackers to expand the scale of the attack at a low cost, and propose a defense called srcTrace. Arbor Networks offers a Peak Flow SP for service providers that monitors huge

Daiki Noguchi and Shigeki Goto are with the Department of Computer

Science and Engineering, Waseda University, Shinjuku, Tokyo 169-8555 Japan, e-mail: (see http://www.goto.info.waseda.ac.jp).

Yota Egusa and Kazuya Suzuki are with SAKURA Internet inc., Grand Front Osaka Tower A 35F, 4-20 Ofukacho, Kita-ku, Osaka-shi, Osaka 530-0011 Japan, e-mail: y-egusa AT sakura.ad.jp (Yota Egusa), ka-suzuki AT sakura.ad.jp (Kazuya Suzuki).

Manuscript received June 25, 2017

amounts of backbone traffic to detect malicious packets [4]. These approaches assume that malicious packets will have a large traffic volume. In reality, however, some attacks may not be easy to detect by measuring traffic volumes. For example, the packet size of an HTTP GET request is small, while the response from the Web server is lengthy [5]. The UDP query packets that invoke DRDoS attacks are also small.

In this study, we proposed a novel approach to detecting DRDoS attacks that use small packets. The key is the leverage of time interval analysis. The proposed method compares the time intervals between packets. After removing outliers, it then calculates a threshold value and applies a K-means clustering algorithm to the time intervals.

The rest of this paper is organized as follows: Section II describes the mechanism and introduces the terminology of DDoS and DRDoS attacks. Section III discusses those existing methods that are relevant to the current study. In Section IV, we set out our goals. The proposed method is described in Section V. Section VI reports on an evaluation of the novel method, carried out in a data center. Some further issues are explored in Section VII, and our conclusions are presented in Section VII.

II. DDOS AND DRDOS ATTACKS In Q4 of 2016, DRDoS attacks occurred more frequently

than SYN Flood attacks. Table I shows examples of DRDoS attacks, which are characterized by their large traffic volume. In 2014, NTP amplification attacks attracted attention because of their catastrophic traffic volumes while using normal NTP servers, which are open to the Internet.

Figure 1 shows the mechanism of a DRDoS attack. These

use UDP packets such as CharGen, DNS, NTP, or SSDP. Such protocols have longer size responses, compared with the short

Discriminating DRDoS Packets using Time Interval Analysis

Daiki Noguchi, Tatsuya Mori, Yota Egusa, Kazuya Suzuki, and Shigeki Goto

TABLE I LARGE VOLUME DRDOS ATTACK INCIDENTS.

Date Protocol Traffic rate [Gbps]

Mar. 2013 DNS 300 May 2013 DNS 167 Feb. 2014 NTP 400 Aug. 2015 RPC 100

1

http://www.goto.info.waseda.ac.jp/

mailto:y-egusa%20AT%20sakura.ad.jp

mailto:[email protected]


queries. The packet size is amplified by the reflector. For example, the monlist command of the NTP protocol replies to a query with the communication history for, at most, 600 devices. This offers a convenient tool for attackers who prefer packet-size amplification. Servers that reply to queries from the Internet are called reflectors.

Fig. 1. Mechanism of DRDoS attacks.

The attacker sends malicious queries to these reflectors while spoofing the source IP address as that of a victim. The reflector then unwittingly returns large-scale responses to that address.

A large number of reflectors exist worldwide, most of which are improperly configured or use a default setting. Several organizations attempt to identify vulnerable servers on the Internet, then notify their administrators [10].

III. RELATED WORKS A DRDoS attack works in two ways. A huge number of

packets may be generated, occupying the entire bandwidth of certain communication links. This can be detected by measuring the traffic volumes across the links. This detection method is implemented using an IDS (Intrusion Detection System), in which an alarm is triggered if a traffic threshold is exceeded.

A second DRDoS attack type uses a small query packet that is answered by a longer reply packet. If the number of query packets is large, the computational resources of a server, including memory, CPU, or process tables, may be overwhelmed. This form of attack is difficult to detect by measuring traffic volumes.

Our earlier work, reported in [11], analyzed the time intervals between DRDoS packets, and covered the CharGen, DNS, NTP, and SSDP protocols. These time intervals were used to characterize each attack. The current study extended this approach by applying a clustering method to discriminate between DRDoS attacks and normal communications. Hayashi et al. proposed a time interval analysis-based method for mitigating HTTP GET request flood attacks on backbone networks [12]. Their approach used two threshold parameters, 𝑇𝑡ℎ and 𝐷𝑡ℎ. If two packets arrive at the server within a time 𝑇𝑡ℎ, they are assumed to be successive (see Figure 2). If the series of successive packets has a longer duration than 𝐷𝑡ℎ, it is treated as suspicious. However, no specific values for 𝑇𝑡ℎ and 𝐷𝑡ℎ were given. In this study, a novel method was proposed for

determining the values of 𝑇𝑡ℎ and 𝐷𝑡ℎ. The approach was then implemented in a data center to assess its performance.

Fig. 2. Packets judged to be successive.

Li et al. [3] used the entropy of a network flow to detect an attacking flow. This assumes that the transmission rate of an attack flow will be larger than that of a legitimate flow. The current study did not assume this. Instead, the analysis was based only on the interval between the arrival times of packets.

IV. PROBLEM TO BE SOLVED

A. Environment of a Data Center The configuration of the data center implementation is

shown as Figure 3. This center has its own AS number. The edge router forwards packets between the outside AS and the inside AS.

Fig. 3. Configuration of the observation point.

The packets were observed at a certain sampling rate by the edge router. Before data analysis, the source IP and destination IP addresses were anonymized using a hash algorithm, to protect the privacy of the information. Only the headers of the packets were observed. Observations were conducted over the period from November 9, 2016 to December 11, 2016.

The packets were represented in 5-tuple flow format (src_IP,

dst_IP, src_port, dst_port, protocol). Sampling was conducted randomly, at a rate of ten flows per hour. The key data were the packet lengths and the intervals between packet arrivals. Flows whose UDP source ports were numbered 17, 19, 53, 111, 123, 137, 161, 1900, 3000, and 27960 were picked up, forming the UDP Port list. These ports are widely used in DRDoS attacks. We also picked up flows through TCP source port 80, as these are sent by Web servers. Potentially abnormal flows were identified by combining the source port number and the destination port number, as shown in Table II.

2

Table III gives a breakdown of the observed flows used in the study. While UDP flows were also observed, these non-NTP packets were rare. Our analysis therefore focused on the NTP flows.

B. Preliminary Investigation Since our analysis is based on the time intervals between

incoming packets in a flow, the distribution of time intervals is important. Figure 4 shows the time intervals, organized by the range shown in Table IV. Significant differences were observed. Note that the Y-axis of Figure 4 uses a logarithmic

scale, and each value differs by at least one order of magnitude from the next. The wide range of time intervals can be seen, and at least five groups could be distinguished. However, it was not clear whether this division by digits was appropriate. We therefore applied a K-means algorithm to the packet intervals to form more robust clusters when n equaled five.

Fig. 4. Packet intervals grouped by five conditions.

C. Identifying a Suspicious Flow The goal of the study was to propose a new method for

discriminating between abnormal and normal flows. The output was therefore labeled either Suspicious or Non-suspicious. The possible outcomes are shown in Table V.

Our proposed method identifies a flow as suspicious if the time between successive packets in the flow exceeds a certain

threshold. We discuss the determination of this value in the following section.

The proposed method does not use the port numbers to judge whether a sequence of packets is suspicious, because new protocols or port numbers may be used at some future date, allowing attacks to resume. Instead, our method uses only the time interval between arriving packets.

V. PROPOSED METHOD The proposed method has three steps, as shown in Figure 5. In step one, outlier values are removed from each flow. In step two, the packet intervals in each flow are classified. In step three, the threshold value indicating a suspicious flow is derived.

Fig. 5. Flow Chart of the proposed method.

A. STEP 1: Removal of outliers Outliers may exist in a flow that contains a pause. For

example, an attacker may intentionally insert a long pause between certain packets to evade detection. Such outliers would constitute noise when calculating the threshold values

TABLE II DEFINITION OF ABNORMAL FLOWS BY PORT NUMBERS.

Src Dst 80 UDP Port List Over 49151 80 Abnormal Normal

UDP Port List Abnormal Normal

TABLE III BREAKDOWN OF OBSERVED FLOWS.

Normal Abnormal Total NTP flow 3,640 14 3,654

HTTP flow 2,311 0 2,311

TABLE V POSSIBLE OUTCOMES.

Suggest

Suspicious Suggest

Non-suspicious Abnormal

Flow True Positive

(TP) False Negative

(FN) Normal Flow

False Positive (FP)

True Negative (TN)

TABLE IV DESCRIPTION OF GROUPS.

Group

Number Conditions

1 i < 0.1𝑠 2 0.1s ≤ i < 1𝑠 3 1s ≤ i < 10𝑠 4 10s ≤ i < 100𝑠 5 100s ≤ i

3

in step two. Any outliers were therefore removed, using the ChangeFinder algorithm [14] shown in Figure 6.

Fig. 6. Two-phase learning process of the ChangeFinder algorithm.

ChangeFinder applies two phase learning. First, input data are analyzed using the Autoregressive (AR) model, then the sequentially discounting AR model (SDAR) learning is applied. The AR model is given by Equation (1) where 𝑦𝑡 is

the data on the packet intervals, 𝑎𝑖 is the AR coefficient, o is the order, and w is white noise, which follows a distribution whose average is zero. ChangeFinder learns the probability density 𝑃𝑡, of the packet intervals x𝑡 by applying the SDAR algorithm, then calculates the outlier scores m( 𝑥𝑡 ) using Equation (2).

m(𝑥𝑡) = −log 𝑝𝑡−1(𝑥𝑡|𝑥𝑡−1) (2) Here, 𝑝𝑡−1(𝑥𝑡|𝑥𝑡−1) is the conditional density function of

𝑥𝑡 against the stochastic process p, and 𝑥𝑡−1 is the series (𝑥1, 𝑥2 , … 𝑥𝑡−1 ). The scores indicate the degree of separation between the values predicted by the AR model and x𝑡. The AR model mainly assumes stationary data, so an important feature of ChangeFinder is its ability to handle non-stationary data by applying the SDAR algorithm. Figure 7 shows how ChangeFinder detects outliers in the data. These outlier scores are normalized regardless of the data range. In this study, we set a threshold value of 50 for the detection of outliers.

Fig. 7. Packet intervals and outlier scores.

B. STEP 2: Clustering by K-means The K-means clustering algorithm [16] was used to

discriminate between successive packet sequences from the terminated packet sequences. The two threshold values, 𝑇𝑡ℎ and 𝐷𝑡ℎ, were determined in Step 3. The K-means algorithm attempts to minimize the value of Equation (3):

Here, 𝐶𝑖 (1 ≤ 𝑖 ≤ 𝑛) represents a cluster, n is the number of clusters, x is an element in the cluster, and µ𝑖 is the centroid of the cluster. The K-means algorithm selects an appropriate µ𝑖 for each cluster, in order to minimize the value of Equation (3).

We used the K-means algorithm in the scikit-learn package [18], and applied it to each normal and abnormal flow. Each cluster was ranked in ascending order, based on its maximum interval. The values Q, R, and U in Table VI were then calculated for each cluster, and the value of U used to compare the clusters. When calculating Q, we rounded off the

packet interval time to two decimal places. The cluster whose U value (=Q×R) is largest is likely to

include a large number of intervals between successive packets. Li [3] noted an apparent difference between short packet intervals and long ones. Our observations confirmed this.

Fig. 8. Example distribution of packet intervals in different clusters.

Within a flow there are five clusters, each of which is a set of packet intervals. Figure 8 shows a distribution graph (CDF) of the packet intervals, in which gradient lines connect the points at CDF values of 0.1 and 0.9. In the example, there are five distribution graphs of packet intervals for each cluster, with different gradients. Combining all the flows and plotting the CDFs and gradients produced Figure 9, in which the distribution and gradient of merged clusters 2, 3, 4, and 5 are also shown.

𝑦𝑡 = ∑ 𝑎𝑖𝑦𝑡−𝑖 + 𝑤

𝑜

𝑖=1

(1)

∑ ∑ ||x − µ𝑖||2

𝑥∈𝐶𝑖

𝑛

𝑖=1

(3)

TABLE VI DESCRIPTION OF VALUES FOR DETECTION.

Values Description Q Maximum value of the number of duplicated

packet intervals in the cluster R Total number of packet intervals in the cluster U Q × R

4

Fig. 9. Distribution of gradients of all flows.

The gradient line is steep if the distribution of packet

intervals is concentrated. As shown in Figure 9, the CDF of cluster 1 for both normal and suspicious flows was larger than the CDF of the merged clusters 2, 3, 4, and 5. This suggested that shorter packet intervals were concentrated. However, longer packet intervals were also present. Cluster 1 was therefore assumed to include a large number of packet intervals. However, as it was not clear that the same result could be seen in each flow, the value Q was used.

Because short intervals were more numerous than large intervals, we adopted the value R, as can be seen from Figure 4. This showed that the majority of intervals were small.

As noted above, successive intervals should be short. We named the cluster whose U was largest 𝐶𝑠𝑢𝑐𝑐, and the cluster next to 𝐶𝑠𝑢𝑐𝑐 , and whose maximum interval was larger, 𝐶𝑑 . This is shown in Figure 10. Most of the packet intervals in 𝐶𝑑 were not successive.

Fig. 10. Examples of clusters and packet intervals.

In Figure 10, interval 𝑖𝐸 is within 𝐶𝑠𝑢𝑐𝑐. Because 𝐶𝑠𝑢𝑐𝑐 has the largest U, 𝑖𝐸 is a successive interval, whereas 𝑖𝐹 is within 𝐶𝑑, and it is non-successive.

C. STEP 3: Determining the values To detect suspicious flows, the optimal threshold Dth must

be found. 1) Longest interval time 𝑇𝑡ℎ for successive packets

We set 𝑇𝑡ℎ to the maximum packet interval time in cluster 𝐶𝑠𝑢𝑐𝑐. This threshold value was used in (2).

2) Formation of successive packet sequences For each flow, we collected all the interval times from

cluster 1 to 𝐶𝑠𝑢𝑐𝑐 . For each flow, we formed a successive sequence of packets that fell within the maximum interval time 𝑇𝑡ℎin (1). We then calculated the total duration of a sequence as the sum of the successive arrival times. For a normal flow, we used all successive sequences. For an abnormal flow, we used the maximum successive arrival time. 3) Fix 𝐷𝑡ℎ

Finally, we calculated the distribution of successive duration times of the normal and abnormal flows, represented in the CDFs. The optimal time 𝐷𝑡ℎ is given when Equation (4) takes the maximum value.

𝐶𝐷𝐹(𝑛𝑜𝑟𝑚𝑎𝑙 𝑎𝑟𝑟𝑖𝑣𝑎𝑙 𝑡𝑖𝑚𝑒)

− 𝐶𝐷𝐹(𝑎𝑏𝑛𝑜𝑟𝑚𝑎𝑙 𝑎𝑟𝑟𝑖𝑣𝑎𝑙 𝑡𝑖𝑚𝑒) (4)

The proposed method uses the derived value of 𝐷𝑡ℎ as the threshold for detecting a suspicious flow. If the time between successive packets equals or exceeds this threshold, an abnormal flow is suspected.

VI. EVALUATION

A. Threshold Value: 𝐷𝑡ℎ To evaluate our method, the proposed method was applied

to the equipment in a data center, as shown in Figure 3. The algorithm produced a threshold value of 𝐷𝑡ℎ = 4.0 sec.

B. Performance of the proposed method Figure 11 shows the results. These follow the ground truth

of flows defined by port numbers from Table II, in which an abnormal flow has the source port number 80 (HTTP) and a destination port number 123 (NTP) or a source port number 123 (NTP) and the destination port number 80 (HTTP). Table VII shows the outcomes, where S is the time between the arrival of successive packets in the real traffic. The True Positive rate (TP) was 100%, and the False Positive rate (FP), while not zero, was low at 5%. For the analyzed packets, most flows were normal. If a high (FP) were found, extra resources would be needed to investigate those suspicious flows that turn out to be normal.

Fig. 11. Distribution of duration times.

5

VII. DISCUSSION

A. Effect of outlier removal When the outlier values were not removed, different results

were produced. Using the same threshold value of 𝐷𝑡ℎ = 4.0, the FP rate rose to 14.2%. Figure 12 shows the results without removal of the outliers. It can be seen that the performance was inferior to that reported in Figure 12, demonstrating the effectiveness of outlier removal.

Fig. 12. Distribution of interval times.

Fig. 13. Distribution of packet sizes across all flows.

B. Observed packet sizes In the Introduction, we noted that the UDP query packets

used to invoke DRDoS attacks are small. Figure 13 shows the distribution of packet sizes across all flows in the data center. This demonstrates that there exists no threshold value that allows normal HTTP packets to be distinguished from NTP attacks based on packet size alone.

VIII. CONCLUSIONS This study proposed a practical method for detecting

DRDoS attacks by analyzing the time intervals between the arrival timestamps of packets. Threshold values were determined after outlier removal, using a K-means clustering algorithm.

An evaluation experiment demonstrated that our proposed method is superior to the conventional DRDoS detection method, based on measurement of traffic volume. This study addressed only the NTP protocol. In future work, we will investigate DRDoS attacks using other protocols, including CharGen, DNS, and SSDP.

ACKNOWLEDGEMENTS A part of this work was supported by JSPS Grant-in-Aid for

Scientific Research B, Grant Number JP16H02832.

REFERENCES

[1] Akamai, “Q4 2016 State of the Internet Security Report,” https://www.akamai.com/us/en/multimedia/documents/state-of-the-internet/q4-2016-state-of-the-internet-security-report.pdf, referred Nov. 5, 2016.

[2] Santanna, José Jair, et al., “Booters—An analysis of DDoS-as-a-service attacks,” Integrated Network Management (IM), 2015 IFIP/IEEE International Symposium on. IEEE, 2015.

[3] B. Li, W. Niu, K. Xu, C. Zhang, P. Zhang, “You can’t hide: a novel methodology to defend DDoS attack based on BotCloud,” Applications and Techniques in Information Security, Communications in Computer and Information Science, Springer Berlin Heidelberg, pp. 203 – 214, 2015.

[4] Arbor Networks, “Arbor Networks SP,” http://www.arbornetworks.com/, referred Oct. 23, 2016.

[5] IMPERVA INCAPSULA, “HTTP FLOOD,” https://www.incapsula.com/ddos/attack-glossary/http-flood.html, referred Oct. 20, 2016.

[6] The Register, “BIGGEST DDoS ATTACK IN HISTORY hammers Spamhaus,” https://www.theregister.co.uk/2013/03/27/spamhaus_ddos_megaflood/, referred Oct. 12, 2016.

[7] Internet Initiative Japan, “Problems about DNS Open Resolver,” https://www.iij.ad.jp/company/development/report/iir/pdf/iir_vol21_internet.pdf, referred Oct. 12, 2016.

[8] JANOG, “NTP Reflection DDoS Attack Explanatory Document,” https://www.janog.gr.jp/wg/doc/ntp-wg-en.pdf, referred Oct. 12, 2016.

[9] Akamai, “Akamai Warns Of 3 New Reflection DDoS Attack Vectors,” https://www.akamai.com/jp/ja/about/news/press/2015-press/akamai-warns-of-3-new-reflection-ddos-attack-vectors.jsp, referred Oct. 12, 2016.

[10] Shadowserver, https://www.shadowserver.org/wiki/, referred Jan. 24, 2017.

[11] Daiki Noguchi and Shigeki Goto, Defense against DRDoS Attacks by OpenFlow Switches, Proceedings of the computer security symposium 2016, pp.1183 – 1190, October, 2016. (in Japanese)

TABLE VII RESULTS OF EVALUATION.

S ≥ 4.0 S < 4.0

Abnormal Flow

100.0% 0.0%

Normal Flow

5.0% 95.0%

6

[12] Yuhei Hayashi et al., “Evaluation of the attack detection method based on duration of continuous packet arrival,” IEICE technical report 115(488), pp. 53 - 58, 2016. (in Japanese).

[13] Arbor Networks, “Worldwide Infrastructure Security Report Volume X,” http://pages.arbornetworks.com/rs/arbor/images/WISR2014_EN2014.pdf, referred Nov. 15, 2016.

[14] J. Takeuchi, K. Yamanishi, “A unifying framework for detecting

outliers and change points from time series,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 4, pp. 482 – 492, 2016.

[15] NTT DATA Mathematical Systems Inc., “Change point detection, ChangeFinder,” http://cl-www.msi.co.jp/reports/changefinder.html, referred Oct. 20, 2016.

[16] MacQueen, J. B.,“Some Methods for classification and Analysis of Multivariate Observations,” Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability 1, University of California Press, pp. 281 - 297, 1967.

[17] Python Software Foundation, “PyPI – the Python Package Index,” https://pypi.python.org/pypi, referred Oct. 21, 2016.

[18] scikit learn, “scikit-learn,” http://scikit-learn.org/stable/index.html, referred Oct. 24, 2016.

[19] IMPERVA INCAPSULA, “SNMP REFLECTION / AMPLIFICATION,” https://www.incapsula.com/ddos/attack-glossary/snmp-reflection.html, referred Oct. 22, 2016.

[20] Alejandro Nolla, “Amplification DDoS Attack With Quake3 Servers,” http://blog.alejandronolla.com/2013/06/24/amplification-ddos-attack-with-quake3-servers-an-analysis-1-slash-2/, referred Oct. 22, 2016.

Daiki Noguchi Daiki Noguchi received the B.S. degree in Computer Science and Engineering from Waseda University in March, 2016. He is now a master student at Department of Computer Science and Communications Engineering, Waseda University. His research interest covers Future Internet and Cyber Security.

Tatsuya Mori Tatsuya Mori is currently an associate professor at Waseda University, Tokyo, Japan. He received B.E. and M.E. degrees in applied physics, and Ph.D. degree in information science from the Waseda University, in 1997, 1999 and 2005, respectively. He joined NTT lab in 1999. Since then, he has been engaged in the research of Internet measurement and security. He is a member of ACM, IEEE, IEICE, IPSJ, and USENIX.

Yota Egusa Yota Egusa received the B.S. degree in Computer Science and Engineering from Osaka University in March, 2014. He is now a technology executive officer of SAKURA Internet inc.

Kazuya Suzuki Kazuya Suzuki received Associate Degree of Engineering from National Institute of Technology Ibaraki College in 2014. He is now a network engineer of SAKURA Internet inc.

Shigeki Goto Shigeki Goto is a professor at Depart-ment of Computer Science and Engineering, Waseda University, Japan. He received his B.S. and M.S. in Mathematics from the University of Tokyo. Prior to becoming a professor at Waseda University, he has worked for NTT for many years. He also earned a Ph.D in Information Engineering from the University of Tokyo. He is the president of JPNIC. He is a member of ACM and IEEE, and he was a trustee of Internet Society from 1994 to 1997.

7


Abstract— Zero-day malware which is created by cyber

deviants is a critical risk and menace because neither machines nor cyber security tools can easily detect them. Phishing emails are the most common point of intrusion for attackers, who randomly send malware to general users. Based on the rise of phishing emails with zero-day malware behavior, the research workshop uses information security analysis tools as well as develop new tools to define an investigation procedure to investigate malware behavior in order to meet the aims of understanding them better, being able to track them effectively, and collect information to find and help infected victims inside an organization’s network.

Keywords: Zero-day attack; phishing mail; Cybercrime; Information security analysis, basis analysis, dynamic analysis.

I. INTRODUCTION Phishing emails containing zero-day malware is a serious problem that is increasing every year. A recent example of a critical case for cyber security is a new type of ransomware named Wanacry (or Wanacrypt or WanaCypt0r or WanaDecyrpt0r) which was first reported on 12 May 2017. By using spear phishing emails combined with the EternalBlue exploit and DoublePlusar backdoor, plus a vulnerability in older versions of the Windows operating system, the attack was able to spread very successfully and threatened victims to pay the ransom through a digital currency, Bitcoin. Over 300,000 PC’s were affected in 3 days across 150 countries. Only a single click by any user would cause the malicious behavior to spread over an entire organization’s network, compromising the availability of its stored information which is a very difficult situation for both users and IT professionals to investigate and resolve such a problem. This workshop focuses on two research paths: (i) investigating phishing emails to find malware behavior and (ii) advances in malware analysis for cyber security and risk. The paths have been designed to track malware behavior, collect information, filter for useful facts, and use this to prepare for the next

zero-day attack, as well as improving technical processing by developing effective and efficient solutions.

By configuring a mail server and simulating an independent network environment, separate from the general network, we can operate investigation procedures. The operation process starts with an online investigation, then a file format investigation, surface analysis, basic dynamic analysis and advanced static analysis on brand new phishing emails, as well as assisting feature collection for data mining and further understanding malware behaviors. The process also targets industry representatives from anti-virus and IDS-IPS tools that aim to stop cybercrime committed by terrorists. In this paper we present the simulation results and discussion in the Results and Discussion section. Finally, we conclude this work and propose future work in the Conclusion section.

II. RELATED WORK This section presents related works:

A. Zero-day attack A zero-day attack is a cyber-attack exploiting a vulnerability

that has not been disclosed publicly. There is almost no defense against a zero-day attack: while the vulnerability remains unknown, the software affected cannot be patched and anti-virus products cannot detect the attack through signature-based scanning because in general, zero-day attack data is not available until after the attack is discovered. Zero-day vulnerabilities are believed to be used primarily for carrying out targeted attacks, this is based on post-mortem analysis revealing vulnerabilities that security analyses have connected to zero-day attacks [13]. By searching 11 million Windows hosts over a period of 4 years to identify executable files that are linked to exploits of known vulnerabilities, Symantec Research Lab identified 18 vulnerabilities exploited in the wild before their disclosure, 11 of which were not previously known to have been employed in a zero-day attack [5]. After the disclosure of zero-day vulnerabilities, the volume of attacks that exploit them increases by up to 5 orders of magnitude. McQueen et al. [12] analyzed the lifespan of known zero-day vulnerabilities in order to estimate the real number of zero-day vulnerabilities that existed in the past. In

Zero-day Malicious Email Behavior Investigation and Analysis

Sanouphab Phomkeona, Kristan Edwards, Yoshitatsu Ban and Koji Okamura

8

contrast to this previous research, we analyze the malware behavior and collect data on real hosts infected with malware, to understand how to prevent, find and recover victims inside a local network.

B. Malicious Code Analysis Malicious code analyzing has been a popular research topic

in recent years, so accordingly, a number of analysis methods already exist. Static analysis and Dynamic analysis are two main techniques that the research has divided. Static analysis is the process of analyzing a program’s code without actually executing it. Christodorescu et al. [6], introduced a technique that uses model checking to identify parts of a program that implement a previously specified, malicious code template. This technique has been modified to allow for more general code templates and the use of advanced static analysis techniques in [7]. Kruegel et al. [9], developed a static analysis technique to identify malicious behavior in kernel modules that indicate a rootkit. While Kirda et al. [8], researched a behavioral-based approach that relies heavily on static code analysis to detect Internet Explorer plug-ins that present spyware-like behavior. However, the weakness of static analysis is that the code analyzed may not necessarily be the code that is actually run. In particular, this is true for self-modifying programs that use polymorphic or metamorphic techniques [14] and downloaders. Malware may also draw from a wide range of obfuscation mechanisms [10, 11] that may make static analysis very difficult.

While static analysis is a naturally safe and useful detection method, code that can be obfuscated breaches its fundamental limits, and so dynamic analysis is more commonly used. Borders et al. [15], presented a behavior-based approach that aims to dynamically detect evasive malware by injecting user input into the system and monitoring the resulting actions. In addition, a number of approaches exist that directly analyze the code dynamically. Moser et al. [4] explored multiple execution paths of Windows executables for malware analysis by tracking how a program processes interesting input (e.g., the local time, file checks, reads from the network) and Crandall et al. [16] researched on the detection of hidden, time-based triggers in malware which attempt to automatically discover time-dependent behavior by setting different values for the system time. Our research is using a basic dynamic analysis to manually investigate malware files and architecture before and after execution, as well as monitoring and tracking malware behavior while the processes are actively running.

III. METHODOLOGY In this section, we present the method of malware investigation process in 3 parts. The first part presents the experimental malware and simulation topology overview. The second part presents the Online and URL investigation procedures, and finally the last part focuses on the file investigation procedures based on both basic dynamic analysis and static analysis.

A. Simulation Malicious emails are usually sent with phishing URL or

attachment file inside. Before clicking or running the malicious file, the kind of threat or how critical the damage will be after, is unpredictable. To analyze and investigate malware effectively and in a real environment, we setup a mail and recording server, as well as preparing a client PC with a Windows operating system, that is separated from the main network to analyze malware. The purpose of the independent network is to prevent the chance of malware infecting the network and spreading to unprotected hosts. The environment

used in this scopes minimizes this risk. Malicious emails which are suspected to contain malware is forwarded from outside the network to the mail server. We also prepared a number of analyzing tools which were installed on the client PC, such as Sandboxie, Toolwiz Time Freeze, TrID, BinTexT, Process Explorer, Process Monitor, and others. Tcpdump is used to record every transaction from the client PC to the internet and then reviewed using Wireshark.

B. URL and Online Investigation In cases where phishing emails contain suspicious URL links, the investigation process starts from a URL site survey. By collecting information from cyber security sites such as Urlquery.net; Aguse.jp; Virustotal.com; etc., we can receive some basic information. Next, we execute tcpdump on the recording server to record the traffic to the internet before accessing the suspicious URL link. We analyze the recorded traffic using Wireshark, tracking the communication between the client and the target site. Every protocol used in the transaction between the client and outside hosts are hints showing the malware’s behavior. Sometimes they are phishing sites with or without a digital signature, fraudulently asking for the victim’s account username and password, or tricking the user in to installing malicious software. If there is a file download, or installation process here, we will continue to the next part, file investigation and monitoring.

C. Suspicious File Investigation Normally, phishing emails that contain suspicious files randomly generate both the header information (subject and body) and the receiver’s address. A common tactic is to use a subject and filename that does not disclose any discernable information, instead use titles with random numbers that is clearly not specified for the target. However, the most dangerous phishing emails these days are Spear phishing attacks. The header of these emails might contain an important message or keyword that convinces the user to open the file or click on the link, for example: An important message from a CEO, a bank account update, online payment, email account lock, or even a parking area issue. The first part of the process

Fig. 1. Simulation Environment

9

starts with the file format investigation. By using analysis tools such as TrID or BinText, we can learn what file format is hidden inside. Surface analysis follows this. This analysis uses internet tools to detect if the malware is new or common. The specimen is uploaded to analysis sites that will display known useful information. It also takes the hash value of the specimen and performs a retrieval with it. If the malware is already known, it is identified at this point. Another surface analysis is a character string extraction; we attempt to retrieve information from the character string contain in the file like ip address, url, etc. The next process is a basic dynamic analysis, started by using Process Monitor combined with Process Explorer to monitor the Windows system processes while compiling malware, as well as using Regshot to check changes in the registry before and after malware operation. We track the malware behavior by following system calls and child processes that occur after execution. Network traffic is also an important key in this process, so we combine the process investigation with the tcpdump to track the packet information sent and received from each process. Some malwares are very calm and sometime they take long time waiting quietness without any operation or hiding their tasks inside operating system’s common processes such as explorer.exe or powershell.exe, so it is difficult to track this type of malware. Because of this reason, we also develop an active method for spyware detection to keep monitoring network traffic and search for suspicious behavior. By using this tool, any suspicious both outgoing and incoming packets are scanned and matching with general keywords such as host name, private ip address or other additional keywords to find out which one are generated by spyware. After completing this part of the analysis process, we are almost done understanding the malware’s behavior. We place static analysis into the final process, using “IDA Pro” disassembler and reverse engineering to discover the instructions inside the malicious source code, then preform dynamic analysis again using Ollydbg. This is a comprehensive analysis as the debugger allows insight to the values stored on each register and in memory to add additional details about malware.

D. Dynamic Signature After completed two main processes of investigation, the information from each analysis is used to form part of the dynamic signature which could be used to widen the scope to match a family of malware to a single signature. Dynamic signatures combine static-signature and behavior-based

approaches used in a proof-of-concept detection test which verifies its ability to detect malware from within its family. The structure of dynamic signatures allows them to be matched to a larger set of malware than any static-based approach presented in previous works. The dynamic signature also has advantage over behavior-based function call graphs because they allow for resources as features. Combining resources in a function call graph [18] eases the constraint on operations performed, which is easily defeated by obfuscation methods. A behavioral graph is a directed graph that is represented by a tuple 𝐺𝑚=(𝑉,𝐸,𝑂). Where 𝑉 is a finite set of vertices, 𝑂 is finite set of operations, and E is a finite set of edges along ordered triplets in that 𝐸:𝑉×𝑂×𝑉. Implementation of the dynamic signature and use of the detection procedure in anti-virus and IDS would increase their current observable set of malware. Figure 3. Shows a dynamic signature created by combining the significant events determined by each investigation step. Starting at the malicious.docx file represented by “V1”, which each edge connects two resources with an associated operation. An edge (𝑣𝑛,𝑣𝑛+1,𝑜𝑛) in 𝐸 expresses that the resource 𝑣𝑛 applied the operation 𝑜𝑛 to resource 𝑣𝑛+1. Therefore, an edge represents a significant event that occurred between two resources. Given the first edge, 𝑒1=(𝑣1,𝑣2,𝑜1) which links a new resource to the source executable resource, there exists a path from 𝑣2 to 𝑣1. Since any new resource, 𝑣𝑛 must be added on an edge 𝑒𝑖=(𝑣𝑥,𝑣𝑖+1,𝑜𝑧), where 𝑣𝑥 is already an element of 𝑉, there always exists a path from 𝑣1 to 𝑣𝑛, thus proving the resource is used by the malware. Each of the 46 vertices were created as representations of the 46 most significant resources that contribute to the malicious behavior of the malware.

Fig. 2. Method Lifecycle

Fig. 3. Dynamic Signature

10

IV. RESULT & DISCUSSION From phishing emails received, we have completed behavioral analyses and tracked malware information resulting in more than 86 characteristics within 4 months and this investigation is still ongoing. After simulation and investigation into malware, the results and all information related to malware and its behavior including survey date, summary, phishing email subject and content, type of threat, time, source-destination, ip address, hash value, screenshot, etc. will be updated and published on to the internal web server [17]. The data on the web server can be referenced in the case of a user receiving a similar malicious email. We are able to give them advice on how to deal with the phishing email. In the worst case users might have already been infected with the virus without them noticing and so the purpose of the database will be to find victims inside a network by comparing potential victim’s behavior with that of malware’s behavior; to help the victim and protect the network as fast as possible. Zero-day malware is usually not detected by machine IDS or online investigation, and is difficult to investigate with static or dynamic analysis. Therefore, you should always keep your eyes open for updates on system vulnerabilities in the news or patches and update information that may be the key to protecting your system. We also keep focusing on Wannacry ransomware and its similarity behavior and vulnerability attack. Refer to the current information, our next research will focus on developing this manual investigation process to an automatic system, which can learn, track and give rapid responding by machine learning, as well as upgrading an active method of spyware detection in advance to catch up any suspicious behaviors which operate without user’s permission.

V. CONCLUSION We have presented a process of investigation into phishing and malicious emails in a real environment where some of them are assumed to be zero-day attacks. Within four months there were 86 characteristics extracted from the phishing emails. The propose of this research is investigate malware behavior, and aims to better understand and track them effectively, as well as collect information to find and help infected victims inside an organization’s network. Thus, continued investigation and updating of analyzed results to the server is necessary. The advantage of recording data is also useful for managing network security, policy, and update information to organization members.

REFERENCES [1] T. Tsikrika, B. Akhgar, V. Katos, S. Vrochidis, P. Burnap and M. L.

Williams. Terrorist Online Content & Advances in Data Science for Cyber Security and Risk on the Web, In 1st International Workshop on Search and Mining. Feb. 2017

[2] M. Aziz, K. Okamura. “An Analaysis of Botnet Attack for SMTP Server using Software Define Network (SDN)”. APAN Research Workshop. 2016.

[3] A. Dinaburg, P. Royal, M. Sharif and W. Lee. “Ether: Malware Analysis via Hardware Virtualization Extensions”. Oct. 2008

[4] A. Moser, C. Kruegel, and E. Kirda. Exploring Multiple Execution Paths for Malware Analysis. In IEEE Symposium on Security and Privacy. May. 2007

[5] L. Bilge, T. Dumitras. “An Empirical Study of Zero-Day Attacks in The Real World,” ACM conference on Computer and communications security. Oct. 2012.

[6] M. Christodorescu and S. Jha. “Static Analysis of Executables to Detect Malicious Patterns. In Usenix Security Symposium”. 2003.

[7] M. Christodorescu, S. Jha, S. Seshia, D. Song, and R. Bryant. “Semantics-aware Malware Detection”. In IEEE Symposium on Security and Privacy, May. 2005.

[8] E. Kirda, C. Kruegel, G. Banks, G. Vigna, and R. Kemmerer. “Behavior-based Spyware Detection”. In Usenix Security Symposium, 2006.

[9] C. Kruegel, W. Robertson, and G. Vigna. “Detecting Kernel-Level Rootkits Through Binary Analysis”. In Annual Computer Security Application Conference (ACSAC), 2004.

[10] C. Linn and S. Debray. “Obfuscation of Executable Code to Improve Resistance to Static Disassembly”. In ACM Conference on Computer and Communications Security, 2003.

[11] G.Wroblewski. “GeneralMethod of Program Code Obfuscation”. PhD thesis, Wroclaw University of Technology, 2002.

[12] M. A. McQueen, T. A. McQueen, W. F. Boyer, and M. R. Chaffin. “Empirical estimates and observations of 0day vulnerabilities. In Hawaii International Conference on System Sciences”. 2009.

[13] Symantec Corporation. Symantec Internet security threat report, volume 17. http://www.symantec.com/threatreport/. Apr. 2012.

[14] P. Szor. “The Art of Computer Virus Research and Defense”. Addison Wesley. 2005.

[15] K. Borders, X. Zhao, and A. Prakash. “Siren: Catching Evasive Malware (Short Paper)”. In IEEE Symposium on Security and Privacy. 2006.

[16] J. Crandall, G. Wassermann, D. Oliveira, Z. Su, F. Wu, and F. Chong. “Temporal Search: Detecting Hidden Malware Timebombs with Virtual Machines”. In Conference on Architectural Support for Programming Languages and OS. 2006.

[17] Y. Ban, K. Okamura. “Result of Analyzed Phishing Mail & Malware Behavior” URL: https://zmal.cs.kyushu-u.ac.jp/info/. 2017.

[18] K. Edwards, K. Okamura, M. Portmann. “Malicious Software Analysis Procedure for Generating Dynamic Signature” Master thesis, The University of Queensland in 2017.

[19] S. S. Hansen, T. M. T. Larsen, M. Stevanovic, and J. M. Pedersen, "An approach for detection and family classification of malware based on behavioral analysis," in 2016 International Conference on Computing, Networking and Communications (ICNC), 2016, pp. 1-5.

Sanouphab Phomkeona is a lecturer at Department of Computer Engineering and Information Technology, Faculty of Engineering, National University of Laos, Lao P.D.R. He received his Bachelor Degree (B. Science of Mathematics and Computer Science Since) from National University of Laos and obtained his Master Degree (M.

Eng of Knowledge-Based Information Engineering) from Toyohashi University of Technology. He has been work with private company in Vientiane Capital for 2 years between 2011-2013 focus on IT specialist. Supported by AUN SEED- Net project by JICA, he leaves as lecturer temporary and currently, he is a PhD candidate and belong to Department of Advanced Information Technology, Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan.

11

https://zmal.cs.kyushu-u.ac.jp/info/

Koji Okamura is a Professor at Research Institute for Information Technology, Kyushu University and Director of Cybersecurity Centre Kyushu University, Japan. He received B.S and M.S. Degree in Computer Science and Communication Engineering and Ph.D. in Graduate School of Information Science and Electrical

Engineering from Kyushu University, Japan in 1988,1990 and 1998, respectively. He has been a researcher of MITSUBISHI Electronics Corporation Japan for several years and has been a Research Associate at the Graduate School of Information Science, Nara Institute of Science and Technology, Japan and Computer Centre, Kobe University, Japan. He’s area of interest is Future Internet and Next Generation Internet, Multimedia Communication and Processing, Multicast/IPV6/QoS, Human Communication over Internet and Active Network. He is a member of WIDE, ITRC, GENKAI, HIJK project and Key person of Core University Program on Next Generation Internet between Korea and Japan sponsored by JSPS/KOSEF.

Kristan Edwards is an undergraduate student in the school of Electrical Engineering and Information Technology, at The University of Queensland, Australia. He is in his final year of a Bachelor (Hons.) and Master of Electrical and Computer Engineering. Currently he is completing a 3-month placement at the Cybersecurity Center, Kyushu University, Japan as a Research Assistant.

Yoshitatsu Ban got his Bachelor degree in April 2014 and Master degree (M. Eng Information Science and Informatics Electronic) from Kyushu Univer- sity since 2016. He is interested in Malware Analysis and Filtering Email. From April 2016, he has been working with Human Techno System Inc. as a security programing research position and has been a visiting researcher at

Cyber Security Center, Kyushu University from October 2016

12

Proceedings of the APAN – Research Workshop 2017

ISBN 978-4-9905448-7-4

Abstract— Cryptocurrencies such as Bitcoin has drawn great

attention recently. The public ledger blockchain serves as a secure

database for cryptocurrencies. However, only 3 to 7 transactions

can be processed per second, which means the blockchain does not

scale. To address this problem, we propose a new consensus

protocol based on sharding and proof of stake. The scalability of

our proposed method is expected to increase linearly with the

network size. We discuss proposed method from the scalability

evaluation, complexity and security view.

Index Terms—Blockchain protocol, Proof of Stake, Scalability,

Sharding.

I. INTRODUCTION

INCE being introduced in 2008, Bitcoin [1] has become

a global decentralized cryptocurrency now and led to more

than 700 alternative coins [2]. The total venture capital of

Bitcoin reached around 330 million USD at the end of 2016 [3].

Bitcoin has attracted great attention in financial, technology as

well as academic. The core technology under Bitcoin is the

Nakamoto consensus protocol, which plays a key role in

maintaining the transaction history of Bitcoin in a public ledger

called the blockchain. The blockchain, serving as a distributed

database to record Bitcoin transactions chronologically and

securely, is considered as the most significant technology of

Bitcoin. With high security, immutability and decentralization,

blockchain has also been applied on the protection of

public/private/semi-public record tools, physical asset keys and

intangible assets [4].

Despite these advantages, a major concern of blockchain is

the scalability [5, 6, 7]. The Bitcoin blockchain could only deal

with at most 7 tps (transactions per second) [8]. On the contrast,

centralized payment systems such as PayPal [9] are able to

process around 115 tps and in visa’s network [10] the capability

could reach to a peak rate of 56,000 tps. The processing speed

of blockchain is affected by two factors: block size and block

Submitted on June 24th, 2017

Y. Gao is with the Department of Intelligent Interaction Technologies, University of Tsukuba, Ibaraki, Japan (e-mail: [email protected]).

H. Nobuhara, is with the Department of Intelligent Interaction Technologies,

University of Tsukuba, Ibaraki, Japan. (e-mail: [email protected]).

interval. Given Bitcoin’s 10 minutes average block interval and

the 1MB average size for each block, the throughput is limited

to 7 tps. The throughput can be improved simply by increasing

block size or reducing block interval. However, increased block

size results in slower block broadcasting in Bitcoin network and

reduced block leads to centralization [5, 6].

This aim of this paper is to propose a scalable protocol for

blockchain with sharding and proof of stake (PoS) algorithm.

TABLE I gives a simple comparison between Bitcoin protocol

and the proposed protocol. The paper has been organized in the

following way. Chapter 2 explains three significant concepts

related to this research. Chapter 3 presents the design of the

proposed sharding proof of stake protocol which can be a

possible solution to blockchain’s scalability problem. Chapter

4 discusses the evaluation of scalability, complexity and

security of the proposed method. The last chapter summarize

the proposed methods.

II. RELATED CONCEPTS

This section introduces three important concepts: Proof of

Work (PoW), Proof of Stake (PoS) and sharding. Proof of

Work is the consensus algorithm of Bitcoin while Proof of

Stake is used to make consensus in the proposed method.

Sharding is another significant technique used in this research.

A. Proof of work [1,11,12,13]

The proof of work is a consensus mechanism used in

cryptocurrencies to maintain the security of the blockchain. In

the case of Bitcoin, nodes (also known as “miners”) compete to

solve a difficult math puzzle to including new blocks in the

blockchain so that they could receive bitcoins as a reward. The

CPU Power of a node is proportional to the probability to

generate a new block, which means the higher the CPU Power

A Proof of Stake Sharding Protocol for

Scalable Blockchains

Y. Gao and H. Nobuhara

S TABLE I

COMPARISON OF BITCOIN PROTOCOL WITH PROPOSED PROTOCOL

Bitcoin protocol Proposed protocol

Consensus algorithm Proof of Work (PoW) Proof of Stake (PoS)

Sharding ×

Scalable ×

13

is, the more likely the node would receive a reward for creating

blocks. The blocks are connected chronologically by one-way

hash algorithms such as SHA-256 to form a blockchain. An

attacker is required to perform as much proof of work

calculation as the other part of the Bitcoin network do. The

attack would not be successful only if the attacker has

controlled more than 51% CPU Power of the entire Bitcoin

network.

Despite the security merits, producing a proof of work data is

costly. The costs including electricity and hardware are

estimated over one million per day [15]. Six hundred trillion

SHA256 computations are conducted by Bitcoin network every

second, however, these calculations turn out to be useless in

practice [14].

B. Proof of stake [12]

The proof of stake is one of the alternative consensus

mechanism of PoW. As shown in TABLE II, the probability of

generating a new block is proportional to the stake status rather

than the CPU power. In Peercoin, the stake status is known as

coin age, which is defined as coin amount times holding period

[14]. A user holding large amount of coin for a longer time (i.e.

user owns larger coin age) has higher probability to create a

new block. Without the need of large quantities of hash

calculation, PoS is much more cost effective compared to PoW,

Additionally, penalties can be set to make 51% attacks much

more expensive in PoS than in PoW [16].

C. Sharding [17]

In current blockchains, the nodes are distributed around the

world, processing all of the transactions and storing the whole

transaction history. This contributes to high security but limits

the scalability. In the case of Bitcoin, only 3~7 transactions can

be processed per second. Several sharding protocols have been

proposed to solve the scalability problem. Luu et al. (2016)

proposed ELASTICO for open blockchains. This sharding

protocol divides the mining network into small groups where

the transactions shards are processed in parallel.

III. PROPOSED METHOD

In this chapter, we introduce our proposed method as a

possible solution of the blockchain scalability problem.

Assume there are nc nodes in the network forming c groups,

therefore each group contains n nodes. Two types of blocks are

generated in proposed method. The middle blocks are

generated by regular node groups and sent to final validation

node group. The final blocks are generated by final validation

group and broadcast to the network. In order to distinguish, the

middle blocks are represented by lower-case “block” and the

final blocks are represented by upper-case “BLOCK”.

A. Overview of the proposed method

The proposed method is mainly based on a sharding protocol

and PoS consensus scheme. Assume the initial number of nodes

in the network is cn. The cn nodes form c groups, which means

each group contains n nodes. One of the c node groups works as

validation node group and the other c-1 node groups are regular

groups. The regular node groups created middle blocks from

the transaction shards assigned to them. The middle blocks are

then processed in validation node group to produce final block

which is recorded in the blockchain. To distinguish the two

types of blocks created in the processes, the lower case “block”

represents the middle block in Step 2 and the upper case

“BLOCK” represents the final block in Step 3. Fig. 1. shows the

main steps of the proposed method.

Each epoch contains 4 steps:

Step 1: Form node groups. Each node belongs to a group.

After a node group is formed, a leader node is chosen randomly

and all of the nodes’ identities in this group are sent to it. After

the group leader gathering all the nodes information in its group,

an identity list is generated and broadcasted to other group

leaders. This process reduces the communication complexity

between nodes from O(n2) to O(cn).

Step 2: Run internal group consensus. A transaction shard

is assigned to a node group randomly. An internal PoS

consensus is run in each node group. The node with large coin

age (coin amount times holding time) has higher probability to

be chosen to generate a new middle block.

Step 3: Generate final BLOCK. The final validation group

collects and combines the middle blocks. A PoS consensus is

run to generate a final BLOCK which is broadcasted to the

whole network.

Step 4: Reshuffle the nodes. After t epochs, all of the nodes

are reshuffled to form new node groups.

B. Form node groups

First, node groups are formed. Assume a group contains n

nodes. The identities of the nodes are supposed to be known by

others. A simple way is that each node broadcasts its identity to

all other nodes. However, this results in O(n2) message

complexity. The strategy to reduce the complexity is presented

in Section IV.

Fig. 1. Step 2 and 3 of the proposed method

TABLE II

DIFFERENCES BETWEEN POW AND POS CONSENSUS ALGORITHM

Proof of Work (PoW) Proof of Stake (PoS)

Based on CPU power Coin age

Cost High Low Security concern Potential 51% attack Lower probability of

51% attack

14

C. Run internal group consensus

After the node groups are formed, transaction shards are

randomly assigned to groups. An internal group consensus is

run in each group to generate middle blocks. We choose the

PoS consensus mechanism. The node owes the highest coin age

are more likely to be chosen to generate a middle block. The

middle block is sent to the final validation node group.

D. Generate final block

The final validation node group collects the middle blocks

and generates the final BLOCK. A PoS consensus is run to

select a node to generate the final BLOCK. A final BLOCK

mainly includes two parts: the previous BLOCK hash and new

middle blocks.

E. Reshuffle the nodes

Nodes are reshuffled to form new groups every t epochs for

higher security. Reshuffling could help to reduce the risk of

centralization. After new node groups are formed, a new epoch

starts from step 1.

IV. EVALUATION AND ANALYSIS

This section discusses the possible evaluation and analysis

the complexity and security of the proposed method.

A. Evaluation

In our proposed method, the network is separated into node

groups where the transaction shards are processed in parallel.

Therefore, the throughput of the network is expected to be c

times higher than the non-sharding consensus protocols.

Experiments will be conducted to evaluate the scalability of the

proposed method. An emulated network will be built on

Amazon EC2 and the nodes in the network ranges from 100 to

1000. According to the steps introduced in section III, node

groups is formed firstly. Then transaction shards are assigned to

node groups randomly to be processed. The process time is

recorded. Finally, the relationship between the network size

and the process time will be analyzed.

Although the experiments are still being performed, the

proposed method is expected to have a linear scalability. The

reason is that transactions are processed in parallel by c node

groups, which means the processing speed is c times of the

non-sharding protocols.

B. Complexity analysis

Assume the network contains cn nodes. If each node broadcasts

its identity to all other nodes in the network, the message

complexity will be O(n2). To reduce the complexity, we

propose to form node groups. Assume each group contains n

nodes, then c node groups are formed in total. When a node

group is formed, a leader node is randomly selected and all

other n-1 nodes in the group send their identities to the leader

node. The leader node generates an identity list base on the

collected information and broadcast this list to the whole

network. At the same time, the leader node also receive identity

list from other node groups. To reduce the number of messages,

a non-leader node only receive the identity list of its own group.

In this way, each node knows the identities of other nodes in the

same group and the leader node in every group has a view of the

nodes in the whole network. The complexity is reduced to

O(cn).

C. Security analysis

One of the security problems related to the blockchain is

known as 51% attack. If one or more nodes take control of over

51% CPU power, they may successfully perform malicious

attacks [18]. According to Larimer D. (2013), a 51% attack is

much more costly and difficult in a PoS network rather than in

the PoW one [19]. In a PoW network, a 51% attack can be

executed with enough cost and hard ware. However, in a PoS

network, a 51% attack requires not only cost (over 51%

possession of stake) but also holding time. In our proposed

method, we set limitations on coin age. The coin age is defined

as the amount of coins times the holding time. The coin age is

valid only if the holding time is between t and t+α. This

limitation could help to reduce the risk of 51% attack.

Another effect related to 51% attack is the incentive. Assume

a 51% attack is succeed in a PoW network to create a false

blockchain fork, with enough CPU power, the attacker is able

to keep the false fork to receive more profits. This is one

possible incentive. However, in a PoS network, even if a 51%

attack is succeed to create a false blockchain fork, what the

attacker could receive is 1% award of its stake. Since the

coinage returns to zero after the attack, the attacker could not

keep the false fork to keep making profits. Therefore, the

incentive to make a 51% attack is lower in PoS than in PoW

network.

V. CONCLUSION

In this paper, we presents the design of a proof of stake

sharding protocol which is considered to be a possible solution

to blockchain’s scalability problem. To the best of our

knowledge, this is the first blockchain protocol combining

sharding protocol and proof of stake consensus algorithm. To

our expectation, the proposed method could increase the

blockchain’s scalability linearly with the network size.

Experiments will be conducted in an emulate network to

evaluate the proposed method.

REFERENCES

[1] Nakamoto, S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System. [Online]. Available: https://bitcoin.org/bitcoin.pdf

[2] Crypto Currency Market Capitalizations. [Online]. Available:

https://coinmarketcap.com/currencies/views/all/ [3] COINFOX. (2016). Bitcoin venture capital in 2016: slowing growth rate.

[Online]. Available:

http://www.coinfox.info/news/reviews/6496-bitcoin-venture-capital-in-2016-slowing-growth-rate

[4] Swan, M. (2015). “Blockchain 2.0: Contracts,” in Blockchain: Blueprint

for a new economy. "O'Reilly Media, Inc.", p. 10. [5] Eyal, I., Gencer, A. E., Sirer, E. G., & Van Renesse, R. (2016, March).

Bitcoin-ng: A scalable blockchain protocol. In 13th USENIX Symposium

on Networked Systems Design and Implementation (NSDI 16) (pp. 45-59). USENIX Association..

15

https://bitcoin.org/bitcoin.pdf

https://coinmarketcap.com/currencies/views/all/



[6] Croman, K., Decker, C., Eyal, I., Gencer, A. E., Juels, A., Kosba, A.. &

Song, D. (2016, February). On Scaling Decentralized Blockchains. In International Conference on Financial Cryptography and Data

Security (pp. 106-125). Springer Berlin Heidelberg.

[7] Luu, L., Narayanan, V., Zheng, C., Baweja, K., Gilbert, S., & Saxena, P. (2016, October). A secure sharding protocol for open blockchains.

In Proceedings of the 2016 ACM SIGSAC Conference on Computer and

Communications Security (pp. 17-30). ACM. [8] Scalability. Bitcoin wiki. [Online]. Available:

https://en.bitcoin.it/wiki/Scalability.

[9] PayPal. [Online]. Available: https://web.archive.org/web/20141226073503/https://www.paypal-medi

a.com/about .

[10] VISA. [Online]. Available: https://usa.visa.com/dam/VCOM/download/corporate/media/visa-fact-sh

eet-Jun2015.pdf

[11] Proof of work. bitcoinwiki. [Online]. Available: https://en.bitcoin.it/wiki/Proof_of_work

[12] Proof of Stake versus Proof of Work. [Online]. Available:

http://bitfury.com/content/5-white-papers-research/pos-vs-pow-1.0.2.pdf [13] What Proof of Stake Is And Why It Matters. [Online]. Available:

https://bitcoinmagazine.com/articles/what-proof-of-stake-is-and-why-it-

matters-1377531463/ [14] PPCoin: Peer-to-Peer Crypto-Currency with Proof-of-Stake. [Online].

Available: https://peercoin.net/whitepaper .

[15] Proof of Stake FAQ. [Online]. Available: https://github.com/ethereum/wiki/wiki/Proof-of-Stake-FAQ

[16] A Proof of Stake Design Philosophy. [Online]. Available: https://medium.com/@VitalikButerin/a-proof-of-stake-design-philosoph

y-506585978d51

[17] On sharding blockchains. [Online]. Available: https://github.com/ethereum/wiki/wiki/Sharding-FAQ

[18] Luu, L., Narayanan, V., Zheng, C., Baweja, K., Gilbert, S., & Saxena, P.

(2016, October). A secure sharding protocol for open blockchains. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and

Communications Security (pp. 17-30). ACM

[19] Bradbury, D. (2013). The problem with Bitcoin. Computer Fraud & Security, 2013(11), 5-8.

[20] Larimer, D. (2013). Transactions as Proof-of-Stake. [Online]. Available:

https://bravenewcoin.com/assets/Uploads/TransactionsAsProofOfStake10.pdf

Yuefei Gao was born in Guizhou Province,

China in 1992. She received the B.S. degree

in Engineering from China Beijing Science

and Techonology University in 2013 and the

M.S. degree from University of Tsukuba in

2016. She is currently pursuing the Ph.D.

degree at the same university.

She has been engaged in research of

Decentralized Trusted Timestamp. Her

research interests include decentralized system,

cryptocurrencies and the blockchain technology.

Ms. Gao was a recipient of the Outstanding Master Thesis

Award.

Hajime Nobuhara received the Ph.D.

degree from Tokyo Institute of Technology,

Japan in 2002. He worked as a post doctoral

fellow in University of Alberta, Canada

from April to September in 2002. From

2002 to 2006, he has been affiliated with

Tokyo Institute of Technology, Japan. From

2006 until now, he has been affiliated with

University of Tsukuba, Japan. IEEE, Japan Society for Fuzzy

Theory and Intelligent Informatics, the Institute of Electronics,

Information and Communication Engineers member.

16

https://en.bitcoin.it/wiki/Scalability

https://web.archive.org/web/20141226073503/https:/www.paypal-media.com/about

https://web.archive.org/web/20141226073503/https:/www.paypal-media.com/about

https://usa.visa.com/dam/VCOM/download/corporate/media/visa-fact-sheet-Jun2015.pdf

https://usa.visa.com/dam/VCOM/download/corporate/media/visa-fact-sheet-Jun2015.pdf

https://en.bitcoin.it/wiki/Proof_of_work

http://bitfury.com/content/5-white-papers-research/pos-vs-pow-1.0.2.pdf

https://bitcoinmagazine.com/articles/what-proof-of-stake-is-and-why-it-matters-1377531463/

https://bitcoinmagazine.com/articles/what-proof-of-stake-is-and-why-it-matters-1377531463/

https://peercoin.net/whitepaper

https://github.com/ethereum/wiki/wiki/Proof-of-Stake-FAQ

https://medium.com/@VitalikButerin/a-proof-of-stake-design-philosophy-506585978d51

https://medium.com/@VitalikButerin/a-proof-of-stake-design-philosophy-506585978d51

https://github.com/ethereum/wiki/wiki/Sharding-FAQ




ISBN 978-4-9905448-7-4

Abstract—During the procedure of the Region-informatization,

there are encountered different types of challenges. Playing the

key roles in the construction mechanism, Governments,

Universities, and enterprises are building their own information

systems that are unable to be interoperable because of lacking

interconnection standard. So these systems’ effective

data-sharing mechanism poses to be a major point of concern. In

this paper, from the top design of informatization eco-system, we

are going to conduct discussion on the roles played by these

participants in addition to the mechanism which these

participants’ system is to interconnect, so that we could enhance

the development of Region-informatization. We consider the

Authentication System built in Shanghai Education as an

illustration together with showing the process, design,

implementation and the current situation comprehensively. The

new approaches in this paper is to provide a fusion scheme of

Oauth2 and Shibboleth, which combines the advantages of both.

And in the maintenance on federation, provide some methods

and practices to improve service level

Key words: Authentication; cross-region authentication;

Shibboleth; Oauth2

I. INTRODUCTION

n the development of regional informatization, there are

encountered three key challenges:

Difficult to Share Information: because of historical reasons,

resources of the Government, educational institutes and

This paragraph of the first footnote will contain the date on which you

submitted your paper for review.

Qi Feng is with the Infomation Technology Services Center, East China

Normal University, North Zhongshan Road Campus: 3663 N. Zhongshan Rd., Shanghai,China(e-mail:[email protected])

ZhongLin Chen is with the Infomation Technology Services Center, East

China Normal University, North Zhongshan Road Campus: 3663 N. Zhongshan Rd., Shanghai,China(e-mail:[email protected])

FuKe Shen is with the Infomation Technology Services Center, East China

Normal University, North Zhongshan Road Campus: 3663 N. Zhongshan Rd., Shanghai,China(e-mail:[email protected])

YuHong Zhu is with the Information Center, Shanghai Municipal

Education Commission: 25N, WeiHai Rd.,Shanghai,China(e-mail: [email protected])

internal departments mostly develop the exclusive monopoly

as well as lack of communication between one another,

eventually leading to the difficulty to information sharing.

Difficulties of Data Analysis: owed to the lack of a scientific

top-level design, an immaculate coordination and information

construction and an independent development of information

resources construction in organization, capital and system, the

construction remained at a low level that also drove data

collection incomplete. These issues brought forth gigantic

difficulties in data analysis. Thus, the Big Data analysis in

regional education has emerged as an empty slogan.

No Cohesion among Government, Educational Institutes

and Enterprises: currently, the main body of the regional

education informatization comprises three aspects:

government, educational institutes, enterprise. Among them,

government provides financial support to multiple independent

projects without taking into consideration the data-sharing

with one another, leaving behind one another “information

islet”; the educational institution also possesses a large number

of low level analogous function information systems

constructions. Repeated construction, constrained staffing and

ineffective system function result into low profits in

infomatization; despite the fact that enterprises have some

participatory enthusiasm, the cooperation between them and

educational institutions is quite fossilized under the current

strategy. Nonetheless, lots of companies are showing their

interests in participating in the cooperation, but they cannot

find the entry point to start with.

II. THE TOP-LEVEL DESIGN OF REGIONAL INFORMATIZATION

In the mechanism of regional informatization, it is more

than important to clarify the responsibilities of different

participants together with making clear the relationships

among them. The top-level design of informatization should

throw emphasis on the design of the overall framework of the

regional information system, as well as the design of

information infrastructure that could support multiple

construction bodies, for the purpose of developing a regional

information ecosystem. Meanwhile, we are required to respect

the independence of the main body in addition to making not

Identity Authentication and Data Access

Authorization in Regional Education

Informatization

Qi Feng, ZhongLin Chen, FuKe Shen And YuHong Zhu

I

17

interference in the procedure of information planning,

management system, etc. The following issues throw emphasis

on the relationship between identity management and resource

sharing in order to delineate the different responsibilities of

educational institutes, governments and enterprises in the

top-level design.

Educational Institute: Educational institutes supervised and

identified, correspondingly together with supplying functional

education resources.

User of education information system is primarily the

teachers and the students, whereby their identification is

entirely supervised by their respective institutes. The

authorization mechanism in cyberspace should constitute the

physical world. That is why academic institutes are required to

bring forth reasonable solutions, construct effective identity

authentication system, and provide authentication services for

its users. It also delivers corresponding types of application

authorization in respect of different users.

Conversely, each academic institute is inclined towards

having a huge amount of characteristic resources, whereby,

how to derive the value of these resources must be taken into

consideration. The sharing of school resources should throw

light on these distinctive applications.

Government: Government is the primary construction body

of the information infrastructure that is capable of

interconnecting multiple identities and the application

resources. Moreover, it helps form a scientific and reasonable

sharing mechanism.

It is quite tough to standardize the application of academic

information system as a unified product to satisfy all user

requirements. Government should put to use this infrastructure

along with corresponding stimulus strategy in order to

motivate enterprises and educational institutes for delivering

topnotch and different personalized service together with

making full use of the resources of each application.

Enterprises: Enterprises possess advantages when it comes

to producing different applications and resources. Furthermore,

they can provide high quality services.

Good educational information ecosystem is unable to

perform without the participation of enterprises. Enterprises

can bring forth the personalized applications for different users.

Enterprises are required to assume the responsibility of the

development of academic information ecosystem, share

responsibility, in addition to getting reasonable commercial

profit.

A sustainable regional education information framework

should comprise three key aspects: authentication,

authorization and data-sharing.

Authentication: Each user's authentication has been the

responsibility of its own the organization. Each individual

should pose the full trust in the results of authentication from

other providers in the federation.

Authorization: Authorization can be segregated in two

aspects:

On the first place is the Authorization for recourse access,

also termed as user authorization. For instance, we can define a

user’s access limits that he can only read. Authorization should

be based on different individual’s features, for instance, users

can be authorized different roles to distinguish between staff

and students.

On the second place is the Authorization for different

application development participant, also called application

authorization, for instance, what kind of application can

participate in ecosystem? How much educational resource can

be utilized by these application developers? And, what

operation they can perform on the data in the infrastructure?

All of these proposals should be controlled by application

authorization.

Data-sharing: Data builds the base of the information

infrastructure. Data-sharing mechanism is quite significant to

support the top-level design as well as to supply personalized

service to users. A productive educational ecosystem should

include user elementary data together with user behavior data.

User elementary Data is associated with the user’s identity

and personal information. Elementary data is primarily

collected offline. Once the user is transferred to online

character, the elementary data will be entered into the online

system.

User behavior Data is defined as the data logging user

behavior as well as the data produced by operation system.

These two kinds of data contribute towards the bases of Big

Data analysis.

Authentication, Authorization and Data-sharing should be

encapsulated in different services, together with being openly

provided to every participant, with Application Program

Interface (API).

For the purpose of supporting several construction

participants as well as users from multiple educational

institutes, authentication must be accomplished by the

organization itself. The authentication findings of multiple

organizations are required to be trusted by each other in the

regional federation. Shibboleth is an open source and a quite

effective solution to this issue.

Shibboleth poses to be standard open source software that

allows users to archive login through web, which belongs to

the same federation. When a user visits protected resources,

Shibboleth provides safe as well as private identity

authentication services.

There are three constituents of Shibboleth:

IDP (Identity Provider): IDP Client: The key function of

IDP deals with validating users together with providing user’s

attribute to resource providers. In accordance with their

attributes, the server would response to user’s activities.

SP (Service Provider): SP client: The key role of resource

service provider deals with responding to user’s request

together with inquiring the attribute of the user from IDP that

belongs to it. Thereafter, in accordance with the result of

inquiry, SP will decide whether user is allowed to visit the

resource or not.

DS (Discovery Service): Discovery Service acquires the

responsibility to guide users to go to their own IDP for

authentication.

As presented in Figure 1, it represented a logic framework

of Shibboleth.

18

Fig 1 Logic framework of Shanghai Cross-organization Education

As user’s visit the resources from the Federation, the data

flow will go through the following steps, in Figure 2:

A. User makes request to get access to the protected

resources.

B. SP judges if the status of user has been authenticated,

contrarily, it redirects the data to DS.

C. User chooses the IDP of his own organization on the DS.

D. Requesting authentication to the IDP selected by the user.

E. Authenticating.

F. IDP sends a message containing user’s confirmed identity

to SP.

G. SP makes request for user’s attribute from IDP.

H. In accordance with the rules that have already been

agreed by federation organizations, IDP sends relevant

attributes to SP.

I. Because of user’s attributes, SP finishes procedure of

authorization

Fig 2 Shibboleth auth flow

III. SHANGHAI EDUCATION AUTHENTICATION CENTER

Shanghai Education Authentication Center (SEAC) brings

forth authentication, authorization and data-sharing for

Shanghai education users. Shibboleth delivers a

cross-organization authentication framework together with

providing a logical solution for the authorization based on user

attributes. Nonetheless, the authorization for the application is

comparatively quite simple that is based on the distributed

authorization of the authentication node. Moreover, it lacks

centralized authorization management. Furthermore, the

authentication data transfer to the application also lacks the

standards. The framework of distributed architecture is

mandatorily required to be centrally managed for supporting

the cooperation of multiple bodies.

That is the reason we require a centralized authorization

management policy for the coordination of the application and

authentication nodes. But, the existing distributed

authentication framework cannot be abolished. In our design,

we integrate oauth2 with shibboleth. SEAC’s solution suggests

authentication management by oauth2 and distributed

authentication by shibboleth. The frame layout is illustrated

by Figure 3.

Fig 3 Framework of SEAC

SEAC unified authentication center performs the key

function of providing distributed authentication, centralized

authorization and data-exchange services in dock application

market. Application is authorized with the use of the open

interface of oauth2, and the cross-organization authentication

docking shibboleth for distributed authentication is done

during the process of authorization. Through the transmission

of users’ attribute data, information in identity authentication

center platform SEAC will be integrated as well as cleaned.

Thereafter, packaging as a unified standard formation and as

an API interface is to be issued for the applications’ calling.

In this way, Oauth2 is actually like a Shibboleth SP Proxy.

Compare with Shibboleth native SP (SAML), Oauth2 API

(REST) is clearly more friendly to the developers. More

importantly, this provides an opportunity for data cleansing,

can significantly enhance the federation’s service level..

Consider the following situation, IDP provide attribute

“typeOf” to distinguish user’s identity, such as ”teacher“ or

“student”. However, in each IDP, the value of typeOf is not

standard, for example TABLE I

IDP ATTRIBUTES

IDP typeOf

IDP1 Teacher

Student

IDP2 Tc

Stu

IDP3 05331

05332

This requires the application of additional work for data data

cleansing. But now the work has done by the SP Proxy. For

applications, the data returned by the API is always standard

19

and consistent, which is more appealing to the developer

In conclusion, SEAC makes use of an authorization mode,

termed as Oauth2, which is the mainstream of the Internet

authentication with fast access and conduciveness to promote.

In respect to cross-organization authentication user, the

authentication data flow through the authentication center, and

then redirect to the application that accords an opportunity for

SEAC to carry out data cleaning. Moreover, it is conducive to

standardize data structures, and also helpful for the promotion

of the application docking. So, we are able to construct an open

platform for authentication like Google/Facebook/Github

through the SEAC Center, and form an ecosystem of regional

education informatization.

SEAC is meant to deliver services to the whole society of

Shanghai education. Furthermore, it is important to guarantee

the reliability of its services. That is how the monitoring of the

SEAC is of utmost importance as well. Nonetheless, owed to

the multiple participation bodies of federal authentication

framework that is quite different from that of single main body,

it is quite tough to consider a strong monitoring program in it.

We can only make use of different schemes of gathering

monitor information in accordance with the distinguished

objects in framework, display and alarm to users with the help

of a unified monitor platform.

There are two types of monitoring applications in regional

informatization authentication framework:

One type is the Applications supported directly by regional

informatization administrator, just like that deliver the

interface to Oauth2. We can utilize a robust collecting method

to these applications. For instance, it is possible to arrange

Agent on the server directly for the collection of the status as

well as logs of it, in addition to reporting to the monitoring

platform.

Secondly, there are Applications supported by regional

informatization participators, as IDP delivered by educational

institutes. These kinds of applications’ objects are placed in

each respective educational institute. This is why we

recommend them to utilize a polite technique for the purpose

of monitoring the running status, as well as reporting to the

platform.

The monitor framework is illustrated by Figure 4

Fig 4 Framework of Monitor

For example, we conduct monitoring of the http status

interface of IDP with the help of the platform, check the status

of one IDP and then report it to the platform sporadically. If an

IDP node is shut down, the collector will receive an error code,

and the platform would take the alarm as per requirements.

The following pictures delineate the average latency time

for the request of 4 IDP nodes, whereby the unit is

milliseconds (ms).

Fig 5 idp latency time

Owed to the fact that the SEAC informatiation administrator

extends support to the DS node directly, we are capable of

collecting quite detailed information of application. We can

not only collect the information of CPU, space, disk IO status,

but also the average amounts of authentication requirements

through collection and analysis of DS server’s logs. The

statistical analysis did away with some worthless requests, so it

showcases more accuracy in comparison with the average

daily page views.

As presented in the following figure, Shanghai SEAC’s

request makes period appearances during the semester, being

low in the summer as well as winter holidays. This is owed to

the fact that, in the term of weekends, teachers and students

have frequent cross-organization communications, thus

leading to more uses of wireless communication based on

SEAC.

Fig 6 ds requests account

Being a distributed authentication framework adopted by

the federal authentication, the quality of user authentication

service is dependent on each IDP node whereas the fact

suggests that the IDP node service quality can’t be controlled

where there is likely to be a very superior quality of IDP

together with the existence of services unavailable frequently

as well as inability of provision of normal service’s IDP. The

congenital loose type federated authentication of alliance

architecture poses to be the key cause. Nonetheless, SP is

always interested in becoming capable of getting a stable

certification experience in accordance with their expectations,

for unstable certification experience is difficult to be promoted

and dissipated that will become certified resistance for the

alliance’s developing. We have tried a variety of ways to

20

improve the Federation's Service level:

If the IDP is also important function within a university,

more stable operation will be expected. Therefore, we work

with vendors which provide identity authentication solution

for universities. In the future, we will make federal identity

module integrated into the university identity authentication

system. Then IDP will be the important function within a

university. This cooperation is under negotiation.

For some small, technically weak universities, we provide

IDP function as a SaaS (IDaaS), for free. The service has been

tested and verified, and will be officially launched in the

second half of 2017.

Some universities do not want to use cloud services. We

provide remote support services based on Jumpserver.

Jumpserver will record all the operations logs. So we can

provide them with a safe and controllable idp remote operation

and maintenance.

The most important is IDP classification and certification.

Incommon assurance has awarded an IDP certification scheme

on the bases of security considerations. In addition to the

certification scheme, we also believe that we are capable of

rating the attributes that have provided responses as per service

stability, response Time (such as 5*8, 7*24) and other

comprehensives by the IDP. With the help of the evaluation

and monitoring by some third party organization, we should

award a topnotch certification to those IDPs that deliver

exceptional services. In the meantime, the results of evaluation

and monitoring displayed online would provide SP with a clear

and predictable certification experience. SP can also adjust

their service object in accordance with the results of the IDP

level analyses, together with promoting and publicizing with a

clear target.

IV. PRACTICE IN SHANGHAI

On the bases of Shanghai SEAC, we engineer numerous

education applications, docking to current system in addition

to providing large amount of state-of-the-art resources.

Cross-organization authentication system disseminates in a

rapid and broad manner, together with offering a friendly

interface to users. Thereafter, we’ll introduce three

distinguishing cross-organization applications as hereunder:

1. Shanghai Educational Cross-organization Wi-Fi (SECW)

Shanghai Educational Cross-organization Wi-Fi (SECW)

system offer wireless resource sharing. Moreover, this

technology is quite identical to the global wireless roaming

framework—Eduroam that allows users to visit the sharing

resource in any supported college, once his own account gets

successfully authenticated.

The technology of Eduroam is based on 802.1x and radius, in

addition to fully taking the distributed management that asks

for more technology management for those schools that are

providing roam services. Nonetheless, SECW is based on we

b portal that puts to application

the management of half-centered and half-distributed that

requires comparatively less technology management together

with being handier to popularize.

Fig 7 Shanghai educational Cross-organization Wi-Fi

As of December 2016, SECW owned an aggregate of 40

units in order to support Shanghai Educational Wireless

roaming that also included 3 Bureau of Education (covering

the identity authentication in primary and secondary schools of

the bureau). All through the same period, there are 4 colleges

that support Eduroam.

2. Cross-organization Education By Universities from

Northeast Shanghai

In order to motivate academic resource sharing, Shanghai

Municipal Education Commission supports the development

of inter-course enrollment system on the bases of SEAC

serviced for 12 universities in Shanghai Northeast Region. It

allows students from all of the 12 member universities

including Fudan University, Tongji University, Shanghai

University of Finance and Economics etc. to participate in the

short courses run by other colleges. Once a student passes the

exams, he would secure the certificate for the same course

awarded by the host university. The platform of intercollegiate

education by universities from northeast

Shanghai(http://www.kxxfx.shec.edu.cn) puts to use the

SEAC authentication for the purpose of transferring users’

attribution, while avoiding rebuilding accounts for users, at the

same time, bringing convenience to those who would take part

in the other college’s minor courses.

Fig 8 Cross-organization Education By Universities from Northeast

3. Construction and Sharing Platform of High Quality

Resources in Shanghai Area

Construction and sharing platform of high quality resources

in shanghai area (http://www.kxzy.sh.edu.cn) is a resource

sharing project that receives the support from Shanghai

Municipal Education Commission, while aiming at the

collection of every university’s database, characteristic data

and high quality resources. As of now, the platform possesses

21

integrated 12 colleges’ high quality databases, including

Fudan University, Tongji University, Shanghai Normal

University, East China Normal University, etc. The databases

house a diverse variety of different types of resources, for

instance books of Republic of China, ancient books, ordinary

reading material, teachers’ guides, degree thesis, etc. The

platform possesses more than 300 thousand documents,

whereby 85% of it offers the whole text of the original paper

and 15% of the recourses offer the copies of documents. The

entire authentication for the resources is based on SEAC.

Fig 9 Construction and Sharing Platform of High Quality Resources in

Shanghai Area

4. Eastern Airlines Campus Promotions

Eastern Airlines Campus Promotions is one of the

promotion activities that aim at educating the users, allowing

the education users to purchase preferential tickets in APP

once they’ve been authenticated. Actually, a large number of

enterprises possess their own promotion policies in respect to

educators, but it is not that convenient in auditing, together

with encountering several issues in safety. Nonetheless,

Eastern Airlines Campus Promotions is based on SEAC

whereby the successful practice has earned bundles of

compliments.

Fig 10 Eastern Airlines Campus Promotions

5. Shanghai Massive Intelligent Learning Environment

(SMILE)

SMILE poses to be a platform that docked the third part

providers with the use of oauth2.0 technology as based on

SEAC technology. The platform aims to providing

distinguished academic services, including relevant recourses,

study materials, and numerous kinds of online courses that are

also popular as “WeShare Science & Technology”.

With Oauth2 technology, WeShare Platform allowed users

to login using numerous ways, for instance WeChat, QQ by

Tencent, SEAC, etc. Being a SEAC user, the academic

attribute of his own is likely to extend great support for the

promotion of WeShare Platform, together with enhancing the

quality of its’ service. The composition of login is presented as

hereunder:

Fig 11 Shanghai Massive Intelligent Learning Environment

V. CONCLUSION AND PROSPECTION

The fundamental objective pursued by regional

informatization is resource-sharing. As every participant is

capable of living up to its potential together with letting each

of the various organizations perform its own functions in the

educational informatization ecosystem, government must offer

prerequisite infrastructure as well as corresponding policies for

the development of education resource prosperity through

multi-agent participation. SEAC appears to be an exceptional

technique. Based on the technology of cross-organization

authentication, we carried out several trials on the sharing of

Wi-Fi resource (SECW), learning resources and library

resources that all exhibited unparalleled effects. In the

meantime, in the practice of resource sharing, we have

distributed to some extent its proposed standard specification

and Identity Authentication System, and both of them attained

great results.

In the future, we will put further efforts in three aspects. As

regards the access as well as promotion of applications, we

should benefit from SEAC Certification for its efforts on three

aspects. In respect to the access and the promotion, the

applicant should be deployed in a more simple and easy

manner. In respect to the access and promotion of users, those

pilot projects are expected to bring more high-qualified

applications to the open platform. Positive feedback from users

and their usage could be driven by applications. In respect to

the stability and security of the system, it is the consolidated

supervisory console that performs monitoring of the status of

authentication nodes and applications. As for those nodes and

applications are concerned, that lack system maintenance and

are unwilling to keep maintenance, we should consent to their

withdrawal as regards the enhancement of the platform, its

service and the potential of supporting.

REFERENCES

[1] Zhou Z.J. etal., 2007. Research on classification and sharing mechanism

of educational information resources [J]. E-education Research,2007(10) [2] Yin R.,2007. Research on the construction and sharing mechanism of

regional basic educational information resources [J]. China Educational

Technology, 2007(9) [3] Xiong C.P etal.,2010. Study on the development and application of

educational information resources "regional co construction and sharing"

[J]. Open Education Research,2010(01)

22

[4] Liu Y.Q.,2013. Mode and operation management of foreign high education information resources sharing [J]. Library &

Information,2013(02)

[5] Zhang X. & Zhang L.G.,2011. Construction of Regional Sharing Model of information education resources [J]. China Medical Education

Technology,2011(02)

[6] Yu S.Q & Zhao X.L.,2009. Education informatization based on Information Ecology [J]. China Educational Technology,2009(08)

[7] Feng Q.,2015. The concept of maintenance service products based on

cross certification [J]. Journal of East China Normal University,2015(S1)

[8] Ge J.M.,2011. Research and implementation of wireless access method

based on Cross Authentication [D]Shanghai,Shanghai Jiao Tong University School of Software,2011

[9] Wiki.shibboleth.net. (2017). Home - Shibboleth 2 - Shibboleth Wiki.

[online] Available at: https://wiki.shibboleth.net/confluence/display/SHIB2/Home [Accessed

16 Apr. 2017].

[10] Identity Assurance Assessment Framework.[online] Available at: http://www.incommon.org/docs/assurance/IAAF.pdf

Qi Feng Male, engineer at Information Technology Services

Center of East China Normal University. Member of the

authentication center in Shanghai Education Construction

Team, major technology support in cross-organization

certification of Shanghai municipal education. Research

direction: infrastructure operation and maintenance

monitoring, SSO, cross-organization authentication).

ZhongLin Chen Male, engineer at Information Technology

Services Center of East China Normal University, major in

software testing, services of campus informatiaztion. Fuscous

on trouble shooting in verity operation systems and school

applications’ management.

FuKe Shen Male, Professor, Senior Engineer in East China

Normal University, director of the China Higher Education

Information Acamy, member of Shanghai education and

research computer network expert committee, vice chairman

of Shanghai Association of Higher Education, director of

Shanghai Education Science Network IPv6 Laboratory

(ECNU) Research Center. Research direction: communication

and network technology (network architecture, the next

generation network protocol and implementation principle of

network traffic monitoring and management), educational

informatization / digital campus (SSO, cross domain

authentication, e-learning, smart, campus). Corresponding

author of this article.

YuHong Zhu Female, senior engineer, chief engineer of the

information center of Shanghai Municipal Education

Commission, engaged in the work of educational

informationization, responsible for the technical management

of Shanghai construction and application. And participated in a

number of issues in the development of informationiation

Education Research, including the evaluation index system of

university informatization in Shanghai, Shanghai educational

informatization planning and development research and

cross-organization authentication system, etc..

23

Abstract — Recently high-performance computing (HPC) and

BigData workloads are increasingly running over cloud-leveraged shared resources, meanwhile traditionally dedicated clusters have been configured only for specific workloads. That is, in order to improve resource utilization efficiency, shared resource clusters are required to support both HPC and BigData workloads. Thus, in this paper, we discuss about a prototyping effort to enable workload-based resource coordination for cloud-leveraged shared HPC/BigData cluster. By taking OpenStack cloud-leveraged shared cluster as an example, we demonstrate the possibility of workload-based bare-metal cluster reconfiguration with interchangeable cluster provisioning and associated monitoring support.

Index Terms — HPC/HTC workload, BigData workload, cloud-based shared cluster, dynamic resource configuration, and bare-metal cluster provisioning.

I. INTRODUCTION OWDAYS we can easily realize diversified

applications at a low cost owing to the emerging cloud-first computing paradigm that leverages flexible resource pooling. In particular, high-performance computing (HPC) and BigData workloads are increasingly spreading over cloud-leveraged shared resource infrastructure to enjoy its scaling and reliability benefits. Thus it is important to leverage the resource pooling power of hyper-scale cloud-based shared clusters, while balancing the dedicated engineering for HPC MPI (message passing interface) parallel computing workload and/or data-intensive BigData computing/storage workload.

This work is supported in part by a collaborative research project of PLSI supercomputing infrastructure service and application, funded by Korea Institute of Science & Technology Information (KISTI). Also, this work is supported in part by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. R7117-16-0218, Development of automated SaaS compatibility techniques over hybrid/multisite clouds).

Namgon Lucas Kim and JongWon Kim are with the School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, Korea (e-mail: {namgon, jongwon}@ nm.gist.ac.kr).

However, traditionally, dedicated clusters for HPC and

BigData parallel workloads have been separately configured only for chosen workload and thus most of the dedicated clusters could not flexibly match and utilize the full capacity of cluster resources. To improve the efficiency of resource utilization, various types of shared clusters have been proposed [4-9]. Besides, the growing popularity of x86 hardware and Linux operating system is accelerating the increasing adoption toward hyper-converged (i.e., compute/storage/networking integrated) cluster nodes (denoted as boxes in this paper) [1]. Thus, it is becoming cheaper and easier to flexibly support both HPC and BigData workloads on a single cloud-leveraged cluster of hyper-converged boxes, which are to be coordinated with composable resource management.

Thus, in this paper, we discuss about a prototyping effort to enable workload-based resource coordination for cloud-leveraged shared HPC/BigData cluster. The workload-based resource coordination (and thus sharing) is coordinated by an entity called as composable resource management, where software-based resource management for the required dynamic coordination is prototyped with resource management APIs. More specifically, as depicted in Fig. 1, we enable workload-based cluster reconfiguration over hyper-converged SmartX Boxes [2], clustered with open-source OpenStack cloud infrastructure software [3]. That is, workload-based cluster resource coordination is designed

Prototyping Workload-based Resource Coordination for Cloud-leveraged

HPC/BigData Cluster Sharing Namgon Lucas Kim and JongWon Kim

N

Fig. 1. Workload-based cluster resource configuration: Concept.

24

and prototyped over the shared resource pools of OpenStack cloud-leveraged cluster. Especially flexible and automated resource coordination is realized by leveraging bare-metal OpenStack cloud provisioning. By taking OpenStack cloud-leveraged shared cluster as an example, we demonstrate the possibility of workload-based bare-metal cluster reconfiguration with interchangeable cluster provisioning and associated monitoring support.

II. CLOUD-LEVERAGED CLUSTER SHARING FOR HPC/BIGDATA WORKLOADS

Based on the coordination power of composable resource management, the cloud-leveraged cluster sharing should serve multiple heterogeneous workloads in general. If we configure a dedicated cluster over a pay-per-use (mostly in VM unit) public cloud, it can become easily inefficient when the demanded workload do not match with the configured resources. Thus, in order to efficiently manage and operate cloud-leveraged shared cluster, the composable resource management should manage shared resource pools based on both workload-aware resource coordination (i.e., reconfiguration) and scheduling policy.

As discussed above, the composable resource management for cloud-leveraged shared cluster usually includes workload management to take complex and difficult scheduling logics in charge. The shared cluster needs to flexibly manage the resource allocation for workloads by utilizing resource management APIs. Moreover, there are several early studies to apply one shared cluster for heterogeneous workloads such as HPC and BigData. First, Hadoop on HPC has ported Apache Hadoop as a BigData processing framework to execute on a HPC cluster [4]. Also, Univa supports API-based Apache Mesos resource scheduling [5] through universal resource broker (URB) [6] as a workload-aware tool for grid-computing-style resource management. That is, scheduling-based resource coordination is supported with Mesos software frameworks without modifying Univa Grid Engine. YARN-MPI [7] has modified YARN as a resource management tool for Apache Hadoop cluster that enables the running of MPI-based parallel computing. Note however that most of these proposals require specialized implementation to support other additional (i.e., not designed initially) workloads.

Moreover, we may selectively choose the type of node (i.e., bare-metal or virtual machine) when configuring cloud-leveraged shared clusters. A bare-metal cluster can exhibit more computing power than a virtual machine cluster [8]. For example, we can provision cloud-leveraged HPC cluster with a highlighted focus on bare-metal clustering for HPC workloads [9]. Similarly, BigData clusters can be enabled over workload-customized provisioning of bare-metal resource boxes with iSCSI-based storage to improve its overall performance [10]. This work is indeed quite close to our work, except that it only considers BigData cluster with iSCSI-based storage and PXE+TFTP capability.

In summary, first, without any main update of related software, the resource coordination module in the proposed

prototype should execute dynamic provisioning running over reconfigurable HPC/BigData cluster. Also, we could utilize more flexible resource management including cloud-leveraged bare-metal provisioning. Finally, we adopt iPXE+HTTP to deploy bare-metal images to reduce the provisioning time for resource coordination.

III. PROTOTYPING WORKLOAD-BASED RESOURCE COORDINATION

A. Cloud-leveraged Cluster and Resource Coordination As depicted in Fig. 1, the proposed cloud-leveraged shared

cluster is built with hyper-converged bare-metal nodes. In order to flexibly install and operate shared cluster over the OpenStack cloud environment, we leverage OpenStack Ironic bare-metal node provisioning to prepare the targeted shared cluster nodes (i.e., bare-metal instances) with bare-metal images including related libraries for a targeted cluster. Thus, the proposed workload-based resource coordination should be able to support bear-metal image generation/management and consistent cluster configuration management, as part of automated cluster provisioning. The automated provisioning is critical since we need to frequently go through so-called CRUD (create, read, update, and delete) of shared cluster nodes to match with time-dependent workload variations. That is, we need to define the life-cycle (tied with CRUD capability) of desired resource coordination, and then design and implement automated cluster provisioning (with reconfiguration). Also, depending on operation policy and resource status of whole shared cluster, the required amount of resource slices is chosen from available resource pools. The selected resource slices allocated and then configured to execute the demanded workloads under the coordination of composable resource management. In addition, the composable resource management needs to be periodically updated about the operation status data of shared cluster so that it can continuously monitor resources and workloads.

This kind of required provisioning for cloud-leveraged shared cluster can be divided into following two stages.

First, we prepare well-arranged bare-metal images and execute the node creation by downloading (from public or private repository) and installing them into the selected bare-metal nodes from OpenStack cloud resource pool. Note that all bare-metal images should be ready to communicate with the proposed resource coordination via proper agents pre-installed.

Second, when the shared cluster is ready to start the operation, we need to continuously apply coordination actions to sustain the operation of shared cluster. The key feature of required resource coordination is dynamic cluster reconfiguration supported by composable resource management. Note that in order to match diverse workload demand, the status about resources and workloads should be continuously monitored. Also resource coordination need to cover the required orchestration of shared cluster operation. For this, eventually, resource management APIs are to be implemented to support various reconfiguration of shared

25

cluster operation. In addition, this kind of resource coordination can be executed only if we have sufficient access/control authority (i.e., privilege) on the underlying shared cluster nodes and its operation. Thus, for each cluster, we typically arrange cluster compute/agent nodes and coordinate them via a cluster master node. Note that, if the master node cannot access/control compute/agent nodes, we may directly control troubled nodes.

To execute workload on the shared cluster, reconfiguration

request file is needed to setup workload type (HPC, BigData, mixed), scheduling type, maximum wait time / min-max requirements for resource, and cluster scaling. Meanwhile the shared cluster operator prepares operation policy to define the number of simultaneous users, maximum amount of resource per user, maximum wait time, and others. Thus, it is required to verify the requested reconfiguration files for its completeness as well as operational conflicts. For example, if some reconfiguration parameters are not reasonable, we need to reject (or change to default) with warning messages.

B. Prototype Implementation We implement a prototype of coordination of composable

resource management using OpenStack-based cloud resource pool. Fig. 2 shows an overview of prototype implementation including building blocks and associated procedures. As explained already, we adopt OpenStack Ironic [11] and use bare-metal node images for HPC cluster with Slurm and BigData cluster with Mesos, respectively. Also, the Mesos agent node image is pre-installed with Mesos and related packages, following OpenStack bare-metal image format. Similarly, the Slurm compute node image is prepared by utilizing OpenHPC [12] packages and configured for efficient HPC cluster operation. One thing to mention is that both

OpenStack and OpenHPC packages can support bare-metal node provisioning. Thus, in this work, we only integrate cluster management (i.e., reconfiguration related) features into that Slurm compute image. Slurm requires synchronization files for authenticated cluster operation and the consistency of sync files across cluster nodes is important. Finally, all bare-metal images are created by utilizing open-source Diskimage-builder [13].

Each cluster node is provisioned by the cooperation of bare-metal node provisioning and composable resource management. Also, since we utilize OpenStack CLI to execute required OpenStack commands, we internally embed OpenStack client for OpenStack CLI. Note that all OpenStack bare-metal images have cloud-init software and user agent installed so that it can support the centralized control of all cluster nodes. Also the operation policy is pre-installed before creating and running the shared cluster. Finally, at this stage of prototype implementation, we fix both Mesos and Slurm masters on the desired node and run independently. Also, we execute both direct and indirect monitoring every 10 minutes.

IV. EVALUATIONS ON PROTOTYPE IMPLEMENTATION We verify the feasibility of proposed workload-based

resource coordination by prototyping it over an OpenStack cloud-leveraged shared cluster. Table 1 shows the hardware specification of the shared cluster with three nodes, which is decomposed into a small-size resource pool of 1 master and 2 compute nodes, respectively. All these nodes are installed with CentOS 7.3, OpenStack Ocata, Slurm 16.05, and Mesos 1.1.0. In addition, OpenStack controller node (Intel Xeon X3330 and 8GB DDR2 RAM) is separately setup. Also all cluster nodes can be power controlled by IPMI (Intelligent Platform Management Interface).

Also, for benchmarking workloads, we use Intel MPI Benchmark (IMB) [14] for HPC workload and Spark-Perf [15] for BigData workload, respectively. For HPC workload, IMB-NBC is selected as the test case and it is executed with default parameters such as 10~1000 iterations, 0B ~ 4MB message size, and 2/4/8 processes. Note that, since each compute node has 12 logical cores, we do not apply Hyper Threading for MPI workload to avoid potential performance degradation. In addition, K-Means test case is selected for BigData workload with Scale_Factor 1, which kind of matches 20 instance workload of Amazon AWS EC2 M1.xlarge.

A. Evaluation: Workload-based Cluster Coordination First, we check the provisioning performance of bare-metal cluster nodes by adopting OpenStack Ironic bare-metal

BM Images Generation/Management

Mesos Agent node

Slurm Compute node

Monitoring

Resource

Workload

Cluster Configuration

Operation Policy

Reconfiguration

BM Cluster Provisioning

Resource Management

OpenStack Client

(a) Building blocks

Request

Cluster Configuration

Same cluster with alreadyProvisioned?

Cluster Provisioning

Applying Operation Policy (Reconfiguration)

Yes

No

Running and Monitoring Workload

(b) Overall procedure

Fig. 2. Prototype implementation of resource coordination.

TABLE I SHARED CLUSTER HARDWARE SPECIFICATION (2 COMPUTE NODES)

Specification CPU 12 cores, 24 threads (Intel Xeon [email protected]) RAM DDR4 64GB (PC4-17000)

Storage 400 GB SSD (Intel S3500) Network 2-port 10GbE (EA), 2-port 1Gbe (EA)

26

provisioning tool with pxe_ipmitool driver over 1Gbps network inter-connection: single node and multiple nodes. Fig. 3 shows the total elapsed time comparison for BigData cluster provisioning for different provisioning options. Note that, in this case, the bare-metal image size of BigData cluster is bigger than that of HPC cluster and thus the overall provisioning takes longer time. Also with OpenStack Ironic bare-metal node provisioning, we need two types of images: deploy image and user image. The deploy image includes OpenStack IPA (Ironic Python Agent) that prepares the bare-metal node provisioning itself. This deploy image is identical with both HPC and BigData cluster provisioning and consists mostly of Kernel (33MB) and Ramdisk (334MB). On the other hand, separate user images are selectively used for workload-based cluster configuration, which is decomposed into three partitions for Kernel (5.2MB), Ramdisk (38MB), and User image (714MB).

We represent the overall cluster node provisioning time by

separating time associated with OpenStack Ironic bare-metal node provision and node boot. The time for OpenStack Ironic bare-metal provision is measured by OpenStack Nova compute service, which includes times consumed for deploying images and node boot/reboot. In comparison, node boot time indicates the time consumed for final booting after the completion of OpenStack Ironic provisioning process, which cannot be measured by OpenStack Nova. Fig. 4 shows the detailed time comparison of cluster node boot according to several booting options. The firmware option means that we load UEFI firmware before loading OS boot loader. Also, if we do not use local_boot option, boot loader consumes longer time since it needs to receive images from the controller node. User space option commonly takes longer and DHCP-interface option needs longest boot time around 30 secs. Finally, as shown in Fig. 3 and Fig. 4, we can compare the overall reconfiguration time according to the choices on bear-metal image delivery and local_boot option (about writing user images to local storage or not). The comparison result is showing that HTTP is better than TFTP in general, and local_boot option can reduce cluster node boot time.

B. Evaluation: Workload Execution and Monitoring Now we verify workload execution performance by

monitoring the operation (i.e., running) status of workload

execution. We independently perform each experiment on the same shared cluster, which is provisioned beforehand. Fig. 5 shows the resource usage patterns to monitor several stages (e.g., before/after and during) of workload execution. We insert 1-min idle time between experiments and reserve key resources (CPU and RAM) for monitoring to avoid unnecessary resource contention.

From Fig, 5, each workload shows two different monitoring results: one is directly collected from cluster compute nodes and the other is collected from the master node for composable resource management. In case of HPC workload, CPU usage shows different patterns between two monitoring options, since Slurm collects 5-min average load with Linux kernel APIs. On the other hand, Mesos-based master node only manages the allocated resource status, which makes the allocated usage of CPU and RAM fixed. Note also that, with direct monitoring, the usage impact of operating system kernel and daemons is included. In summary, with two different monitoring results, we can verify that the RCM could figure

0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:00 11:00

PXE+TFTP

PXE+TFTP+Local_Boot

iPXE+HTTP

iPXE+HTTP+Local_Boot

Time (m:s)Ironic Provision Machine Boot

Fig. 3. Provisioning time comparison (bare-metal cluster node for BigData workload).

0%

20%

40%

60%

80%

100%

120%

Resource Utilization for HPC Workload

CPU-USED_AVG (%) MEM-UTIL_AVG (%) SLURM-CPU-USED SLURM-MEM-UTIL

(a) HPC-IMB with Slurm

0%

20%

40%

60%

80%

100%

120%

Resource Utilization for BigData Workload

CPU-USED_AVG MEM-UTIL_AVG MESOS-CPU-USED MESOS-MEM-UTIL (b) Spark-Perf with Mesos

Fig. 5. Monitoring result for workload execution.

0:00 0:30 1:00 1:30 2:00 2:30 3:00 3:30 4:00

PXE+TFTP

PXE+TFTP+Local_Boot

iPXE+HTTP

iPXE+HTTP+Local_Boot

Time (m:s)firmware loader kernel initrd userspace Fig. 4. Boot time comparison (bare-metal cluster node).

27

out and assist the workload-based resource usage.

C. Provisioning Scalability Expectation

Finally we check cluster node scalability by estimating the cluster provisioning time for up to 50 nodes. Based on the provisioning time performance in Section IV and with iPXE local_boot option, we depict the estimated time in Fig. 4. It is estimated that around 20 minutes is consumed to simultaneously provision a 50-node cluster by utilizing the 17-sec delta time for one additional node provisioning. Also, 1GB-size bare-metal image transfer takes around 13 secs over 1Gbps network (assuming 600 Mbps throughput). In addition, we assume involved delay due to OpenStack controller node.

V. CONCLUSION We presented a prototype of resource coordination module

that performs bare-metal cluster coordination for both HPC and BigData workloads based on an OpenStack cloud-leveraged resource pool. We also verified the possibility of cluster reconfiguration depending on the types of workloads by leveraging automated OpenStack Ironic bare-metal provisioning. Note that the consumed provisioning delay for shared resource clustering is 8 minutes and 30 seconds, only with two compute nodes. It might be a non-negligible overheads for overall shared cluster performance. However,

remember that usually the combined workloads of HPC and BigData will be running for several hours.

REFERENCES [1] J. Kim, "Realizing Diverse Services Over Hyper-converged Boxes with

SmartX Automation Framework," in Proc. Conference on Complex, Intelligent, and Software Intensive Systems (CISIS 2017).

[2] A.C. Risdianto, J. Shin, and J. Kim, "Building and Operating Distributed SDN-Cloud Testbed with Hyper-convergent SmartX Boxes," in Proc. 6th EAI International Conference on Cloud Computing, Daejeon, Korea, Oct. 2015.

[3] OpenStack, http://openstack.org. [4] A. Luckow, et al., "Hadoop on HPC: Integrating Hadoop and Pilot-based

Dynamic Resource Management," arXiv preprint arXiv:1602.00345 (2016).

[5] B. Hindman et al., "Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center," In Proc NSDI 2011.

[6] Univa URB, http://www.univa.com/resources/files/urb.pdf. [7] MPICH2-Yarn, https://github.com/alibaba/mpich2-yarn. [8] C. G. Kominos, N. Seyvet, and K. Vandikas. "Bare-metal, virtual

machines and containers in OpenStack." In Proc. Innovations in Clouds, Internet and Networks (ICIN), 2017.

[9] P. Rad, et al. "Benchmarking bare metal cloud servers for HPC applications." In Proc. Cloud Computing in Emerging Markets (CCEM), 2015.

[10] A. Turk, et al. "An experiment on bare-metal bigdata provisioning." In Proc. 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 16), 2016.

[11] OpenStack Ironic, https://wiki.openstack.org/wiki/Ironic. [12] Karl W. Schulz, et al., "Cluster Computing with OpenHPC," In Proc.

HPCSYSPROS16, 2016. [13] Diskimage-builder,

http://docs.openstack.org/developer/diskimage-builder. [14] Intel, Intel MPI Benchmarks.

https://software.intel.com/en-us/articles/intel-mpibenchmarks. [15] Spark-Perf, https://github.com/databricks/spark-perf.

00:00

05:00

10:00

15:00

20:00

25:00

0 10 20 30 40 50

Time (m:s)

# of machines

Fig. 6. Estimated node boot time.

28


Abstract—NAMD benchmarks were done on five different

proteins with varying system sizes (anoplin, kalata B1, North-Atlantic ocean pout antifreeze protein, Pseudomonas aeruginosa PAO1 lipase and octopamine receptor in mushroom bodies, OAMB) solvated with TIP3P water through four different publicly available computer resources in the Philippines. Our results show that the high-end desktop generated the most ns/day for small and medium-sized systems (e.g. anoplin, kalata B1, and antifreeze protein) while BlueGene/P generated the most ns/day for larger system sizes (e.g. lipase and octopamine receptor). Although these computing resources are capable of exploring protein behavior through molecular dynamics (MD) simulations for small to medium-sized systems, dealing with large systems require tremendous computational resources. This benchmark highlights the importance of intercommunication in NAMD. Moreover, our results showed the advantage of using GPU-accelerated desktops for certain MD simulations. However, the poor scalability of the high-end desktop does not make it viable for simulating large systems. Improvements in Philippine computing infrastructure and protocol is highly recommended to keep up with advances in high performance computing globally.

This work is supported in part by the Newton Agham Programme (Project Number FP160010), the Office of the Vice Chancellor for Research and Development (Grant Number PNE151512), the Natural Sciences Research Institute, and the Office of the Vice President for Academic Affairs of the University of the Philippines Diliman under the Emerging Interdisciplinary Research (EIDR) Program. Acknowledgment is also made to ASTI and the Computing and Archiving Research Environment (CoARE) of the Department of Science and Technology, Philippines, Computational Science Research Center and Philippine Genome Center - Core Facility for Bioinformatics of the University of the Philippines for the allocation of computing resources required for this study.

Ronny L. Cheng is affiliated with the Institute of Chemistry, University of the Philippines, Diliman, Quezon City, Philippines (e-mail: [email protected]).

Ren Tristan A. dela Cruz is affiliated with the Department of Computer Science, University of the Philippines, Diliman, Quezon City, Philippines (e-mail: [email protected])

Francoise Neil D. Dacanay is affiliated with the Institute of Chemistry, University of the Philippines, Diliman, Quezon City, Philippines (e-mail: [email protected]).

Gil C. Claudio is affiliated with the Institute of Chemistry, University of the Philippines, Diliman, Quezon City, Philippines (e-mail: [email protected]).

Ricky B. Nellas is affiliated with the Institute of Chemistry, University of the Philippines, Diliman, Quezon City, Philippines (e-mail: [email protected]).

Index Terms—NAMD, benchmarking, Philippine computational resources, BlueGene/P, GPUaccelerated desktops

I. INTRODUCTION olecular dynamics (MD) simulation is a technique that computes the classical many-body problem involving

the interactions of large systems at the molecular level, i.e. biomolecules and materials. [1], [2] Its development has been crucial in the molecular level understanding of the complex motions of proteins with important biological functions. 3 Using MD, we can infer the relation between the structure and the motion of biomolecules, and its resulting structure-function relationship. [4] Understanding the motions of these biomolecules have significant importance in protein engineering and drug design. [5] MD simulations are also used in ab initio folding of proteins for predicting the 3D structures of proteins and to understand enzyme kinetics and its underlying mechanism. [6], [7], [8] It is also used to provide insight in nucleation phenomena and to design and optimize molecularly imprinted polymers. [9], [10] Earliest implementations of MD include the simulation of hard sphere systems, [1] fluid dynamics of liquid water and argon, [11], [12], [13] calculation of thermodynamic properties of binary liquid mixtures, [14] and protein folding of crambin (46 amino acid residues), [15] among others. Most of these studies are limited only to less than a thousand atoms because of the existing computer capabilities at that time. [16] With the advent of faster computers and smarter algorithms, simulations of ten thousands up to millions of atoms are now possible with timescales ranging from picoseconds to milliseconds. [17] Simulations at these magnitudes require computational capabilities greater than a commercial desktop computer can handle. [2] To provide a solution to this problem, parallel processing is utilized to perform the simulations that would otherwise be very time consuming and impractical to execute on commercial off-the-shelf computers. [17] The simulation benefits from numerous microprocessors that are configured to perform parallel computations that significantly reduce simulation time. [17]

Another advancement to high performance computing is the use of graphic processing units (GPU).[18] Initially used as

NAMD Benchmarking on Publicly Available Philippine Computational Resources

Ronny Cheng, Ren Tristan Dela Cruz, Francoise Neil Dacanay, Gil Claudio, Ricky Nellas

M

29

Fig. 1: Cartoon representations of benchmarked systems: (a) Anoplin protein (Uniprot ID: P0C005, 11 amino acid residues), (b) Kalata B1 protein (PDB ID: 1NB1, 29 amino acid residues), (c) North Atlantic ocean pout antifreeze protein (PDB ID: 1KDF, 70 amino acid residues), (d) Pseudomonas aeruginosa PAO1 lipase (1EX9, 285 amino acid residues), and (e) Octopamine receptor in mushroom bodies OAMB (Uniprot ID: Q7JQF1, 645 amino acid residues). a graphics processor, it could also be used in tandem with central processing units (CPU) to further increase computing capabilities. [19] In this set-up, GPU handles the parallel portions of a code while the CPU handles the serial portions, providing an optimized performance compared to using CPU cores alone. [19] Use of GPU-accelerated computer systems are gaining popularity due to its application as cost-`effective alternatives to HPC, along with its potential to conserve power and space. [18] Desktops with Intel processors also have the capacity to use hyper-threading, wherein virtual cores are available after utilizing all physical cores. [20] Further parallelization occurs through sharing of resources between physical and virtual cores. [20] Nanoscale Molecular Dynamics (NAMD) is an MD software able to simulate biological systems in realistic environments by taking advantage of parallel computing machines to handle the computational complexity of large molecules, by using spatial and force decomposition. [21], [22] It is available on multiple platforms, including parallel implemented clusters, desktops and laptops.[22] NAMD is highly applicable for multimillion atom systems, capable of simulating up to 2.64 million atoms and can run simulations at a femtosecond time scale. [17], [21] The implementation of parallel computation in NAMD allows for the algorithm to be scalable, and parallel efficiency can be related to factors such as system size, number of processors and intercommunication infrastructure. [21], [23] A source of NAMD’s scalability is the overlap of calculations of non-bonded forces and communication-intensive Particle Mesh Ewald method. [24] Scalability is influenced by the amount of atoms in the system and the number of processors used.[22] Parallel efficiency decreases due to increased communication time between processors, especially for relatively small molecules wherein communication time is greater than computing time.[17] Thus, parallel efficiency is determined by the communication-to-computation ratio. [21] The maximum speed up of an algorithm is the ratio of parallel simulation time over serial simulation time. This is theoretically obtained based on the proportion of parallel components of the algorithm compared to the serial. As stipulated from the Amdahl’s law, the speedup of a program is limited by the serial components of the task. [25] NAMD benchmarking is being done to compare simulation time of the algorithm on different processors since it is greatly affected by processor speed. [22] It is also affected by computing time and intercommunication efficiency between

processors. [26] Some benchmarks for NAMD found a turnover point wherein for systems with number of atoms above this point, reduced computing time is expected for increased number of cores, enabling extrapolation of computing time. [17] Using BlueGene/P, increased system size generally results to a linear increase in calculation speed.[27] In comparison to CPU-only version of NAMD, GPU-accelerated NAMD runs 7.1 times faster and is also 2.73 times more power efficient. [18] This study aims to benchmark NAMD using different publicly available computer resources located in the Philippines by simulating five different biochemical system sizes. Results here may be used to determine the amount of nanoseconds per day that each resource can simulate using NAMD. Also, benchmark results here may be used in the future as a guide to create a Philippine roadmap for development of biomolecular computations and related researches.

II. SYSTEMS AND METHODS Proteins used in this study were anoplin (Uniprot ID:P0C005), [28] kalata B1 protein (PDB ID:1NB1), [29] North Atlantic ocean pout antifreeze protein (PDB ID:1KDF), [30] Pseudomonas aeruginosa lipase (PDB ID:1EX9), [31] and octopamine receptor in mushroom bodies OAMB homology model (Uniprot ID:Q7JQF1), [32] , [33] whose cartoon representations are shown in Figs. 1a to 1e, respectively. [34], [35] For charged proteins, ions were introduced to the system to neutralize charge. The CHARMM force field was applied to every protein and all systems were solvated with TIP3P water box spanning 15 Å from the protein. [36] The number of atoms in each solvated system is shown at Table 1.

TABLE I: List of system specifics used for benchmarking. Protein Total

number of atoms

Number of protein

atoms

Number of solvent

atoms

Number of ions

Anoplin 10 615 187 10 425 3 Kalata B1 13 828 379 13 449 0

N. Atlantic ocean pout antifreeze

21 920 991 20 929 0

P. aeruginosa PAO1 lipase

52 546 4 196 48 345 5

Octopamine receptor OAMB

112 856 9 159 103 692 5

a) Anoplin b) Kalata B1 c) North Atlantic ocean pout antifreeze

d) P. aeruginosa PAO1 lipase

e) Octopamine receptor OAMB

30

(a) (b) Fig. 2: NAMD benchmark results for the ASTI HPC. (a) The NAMD performance against the number of processors for the ASTI HPC. (b) NAMD speed up

against the number of processors for the ASTI HPC.

(a) (b) Fig. 3: NAMD benchmark results for BlueGene/P. (a) The NAMD performance against the number of processors for the BlueGene/P. (b) NAMD speed up

against the number of processors for the BlueGene/P. The simulation was done at 300 K applying Langevin

dynamics to regulate the simulation temperature and pressure, while the Particle Mesh Ewald (PME) method was used to calculate long range interactions. [37], [38], [39] The Shake algorithm was also used to constrain water bond geometries. [40] The cutoff for Van der Waals and electrostatic interactions were set at 10 Å. A smooth switching function for both electrostatic and van der Waals interactions is applied for 8 Å interatomic distances.

NAMD simulations were done using four different computer systems: (1) high performance computing cluster (HPC) located at the Advanced Science and Technology Institute (ASTI) (48 × Intel Xeon CPU E5-2697 v2 @ 2.70 GHz), (2) BlueGene/P located at the Philippine Genome Center (PGC) (1 rack, 1024 × 4-core IBM PowerPC 450 @ 850 MHz), (3) HPC located at the Computational Science Research Center (CSRC) (2 × 4-core Intel Xeon CPU E5405 @ 2.0 GHz) and (4) high-end desktop computer located at the

good ViBES laboratory, Research Building, Institute of Chemistry, University of the Philippines Diliman (4 × Intel Core i7-4790 with 4 virtual cores @ 3.60 GHz accelerated with NVIDIA Geforce Jetstream GTX970). Version 2.12 of NAMD was ran for ASTI and CSRC HPCs while NAMD version 2.7b1 was utilized for the BlueGene/P platform. For the high-end desktop, CUDA-build NAMD was used.

III. RESULTS AND DISCUSSION

A. Computing resource performance at different system sizes 1) ASTI HPC: ASTI HPC is capable of simulating between ∼ 4.58 ns/day for OAMB with 44 processors and ∼ 32.98 ns/day for anoplin with 24 processors, with a linear decrease in ns/day for increasing system size (Fig. 2a). A positive correlation between relative speed up ratio and processors used was observed for 1EX9 and OAMB, while breakdown was

31

observed after 24-32 processors for other systems, indicating scaling breakdown (Fig. 2b). The scaling breakdown can be attributed to the high latency of Gigabit Ethernet network compared to other intercommunication networks, which have relatively high CPU overload when sending data, leading to longer idle time for processors. [41] Thus, good scalability in ASTI HPC is only achieved at larger system sizes.

An advantage of ASTI HPC is its accessibility to Philippine researchers. However, a repercussion of this is that multiple jobs requiring HPC capabilities (not limited to NAMD) are shared in the same nodes, which causes interference greatly affecting parallel performance. [22] Thus, there is a limit in the number of cores and processed jobs allotted to each user. In addition, a seven day job time limit is imposed and jobs surpassing this are automatically terminated.

2) BlueGene/P: The BlueGene/P is capable of generating between ∼ 6.05 ns/day for OAMB using 256 processors and ∼ 30.03 ns/day for anoplin using 192 processors (Fig. 3a). A

negative correlation was observed between system size and generated ns/day. As shown in Fig. 3b, increase in processors used led to an increase in relative speed up ratio, indicating that the algorithm is scalable using the BlueGene/P. The slopes indicate that parallel efficiency has not been maximized yet. The lack of plateau also indicates that further increase of processors can be done without compromise of speed up. This information is vital, especially for BlueGene/P, since only 1 rack composed of 256 cores is available at the Philippine Genome Center, whereas a maximum of 72 can be utilized. [42] The scalability in BlueGene/P can be attributed to its intercommunication infrastructure, which utilizes five different networks to facilitate intercommunication. Of the five networks, 3-D torus is the main network used for message passing wherein each node is connected to six nearby nodes. In effect, the average path length between nodes and required bandwidth decreases, thus reducing latency. [43]

Fig.4 : NAMD benchmark results for the CSRC HPC. (a) The NAMD performance against the number of processors for the CSRC HPC. (b) NAMD speed up against the number of processors for the CSRC HPC.

(a) (b)

Fig.5 : NAMD benchmark results for the high-end desktop computer. (a) The NAMD performance against the number of processors for the high-end desktop computer. (b) NAMD speed up against the number of processors for the high-end desktop computer.

(a) (b)

32

3) CSRC HPC: The CSRC HPC on the other hand, is capable of simulating between ∼ 0.95 ns/day for OAMB and ∼ 11.19 ns/day for anoplin, both using 8 processors (Fig. 4a). A decrease in ns/day generated is observed for increasing system size. As a consequence, relative speed up ratio also increases, indicating good scalability (Fig. 4b). While CSRC HPC also uses Gigabit Ethernet, an explanation to its scalability is due to the lower computational capabilities of its processors (2 x 4-core Intel Xeon CPU E5405 @ 2.0 GHz), wherein computing within processors becomes the bottleneck.

The insignificant difference in slope between OAMB and 1EX9 indicate that the parallel efficiency has already been maximized, which means that if larger system sizes are simulated, a lower relative speed up ratio is to be expected. 4) High-end desktop computers: Benchmarks using the high-end desktop computer shows a subsequent decrease in generated ns/day for increasing system size. It is able to simulate between ∼ 4.05 ns/day for OAMB and ∼ 44.64 ns/day for anoplin, both using four processors (Fig. 5a). An increase in relative speed up ratio was observed for increasing processors, followed by a decrease after four processors (Fig. 5b). This indicates that hyper-threading for high-end desktop computer is not beneficial to NAMD, and reduced performance may be attributed to inferior specifications such as synchronous dynamic random-access memory (SDRAM) interface. The high-end desktop cluster utilizes DDR3 SDRAM (1666/1333/1066 MHz) while its next generation, DDR4 SDRAM is widely available capable of having higher transfer rates. [44] This indicates poor scalability on NAMD due to dependency on intercommunication. Since the GPUs utilize a single instruction multiple data (SIMD) organization, lower throughput compared to using CPUs alone is expected due to minimal overlap. [45] This is a result of prioritizing high aggregate performance over optimized performance of threads. [45] In CUDA-accelerated NAMD, the GPU handles short-range nonbonded forces while the CPU keeps the atom coordinates and calculates long-range electrostatic and bonded forces. [24]

Communication comes from data transfer of coordinates and calculated nonbonded forces. [24] Since intercommunication is processed before starting the processing of calculations per node in every timestep, idle time is aggregated causing the limiting factor for high-end desktop computers to be the intercommunication. [18], [46]

B. Which computing resource to use?

1) Small system size: Comparison for small system sizes (anoplin in water system), is shown in Fig. 6. The high-end desktop computer provided the most ns/day, followed by ASTI HPC, BlueGene/P and CSRC HPC (Fig. 6a). For small system sizes, wherein intercommunication dependence is less due to minimal load balancing, the performance of the high-end desktop computer is probably due to the clock rate and bandwidth of the processors, which is the highest among computing systems used. This indicates that high-end desktop computer has the best computational capabilities among benchmarked resources.

The relative speed up ratio for small system sizes suggest that increasing the number of processors for CSRC HPC and BlueGene/P will further increase the ns/day generated for both resources (Fig. 6b). Sublinear speedup is observed for all computing resources, which is in agreement with how NAMD works in parallel systems due to the dependence in intercommunication. The relative speed up ratio value suggests that the parallel capabilities of NAMD have not yet been maximized.

2) Medium system size: Simulations for medium sized systems (1KDF in water system) show that the high-end desktop computer and BlueGene/P generated nearly identical values in terms of ns/day, followed by ASTI HPC and CSRC HPC (Fig. 7a). The benchmark shows that intercommunication becomes more important for medium system sizes.

Based on the relative speed up ratios for medium system sizes, CSRC HPC and BlueGene/P will benefit from addition of processors wherein more ns/day can be generated (Fig. 7b).

Fig. 6: Benchmark for various computing resources used for the anoplin-water system, a representative of a small system size. (a) NAMD performance of computing resources in the Philippines used for the anoplin-water system. (b) Relative speed up ratio of computing resources in the Philippines used for the anoplin-water system.

(a) (b)

33

Sublinear speedup is also observed for all computing resources. The parallelization of NAMD is not yet maximized for medium sizes, as exhibited by the relative speed up ratios.

3) Large system size: For large system sizes (OAMB in water system), the BlueGene/P was able to generate the most ns/day, followed by the high-end desktop computer, ASTI HPC, CSRC HPC (Fig. 8a).

For increasing system size, it can be seen that BlueGene/P becomes a more suitable computing resource for NAMD. The results show that processor speed is not the only factor in simulation time because in terms of processor speed, BlueGene/P processors have lower processing capability compared to the high-end desktop computer processors due to lower clock rate, which is done to keep BlueGene/P power efficient. [47] Intercommunication becomes a very important factor in NAMD performance for larger system sizes.

Relative speed up ratios for all computing resources used for

large system sizes show that further speed up is done by increasing the number of processors for ASTI HPC, CSRC HPC and BlueGene/P (Fig. 8b). Sublinear speedup is observed and has the highest ratio values for BlueGene/P and ASTI HPC, hence, parallel performance of NAMD is not yet maximized.

4) Synthesis: Benchmarks show that GPU-accelerated desktop cluster systems provide an alternative to other publicly available computing systems in the Philippines to solve problems using NAMD, especially for system sizes lower than 20 000 atoms. However, for larger system sizes, BlueGene/P provide the most ns/day. This signifies the importance of intercommunication in NAMD simulations, especially considering that the BlueGene/P processors have the lowest clock rate among all computing resources used, in response to being comparatively power efficient compared to other supercomputers. [47] Despite the high-end desktop computer having superior computational capabilities, bottleneck due to

Fig. 7: Benchmark for various computing resources used for the 1KDF-water system, a representative of a medium system size. (a) NAMD performance of computing resources in the Philippines used for the 1KDF-water system. (b) Relative speed up ratio of computing resources in the Philippines used for the 1KDF-water system.

(a) (b)

Fig. 8: Benchmark for various computing resources used for the OAMB-water system, a representative of a large system size. (a) NAMD performance of computing resources in the Philippines used for the OAMB-water system. (b) Relative speed up ratio of computing resources in the Philippines used for the OAMB-water system.

(a) (b)

34

intercommunication diminishes its computational advantage especially at higher system sizes where intercommunication is vital.

Our computing resources are capable of performing simulations for applications such as considering solvent and membrane environment in protein dynamics for small compounds. [2] However, simulating certain problems like protein function elucidation and biological processes still require better HPC specifications that is capable of simulating up to the microsecond timescale.[48]

To be able to increase output of local computational research, enhancement of local computing resources should be done. Better computational resources will allow us to simulate other biomolecules of interest with much larger sizes such as ribosomes, viruses, cellulose, among others.[49] Despite GPU-accelerated desktops being more cost effective, its poor scalability shows that adding cores do not provide the solution to simulate larger system sizes. Instead, systems with better scalability such as BlueGene/P should be invested upon, wherein the computational resources and time required to complete a simulation of NAMD given a certain system size can be estimated. Addition of BlueGene/P racks can be done, as previous NAMD benchmarks using BlueGene/P indicated that a further decrease in simulation time is expected when 8192 cores are used instead of 4096. [27] The successor of BlueGene/P, the BlueGene/Q system, may also be used. Benchmarks show improvements of BlueGene/Q based on power efficiency and computational capability compared to BlueGene/P. [50]

IV. CONCLUDING REMARKS In this study, NAMD benchmarks on four different publicly

available Philippine computing resources were done for different system sizes. For all computer systems, increased system size resulted in a decrease in ns/day generated. Nearly similar times were found for ASTI HPC and BlueGene/P. However, accessibility issues hamper performance for both computing resources. CSRC HPC, while no accessibility issues were found, had the slowest simulation time among all computing systems due to inferior specifications. Although the high-end desktop computer had the best hardware specifications, high intercommunication cost affects its NAMD performance, with optimal performance using four cores. Hyper-threading did not optimize NAMD simulation for the high-end desktop computer but instead contributed to the bottleneck in intercommunication.

For small and medium system sizes, the high-end desktop computer generated the most ns/day. However, for large system sizes such as OAMB, more ns/day was simulated using BlueGene/P due to its optimized intercommunication network. The benchmark shows the significance of intercommunication in NAMD simulation, especially for larger system sizes.

The benchmarks also show the feasibility of using GPU-accelerated desktops as alternative to non-GPU-accelerated HPCs in NAMD for certain system sizes. However, considering that global computing resources can

generate hundreds of nanoseconds per day, the performance of publicly available Philippine computing resources lag in comparison to international counterparts. The protocols regarding HPC use also needs to be varied to accomodate simulations that may take several weeks or months to process.

The Philippines is capable of producing computational research despite challenges in resources. However, to keep up with international advances in computational science and applications, the Philippines should invest in latest scalable HPC resources such as BlueGene/Q. Investment to better computing resources provides not only increased computational capabilities to solve larger problems but also improved safety and power costs, which may not only be applicable in computational chemistry but in other applications such as particle physics and weather forecasting, among others.

ACKNOWLEDGMENT This work is supported in part by the Newton Agham Programme (Project Number FP160010), the Office of the Vice Chancellor for Research and Development (Grant Number PNE151512), the Natural Sciences Research Institute, and the Office of the Vice President for Academic Affairs of the University of the Philippines Diliman under the Emerging Interdisciplinary Research (EIDR) Program. Acknowledgment is also made to ASTI and the Computing and Archiving Research Environment (CoARE) of the Department of Science and Technology, Philippines, Computational Science Research Center and Philippine Genome Center - Core Facility for Bioinformatics of the University of the Philippines for the allocation of computing resources required for this study.

REFERENCES [1] B. Alder and T. Wainwright, “Phase Transition for a Hard Sphere

System,” in J. Chem. Phys., vol. 27, p. 1208, 1957. [2] M. Karplus and J. McCammon, “Molecular dynamics simulations of

biomolecules,” in Nat. Struct. Biol., vol. 9, p. 646, 2002. [3] M. Karplus, “Molecular Dynamics of Biological Macromolecules: A

Brief History and Perspective,” in Biopolymers, vol. 9, pp. 350–358, 2002.

[4] Y. Duan and P. Kollman, “Pathways to a Protein Folding Intermediate Observed in a 1-Microsecond Simulation in Aqueous Solution,” in Science, vol. 282, pp. 740–744, 1998.

[5] B. Isralewitz, J. Baudry, J. Gullingsrud, D. Kosztin, and K. Schulten, “Steered molecular dynamics investigations of protein function,” in J. Mol. Graphics, vol. 19, no. 1, pp. 13–25, 2001.

[6] A. Liwo, M. Khalili, and H. A. Scheraga, “Ab initio simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptide chains,” in P. Natl. Acad. Sci. USA, vol. 102, no. 7, pp. 2362–2367, 2005.

[7] Q. Cui and M. Karplus, “Quantum mechanical/molecular mechanical studies of the triosephosphate isomerase-catalyzed reaction: Verification of methodology and analysis of reaction mechanisms,” in J. Phys. Chem. B, vol. 106, no. 7, pp. 1768–1798, 2002.

[8] I. Feierberg and J. Åqvist, “Computational modeling of enzymatic keto- enol isomerization reactions,” in Theor. Chem. Acc., vol. 108, no. 2, pp. 71–84, 2002.

[9] G. C. Sosso, J. Chen, S. J. Cox, and et al., “Crystal nucleation in liquids: Open questions and future challenges in molecular dynamics simulations,” in Chem. Rev., vol. 116, no. 12, p. 7078, 2016.

[10] B. Minisini and F. Tsobnang, “Molecular dynamics study of specific interactions in grafted polypropylene organomodified clay

35

nanocomposite,” in Compos. Part A-Appl. S., vol. 36, no. 4, pp. 539–544, 2005.

[11] A. Rahman, “Correlations in the Motion of Atoms in Liquid Argon,” in Phys. Rev., vol. 136, no. 2A, p. A405, 1964.

[12] J. Barker and R. Watts, “Structure of Water: A Monte Carlo Simulation,” in Chem. Phys. Lett., vol. 3, no. 3, p. 144, 1969.

[13] A. Rahman and F. Stillinger, “Molecular Dynamics Study of Liquid Water,” in J. Chem. Phys., vol. 55, no. 7, p. 3336, 1971.

[14] I. McDonald, “NpT-ensemble Monte Carlo calculations for binary liquid mixtures,” in Mol. Phys., vol. 23, no. 1, pp. 41–58, 1972.

[15] A. Brunger, G. Clore, A. Gronenborn, and et al., “Three-dimensional structure of proteins determined by molecular dynamics with interproton distance restraints: Application to crambin,” in Proc. Natl. Acad. Sci. USA, vol. 83, pp. 3801 – 3805, 1986.

[16] W. V. Gunsteren and H. Berendsen, “Computer Simulation of Molecular Dynamics: Methodology, Applications, and Perspectives in Chemistry,” Angew. Chem. Int. Ed. Engl., vol. 29, pp. 992–1023, 1990.

[17] K. Sanbonmatsu and C. S. Tung, “High performance computing in biology: Multimillion atom simulations of nanoscale systems,” in J. Struct. Biol., vol. 157, pp. 470–480, 2007.

[18] V. V. Kindratenko, J. J. Enos, G. Shi, and et al., “GPU Clusters for High-Performance Computing,” in Cluster Computing and Workshops, 2009. CLUSTER’09. IEEE International Conference on. IEEE, 2009, pp. 1–8.

[19] J. Nickolls and W. Dally, “The GPU computing era,” in IEEE Micro, vol. 30, no. 2, 2010.

[20] A. Jackson, “Stepping Up,” in Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2013 15th International Symposium on. IEEE, 2013, pp. 37–41.

[21] L. Kalé, R. Skeel, M. Bhandarkar, and et al., “NAMD2: Greater scalability for parallel molecular dynamics,” in J. Comput. Phys., vol. 151, no. 1, pp. 283–312, 1999.

[22] J. C. Phillips, R. Braun, W. Wang, and et al., “Scalable molecular dynamics with NAMD,” in J. Comput. Chem., vol. 26, no. 16, pp. 1781–1802, 2005.

[23] A. Y. Grama, A. Gupta, and V. Kumar, “Isoefficiency: Measuring the Scalability of Parallel Algorithms and Architectures,” in IEEE Parallel and Distributed Technology, Special Issue on Parallel and Distributed Systems: From Theory to Practice, 1993, pp. 12–21.

[24] J. C. Phillips, J. E. Stone, and K. Schulten, “Adapting a Message-Driven Parallel Application to GPU-Accelerated Clusters,” in High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008. International Conference for. IEEE, 2008, pp. 1–9.

[25] G. M. Amdahl, “Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities,” in Proceedings of the April 18-20, 1967, Spring Joint Computer Conference. ACM, 1967, pp. 483–485.

[26] A. Poghosyan, L. Arsenyan, H. Astsatryan, and et al., “NAMD package benchmarking on the base of armenian grid infrastructure,” in Communications and Network, vol. 4, pp. 34–40, 2012.

[27] A. H. Poghosyan, L. H. Arsenyan, and H. V. Astsatryan, “Comparative NAMD Benchmarking on BlueGene/P,” in MIPRO, 2012 Proceedings of the 35th International Convention. IEEE, 2012, pp. 319–321.

[28] K. Konno, M. Hisada, R. Fontana, and et al., “Anoplin, a novel antimicrobial peptide from the venom of the solitary wasp Anoplius samariensis,” in BBA-Protein Struct. M., vol. 1550, no. 1, pp. 70–80, 2001.

[29] K. J. Rosengren, N. L. Daly, M. R. Plan, and et al., “Twists, knots, and rings in proteins structural definition of the cyclotide framework,” in J. Biol. Chem., vol. 278, no. 10, pp. 8606–8616, 2003.

[30] F. D. Sönnichsen, C. I. DeLuca, and et al., “Refined solution structure of type III antifreeze protein: Hydrophobic groups may be involved in the energetics of the protein–ice interaction,” in Structure, vol. 4, no. 11, pp. 1325–1337, 1996.

[31] M. Nardini, D. A. Lang, K. Liebeton, and et al., “Crystal structure of Pseudomonas aeruginosa lipase in the open conformation the prototype for family I. 1 of bacterial lipases,” in J. Biol. Chem., vol. 275, no. 40, pp. 31 219–31 225, 2000.

[32] K.-A. Han, N. S. Millar, and R. L. Davis, “A novel octopamine receptor with preferential expression in Drosophila mushroom bodies,” in J. Neurosci., vol. 18, no. 10, pp. 3650–3658, 1998.

[33] S. Balfanz, T. Strünker, S. Frings, and et al., “A family of octapamine receptors that specifically induce cyclic AMP production or Ca2+ release in drosophila melanogaster,” in J. Neurochem, vol. 93, no. 2, pp. 440–451, 2005.

[34] H. M. Berman, J. Westbrook, Z. Feng, and et al., “The protein data bank,” in Nucleic Acids Res., vol. 28, no. 1, pp. 235–242, 2000.

[35] The UniProt Consortium, “Uniprot: the universal protein knowledge- base,” in Nucleic Acids Res., vol. 45, no. D1, p. D158, 2017.

[36] K. Vanommeslaeghe, E. Hatcher, C. Acharya, and et al., “CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields,” in J. Comput. Chem., vol. 31, no. 4, pp. 671–690, 2010.

[37] S. E. Feller, Y. Zhang, R. W. Pastor, and et al., “Constant pressure molecular dynamics simulation: the langevin piston method,” in J. Chem. Phys., vol. 103, no. 11, pp. 4613–4621, 1995.

[38] T. Darden, D. York, and L. Pedersen, “Particle mesh ewald: An n log (n) method for ewald sums in large systems,” in J. Chem. Phys., vol. 98, no. 12, pp. 10 089–10 092, 1993.

[39] U. Essmann, L. Perera, M. L. Berkowitz, and et al., “A smooth particle mesh ewald method,” in J. Chem. Phys., vol. 103, no. 19, pp. 8577–8593, 1995.

[40] H. C. Andersen, “Rattle: A velocity version of the shake algorithm for molecular dynamics calculations,” in J. Comput. Phys., vol. 52, no. 1, pp. 24–34, 1983.

[41] B. M. Bode, J. J. Hill, and T. R. Benjegerdes, “Cluster Interconnect Overview,” in Proceedings of USENIX 2004 Annual Technical Confer- ence, FREENIX Track, 2004, pp. 217–223.

[42] G. Lakner, IBM System Blue Gene Solution: Blue Gene/P System Administration. IBM Redbooks, 2009.

[43] A. D. Hospodor and E. L. Miller, “Interconnection Architectures for Petabyte-Scale High-Performance Storage Systems,” in NASA/IEEE MSST 2004 Twelfth NASA Goddard Conference on Mass Storage Sys- tems and Technologies, 2004, p. 273.

[44] I. Micron Technology. Ddr3 to ddr4. [Online]. Available: https://www.micron.com/products/dram/ddr3-to-ddr4

[45] J. C. Phillips and J. E. Stone, “Probing biomolecular machines with graphics processors,” in Commun. ACM, vol. 52, no. 10, pp. 34–41, 2009.

[46] J. E. Stone, D. J. Hardy, I. S. Ufimtsev, and et al., “GPU-accelerated molecular modeling coming of age,” in J. Mol. Graphics, vol. 29, no. 2, pp. 116–125, 2010.

[47] S. Alam, R. Barrett, M. Bast, and et al., “Early Evaluation of IBM BlueGene/P,” in Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, ser. SC ’08. Piscataway, NJ, USA: IEEE Press, 2008, pp. 23:1–23:12. [Online]. Available: http://dl.acm.org/ citation.cfm?id=1413370.1413394

[48] J. L. Klepeis, K. Lindorff-Larsen, R. O. Dror, and et al., “Long-timescale molecular dynamics simulations of protein structure and function,” in Curr. Opin. Struc. Biol., vol. 19, no. 2, pp. 120–127, 2009.

[49] J. R. Perilla, B. C. Goh, C. K. Cassidy, and et al., “Molecular dynamics simulations of large macromolecular complexes,” in Curr. Opin. Struc. Biol., vol. 31, pp. 64–74, 2015.

[50] D. Chen, N. A. Eisley, P. Heidelberger, and et al., “The IBM Blue Gene/Q Interconnection Network and Message Unit,” in Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC ’11. New York, NY, USA: ACM, 2011, pp. 26:1–26:10.

36


ISBN 978-4-9905448-7-4

Abstract — This paper focuses on service description and

composition for complex 3-tier datacenter application services,

tied with firewalls and load balancing. By adopting cloud-native

container-based microservices architecture (MSA) for a

small-sized datacenter situation, we attempt to compare several

approaches for service description and composition, especially

from the viewpoint of service function chaining (SFC). Also we

prototype them with OpenStack-based cloud virtual machines

(VMs) by comparing their performances with resource usages.

Index Terms — Cloud-native computing, microservices

architecture, container orchestration, service description and

composition, and service function chaining.

I. INTRODUCTION

With the advent of cloud-first computing era, the value

chain around cloud industry has been rapidly growing. This led

to gradual migration of the specialized application services

over dedicated clusters to cloud-based shared infrastructures

[1, 2]. Following this trend, the service oriented computing

paradigm for diversified application services is transforming to

so-called microservices architecture (MSA) that stitches

together multiple openAPI-based component services (i.e.,

functions) to compose a whole composite service. This MSA is

known for several benefits such as dynamic agility, easy and

flexible maintenance, and cost effectiveness due to shared

resource pooling [3].

Typically, the service composition for MSA-based

application services is done by service function chaining

(SFC). With SFC, we start with allocating the necessary

resource slices from shared cloud infrastructure to

This work was supported by Institute for Information & communications

Technology Promotion (IITP) grants funded by the Korea government

(MSIT): No. R7117-16-0218 (Development of automated SaaS compatibility

techniques over hybrid/multisite clouds) and No. 2015-0-00575 (Global

SDN/NFV open-source software core module/function development).

Moonjoong Kang and JongWon Kim are with the school of Electrical

Engineering and Computer Science, Gwangju Institute of Science and

Technology (GIST), 123 Cheomdangwagi-ro, Buk-gu, Gwangju, 61005,

Republic of Korea (e-mail: {mjkang, jongwon}@nm.gist.ac.kr).

accommodate all component services (i.e., functions). These

functions and their stitching requirements are to be satisfied by

enabling diverse inter-connections among them.

However, the effectiveness of SFC-based service

description and composition is not an easy target for the

complicated form of datacenter Web-App-DB 3-tier

application services, which may include additional firewalls

and load balancing [4]. There are several approaches to handle

this kind of complex SFC-based service description and

composition in general. Thus, in this paper, we are attempting

to explain and compare them by choosing example complex

Web-App-DB 3-tier application services. Also we attempt to

understand the whole procedure behind service description and

composition by prototyping the realization of SFC-based

service description and composition and by evaluating their

performance/cost in orchestrating SFC-based service

description and composition.

II. SFC-BASED APPROACHES FOR SERVICE DESCRIPTION

AND COMPOSITION

We compare three SFC-based service description and

composition in this paper: container-based MSA-SFC for

web-based SaaS (Software as a Service) applications,

HOT(Heat Orchestration Template)-SFC for OpenStack

Heat-based (cloud-integrated or Web-based) SaaS

applications, and IETF (Internet Engineering Task Force)

NSH(Network Service Header)-SFC for network

infrastructure-focused SaaS applications.

Regarding service description aspects, all 3 SFC approaches

commonly describe application services as a set of abstracted

functions with ordering constraints. They include identifiers

for all component functions, access interfaces for function

binaries (e.g., embedded scripts or URI), and the directional

dependency relation among functions. IETF NSH-SFC,

however, may include non-abstracted functions, which are

1-to-1 matched to specific physical machines and do not

require function binaries. Also IETF NSH-SFC is unique in

handling overlay networking by relying on NFV(Network

Function Virtualization)-enabled network infrastructure, while

other SFCs rely on service-transparent encapsulation-based

Comparison of

Service Description and Composition

for Complex 3-Tier Cloud-based Services

Moonjoong Kang and JongWon Kim

37

overlay networking [5-8]. Because of this, IETF NSH-SFC

introduces NSH concept [9], where any involved function

must handle NSH directly or be wrapped with service function

proxy1 . Also, due to the unique encapsulation for overlay

networking, IETF NSH-SFC is more effective in monitoring

and adjusting the complete routes of packets than other

approaches [9]. HOT-SFC covers OpenStack cloud

infrastructure [8], but its encapsulation is completely delegated

to OpenStack Neutron and transparent to its service.

Next, regarding the service composition aspects, 3 SFC

approaches mostly share similar step-by-step procedure

depicted in Fig. 1. The SFC orchestration tool first parses the

description for service composition and checks the resource

requirements of involved functions. If the resource

requirement is met, the orchestration tool proceeds to identify

and allocate demanded resources for involved functions by

considering their adjacency for stitching efficiency. Once

resources are allocated, the orchestration tool performs the

necessary interconnections to establish flexible networking of

resources. Then, all involved functions are appropriately

deployed (i.e., located, placed, and instantiated). Finally, these

deployed functions are activated and stitched together to

establish end-to-end composite service composition.

Also, during the function stitching, all 3 SFCs can manage

the interconnections among functions via KV (key-value)

storage. With function identifier as its key, we can orchestrate

the scaling and load balancing of service composition to

mitigate the service down time. However, OpenStack

1 Service function proxy translates NSH for non-NSH-aware functions.

Every inbound traffic must first go through a classifier that attaches NSH

while NSH is detached for outbound traffic.

Heat-based HOT-SFC provides function stitching only to

OpenStack-integrated functions while leaving the handling of

web-based SaaS application services to own implementation.

III. CONTAINER-BASED MSA-SFC SERVICE DESCRIPTION &

COMPOSITION

Now, by choosing container-based MSA-SFC service

description and composition, we explain the whole procedure

behind service description and composition by prototyping the

realization of SFC-based service description and composition

and by evaluating their performance/cost in orchestrating

SFC-based service description and composition. Note that

MSA-SFC does not need any change of the application service

to be composed as it does not need explicit encapsulation at the

service level. Also it may provide high availability for

Web-App-DB 3-tier application services without additional

own implementation.

Now for container-based MSA-SFC service description and

composition of complex Web-App-DB 3-tier cloud-based

services with more than 10 functions, we compare two popular

container orchestration tools: Docker Swarm and Kubernetes.

As shown in Fig. 2, both Docker Swarm and Kubernetes

follow almost identical workflows in terms of service

description and composition as follows. The description for

service composition uses string-based function identifier. Each

function can be given with required (and optional) resource

amount and identifier-based dependency configuration. Also,

TABLE I

SERVICE DESCRIPTION AND COMPOSITION APPROACHES: DIFFERENCES.

Type Container-based MSA-SFC OpenStack Heat-based HOT-SFC IETF NSH-SFC

Application

services Web-based SaaS

Web-based and cloud

infrastructure-integrated SaaS Network infrastructure-focused SaaS

Encapsulation Transparent to application services All functions need explicit NSH encapsulation

or service function proxy

Flow tracking Not available NSH includes whole path info for packets

Service

function

stitching

No implementation required

for high availability

Needs its own implementation


No implementation required


Fig. 2. Example SFC for socks shop applications with additional firewall and

3x scaling to selected functions via Docker Swarm or Kubernetes.

• Functions’ ID, Codes, Configs, Resource (box) req.

• Service Function & Resource Graph

Resource Inter-connect2 Function Deployment3 Function Stitching4

1 Resource allocationPrerequisite:Service Description

0

Fig. 1. Step-by-step procedure for SFC-based service description and

composition.

38

after parsing the description for service composition, the

orchestration tool allocates resource boxes matching the

requirements of lightweight Docker container functions. Then

the orchestration tool makes a dedicated and isolated overlay

networking by interconnecting all resource boxes. Binaries for

involved functions are downloaded from Docker Hub and

deployed (i.e., located, placed, and instantiated) to the resource

boxes. Remember that the interconnections for the resource

boxes are managed via KV-based storage.

However, while both orchestration tools allocate box-style

resources for the service composition in the form of Docker

containers, Docker Swarm orchestration tool is tightly

integrated with extended Docker engine. In comparison,

Kubernetes orchestration tool utilizes only Docker containers

with Docker APIs and wraps it with other open-source tools

for the required orchestration. Kubernetes deploys its own

functions as additional Docker containers running inside

resource boxes while Docker Swarm does not. For example,

when interconnecting allocated resource boxes, Docker

Swarm can provide its own native network driver to form

overlay networking for the application services to be

composed. On the contrary, Kubernetes does not provide any

native counterparts and the operator must choose a network

addon from 3rd parties.

IV. EXPERIMENT ENVIRONMENT AND RESULTS

While Docker Swarm and Kubernetes can orchestrate

container-based MSA-SFC service description and

composition, they have quite different architecture that may

lead to performance differences. Thus, in order to evaluate the

service description and composition for complex

Web-App-DB 3-tier datacenter application services, an

experiment environment is built based on two types of

distributed OpenStack clouds to simulate remote users

accessing the cloud-leveraged services. The primary

OpenStack cloud consists of 3 boxes with Intel Xeon

E5-2640v3, 24GB DDR4 ECC Register RAM, 400GB Intel

750 NVMe SSD and each box is distributed to 3 different sites.

The secondary OpenStack cloud is based on one box (with the

same specification with the former cloud boxes) as OpenStack

control node, another 4 Supermicro SYS-E200-8D boxes with

Intel Xeon D-1528, 32GB DDR4 ECC RAM, 500GB

Samsung SSD. Each experiment is performed mostly on the

VMs with the same configuration of 4 vCPU cores, 8GB

RAM, and 40GB storage.

To eliminate any possible interference with each other, the

service composition by Docker Swarm and Kubernetes are

respectively realized at the different sites of the primary cloud.

The cluster configurations of both orchestration tools are

almost identical with one VM as a manager node, other two

VMs as worker nodes. Also service function is not scheduled

to the manager node. For container networking, Docker

Swarm is configured use its native network driver and

Kubernetes is configured to use Weave Net network addon

from Weaveworks. The application services are using overlay

networking to interconnect all involved functions. Also, both

orchestration tools are configured to interconnect resource

boxes using the same network interface listening for inbound

connections from outside.

Also, for the example application services for the

composition of both clusters, we choose socks shop

application services from Weaveworks by considering its

similarity with the complex application services used for

production and its load-testing function that simulates users to

test the composed service. We also use Linux kernel’s Netfilter

firewall function via Docker Swarm and Kubernetes to block

any unintended access to the application services and modify

its service description to scale socks shop’s functions (i.e., 1

Web and 2 App functions with 3 replications for

load-balancing). To perform load-testing on the composed

application services, the load-testing function is placed at the

last unoccupied sites of the primary and secondary clouds.

Each load-testing function generates 3 clients and 40000

requests to the orchestration tools, respectively. Therefore it

generates 6 clients with 80000 requests to the service

composed by each orchestration tool from 2 remote places.

Our evaluation of service description and composition tool

is focusing on resource usage of the composed services for the

same 3-tier application. For the measurement of evaluations,

we use Intel Snap telemetry framework and place its agents to

each VM and collect CPU utilization percentages, RAM usage

percentages, disk read/write bytes per second, and network

interface sent/received bytes per second2. Collected metrics are

stored into InfluxDB time-series database on another VM

located at the secondary cloud.

Thus, as shown in Table II, when comparing CPU active

percentages for manager node, Kubernetes shows slightly

higher usage than Docker Swarm. For worker nodes, this slight

2 These resource metrics represents the usage of computing, storage,

networking resources.

TABLE Ⅱ

AVERAGE AND STANDARD DEVIATION OF COLLECTED METRICS DURING LOAD TEST ON COMPOSED SERVICES

Tool Node

Type Value

CPU

Active %

RAM

Util %

Ethernet Bytes Disk Bytes

Received Sent Read Written

Docker

Swarm

Manager μ 0.86 % 44.31 % 1009086.44 966419.12 0.00 4741.95

σ 0.76 0.02 361933.12 345394.14 0.00 10297.25

Worker μ 266.52 % 146.04 % 1594812.37 1594812.37 0.00 1045358.23

σ 67.75 7.34 494473.55 494473.55 0.00 798980.66

Kubernetes

Manager μ 4.67 % 29.62 % 1013737.24 974305.11 0.00 60993.94

σ 2.30 0.04 363253.35 346119.57 0.00 40172.65

Worker μ 375.11 % 154.86 % 2214655.04 2214655.04 394567.97 32735233.68

σ 81.74 11.87 669057.69 669057.69 3288517.74 9230700.90

μ = Average, σ = Standard Deviation

39

gap grows and reaches the average difference of 108.53%3.

Memory utilization percentages show that Docker Swarm is

using larger RAM from its manager node. More specifically it

shows about 14.68% more than Kubernetes. Also, both

environments show almost constant usages during the whole

load testing. For the sum of the metrics from worker nodes,

however, Kubernetes is 8.82% higher performance than

Docker Swarm. Network interface sent/received bytes per

second metric shows that Kubernetes generates more traffic

than Docker Swarm with bytes received 4.54KB/s higher at

manager node and 605.32KB/s higher overall at worker nodes

and bytes received 7.70KB/s higher at manager node and

605.32KB/s higher overall at worker nodes. While both

Kubernetes and Docker Swarm never read their manager

node’s disk as the metric always stays at 0, Kubernetes shows

that bytes written only 54.43KB/s more than Docker Swarm.

But Docker Swarm shows more constant rates than

Kubernetes’, with the difference of standard deviation by

about 29,875. However, the most contrasting result is the

metric of written bytes to disks at worker nodes, as Kubernetes

is 30.22MB higher than Docker Swarm and shows 8431720.24

as higher standard deviation. As seen in Fig. 3, Docker Swarm

shows almost no bytes written compared to Kubernetes.

Thus, by considering all the collected metrics, we believe

that Docker Swarm is more effective than Kubernetes in terms

of resource usage for the composition of complex

Web-App-DB 3-tier application services.

V. CONCLUSION

This paper performed the comparison of service description

and composition for complex Web-App-DB 3-tier datacenter

application services by comparing Docker Swarm and

3 The scale of one core’s full utilization is set to 100%.

Kubernetes on OpenStack-based cloud VMs. Also for Socks

shop application service described and composed, VM’s

resource usage was collected during the load testing and

evaluated.

REFERENCES

[1] N. Kratzke and Q. Peter-Christian, “Understanding cloud-native

applications after 10 years of cloud computing - A systematic mapping

study,” Journal of Systems and Software, vol. 126, pp. 1-16, Apr. 2017.

[2] N. Dragoni et al, “Microservices: Yesterday, today, and tomorrow,” in

e-print arXiv:1606.04036. June 2016.

[3] K. Karanasos et al, “Mercury: Hybrid centralized and distributed

scheduling in large shared clusters.” in Proc. USENIX ATC, 2015.

[4] J. Stubbs, M. Walter, and R. Dooley, “Distributed systems of

microservices using docker and serfnode,” IEEE International Workshop

on IEEE Science Gateways (IWSG), 2015.

[5] V. Marmol, R. Jnagal, and T. Hockin, “Networking in containers and

container clusters,” Proc. of NetDev 0.1, Feb. 2015.

[6] B. U. I. Tuan-Anh et al, “Cloud network performance analysis: An

OpenStack case study,” 2016.

[7] A. L. Kavanagh, “OpenStack as the API framework for NFV: The

benefits, and the extensions needed,” Ericsson Review 2, 2015.

[8] Y. Yamato et al, “Development of template management technology for

easy deployment of virtual resources on OpenStack." Journal of Cloud

Computing, vol. 3. No. 1, July 2014.

[9] J. Halpern and C. Pignataro, Service function chaining (sfc) architecture,

IETF RFC 7665. 2015.

Fig. 3. Collected metrics during load testing on the composed services. From left to right, top to bottom: a) CPU utilization on manager nodes, b) CPU utilization

on worker nodes, c) RAM utilization on manager nodes, d) RAM utilization on worker nodes, e) Received/sent bytes via network on manager nodes, f)

Received/sent bytes via network on worker nodes, g) Read/written bytes from/to disk on manager nodes, and h) Read/written bytes from/to disk on worker nodes.

40

Abstract— Changes in the extent of Arctic sea ice, which have

resulted from climate change, offer new opportunities to use the

Northern Sea Route (NSR) for shipping. However, choosing to

navigate the Arctic Ocean remains challenging due to the limited

accessibility of ships and the balance between economic gain and

potential risk. As a result, more detailed information on both

weather and sea ice change in the Arctic region are required. In

this research, a high-resolution global AGCM was used to provide

detailed simulation on the extent and thickness of sea ice in the

Arctic Ocean, which is the AMIP-type simulation for the

present-day climate during 31 years from 1979 to 2009 with

prescribed SST and Sea Ice concentration. For the future climate

projection, we have performed the historical climate during

1979-2005 and subsequently the future climate projection during

2010-2099 with mean of four CMIP5 models due to the two

Representative Concentration Pathway scenarios (RCP 8.5 and

RCP 4.5) respectively. First, the AMIP-type simulation was

evaluated by comparison with the Hadley Centre sea-ice and Sea

Surface Temperature (HadISST) dataset. The model reflects the

maximum (in March) and minimum (in September) sea ice extent

and annual cycle. Based on this validation, the future sea ice

extents show the decreasing trend for both the maximum and

minimum seasons and RCP 8.5 shows more sharply decreasing

patterns of sea ice than RCP 4.5. Under both scenarios, ships

classified as Polar Class (PC) 3 and Open-Water (OW) were

predicted to have the largest and smallest number of

ship-accessible days (in any given year) for the NSR, respectively.

Based on the RCP 8.5 scenario, the projections suggest that after

2070, PC3 and PC6 vessels will have year-round access across to

the Arctic Ocean. In contrast, OW vessels will continue to have a

seasonal handicap, inhibiting their ability to pass through the

NSR.

Index Terms— Climate change, Arctic sea ice extent, Arctic sea

ice thickness, ship accessibility, marine navigation.

I. INTRODUCTION

EAN global atmospheric CO2 reached 396.0 parts per

million (ppm) in 2013 [1]. The continuous increase in

Jai-Ho Oh is with the Department of Environmental and Atmospheric

Sciences, Pukyong National University, Busan, Korea (e-mail:

[email protected]) Sinil Yang is with the Department of Environmental and Atmospheric

Sciences, Pukyong National University, Busan, Korea (e-mail:

[email protected]) Byong-Lyol Lee is with the Commission for Agricultural Meteorology,

World Meteorological Organization, United Nations, Geneva, Swizerland

(e-mail: [email protected])

atmospheric greenhouse gases may be responsible for

anomalously high Arctic sea surface temperatures experienced

in the last 1,450 years at least [2], and the resulting decreases in

Arctic sea ice, which has proceeded at a rate of ~3.5–4.1% per

decade [3]. This level of sea ice disappearance is equivalent to

~0.45–0.51 million km2 per decade, with sea ice melting

particularly significant during the summer months. However,

these changes offer the possible longer seasons of navigation

along the Northern Sea Route (NSR) as well as providing a lot

of economic benefits including the Arctic’s natural resources.

Therefore, understanding the current sea ice trend and

projecting the future change of sea ice is very important for new

challenges of the Arctic maritime activities. Many modeling

studies have been provided the projections of the Arctic sea ice

extent by 2100 from the Intergovernmental Panel on Climate

Change (IPCC) Fourth Assessment Report (AR4) models under

the emission scenarios [4], [5], [6] and they agreed the summer

Arctic sea ice losses of 50 to 80% with large model-to-model

differences. Overland and Wang [6] have suggested the timing

of ice-free conditions considering recent data and expert

opinion based on Coupled Model Intercomparison Project

phase 5 (CMIP5) models for IPCC Fifth Assessment Report

(AR5). Moreover, [7] following [8], which defined the methods

for calculating ship-accessible area at sea, have projected the

shipping accessibility based on CCSM4 (The fourth version of

the Community Climate System Model) sea ice simulations by

2100 under radiative forcing scenarios of IPCC AR5. However,

the spatial resolution of CCSM4 (~1.25o) has limitation to

describe the sea ice extents for the Arctic’s complex geography

such as the Canadian Archipelago.

The objective of the present study is to investigate the future

climate change of ship accessible days in the NSR by the high

resolution Atmospheric Global Climate Model (AGCM) until

2099. The AGCM is the finest resolution comparing previous

studies and these results are capable of providing detailed

regional information on variations in the Arctic sea ice extent

and thickness.

This paper is organized as follows; the model, observational

data, target shipping route and definition of ship accessibility

are described in Section 2. Section 3 verifies the evaluation of

sea ice extent against observation and project the future sea-ice

projections. Based on these results, the potential

ship-accessible dates are presented in same section. Finally, the

conclusions of the study are summarized in Section 4.

Future projections ship accessibility

for the Arctic Ocean based on IPCC CO2

emission scenarios

Jai-Ho Oh, Sinil Yang and Byong-Lyol Lee

M

41




TABLE I

SPECIFICATION OF FOUR MODELS SELECTED FROM THE CMIP5.

No. Model Resolution Country

1 CanESM2 2.8125º × 2.8125º Canada

2 CNRM-CM5 1.40625º × 1.40625º France

3 HadGME2-ES 1.875º × 1.24º United Kingdom

4 MIROC5 1.40625º × 1.40625º Japan

II. MATERIALS AND METHODS

A. Model simulation and its validation

Sea ice simulations were conducted using the GME, AGCM,

which is an operational global numerical weather prediction

model of German Weather Service [9]. This model is

distinguishable from other general circulation models (GCMs)

by a uniform icosahedral–hexagonal grid. In addition, the GME

model has a sea-ice scheme using a bulk thermodynamic ice

model. The sea-ice model is based on a self-similar parametric

representation of the evolving temperature profile within the

ice and on the integral heat budget of the ice slab. It is based on

a parametric representation (assumed shape) of the evolving

temperature profile within the ice and the integral heat budget

of the ice layer [10].

First, for the model validation of present-day climate, we

have conducted the Atmospheric Model Inter-comparison

Project (AMIP)-type simulation using high-resolution GME

AGCM at 40 km/40 layers from 1979 to 2009 with the

prescribed SST and sea ice concentration by NCAR following a

procedure described by [11]. In IPCC AR5, they recommended

several scenarios for atmospheric CO2 concentrations,

including the Representative Concentration Pathway scenarios

(RCP) 4.5 and 8.5. For the future climate change, we have also

conducted the historical simulation during 1979-2005 and

sequentially future climate projections during 2010-2099 based

on RCP scenarios. These future climate simulations used the

composite SST and SIC averaged from the coupled GCM

(CGCM) models of CMIP5. This method may reduce and

underestimate the year-to-year variability in the future climate

projection, nevertheless, the use of averaged SST and SIC

would be useful for the prediction of general tendency for

possible ship accessibility. In available CGCMs of over 50

models involving of over 20 modeling groups [12], the four

CGCM models in different groups were selected in the light of

the large diversity of individual models for the sea ice

simulations [13] and Table 1 lists the detailed specification of

these models. In these models, the most important factor was

the concentration of atmospheric CO2.

To evaluate the capability of the sea ice simulations, the

modeled sea ice extents were compared with the Hadley Centre

sea-ice and Sea Surface Temperature (HadISST) dataset, which

contains 1º × 1º data observed between 1979 and 2009 [14]. In

addition, the simulated monthly sea ice volumes in the GME

were evaluated with the Pan-Arctic Ice Ocean Modeling and

Assimilation System (PIOMAS) Arctic sea ice volume

reanalysis [15] and four sea ice volume in historical run of the

CMIP5, which are CNRM-CM5, MIROC5, MPI-ESM-LR, and

MPI-ESM-P, for the analysis period from 1979 to 2005 years.

B. Shipping routes in Arctic Ocean

Two potential trans-Arctic navigation routes were analyzed

in terms of decreasing sea ice area: the NWP and the NSR [7].

First, NSR is a shipping route following the northern Eurasian

coast between the Bering Strait to the Barents Sea [16].

Secondly, NWP follows along the northern North American

coast and the Canadian Arctic archipelago [17], [18].

Previous studies have focused on the NSR, with relatively

few considering the NWP, perhaps because sea ice thicknesses

in the NWP are thicker than on the NSR. Furthermore, the

geographical complexity of the coastline along the NWP, as

compared with the NSR, provides an additional difficulty for

shipping.

C. Definition of ship accessibility

For the projection of the future ship-accessible days, we

followed the method of [7], who introduced a formula to define

ship accessibility for various ship types as follows:

)()()( nnbbaa IMCIMCIMCIN (1)

where IN is the ice numeral, Ca/Cb is the sea ice

concentrations of ice types a and b, and IMa/IMb is the ice

multiplier of ice types a and by [7], [17]. The ice multiplier (IM)

is a non-zero integer variable indicating the risk presented by a

particular ice type to a vessel of a given class in Table 2. If IN is

negative, the area is inaccessible for a vessel, while when IN is

positive, navigation through the ice regime is possible. When

we calculated the IM index in this study, six ice types in Table 2

were identified in the Arctic Ice Regime Shipping System

(AIRSS) with accompanying thickness ranges. Thickness

ranges for older ice classes were calculated from observed

thickness and age data [20], [21]. Following the approach of [7],

three vessel classes, which are Polar Class 3 (PC3), PC6, and

Open-Water (OW), were considered to calculate the IN in two

RCP scenarios every 20-year time windows for three vessel

classes (Table 2). PC3 vessels are icebreakers capable of

‘year-round operation in second-year ice, which may include

TABLE II

ICE MULTIPLIERS FOR SELECTED VESSEL CLASSES.

Ice Type (n) Thickness

(cm)

Vessel Ice Multiplier1

PC3 PC6 OW

Open-Water ~10 2 2 2

Gray 10~15 2 2 1

Gray-white 15~30 2 2 −1

Thin 1st-year, 1st stage 30~50 2 2 −1

Thin 1st-year, 2nd stage 50~70 2 2 −1

Medium 1st-year 70~120 2 1 −2

Thick 1st-year 120~160 2 −1 −3

2nd-year 160~190 1 −3 −4

Multi-year 190~ −1 −4 −4 1Vessel class abbreviations: PC = Polar Class; OW = Open-Water.

42

multi-year ice inclusions’. PC6 are moderately

ice-strengthened ships capable of ‘summer/autumn operation in

medium first-year ice, which may include old ice inclusions’.

OW ships have no ice strengthening features.

In addition, we used 3-hourly sea-ice fraction and sea-ice

thickness in 40 km horizontal resolution of the GME model. In

order to investigate the ship accessibility, daily and monthly

means were calculated from at each time window, and then the

ship accessible date in the Arctic shipping routes was calculated

by considering with both sea-ice results and land-use on each

grid point along each coastlines. Finally, we performed to

analyze the ship accessibility every 10 years up to 2090 year

using “A star search algorithm”, which is widely used in

pathfinding and traversable path between multiple points,

based on these sea-ice results. In addition, results in this study

have detailed temporal resolution more than [7].

III. RESULTS AND DISCUSSION

A. Evaluation of simulated sea-ice extent

To assess the ability of the model simulation, the simulated

sea ice thicknesses for both March and September, which are

known as the maximum and minimum seasons for sea ice

extent in the Arctic Ocean, for the period 1979–2009 were

compared with the observed sea ice extent from the HadISST

(Fig. 1). The results show that the extent of simulated sea ice

was in good agreement with the 15% concentration boundary of

the HadISST dataset for both the maximum and minimum

seasons and these are consistent with multi-model average of

sea ice extent in previous similar studies [5], [22].

The seasonal variations in the averaged sea ice extent were

also in good agreement with the HadISST dataset, with the

maximums observed in February and March and the minimums

observed in August and September. The simulated sea ice

extent was underestimated for most months and this

underestimation corresponds with those of most models

analyzed by [22]. This may reflect the design of current sea ice

model in the GME, which does not allow sea ice to move into

neighboring grids by wind action. In addition, the sea ice

thickness was evaluated as a sea ice volume (103 km3). A

comparison of the simulated seasonal cycle of sea ice volume

(Table 3) shows remarkable agreement between all modeled ice

volume and reanalysis PIOMAS, for which those five

correlation coefficients are greater than 0.98. In particular, the

four historical sea ice volume of the CMIP5 shows high

correlation coefficient with the reanalysis, almost 1.0.

Although the sea ice thickness in GME can change only

through radiative heating or cooling, or through heat exchange

with the atmosphere (i.e., thermodynamics sea-ice model), the

sea ice volume in GME has lowest root mean squared error

(RMSE) among the modeled ice volume (Table 3). It might be

due to the difference of the experimental design (e.g., AMIP

and historical run).

B. Future sea-ice projections

The spatial distribution of future changes in sea ice

thicknesses for March and September (Fig. 2) were calculated

for the 2030s (2030–2039), 2050s (2050–2059), 2070s

(2070–2079), and 2090s (2090–2099). The results showed a

decreasing trend in sea ice thicknesses for both the maximum

and minimum seasons as a result of global warming, in turn due

to increased levels of atmospheric greenhouse gases, as given

by both the RCP 4.5 and RCP 8.5 scenarios. However, RCP 8.5

showed more sharply decreasing patterns of sea ice for both

March and September, as compared with RCP 4.5.

From the 2030s to 2090s, the most significant monotonically

decreasing patterns were found in the Greenland Sea and

Barents Sea in March, where the Arctic sea ice extent and

thickness reach their maximum. In particular, after the 2070s,

the March sea ice thickness was < 1.5 m for both the NSR and

the NWP (Fig. 2). This shrinking sea ice area and decreases in

sea ice thickness in the Arctic Ocean, even during the

maximum sea ice season, indicates a longer period of

opportunity for commercial shipping through both the NSR and

NWP within the near future.

In a similar manner, Arctic sea ice also diminished in

September. The simulations forecasted that sea ice will have

almost completely disappeared after the 2030s, using both the

RCP 4.5 and RCP 8.5 scenarios, with the exception of a limited

area around the North Pole. This disappearance of sea ice

during the summer will provide an open channel for

commercial shipping through the Arctic Ocean in the near

future.

Fig. 1. Spatial distribution of sea ice thickness (m) in present-day climate simulations (1979-2009) in the Northern Hemisphere for: (a) March, and; (b)

September. The black solid line denotes the observed 15% concentration

boundary of the Hadley Centre sea-ice and Sea Surface Temperature

(HadISST) dataset.

TABLE III

STATISTICAL PARAMETER DETERMINED USING MONTHLY CLIMATOLOGY

OF MODELED ARCTIC SEA ICE VOLUME WITH REANALYSIS PIOMAS

FOR 1979 TO 2005

Model r RMSE (103 km3)

CNRM-CM5 0.99 6.08

MIROC5 0.99 12.78

MPI-ESM-LR 0.99 4.91

MPI-ESM-P 0.99 6.43

GME 0.98 4.63

43

When validated relative to the present-day climate

(1979–2005), the decadal changes in averaged Arctic sea ice

thicknesses (calculated based on both RCP 4.5 and RCP 8.5) in

March and in September (from the 2010s to the 2090s) also

clearly showed decreasing trends and this September trend

shows similarities with [22], which indicates the time series of

September sea ice extent based on 89 ensemble members from

36 CMIP5 model but slightly faster decreasing trend than

multi-model average. In comparison between the two scenarios,

the maximum (March) Arctic averaged sea ice thickness using

the RCP 8.5 scenario (−0.3229) was more than double that of

the RCP 4.5 scenario (−0.1729) during the 2090s. Decreases

were also observed in the minimum (September) Arctic

averaged sea ice thicknesses, although in this case both

Fig. 2. Future projections of Arctic sea ice thickness (m) in March (a and b figures) and September (c and d figures) for the (1) 2030s (2030-2039), (2) 2050s

(2050-2059), (3) 2070s (2070-2079), and (4) 2090s (2090-2099) relative to current climate (1979-2005), based on Representative Concentration Pathways (RCP)

4.5 (a and c figures) and 8.5 (b and d).

44

scenarios produced the same magnitude of change (e.g.,

−0.1651 in the 2090s).

C. Potential ship-accessible dates

Based on the validation to present-day climate simulations,

we estimated the numbers of ship-accessible days for five

different future years, from 2010 to 2090 every 20 year, in

Table 4. The results showed that ship-accessible days will

gradually increase; although, as expected, the rate of increase

was higher when using the RCP 8.5 scenario, as compared with

the RCP 4.5 scenario, for all vessel classes. In particular, the

rate of increase in ship-accessible days was most noticeable

after the 2050s, which may reflect the fact that most thick

multiyear sea ice will have disappeared from the Arctic Ocean

(except the region near the Greenland and the Canadian Arctic

Archipelago) to be replaced by either newly formed first year

sea ice or thin perennial sea ice.

Under the RCP 4.5 scenario ship-accessible days for OW,

PC6, and PC3 will have increased from 59 (2010) to 140 days

(2090; +137.2%), 128 to 185 days (+44.5%), and 188 to 278

days (+47.8%), respectively. Under the RCP 8.5 scenario these

ranges were extended, with increases from 87 to 223 days

(+156.3%), 151 to 273 days (+80.7%), and 220 to 365 days

(+65.9%), respectively (Table 4).

For commercial ships (OW), potential ship-accessible

periods within NSR using the RCP 4.5 scenario can be

summarized as follows: In the 2030s, the NSR will open on

July 15 and close on November 2, yielding a total of 111

accessible days. In the 2050s, the opening and closing dates

will be July 10 and November 18, respectively, which will

result in 132 ship accessible days. For the 2070s and 2090s, the

NSR will open on July 5 and July 4, and close on November 21

and November 24, respectively. Thus, the NSR will be

accessible to OW for 140 days in the 2070s and 144 days in the

2090s.

The future ship-accessible days for commercial OW ships

were extended when using the RCP 8.5 scenario. In the 2030s,

2050s, 2070s, and 2090s, the NSR will open on July 31st, July

13th, June 30th, and June 3rd, respectively, while the projected

closing dates are November 2nd, November 21st, December

17th, and January 11th, respectively. Thus, under the RCP 8.5

scenario, OW ships will be able to access the NSR for 95, 135,

171, and 223 days in the 2030s, 2050s, 2070s, and 2090s,

respectively. Furthermore, under the RCP 8.5 scenario, PC3

ships will be able to operate year round by 2090s.

IV. CONCLUSIONS

In this study, a high-resolution global numerical weather

prediction model (GME) was used to predict the changing

extent and thickness of Arctic sea ice in the 21st century. By

comparing simulated sea ice data with observations from the

HadISST dataset, the results showed that the GME model is

capable of accurately simulating current sea ice in the Arctic

Ocean. The spatial distribution of simulated sea ice for the

present-day climate (1979–2009) was in a good agreement with

the HadISST observations, including the maximum extent of

sea ice in February and March and the minimum extent in

August and September. However, the model did somewhat

underestimated sea ice extent, which may have been caused by

an absence of wind driven sea ice advection in the model, or by

the way in which the HadISST dataset is constructed.

We also used the GME model results to investigate the

changing sea ice extent in terms of ship accessible days for

three ship classes in both the NSR. The dominant feature was

TABLE IV

POTENTIAL OPENING DATES, CLOSING DATES, AND NUMBER OF SHIP-ACCESSIBLE DAYS BY MODEL1 AND VESSEL CLASS

2 FOR THE NORTHERN SEA ROUTE (NSR)

Vessel Classes

Year

RCP4.5 RCP8.5

Open Close Accessible

Days Open Close

Accessible

Days

OW

2010 8/11 10/8 59 7/26 10/20 87

2030 7/15 11/2 111 7/31 11/2 95

2050 7/10 11/18 132 7/13 11/24 135

2070 7/5 11/21 140 6/30 12/17 171

2090 7/4 11/24 144 6/3 1/11 223

PC6

2010 7/25 11/29 128 7/11 12/8 151

2030 7/12 12/23 165 7/8 12/24 170

2050 7/10 1/9 184 7/9 1/12 188

2070 7/4 1/3 184 6/21 2/11 236

2090 7/4 1/4 185 6/1 2/28 273

PC3

2010 7/27 1/30 188 6/29 2/3 220

2030 6/20 2/14 240 6/18 2/18 246

2050 6/18 3/9 265 6/16 3/18 276

2070 6/9 3/11 276 5/24 4/26 338

2090 6/11 3/7 270 - - 365

1 Data based on Representative Concentration Pathways (RCP) 4.5 and 8.5 2 Vessel class abbreviations: PC = Polar Class; OW = Open-Water.

45

an increasing trend in ship-accessible days, together with the

progress of warming in the Arctic Ocean. This trend was more

significant (for all vessel types) using the RCP 8.5 scenario.

Despite significant advances in ship design and receding sea

ice cover technology in recent decades, navigation in Arctic

waters remains limited due to its remoteness, with navigational

mistakes potentially fatal for both the operators and for the

environment. The results of this study have provided and

validated high-resolution forecasts of the sea ice extent in the

Arctic, which are essential for both safer navigation in the

presence of current sea ice levels, as well as for long-term

predictions of sea ice thickness, which have practical and

economic impacts for commercial shipping.

ACKNOWLEDGMENT

Computing resources were provided by the Partnership &

Leadership for the Nationwide Super Computing Infrastructure

(PLSI) project of the Korea Institute of Science and

Technology Information (KISTI).

REFERENCES

[1] World Meteorological Organization (WMO), 2014: Record Greenhouse

Gas Levels Impact Atmosphere and Oceans. Press Release No. 1002, Sep. 9, 2014, Geneva.

[2] C. Kinnard, C. M. Zdanowicz, D. A. Fisher, E. Isaksson, A. de Vernal,

and L. G. Thompson, “Reconstructed changes in Arctic sea ice over the past 1,450 years”, Nature, vol. 479, pp. 509–512, 2011.

[3] T. F. Stocker, D. Qin, G.-K. Plattner, “Climate Change 2013: The

Physical Science Basis. Contribution of Working Group I to the Fifth

Assessment Report of the Intergovernmental Panel on Climate Change”,

Cambridge University Press, 2013.

[4] X. Zhang and J. E. Walsh, “Toward a seasonally ice-covered Arctic Ocean: scenarios from the IPCC AR4 model simulations”, Journal of

Climate, vol. 19, pp. 1730–1747, 2006.

[5] O. Arzel, T. Fichefet, and H. Goosse, “Sea ice evolution over the 20th and 21st centuries as simulated by the current AOGCMs”, Ocean Modelling,

vol. 12, pp. 401–415, 2006.

[6] J. E. Overland and M. Wang, “When will the summer Arctic be nearly sea ice free?”, Geophysical Research Letters, vol. 40, pp. 2097-2101, 2013.

[7] S. R. Stephenson, L. C. Smith, L. W. Brigham, and J. A. Agnew,

“Projected 21st-century changes to Arctic marine access”. Climatic Change, vol. 118, pp. 885-899, 2013.

[8] S. R. Stephenson, L. C. Smith, and J. A. Agnew, “Divergent long-term

trajectories of human access to the Arctic”, Nature Climate Change, vol. 1, pp. 156–160, 2011.

[9] D. Majewski, D. Liermann, P. Prohl, B. Ritter, M. Buchhold, T. Hanisch,

G. Paul, W. Wergen, and J. Baumgardner, “The Operational Global Icosahedral–Hexagonal Gridpoint Model GME: Description and

High-Resolution Tests”, Monthly Weather Review, vol. 130, pp. 319–338,

2002. [10] D. Mironov, and B. Ritter, “A first version of the ice model for the global

NWP system GME of the German Weather Service”, No. 33. Report, 2003.

[11] J. W. Hurrell, J. J. Hack, D. Shea, J. M. Caron, and J. Rosinski, “A New

Sea Surface Temperature and Sea Ice Boundary Dataset for the Community Atmosphere Model”, Journal of Climate, vol. 21, pp.

5145–5153, 2008.

[12] K. E. Taylor, R. J. Stouffer, and G. A. Meehl, “An overview of CMIP5 and the experiment design”, Bulletin of the American Meteorological

Society, vol. 93, pp. 485–498, 2012.

[13] F. Massonnet, T. Fichefet, H. Goosse, C. Bitz, G. Philippon-Berthier, M. Holland, and P.-Y. Barriat, “Constraining projections of summer Arctic

sea ice”, The Cryosphere, vol. 6, pp. 1383–1394, 2012.

[14] N. A. Rayner, and 7 others, “Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth

century”, Journal of Geophysical Research, vol. 108 (D14), pp. 4407,

2003. [15] J. Zhang and D. Rothrock, “Modeling global sea ice with a thickness and

enthalpy distribution model in generalized curvilinear coordinates”,

Monthly Weather Review, vol. 131, pp. 845–861, 2003. [16] M. Liu and J. Kronbak, “The potential economic viability of using the

Northern Sea Route (NSR) as an alternative route between Asia and

Europe”, Journal of Transport Geography, vol. 18 (3), pp. 434–444, 2010.

[17] S. E. L. Howell and J. J. Yackel, “A vessel transit assessment of sea ice

variability in the Western Arctic (1969-2002): implications for ship navigation”, Canadian Journal of Remote Sensing, vol. 30, pp. 205–215,

2004.

[18] V. C. Khon, I. I. Mokhov, M. Latif, V. A. Semenov, and W. Park, “Perspectives of Northern Sea Route and Northwest Passage in the

twenty-first century”, Climatic Change, vol. 100, pp. 757–768, 2010.

[19] Transport Canada, “Arctic ice regime shipping system (AIRSS) standard”, Otawa, 1998.

[20] R. Kwok, G. F. Cunningham, H. J. Zwally, and D. Yi, “Ice, Cloud, and

land Elevation Satellite (ICESat) over Arctic sea ice: Retrieval of freeboard”, J. Geophys. Res., vol. 112, C12013, 2007.

[21] J. A. Maslanik, C. Fowler, J. Stroeve, S. Drobot, J. Zwally, D. Yi, and W.

Emery, “A younger, thinner Arctic ice cover: Increased potential for rapid, extensive sea-ice loss”, Geophysical Research Letters, vol. 34, L24501,

2007.

[22] Langehaug, H. R., F. Geyer, L. H. Smedsrud, and Y. Gao, “Arctic sea ice decline and ice export in the CMIP5 historical simulations”, Ocean

Modelling, vol. 71, pp. 114-126, 2013.

Jai-Ho Oh is a Professor of the

Department of Environmental and

Atmospheric Sciences and Director of the

Supercomputing Center of Pukyong

National University, Busan, Korea. After

completing the Bachelor degree in

Atmospheric Sciences from the Seoul

National University, Prof. Oh joined the

Korean Air Force as weather forecasting

officer in 1976, and retired as captain. Prof. Oh got his master

and Ph.D. from the Oregon State University, Corvallis, Oregon,

USA. After that he had worked at the University of Illinois at

Urbana-Champaign, and Argonne National Laboratory in USA.

His primary field of research is Climate System Modeling

which includes several major interaction among Atmosphere,

Hydrosphere, Biosphere and Lithosphere. His current research

interests include numerical weather prediction, disaster

prevention, early warning, and regional impact of climate

change. He had also been the Director of the Center for

Atmospheric Sciences and Earthquake Research, in which he

conducted about 100 projects, for the period of 2005-2010.

Sinil Yang received the B.S. and M.S.

degree in atmospheric science from

Pukyong National University, Busan,

South Korea, in 2011 and 2013. He is

currently pursuing the Ph.D. degree in

extreme wave climate on climate change

based on climate modeling at Pukyong

National University, Busan, South Korea.

46

Byong-Lyol Lee is a President with the

Commission for Agricultural Meteorology

of World Meteorological Organization in

United Nations, Geneva, Switzerland. He

is also an Overseas Dean with the College

of Applied Meteorology, Nanjing

University of Information Science and

Technology, Nanjing, China.

He received the M.S. degrees in Crop

Ecology from the Seoul National University in 1981 and Ph.D.

degree in Agricultural Ecology and Field Crop Science from

Cornell University in 1990. He is a science committee member

in Global Environment and Natural Resource Institute, a chief

scientist at NCAM for global collaborations, an advisor to

KMA for external collaborations. He is also a member in

Advisory committee of IDMP (Integrated Drought

Management Program) of WMO/GWP, and a member in

Inter-Commission Expert Group of WMO Information System

(WIS).

47


Abstract—The Software-Defined Networking (SDN) is

considered to be an improved solution for applying flexible control and operation recently in the network. Its characteristics include centralized management, global view, as well as fast adjustment and adaptation. Many experimental and research networks have already migrated to the SDN-enabled architecture. As the global network continues to grow in a fast pace, how to use SDN to improve the networking fields becomes a popular topic in research. One of the interesting topics is to enable routing exchanges among the SDN-enabled network and production networks. However, considering that many production networks are still operated on legacy architecture, the enabled SDN routing functionalities have to support hybrid mode in operation. In this paper, we propose a routing exchange mechanism by enabling reactive BGP peering actions among the SDN and legacy network components. The results of experiments show that our SDN development is able to work as a transit Autonomous System (AS) to exchange routing information with other BGP routers.

Index Terms—Software-Defined Networking, OpenFlow, BGP, Software-Defined Routing.

I. INTRODUCTION S the evolution of the network, there are more and more requirements for new protocol testing or devices update

in all network environments. However, under current network architecture, it takes both huge time and financial cost to carry out these tasks. For example, routers play an indispensable role in the environment such as data centers or backbone networks in which even shutdown a little while for update will result in an unpredictable loss. Besides, network management and performance tuning is quite challenging because that network devices are usually vertically-integrated black boxes [1]. The development of devices is mastered by the vendors, whereas customers can only passively wait for the expensive and inflexible products provided by them.

The above-mentioned example shows the limit of the legacy

Hao-Ping Liu, Pang-Wei Tsai, Wu-Hsien Chang and Chu-Sing Yang are with the Institute of Computer and Communication Engineering, Department of Electrical Engineering, National Cheng Kung University, Taiwan (email: [email protected], [email protected], [email protected], [email protected] )

network. Eventually, networks with this closed architecture become ossified [2] and lead to a bottleneck for the progress of the real world. Yet the emergence of Software-Defined Networking (SDN) [3] provides a solution to this problem. SDN brought the concept that separating the data plane and the control plane of a network. Allowing network operators to directly operate networks in a centralized manner with an independent controller in the control plane. In addition, the devices in the data plane such as switches, just simply perform the forwarding of packets according to the policies set by the SDN controller. There are already some ongoing researches and implements of SDN [4], and Figure 1 shows the most popular referred SDN architecture [5].

In this SDN architecture, developers can easily deploy their innovations just by programming applications in the application layer. The core network services in the control layer interact with the applications through the Northbound Interface such as a RESTful Application Programming Interface (API) [6], and dynamically modify the forwarding behavior of the network devices in the infrastructure layer through the Southbound Interface, that is, the OpenFlow protocol [2]. A device in the infrastructure layer maintains flow tables which are composed of several flow rules. A flow rule contains a match field and an instruction field. The match field defines a series of characteristics of a packet, and the instruction field defines several actions to manipulate a matched packet. When a packet comes into a data plane device, a pipeline procedure starts to compare the incoming packet through the match field of these flow rules, and finally figure out the output port or other operations to this packet.

Design and Development of the Reactive BGP peering in Software-Defined Routing Exchanges

Hao-Ping Liu, Pang-Wei Tsai, Wu-Hsien Chang and Chu-Sing Yang

A Fig. 1. The logical view of a SDN architecture [5].

48





Through centralizing the control intelligence and modifying the flow tables, SDN breaks the monopoly of the vender-dependent network appliances by using commodity hardware with a free, open source Network Operating System (NOS) [7]. Network hardware and software can then evolve independently, and the function developers turn to just focus on the exploitation of their new ideas without concerning about the difficulty in the subsequent deployment. SDN augments the programmability and virtualization while simultaneously simplifies the configuration and troubleshooting of networks. Though many challenges still in processing, SDN has been considered as the revolution to the current networking. Besides the newly deployment of SDN in wide-area networks [8], the conversion from legacy IP networks to the SDN or hybrid networks is also an ongoing research issue now [4]. The challenge is, as Sezer et al. [9] has pointed out, it requires a hybrid infrastructure in which the legacy and SDN-enabled network nodes can operate in harmony. Such interoperability needs SDN communication interfaces to provide backward compatibility with the existing IP routing to retain the connection between the SDN network and other legacy IP networks. To solve this challenge, Lin et al. [10], Rothenberg et al. [12], and Thai et al. [13] have mentioned the utilization of BGP [14]. Due to its stable and widely deployed in current IP networks, keeping using BGP during the gradual update is more practical.

In this paper, we design a virtual BGP entity that combines a reactive BGP peering mechanism to the SDN control logic. With this design, the SDN domain is able to act as a transit AS which can reactively build BGP sessions with external legacy networks and propagate the routing information as well as the inter-domain IP flows from one external network to the others. The remainder of this paper is organized as follows. Section II gives a brief introduction to the related works. Section III demonstrates the comprehensive design of our system. Section IV brings an experiment to verify the functionality of our implementation. Section V gives the discussion over the experimental results and indicates the potential improvements. Finally, a conclusion of this paper is provided in Section VI.

II. RELATED WORK There are already some researches and implementations

about designing a BGP-enabled SDN framework or a hybrid system that associating SDN with IP routing. These works bring about many great ideas, and this section gives a brief introduction to them. 2.1 RouteFlow [15] uses virtual machines (VMs) to control

the behavior of OpenFlow switches by mapping each active ports of switches to a virtual network interface on VMs one by one. These VMs run open source routing protocols such as BGP and Open Shortest Path First (OSPF) [16], and form a virtual topology by connecting with each other. Therefore, VMs can exchange the routing information and control the behavior of the switches as if they are running a distributed control plane.

2.2 Open Source Hybrid IP/SDN networking (OSHI) [17] combines the regular IP routing with SDN-based

forwarding and provides a hybrid IP/SDN network node on Linux. This hybrid node uses Quagga software [18] for OSPF routing and Open vSwitch software [19] for OpenFlow-based switching. Packets can be routed in regular IP method or SDN-based paths (SBPs) alternatively by considering the headers at different protocol levels. Evaluations are also presented to display the performance of SBPs.

2.3 Hong et al. [20] propose a hybrid system consisting of both legacy forwarding devices and programmable SDN switches. They study how to satisfy a variety of traffic engineering goals such as load balancing or fast failure recovery during the incremental deployment of SDN. An evaluation on real ISP and enterprise topology is also presented with discussion.

2.4 SDN-IP [10] and BTSDN [11] both propose a peering manner between SDN and IP networks. In their SDN context, several legacy BGP routers are attached to the OpenFlow switches. These BGP routers are responsible for peering with the external IP networks. The routing information received by these routers in the data plane should be synchronized to the SDN-IP application in the SDN controller via an out-of-band control link as Figure 2 shows. This approach utilizes legacy BGP routers as a BGP proxy for the SDN domain. However, considering the spirit of SDN, that is, centralizing all configuration and control of the network, we think removing the proxy BGP routers and just integrating the BGP control mechanism into the SDN/OpenFlow architecture is more intuitive. This idea then turns out to be our motive.

III. SYSTEM DESIGN Since the biggest difference between SDN-IP and our

system is that we combine the BGP capacity to the SDN control logic rather than using a legacy BGP router in the data plane as a proxy, the BGP messages from neighbors are actually encapsulated as OpenFlow packet-in messages and sent to the controller by switches. Similarly, the replies from the controller are also encapsulated as OpenFlow packet-out messages and sent to the corresponding switch which will forward it to the corresponding neighbor afterward. The details of the operation will be described in the following article. In this chapter, part A gives an overall view of the scenario. Part

Fig. 2. The architecture of SDN-IP network peering [9].

49

B describes how to achieve the peering mechanism by the cooperation of modules designed by us. Part C shows the receipt, handling and advertisement of the routing information as well as the subsequent update of Routing Information Base (RIB). Finally, part D describes how we fulfill the requirement of software-defined routing for IP traffics over the SDN network.

A. Overview Our approach simplifies the peering mechanism from

SDN-IP by removing the legacy BGP routers in the SDN data plane. Figure 3 describes the scenario that two legacy networks with AS number 65001 and 65002 connect to a SDN network with AS number 65000. Each external network has an edge BGP router (named r1 or r2) which are used to peer with the SDN domain. In the SDN domain, s1 and s2 are OpenFlow-enabled switches that connect to r1 and r2 respectively, and the remaining OpenFlow-enabled switches in the SDN domain are named as intermediate switches. All of these switches are controlled by a SDN controller.

In the controller, we leverage virtual network interfaces and several programming modules to constitute a virtual BGP entity to handle the procedure of the External BGP (eBGP) sessions. Figure 4 shows all of the modules used by the virtual BGP entity with their organization. After an initialization by Main module, every BGP control message from the neighbors (i.e., r1 and r2) will match a proactively installed table-miss flow rule in the data plane and then be encapsulated as an OpenFlow packet-in message to the controller. Protocol Handler module is responsible for parsing the BGP packets in these packet-in messages and deciding the next step, such as replying a BGP open message or a BGP keep-alive message to start or maintain an eBGP session. In this manner, our virtual BGP entity can properly interoperate with the neighbors.

B. Peering Mechanism To achieve the BGP peering, what we need to handle is the

entire control of the communication. So our Protocol Handler module must be able to respond correctly for different kinds of requests including ARP, TCP handshake and BGP queries. In the initialization, Main module acquires neighbors’ information by reading a configuration file set in advance. Then the system gets ready to parse the incoming packets and starts waiting for the requests from the external BGP routers. To respond to the layered design of TCP/IP suite, our Protocol Handler is also designed in a layered manner. For an incoming packet from a neighbor, Protocol Handler judges and calls submodules, including ARP Handler, ETH Handler, IPv4 Handler, TCP Handler and BGP Handler, to handle packet headers at different protocol level, and generates the appropriate reply. Afterward, Main module assigns the corresponding switch to send out this reply back to the neighbor. This is how a control packet from neighbors be handled.

C. RIB Update We need to update the RIB of the virtual BGP entity once a

BGP update message is recognized by Protocol Handler. An RIB update event will be triggered and inform BGP Handler to take out the information, including Network Layer Reachability Information (NLRI), path attributes and withdrawn routes (if any) from the packet, then RIB Handler uses this information to insert or delete prefixes in the local RIB. Finally, after the RIB update, our BGP entity should also advertise this update information to the other neighbors to continue the information propagation.

D. Software-defined Routing Mechanism Our virtual BGP entity has learned and propagated the

routing information among neighbors after RIB updates. Each external BGP router regards our virtual BGP entity as the next hop to the others. For the external IP flows from one external BGP router to another, that is, the inter-domain IP flows, we design a software-defined routing mechanism to arranges a path inside the SDN data plane. This path, called flow path, is composed of a series of switches that can forward the IP traffics one switch by one switch over the SDN domain. However, OpenFlow switches can do nothing before flow rules have been installed to them. Also, the destination MAC address of the IP flows is still the MAC address of our virtual BGP entity. Some of above functionalities are based on our previous works [21][22]. The following descriptions will introduce how Path Handler module dynamically install and remove flow rules to achieve the routing of the external IP flows. 1) Flow Path Installation

To prepare a path between two corresponding neighbors

for a new inter-domain IP flow. In current prototype we

select the shortest path among the switches. The first

packet of this IP flow causes a packet-in event to the

controller and triggers Path Handler to install a series of

flow rules to the switches along the selected flow path.

Switches with these flow rules match the input port as well

Fig. 3. The scenario of our approach.

Fig. 4. The organization of our modules.

50

as the destination IP prefix of the packets, and send out the

matched packets to an output port. Traffics of the IP flows

can be routed along this path by continuously matching

these rules from one switch to the next switch and finally

reach the corresponding neighbor.

2) Layer 2 and Layer 3 Routing Mechanism Even though we have arranged a path for the

inter-domain traffics from one neighbor to another

neighbor, the question is that the destination MAC address

of these traffics are still the MAC address of the virtual

interface because the sender regards our virtual BGP entity

as the next hop. If switches just forward them, packets will

be dropped due to the wrong destination MAC address. So

in order to satisfy layer 2 connection, we add a destination

MAC rewriting action to the flow rule. As for satisfying

layer 3 routing, we also add a Time To Live (TTL) value

decreasing action to this flow rule. Eventually, after a

packet matches this flow rule, the packet’s TTL value will

be subtracted by 1 and the destination MAC address will be

changed to the MAC address of another neighbor just as

how a router routes a IP packet.

3) Flow Path Elimination

Besides installation, we still need a mechanism to

eliminate the useless flow rules to avoid the excessive

entries on the switches. OpenFlow provides an idle timeout

control for the removal of a flow rule. We use this feature

and set a proper timeout period for flows in different

priority levels. Then the flow rule will be automatically

eliminated from the flow table if no packets match it before

the timer expires. Finally, combining the flow path

installation with elimination mechanism, we can

dynamically insert and remove flow paths in the SDN

domain.

IV. EXPERIMENT RESULTS At current stage, we devoted to the architecture design and

implement the first prototype. The following experiments are designed to verify the feasibility of our idea, including the handling of the BGP sessions as well as the software-defined routing mechanism for the inter-domain IP flows.

We adopt Mininet [23] as the network emulator for this experiment. Researchers can use Mininet to design a customized virtual network testbed on a single Linux kernel simply with its virtualization capability. We run the experimental topology shown in Figure 5 on Mininet. This topology shows three legacy IP networks with AS number 65001, 65002, and 65003 respectively. They all connect to a SDN domain with AS number 65000. There is a BGP routers in each legacy IP network (i.e., r1, r2, and r3) and each BGP router individually connects to a host in their domain (i.e., h1, h2, and h3). In the data plane of the SDN domain are three inter-connected OpenFlow switches (i.e., s1, s2, and s3) which link to a legacy network respectively. These OpenFlow switches are all controlled by single SDN controller in the SDN control plane.

For r1, r2, and r3, we adopt Quagga routing suite version 0.99.22.4 as the BGP software. Quagga is a network routing software suite that can provide a Unix-like system with

multiple routing mechanisms. In the SDN domain, we adopt Open vSwitch version 2.0.2 implemented in Mininet as the switch software for s1, s2, and s3, and Ryu [24] version 4.10 as the SDN controller software. Ryu is a SDN framework based on Python module components. Ryu provides many well defined APIs that simplify the development of the

Fig. 5. Experimental topology.

Fig. 6. Routing tables of each external router.

Fig. 7. The successful Ping test between hosts.

51

management and control in the SDN environment. Both Mininet and Ryu software are running on the same VM which is equipped with 2 processors, 4GB RAM, and runs Ubuntu-14.04-desktop-amd64 as the OS. This VM is managed by the VirtualBox software [25] running on a PC with 12GB RAM, intel core i7-4770 CPU, and Microsoft windows 10 as the OS.

To validate feasibility of our design, we start Ryu with our approach as an application to control this topology. After a little while for building BGP sessions and exchanging routing information, we check the routing table of the three external BGP routers. As the routing tables shown in Figure 6, every BGP router records the IP prefix of other ASes. Thus we confirm that our SDN domain can properly receive the BGP update messages from an external IP network and advertise these routes to the others. Then we do the Ping tests between the hosts in different legacy networks (i.e., h1, h2, and h3) to make sure the software-defined routing of IP traffics over the SDN domain. As Figure 7 that shows the successful Ping requests and replies, the software-defined routing mechanism of our approach is proved.

V. DISCUSSION The above experimental result illustrates that SDN domain

with our approach is able to exchange routing information with neighbors. IP flows are also allowed to traverse the SDN domain with the software-defined routing mechanism. For the external BGP routers, the SDN domain performs just as same as a legacy BGP router. We have achieved the basic stitching between these two type of network paradigms. However, to become more practical, the BGP routing mechanism should still satisfy several requirements such as high capacity of RIB, fast IP lookup, high reliability and so on. We have not tested our system with the BGP routers in real internet environment. Scalability issues are predictable due to the restriction of the size of flow tables in the switches and the performance of single controller. Furthermore, there are still many topics worth studying by considering the advantages of SDN based on this approach. For example, now the flow paths are selected by only considering shortest path of the switches, we can design a best flow path selection algorithm by thinking of more conditions like switches’ load or flow priority, and even design a flow migration mechanism to prevent a switch failure or link break. These are all potential issues which we have planned to implement in the future.

VI. CONCLUSION

In this paper, we design a suite of reactive BGP peering and a software-defined routing mechanism that can mask a SDN domain as a transit AS to propagate routing information and IP flows among the adjacent external networks. This design can increase the compatibility of SDN and legacy IP networks during the incremental deployment. To integrate BGP control into the SDN control logic, we design a virtual BGP entity in the SDN controller. By utilizing OpenFlow packet-in and packet-out messages, our system enables the SDN controller to exchange BGP messages with neighbors through the OpenFlow switches in the data plane. Besides, our approach

also provides the software-defined routing mechanism for the inter-domain IP traffics. This mechanism arranges a flow path and enables the IP flows to traverse a SDN domain with the achievement of layer 3 IP routing by decreasing TTL value and layer 2 Ethernet delivery by rewriting the destination MAC address. In the end, the result of the experiment proves the feasibility of our approach as an application in the Ryu controller to be a transit AS that propagates the routing information and inter-domain IP traffics among multiple domains.

REFERENCES [1] B. A. A. Nunes, M. Mendonca, X.-N. Nguyen, K. Obraczka and T.

Turletti, "A survey of software-defined networking: Past, present, and future of programmable networks," IEEE Communications Surveys & Tutorials, vol. 16, no. 3, pp. 1617-1634, 2014.

[2] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker and J. Turner, "OpenFlow: enabling innovation in campus networks," ACM SIGCOMM Computer Communication Review, vol. 38, no. 2, pp. 69-74, 2008.

[3] N. McKeown, "Software-defined networking," INFOCOM keynote talk, vol. 17, no. 2, pp. 30-32, 2009.

[4] D. Kreutz, F. M. Ramos, P. E. Verissimo, C. E. Rothenberg, S. Azodolmolky and S. Uhlig, "Software-defined networking: A comprehensive survey," Proceedings of the IEEE, vol. 103, no. 1, pp. 14-76, 2015.

[5] Open Networking Foundation (ONF). (2012, April 13) Software-defined networking: The new norm for networks. [Online]. Available: https://www.opennetworking.org/images/stories/downloads/sdn-resources/white-papers/wp-sdn-newnorm.pdf

[6] R. T. Fielding and R. N. Taylor, "Principled design of the modern Web architecture," ACM Transactions on Internet Technology (TOIT), vol. 2, no. 2, pp. 115-150, 2002.

[7] P. Berde, M. Gerola, J. Hart, Y. Higuchi, M. Kobayashi, T. Koide, B. Lantz, B. O'Connor, P. Radoslavov and W. Snow, "ONOS: towards an open, distributed SDN OS," in Proceedings of the third workshop on Hot topics in software defined networking. ACM, 2014, pp. 1-6.

[8] O. Michel and E. Keller, "SDN in wide-area networks: A survey," in Proceedings of the IEEE Fourth International Conference on Software Defined Systems (SDS), 2017, pp. 37-42.

[9] S. Sezer, S. Scott-Hayward, P. K. Chouhan, B. Fraser, D. Lake, J. Finnegan, N. Viljoen, M. Miller and N. Rao, "Are we ready for SDN? Implementation challenges for software-defined networks," IEEE Communications Magazine, vol. 51, no. 7, pp. 36-43, 2013.

[10] P. Lin, J. Hart, U. Krishnaswamy, T. Murakami, M. Kobayashi, A. Al-Shabibi, K.-C. Wang and J. Bi, "Seamless interworking of SDN and IP," ACM SIGCOMM computer communication review, Vol. 43, No. 4, pp. 475-476, 2013.

[11] P. Lin, J. Bi and H. Hu, "Internetworking with SDN using existing BGP," in Proceedings of the Ninth International Conference on Future Internet Technologies. ACM, 2014, p. 21.

[12] C. E. Rothenberg, M. R. Nascimento, M. R. Salvador, C. N. A. Corrêa, S. C. de Lucena and R. Raszuk, "Revisiting routing control platforms with the eyes and muscles of software-defined networking," in Proceedings of the first workshop on Hot topics in software defined networks. ACM, 2012, pp. 13-18.

[13] P. W. Thai, and J. C. De Oliveira, "Decoupling BGP policy from routing with programmable reactive policy control," in Proceedings of the ACM conference on CoNEXT student workshop. ACM, 2012, pp. 47-48.

[14] Rekhter, Yakov, T. Li, and S. Hares, "A border gateway protocol 4 (BGP-4)," No. RFC 4271, 2005.

[15] M. R. Nascimento, C. E. Rothenberg, M. R. Salvador, C. N. A. Corrêa, S. C. de Lucena, and M. F. Magalhães, "Virtual routers as a service: the routeflow approach leveraging software-defined networks," in Proceedings of the 6th International Conference on Future Internet Technologies. ACM, 2011, pp. 34-37.

[16] J. Moy, "OSPF specification," No. RFC 1131, 1989. [17] S. Salsano, P. L. Ventre, L. Prete, G. Siracusano, M. Gerola, and E.

Salvadori, "OSHI-Open Source Hybrid IP/SDN networking (and its emulation on Mininet and on distributed SDN testbeds)," in Proceedings

52

of the IEEE Third European Workshop on Software Defined Networks (EWSDN), 2014, pp. 13-18.

[18] P. Jakma and D. Lamparter, "Introduction to the Quagga Routing Suite, " IEEE Network, vol.28, no 2, pp. 42-48, 2014.

[19] "Open vSwitch, An Open Virtual Switch," 2014. [Online]. Available: http://openvswitch.org/

[20] D. K. Hong, Y. Ma, S. Banerjee, and Z. M. Mao, "Incremental deployment of SDN in hybrid enterprise and ISP networks," in Proceedings of the Symposium on SDN Research. ACM, 2016, p. 1.

[21] P.-W. Tsai, P.-W. Cheng, H.-Y. Chou, M.-Y. Luo, and C.-S. Yang, "Toward inter-connection on OpenFlow research networks," in Proceedings of Asia-Pacific Advanced Network, vol. 36, 2013, pp. 9-16.

[22] P.-W. Tsai, P.-M. Wu, C.-T. Chen, M.-Y. Luo, and C.-S. Yang, "On the implementation of path switching over SDN-enabled network: A prototype," in Proceedings of the IEEE International Conference on Consumer Electronics-Taiwan, 2015, pp. 90-91.

[23] "Mininet: An instant virtual network on your laptop (or other PC)," 2012. [Online]. Available: http://mininet.org/

[24] "Ryu SDN Framework," 2013. [Online]. Available: https://osrg.github.io/ryu/

[25] "Oracle VM VirtualBox," 2008. [Online]. Avaliable: https://www.virtualbox.org

Hao-Ping Liu received his Bachelor's degree in Electronic Engineering and now is pursuing his Master’s degree in the Institute of Computer and Communication Engineering at National Cheng Kung University. His research interest focuses on software-defined networking. Pang-Wei Tsai received the Bachelor's degree in Electronic Engineering and the Master's degree in the Institute of Computer and Communication Engineering from National Cheng Kung University. His research interest is on software-defined networking, cloud computing, virtualization and network management. He is also experienced in designing the large-scale network testbed. His current research is focus on developing the future Internet testbed on TaiWan Advanced Research and Education Network. Wu-Hsien Chang received the Bachelor's degree from the Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, Taiwan, in 2016. He is currently pursuing the Master's degree in the Institute of Computer and Communication Engineering, National Cheng Kung University, Tainan, Taiwan. His research topic focuses on software-defined networking. Chu-Sing Yang is a Professor of Electrical Engineering in the Institute of Computer and Communication Engineering at National Cheng Kung University, Tainan, Taiwan. He received the B.Sc. degree in Engineering Science from National Cheng Kung University in 1976 and the M.Sc. and Ph.D. degrees in Electrical Engineering from National Cheng Kung University in 1984 and 1987, respectively. He joined the faculty of the Department of Electrical Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan, as an Associate Professor in 1988. Since 1993, he has been a Professor in the Department of Computer Science and Engineering, National Sun Yat-sen University. He was the chair of the Department of Computer Science and Engineering, National Sun Yat-sen University from August 1995 to July 1999, and the director of the Computer Center, National Sun Yat-sen University from August 1998 to October 2002. He joined the faculty of the Department of Electrical Engineering, National Cheng Kung

University, Tainan, Taiwan, as a Professor in 2006. He participated in the design and deployment of Taiwan Advanced Research and Education Network and served as the deputy director of National Center for High-performance Computing, Taiwan from January 2007 to December 2008. His research interests include future classroom/meeting room, intelligent computing, network virtualization.

53

1

Experimental tests for outage analysis in SISOLi-Fi Indoor Communication Environment

Atchutananda Surampudi1, Sankalp Shirish Chapalgaonkar1, A.Paventhan2

1Department of Electrical Engineering, Indian Institute of Technology Madras, India2ERNET, India

1 {ee16s003, ee15b018}@ee.iitm.ac.in2 {paventhan}@eis.ernet.in

Abstract—Visible Light Communications, popularly known asLi-Fi for indoor communications is capable of providing internetaccess at high data rates. This technology utilises visible lightwaves as carriers for passband modulation, which can also becalled as optical free space modulation on a static, linear timeinvariant optical wireless channel. For indoor applications, testingsuch a technology for coverage in both line of sight and non line ofsight scenarios, becomes advantageous for efficient deployment.In this work, we have practically deployed a Single InputSingle Output Li-Fi communication pair and have experimentallyanalysed the performance in terms of power received and outagedistance for different colours and installation heights. These testscan be standardised for future works.

Index Terms—Li-Fi, height, colour, received power, Philips.

I. INTRODUCTION

V ISIBLE Light Communications (VLC) had beensuccessfully used for exchange of information long

ago even when the telephone was not invented. Intensitymodulated light signals can carry information. But because ofthe lack of efficient light transmission and reception devices,this could not be analysed further. After the invention ofthe Light Emitting Diode (LED) and it’s commercializationduring the mid twentieth century, research in VLC took anew turn. The early papers started around the late twentiethcentury [1]. The use of the LED for VLC is described in[2]. This technology for indoor access has been coined as theLi-Fi technology by professor Herald Hass at The Universityof Edinburgh, U.K, and its capabilities for indoor internetaccess were first demonstrated at a TED global in 2011 [3].

The Li-Fi communication technology, using visible light asthe carrier, can be used to provide reliable and high speed dataaccess because of the huge visible light spectrum availablein the range of Terahertz (THz). This technology can alsobe used in the areas where traditional Radio Frequency (RF)communications fail to provide coverage, thus acting as asupplement for the same. In health care environments andoil and gas industries, where radio waves may be harmful tooperate for communication, the Li-Fi technology can be apotential communication method by utilising the safe, visiblelight waves [4]. The 5G mobile communication standard

introduces the Internet of Things (IoT), where every devicewill be interconnected. Here too this technology can helpprovide high speed data access as well as security. So,performing experimental analysis for outage using standardtests becomes very crucial to understand this technology andto deploy it effectively in indoor environments. The receivedpower from a given coloured LED, at a given height andradius, in cylindrical coordinates, provides an idea on thecoverage cone or the coverage area provided by the LEDtransmitter. So, to analyse outage in an optical wirelesscommunication scenario is very important. In the traditionalwireless communications, the channel is assumed to be lineartime variant and various works have been done to analysethe same [5]. But for a static linear time invariant opticalwireless channel, the experimental analysis becomes moreimportant because the channel is deterministic. So in thiswork a set of experimental tests have been conducted for bothLine of Sight (LOS) and Non Line of Sight (NLOS) methodof Li-Fi communication to analyse the maximum distance atwhich outage occours. We experiment NLOS communicationwith the help of coloured reflectors. Also, a Li-Fi transmittergenerates a conical flux of light coverage over the given area.This is due to the limited Half Power Semi Angle (HPSA)of the transmitter. So, over that conical coverage, measuringthe HPSA is important. This also has been included as anexperiment.

In this work, the Li-Fi transmitter or LED refer to thesame downlink transmitter. The Li-Fi receiver or the pho-todetector (PD) refer to the same downlink receiver. Thisnaming convention is used appropriately according to conve-nience interchangeably. Further, this naming convention can beused for infrared (IR) uplink Li-Fi transmission also, whichwill be specified. This paper has been arranged as follows.Section II describes the system model and the componentsused. Section III describes the experimental tests performed.Section IV presents the experimental results and presents themwith appropriate graphs along with the inferences. The paperconcludes with Section V.

II. SYSTEM MODEL

A Single Input Single Output (SISO) Li-Fi communicationis considered in all our tests for both uplink (user to the

54

2

internet) and downlink (internet to the user) communicationpaths. Basically, the transfer of data happens from a lighttransmitting LED to a detecting PD. In all our experiments,for downlink, from Table I, the transmitting LED is thePhilips DN561B and the receiving PD is integrated withthe Li-Fi Dongel. For uplink, the Infra Red (IR) LED atthe Li-Fi dongel becomes the transmitting LED, and thePhilips LBRD14016-3 IR uplink receiver, placed adjacent toDN561B becomes the receiving PD.

The downlink SISO geometry is discussed now, which isapplicable for uplink SISO as well. Consider the downlinkof the LED-PD communication scenario limited by themodulation bandwidth of the LED, as considered in [6]. Themodulation bandwidth refers to the range of frequencies atwhich the LED can be intensity modulated or the rates atwhich the intensity flickering can happen, imperceivable bythe human eye. Let the light source be at an elevation heighth from the origin and the PD be at a distance z from theorigin as shown in Fig.1.

Fig. 1. This figure shows the LOS light propagation geometry. The triangularshaped LED source is at a height h and is tagged to the PD at distance z fromthe origin, with a given Field of Vision (FOV). The angles ✓

0, tr and ✓0,rec

are transmission angle at the LED and incidence angle to the PD with respectto the normal as shown by the dotted line respectively. ✓

0.5 is the Half PowerSemi Angle (HPSA) of the LED. This is adapted from [6].

In Fig. 1, ✓0,tr is the transmission angle from the LED at

origin and ✓0,rec is the angle of incidence of the same light ray

at the PD. FOV denotes the Field Of View of the PD, whichis the maximum solid angle to which the received rays canbe detected. ✓

0.5 denotes the Half Power Semi Angle (HPSA)of the transmitter. HPSA refers to the solid angle at whichthe optical power becomes half of that at normal (solid angle= 0�). Let A

pd

be the area of the PD. Moreover, from [6],theoretically the LTI channel contributes a Gain, G(z), fromthe LED light source to the given PD receiver at a position zon the ground, which is given in (1).

G(z) =(m + 1)A

pd

2⇡l2

cos

m(✓0,tr ) cos(✓

0,rec). (1)

TABLE ICOMPONENTS : THIS TABLE SHOWS THE LI-FI COMMUNICATION

COMPONENTS RECEIVED FROM THE PHILIPS, EINDHOVEN.

Component ProductName Function Quantity

Li-Fi downlinktransmitter ( Fig.2 )

LuxspaceDN561B

White LED transmitter forDownlink. 1

Modem board (Fig. 3 )

Modemboard

A central circuitry whichcontains a microcontrollerto control dimming andmodulation.

1

Xitanium 20WLED powerdriver ( Fig. 4 )

LBRD1514-1

To control and stabilizethe current and drive theLED appropriately.

1

Uplink IR Re-ceiver ( Fig. 5 )

LBRD14016-3

Uplink receiver placed ad-jacent to the downlinktransmitter.

1

Li-Fi Dongel (Fig. 6 and Fig. 7)

Li-Fi Don-gel

Is attached to the user de-vice. Acts as both uplinktransmitter and downlinkreceiver. Has an indicatorto show Li-Fi connectiv-ity.

1

In (1), l denotes the cartesian distance from the LED lightsource to the PD as given in (2). Also, m is the Lambertianemission order of the light source which is given in (3).

l2 = h2 + z2. (2)

m = � ln(2)ln(cos(✓

0.5). (3)

These expressions are not used explicitly in the paper.

A. Components receivedThe experiments have been performed using a set of Li-

Fi transmitter and receiver pair received from The Philips,Eindhoven, for the purpose of academic demonstration andtesting of indoor Li-Fi capabilities. The received componentsare listed in Table I.

Fig. 2. The Luxspace DN561B downlink LED transmitter. In the inset, thereare 14 LEDs present. 4 on the inner ring and 10 on the outer ring. The HPSAof the combined transmitter is 25

�

B. Other testing componentsThese components are used to precise and accurate mea-

surement of experimental values. These are listed as in TableII.

55

3

Fig. 3. The modem board. This is a central circuitry to control dimming andmodulation using a microcontroller. It has an Ethernet port to receive datafrom a wired network and convert data suitably for Li-Fi communication.

Fig. 4. The Xitanium 20W LED power driver, LBRD1514-1.

III. EXPERIMENTAL TESTS

The tests are performed in a closed laboratory, at nighttime, so that the effect of ambient constant illumination dueto other luminaires or sunlight can be considered negligibleand it becomes easier to obtain the results. Nevertheless, theLi-Fi technology works in daylight also, because the constantillumination from the sunlight on the receiver PD does notaffect the modulated intensities. It rather adds a negligible

TABLE IICOMPONENTS : THIS TABLE SHOWS THE COMPONENTS USED FOR

ACCURATE EXPERIMENTAL VALUES.

Component Property Function

Low NoiseAmplifier (Fig. 8 )

InternalResistance= 1000 ⌦; InternalCapacitance =1µF

To measure free space opticalpower at a given location.

Meter Tape Length = 350cm

To measure the distance betweenLED and PD.

Protractor Angle = 0� to180�

To measure the angle of deviation,from the normal, of the free spacecoordinate where the average re-ceived optical power is measured.

Colouredtransparentfiber sheet( red, greenand blue ) (Fig. 9 )

Thickness =0.05mm

To experiment using differentcolour wavelengths. This sheet isplaced before the LED transmitter,to obtain a coloured LED from thewhite LED.

Fig. 5. The IR Uplink receiver, LBRD14016-3.

Fig. 6. The Li-Fi dongel. This is connected to the user device using a universalserial cable (USB). There are two circular insets on the right side of the device.The upper inset contains the Uplink IR transmitter. The lower inset containsthe downlink receiver PD.

DC wander to the PD. This affect can be cancelled easily.

The LEDs for both uplink and downlink have a HalfPower Semi Angle (HPSA). The flux of light emitted andthe maximum solid angle is limited by the HPSA of LEDs.This flux gives a conical coverage over the entire illuminationregion. Now, as a note, we have done the tests using thecomponent parameters used in Table II. These experimentalvalues obtained, may change as Table II parameters change.For now, the current parameters become the reference.

A. LOS - Received power at different coordinates

The experimental setup is shown in Fig. 10. The schematicis shown in Fig. 1. The purpose of this experiment is tofind the HPSA of the downlink LED. In this experiment,the average received power is measured using a Low NoiseAmplifier (LNA) at a given location in free space for a givencolour of LED, angle ✓

tr

of the LED and the slant height l ofthe location, along the inclined trajectory. Now the separationheight h becomes the trigonometric tangent measure of l withrespect to ✓

tr

. The variation of colour, h and ✓tr

is as shownin Table III. Now, as these parameters vary, the receivedoptical power at the LNA varies. With this variation, theconical coverage and the HPSA of the LED can be estimatedexperimentally.

56

4

Fig. 7. The Li-Fi dongel. This is connected to the user device using a universalserial cable (USB). There is an indicator which glows on to confirm the Li-Ficonnectivity.

Fig. 8. The Optical Power receiver - Low noise amplifier circuit (LNA).

B. LOS - Maximum distance - Outage analysis

The experimental setup is shown in Fig. 10. The schematicis shown in Fig. 1. The purpose of this experiment is tofind the variation of the outage distance with height of LEDinstallation and colour for both uplink and downlink. So, inthis experiment, for a given LED installation height h and agiven LED colour, the distance z is measured at a point on theground where the indicator on the Li-Fi dongel stops blinking.This situation we refer to as the outage and the distance werefer to as the maximum distance z

max

. This we repeat fordifferent LED illumination colours and for each colour, wevary the height h. The outage situation arises when the edgeof the coverage cone is reached for that height. The rangesare described in Table IV.

TABLE IIIPARAMETERS : THIS TABLE SHOWS THE PARAMETER RANGE FOR TEST

A.

Parameter RangeHeight h ( Thedownlink LEDtransmitter iskept fixed, thePD location isvaried )

5cm to 150cm ( at an interval of5cm ).

Angle ✓ 0�, 30�, 40�, 50�.Colour Red, Green, Blue.

Fig. 9. The Blue, green, red colour ( in clockwise order ) fiber sheets.

Fig. 10. The experimental setup for both LOS - A and B tests.

C. NLOS - Reflection colour based outage analysis

In this test, the purpose is to find the variation of outagedistance for downlink in case of Non Line of Sight (NLOS)reflection based communication. Here, The LED transmitter iskept on a horizontal plane. The experimental setup is shownin Fig. 11. The experimental schematic is shown in Fig. 12.From Fig. 12 we observe that the downlink LED transmitter,placed at a distance d

1

, faces a colour specific reflector, placedat origin. The uplink IR receiver is always placed at the origin.The PD is placed at a distance d

2

, facing the reflector, fromthe origin. Now, for a given d

1

and a given reflector colour,d

2

is varied along the normal of the plane of the reflector.We stop at a point on the ground where the indicator on theLi-Fi dongel stops blinking. This situation we refer to as theoutage and the distance we refer to as the maximum distanced

2,max

. This we repeat for different reflector colours and foreach colour, we vary the distance d

1

. The input parameterranges are described in Table V.

TABLE IVPARAMETERS : THIS TABLE SHOWS THE PARAMETER RANGE FOR TEST

B.

Parameter RangeHeight h ( Thedownlink LEDtransmitter heightis varied usinga verticallymovable stand )

5cm to 150cm ( at an interval of5cm ).

Colour Red, Green, Blue.

57

5

Fig. 11. Experimental setup for NLOS communication scenario. Here only theexperimental setup is shown. The actual experiment happens when constantambient illumination is switched off and the room is dark. Also, because theexperiment is NLOS, a tunnel kind of covering is provided to avoid the lightwaves to spread out. The DN561B is inside the tunnel at a distance d

1

. Thereflector is at origin and the dongel is at a distance d

2

.

Fig. 12. The experimental schematic for NLOS test C. Here, the reflector(rectangle shaped), is placed at origin. The downlink LED (triangle shaped) isplaced at a distance d

1

facing the reflector along the normal of the reflector’splane. The Li-Fi dongel is placed at a distance d

2

from the reflector facingthe reflector. This has both uplink LED and downlink PD. The uplink IR PDis placed at origin.

IV. EXPERIMENTAL RESULTS AND INFERENCES

In this section we describe the experimental results throughgraphs and make appropriate inferences.

A. LOS - Test A

Here the average received optical power is measured infree space by varying the height h, LED colour and angleof transmission ✓

tr

. Fig. 13 to Fig. 16, show the variation

TABLE VPARAMETERS : THIS TABLE SHOWS THE PARAMETER RANGE FOR TEST

C.

Parameter Rangedownlink LEDtransmitterdistance d

1

from origin (The downlinkLED transmitterdistance is variedon a horizontalplane )

35cm, 90cm, 120cm.

Reflector Colour Red, Green, Blue, White, Black.

0 50 100 1500

1

2

3

4

5

6x 10

−4

Height in cmP

ow

er i

n w

att

s

Power received Vs height from source, 0 degree

y = red colory = green colory = blue colory = white color

Fig. 13. Test A - ✓tr = 0�

of Popt,rec vs. height h for different ✓

tr

- 0�, 30�, 40�, 50�.We observe that for a given ✓

tr

, for a given h, Popt,rec is

more for a colour of larger wavelength. Also, for a givencolour and h, as ✓

tr

increases, the average power decreases.Using these graphs, we can infer that the FOV of the downlinkLED transmitter is between 40� and 50�, because a drasticdifference of received optical power occurs between theseangles.

0 20 40 60 80 100 120 1400

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6x 10

−4

Height in cm

Po

we

r i

n w

att

s



Fig. 14. Test A - ✓tr = 30�

58

6

0 20 40 60 80 100 1200

1

x 10−4

Height in cm

Po

we

r i

n w

att

s



Fig. 15. Test A - ✓tr = 40�

5 10 15 20 25 30 350

0.5

1

1.5

2

2.5

3x 10

−5

Height in cm

Po

we

r i

n w

att

s

Power received Vs distance from source, 50 degrees


Fig. 16. Test A - ✓tr = 50�

B. LOS - Test B

In Fig. 17 we observe that as the installation height hincreases, the coverage distance z

max

also increases. This isdue to the limitation of coverage area by the FOV of the LEDtransmitter. So, as h increases, the area subtended increasesdue to the coverage cone of light flux. Also, for a givenh, as the wavelength increases from blue to red colour, thecoverage distance also increases. This reproves the fact thatfor an electromagnetic wave ( even light wave ) with larger

wavelength travels a larger distance and experiences lesserattenuation.

40 60 80 100 120 140 160 180 20040

60

80

100

120

140

160

180

Height from ground in cm

Co

ve

ra

ge

dis

tan

ce

in

cm

Coverage distance Vs Source height from ground

y=red color

y=green color

y=blue color

y=white color

Fig. 17. Graph for Test B.

C. NLOS - Test C

In this figure we observe that as colour of the reflectorchanges, we get different coverage distances. So, the reflectorcolour becomes important.

30 40 50 60 70 80 90 100 110 120 1300

50

100

150

200

250

300

350

400

Distance of source from reflector in cm

Ou

tag

e d

ista

nc

e i

n c

m

Outage distance Vs source distance from reflector

y=red reflectory=green reflectory=blue reflectory=white reflectory=black reflector

Fig. 18. Graph for Test C.

59

7

V. CONCLUSION

In this work a set of tests for outage analysis and receivedpower were described using the Li-Fi components receivedfrom the Philips, Eindhoven. We saw that these tests foroutage are able to produce primary and novel results. Also,we were able to observe a trend in the values obtained whichhelp us in estimating certain parameters like HPSA.

But, these tests are at their primary stages. In future, thesetests may be standardised for Li-Fi hardware by makingmore precise measurements and comparison with standardtheoretical values. Also, these tests can be used for real timedeployment of the Li-Fi hardware. The NLOS test can beimprovised by using wide range of angles of incidence andcolours of the reflectors.

ACKNOWLEDGEMENT

We would sincerely like to acknowledge the Philips, Eind-hoven team to have provided us the Li-Fi communicationcomponents for indoor experiments. Also, we would like tothank ERNET India to have provided the opportunity to setup ademonstration laboratory, where the tests could be successfullyconducted.

REFERENCES

[1] Barry, John R., Joseph M. Kahn, William J. Krause, Edward A. Lee,and David G. Messerschmitt. "Simulation of multipath impulse responsefor indoor wireless optical channels." IEEE journal on selected areas incommunications 11, no. 3 (1993): 367-379.

[2] Komine, Toshihiko, and Masao Nakagawa. "Fundamental analysis forvisible-light communication system using LED lights." IEEE transactionson Consumer Electronics 50, no. 1 (2004): 100-107.

[3] Haas, Harald. Harald Haas: Wireless data from every light bulb. TED,2011.

[4] Haas, Harald. "LiFi: Conceptions, misconceptions and opportunities." InPhotonics Conference (IPC), 2016 IEEE, pp. 680-681. IEEE, 2016.

[5] Andrews, Jeffrey G., Francois Baccelli, and Radha Krishna Ganti. "Atractable approach to coverage and rate in cellular networks." IEEETransactions on Communications 59, no. 11 (2011): 3122-3134.

[6] Chen, Cheng, Stefan Videv, Dobroslav Tsonev, and Harald Haas."Fractional frequency reuse in DCO-OFDM-based optical attocellnetworks." Journal of Lightwave Technology 33, no. 19 (2015): 3986-4000.

60

Zigbee Based Home Automation and Agricultural

Monitoring System A mesh networking approach for autonomous and manual system control

Rakesh Kumar Jha[1], Shivam Khare[2], Rahul Sharma[3], Anubhav Tewari[4], Ankit Tyagi[5] , Shubha Jain [6]

[1], [2], [3], [4], [5] Shri Mata Vaishno Devi University, Katra, J&K, India, [6] SGSITS,Indore,M.P,India

Abstract— Today’s generation of electronic devices are more

enhanced and capable than the previous ones with exciting

changes in technology has seen to control a variety of home

devices with the help of a home automation system. These

devices can include lights, fans, doors, surveillance systems

and consumer electronics. However along with the

smartness and intuitiveness we want a system which is

economic as well as low power consuming. ZigBee

technology collects and monitors different types of

measurements that reflect energy consumption and

environment parameters. This paper details the designing

of a protocol to monitor various environmental conditions

in a home. We are using advanced technology of Micaz

motes (which have their own routing capabilities), NESC

language programming and Moteworks (used as a data

acquisition platform). Index Term — MEMS; Tiny OS (Tiny Operating Systems); SoC

(System on Chip); WSN (Wireless Sensor Networks); IEEE 802.15.4; LR-

WPAN (Low Rate Wireless Personal Area Networks); WiFi (Wireless

Fidelity); XBee (Zigbee); OTAP (Over The Air Programming)

I. INTRODUCTION

New generation electromechanical automation systems aim at real time data processing and transmission to a remote location for monitoring and sensing which aids in the domain of knowledge to the end user about the respective region where the system is currently deployed. Various communication standards and platforms have been devised in previous decades to turn such concepts into reality along with tremendous progress in semiconductor and MEMS technologies over the past decade has led to production of inexpensive sensors and microcontroller platforms which can be easily interfaced with each other. Many such startups are now also dealing with these technologies and rolling out products which have unique features like insignificant power consumption resulting in prolonged battery life, low maintenance modules with added short circuit prevention, encrypted data transmission standards for software level security and use of biometric data of intended user for counterfeiting rate.

Many such examples can be quoted, like NEST which is a subsidiary under Alphabet Inc. which initially started as a thermostat controller for indoor heating systems but now has a product ecosystem consisting of Thermostat, Smoke + CO alarm and Camera which in tandem for their platform Home/Away Assist which tries to tackle the issue of conventional geo-fencing HAS by recognizing a close group of maximum 10 users under NEST protect app family accounts which can be recognized as authentic users too in case the primary phone is drained out of battery.

Present day approaches for home automation systems includes

improvement of basic device interoperability through research

and standards as well as monolithic system to integrate

multiple devices for specific tasks.. The remote sensing

options were already available but we have pursued the aim of

creating automation and manual control too. The project was

developed keeping account of current barrier, providing an

adaptable, low cost system that the end user can oneself

install, configure, upgrade and control, consisting of server

and diverse systems. Contemporary approaches include

remote controlling and sensing which can be implemented

using convectional devices like smartphones and personal

computers. These solutions provide facility to end user to

remotely access their home. Additionally, latest features can

be introduced using future update roll outs to make system

self-adaptive.

The paper introduces the concept of home automation and agricultural monitoring services using the WSN technology platform to achieve the primary need of the smart, durable, efficient and adaptive control over the target environment. Actuators and sensors work in parallel to realize the simulation of remote environment and taking required decisions accordingly through the computation algorithm.


ISBN 978-4-9905448-7-4

61

Fig. 1 Detailed system components overview

Fig. 1 Detailed system components overview

A. CONTRIBUTION

In this paper proposed an automated system based on Wireless Sensor Networking which can be accessed and controlled at any remote location resulting in real time responses. This purposes is achieved using MoteWorks, a data acquisition platform developed by Memsic Inc. which supports mesh networking and mote monitoring to store the data into a PSQL ODBC database. In this paper an energy efficient and self-adaptive environment is being implemented to reduce the manual requirement and enhance the output efficiency by providing accurate results in a regular span of time resulting in better analysis and precise actions. Here in case of a system failure an additional feature remote sensing and controlling using smart phones has been added to secure the system working. System is self-efficient to gather data from displaced nodes at the base station where MDA300CA has been interfaced which operates upon Tiny OS Software and respond accordingly in form of appliance controlling on the basis of the values obtained and the threshold value already set without any external assessment. In this paper the developed a proposed model to establish an environment in which any kind of break in, theft or any malfunction such as fire, gas leakage and water leak can be easily detected and the requisite actions can be taken instantly.

B. OVERVIEW

The paper is divided from here onwards into the following sections with a small description of each:

1. ZigBee and Hardware – It deals with running SoC platform specifications along with attached sensor board modules and wireless interface technology used.

2. Development Platform – Software suite used for the programming and data acquisition with the underlying language base used to program individual motes and OS running them.

3. System Model – Sheds light upon overall layout of device and sensor ecosystem created, through a detailed graphic to aid visual learning about the working.

4. Mathematical Modelling – Contains equations used related to ZigBee protocol and others.

5. Algorithm – Charts logical behavioral flow of system for unique conditions.

6. Simulation Parameters – Listed derived results are discussed across various standards of measurements and category classifications.

7. Result Analysis – Screenshots of output and practical working are shared and compared with previous application cases.

8. Conclusion and Future work – Future scope related to the concerned field and usage scenario with end user usage experience as conclusion.

9. Reference – Enlists publication, image, document and other sources used in creation of this paper.

62

II. ZIGBEE AND HARDWARE

A. ZIGBEE PLATFORM

The 802.15.4 IEEE platform is a infrastructure less based protocol serves the foundation for ZigBee platform which specifies the physical and media access control layer standards for LR-WPAN networks. It’s maintained and published by ZigBee Alliance which provides support for upper networking layers according to the SoC intended.

This mesh networking targeted standard works on ISM band frequency range (2.4 GHz, 714 MHz, 868 MHz, 915 MHz) with data rate encompassing 20 kbps (868 MHz) to 250 kbps (2.4 GHz). Numerous routing protocols like flooding, TEEN, APTEEN, LEACH have been proposed but till now no perfect one has been accepted by the industry. The device class is divided into physical and logical types with further into FFD and RFD which serve unique purposes individually. The nodes act as whether in sensing or control mode then it’s under RFD and FFD’s are responsible for routing functions catering child devices under the cluster with FFD as cluster head.

It finds wide applications in fields of animal tracking, home automation, industrial control, medical health monitoring and numerous others.

B. MEMSIC WSN KIT

B.1 MICAZ Mote

MICAZ mote is specifically designed for wireless sensor networks having its own router capabilities to communicate with the surrounding nodes and use IEEE 802.15.4 (specifies the physical layer and media access control for low rate wireless personal area network) protocol for setting up a low manufacturing cost and power efficient, battery-operated networks.

Fig. 2 MPR2400 component block view

MICAZ mote comprises of MPR2400CA platform based on

the Atmel ATmega128L microcontroller as depicted in Fig. 1

which uses its internal flash memory to run mote works.

MPR2400CA platform simultaneously communicates with

surrounding nodes and runs sensor applications. MICAZ motes

uses wireless ad hoc networking using mesh topology to set up

an autonomous network and provides data rate of 250 kbps

among nodes and base station to interface wide range of

external peripherals.

B.2 MIB520CB

MIB520CB acts as an USB connector for MICAZ motes for

communication and in system programming purposes. MICAZ

motes when connected with MIB520CB acts as a base station

and helps to collect data from all other motes present in the

system. MIB520CB extend two different kind of ports one for

in system mote programming and another for data

communication over USB with baud rate of 57.6K and when

connected with USB port doesn’t require external power

source.

B.3 MTS420

MTS420 is a surrounding monitoring sensor which can be

easily deployed in remote locations as it requires very low

maintenance and have an extended battery life. MTS420

provides wide range of features such as temperature/

humidity/ pressure (300mbar to 1100mbar) sensors and light

intensity, along with dual axis accelerometer. Along with

above mentioned features MTS420 also have GPS module to

get coordinates of the motes.

B.4 MDA300CA

MDA300CA is an extremely flexible data acquisition board

having temperature and humidity sensors embedded on its

platform and offers a wide range of features on board such as

ADC channels, digital I/O channels, two relay channels (one

open and one closed) and supports external I2C interface.

MDA300CA have 64K EEPROM to store the data measured

by the sensors. MDA300CA operates using Tiny OS software.

III. DEVELOPMENT ENVIRONMENT

A. MOTEWORKS

This is a data acquisition platform developed by Memsic Inc. which supports mesh networking and mote monitoring to store the data into a PSQL ODBC database. The overall software kit is divided into 3 separate tiers: mote, server, client. Mote tier supports multi-hop, non-infrastructure, ad-hoc, mesh networking protocol for LR-WPAN wireless networks based on ZigBee. The motes communicate via multi hop communication for improved reliability and radio coverage which are connected to PC via MIB520CB gateway which is equipped with antenna to remotely program motes and the comes under Xmesh Mote Tier.

63

Fig. 3 MoteConfig

Server tier manages SQL database under applications interfacing with mesh to higher level layers and outside apps via terminal exchange.

Fig. 4 MoteView

For end to end solution across all the tiers to the user or developer, the Client tier comes to the service which displays statistical information straight from sensors in form of text or graphic charting with the capability of rendering past event readings through fetching database. Individually node can be updated and configured based on sensor board attached and communication channel attached respectively through programming via gateway as shown in Fig. 2

The software package provided by Memsic can be subdivided into the respective components as follows:

Table no. 1

TinyOS and MoteWorks

tools

An event-driven OS for

wireless sensor networks; tools

for debugging.

nesC compiler An extension of C-language

designed for TinyOS

Cygwin A Linux-like enivreonment for

Windows

AVR tools A suite of software

development tools for Atmel’s

AVR processors

Programmer’s notepad IDE for code compilation and

debugging

XSniffer Network monitoring rool for

RF environment

Moteconfig GUI environment for Mote

Programming and OTAP

LotusConfig GUI environment for Lotus

Programming

Graphviz To view files made from make

docs

B. TinyOS

The environment selected for the current work in TinyOS

which is supports by the MEMSIC kit. The protocol shown

here suits the network and routing requirements of the design

and the deployment of motes. The whole operation is shown

by the application layer which is in direct contact with the user

or administrator. The advantage of using the TinyOS is that it

is open source and is easily available in the internet. The

language used here is NesC which is used to program the

motes. The TinyOs is easily compatible with the motes, is

event-driven in nature. The libraries used here are available

default and are used in the code by direct programming and

including them.

As per the data of [19], three IDEs (integrated development

environments) are available for the TinyOS, to be used in

Eclipse:

YETI 2, ETH Zurich

XPairtise

TinyDT

These plugins are included in Eclipse in order for them to run.

The version of TinyOS used for this paper is 2.1.2.

The range of using microprocessors in TinyOS is right from 8-

bit architecture to 32-bit architecture and from 2KB RAM to

32MB RAM (or more) respectively.

C. NESC

For programming motes, a new language ecosystem has been created which is focused on a component and interface driven methodology. It provides an easier linking model than C and derives it’s features from C, C++ and Java with the aim of building components just like in Java objects for which can be compiled into complete concurrent systems for robust embedded network systems.

Program writing involves writing components and wiring them often known as interfaces occurring at compile time and bidirectional in nature. It builds up into a concurrency model for monitoring hardware event handlers and tasks as per the respective sensor module.

Here Fig. 4 shows Ubuntu 9.10 with working Tiny os 2.x install running in Virtual Box. Eclipse was installed in the machine with Yeti2 development plugin to identify and acquire NesC libraries and syntax for the further code development.

64

IV. SYSTEM MODEL

The motes are programmed using pre written programs so that

one of them acts as a server and others acts as nodes. The

gateway has functions like collection of data from different

nodes and to pass the data to base station.

The communication between motes and base station is done

using Xbee protocol and that between the Base Station and

acquisition board is also based on Xbee. The devices are

operated using specific conditions and are connected to the

acquisition board MDA300CA which Fig. 5 describes. Now

the data acquisition board is connected to the internet using

WiFi so that anyone can control it.

Fig. 5 Eclipse + Yeti2 running in UbunTos virtual machine

The NesC was run into a UbunTos (Ubuntu + TinyOS) virtual machine with Eclipse installed alongwith Yeti2 Plugin installed for NesC syntax recognition and compilation.

Fig. 6 System working layout

V. MATHEMATICAL MODELLING

Numerous parameters are dealt with during

discussion about Zigbee transmission technology. Some of

them are :

1. Battery Consumption of onboard radio – power

usage of a Zigbee radio can be broken down into

a combination of it’s unique states.

pcon = ptx +prx +pSleep +pidle ----- 1

Where p= E/t

2. The battery lifetime in hours could be calculated

using the following formula:

t=

---------------------2

3. Power loss which is crucial in designing battery

extensive systems, can be calculated for

communication technologies which utilize ISM

band is:

PL(dB)=-10log

----3

Table no. 2

pCon total power consumption

ptx transmitted signal power

prx received signal power

pSleep sleep state power

65

pidl idle state power is no packets are transmitted or

received

t battery life time in Hours

ic battery Capacity in mAh

i load current in mA

n peukert's exponent, it ranges from 1

to 1.3, where 1 is the nominal value

PL path loss

transmitter antenna gain

receiver antenna gain

frequency of wave

transmission distance

E Energy consumption

Table no. 2

VI. ALGORITHM

Algorithm description is presented here through flow chart

both for automated and manual control override.

The user must interfere in case of any sensor/mote failure

which is notified through the app or if he would like to

personalize the appliance properties to a different one than the

preset.

It’s evident from the Fig. 6 that user connects to the system

through any wireless network which in turn send command

data to Base station mote via Ethernet connected PC and those

are forwarded to respective motes through OTAP.

Fig. 7 Flow chat for manual control

The autonomous programming model as shown in Fig .7 is

based on the context aware computing of motes i.e. respective

property of room (humidity, temperature and moisture level

here) is monitored through sensor with personalized threshold

values already stored and appliance properties are customized

accordingly.

Fig. 8 Flow chat for autonomous control

PSEUDO CODE: Smart Home Control

if (device state==1) //while device is on state

{

//code for connection establishment

// Fan control

if (room temp > 250C)

{

if(no of people in room > 0)

{ turnonfan(); }

}

// Light control

if (light intensity < 400 lux)

{

if(no of people in room > 0)

{ turnonlight(); }

}

//Irrigation

if (soil water content < 0.1)

{ turnonmotor(); }

}

VII. SIMULATION PARAMETERS

Simulated result is benchmarked upon various parameters

which may vary as per respective sensor module type, some of

them can be listed for MTS 420CC used are:

1. Humidity / Temperature (C) – It uses SHT 11 single

chip sensor module for calibrated digital output

values.

2. Barometric Pressure (mba) – Intersema M55ER SMD

Hybrid peizoresistive sensor and 3-wire ADC

interface.

66

3. Light intensity (Lux) – TLS2250 digital sensor with

dual photodiode provides effective 12 – Bit dynamic

range.

4. Acceleration (g) – MEMS micromachined 2 – Axis,

+/- 2g capable of tilt detection, movement, vibration

and seismic mesaurements.

5. GPS – Leadtek GPS – 9546 or uBlox LEA-4A

provides antenna power and serial data at USART1

for positional detection.

VIII. RESULT ANALYSIS

Using the values obtained from the MoteView we can take the

decision for automation. The decision is made as per the

pseudo code given above.

The motes give the value to the base station which is

responsible for accumulation of data. This data has some value

like voltage, humidity, temperature, etc. As per the pseudo

code we have made an algorithm which turns on the peripheral

devices as per specific conditions as in Fig. 8 & 9 data is

monitored on a continuous time scale basis. An example can

be seen as that if temperature of room is greater than 250C, it

means that the room is hot and fan should be turned on.

However, to save power we must see that whether people are

there in the room or not. If people are present in the room

(which is observed by PIR sensor) then the fan is turned on.

Similar is the case when humidity is greater than 30% which is

uncomfortable for humans.

Similarly, other peripherals like light, motor are controlled

based on simple reasoning and human habitable conditions.

From the above simulation results we can see that we have

taken a few of possible parameters and according to human

comfort have taken proper steps to ensure the same in form of

home automation.

Fig. 9 MoteView with mote topology graphview

The whole system was built with sensors and actuators

connected to Arduino for autonomous and manual control.

Some of the pictures are attached here:

Fig. 10 MoteView displaying data acquired from sensors

Fig. 11 Detailed view of hardware model

Fig. 12 Agricultural model

67

Fig. 13 Mobile app login page

Fig. 14 User mode select menu

Fig. 15 Sensor data being monitored from app over Bluetooth

IX. CONCLUSION AND FUTURE WORK

With an upward trend in Home Automation we require a

system capable of simultaneously sensing and monitoring the

environment and acting accordingly in real time to provide

safe and secure surroundings. Early HAS specifically relied on

appliance interfacing and controlling but now the prime focus

has shifted to get a secure and energy efficient environment

which is being realized using WSN and ZigBee technology. In

this paper, ZigBee platform along with MICAZ motes + MPR

2400CA sensor module has been implemented to obtain a self-

forming non infrastructure based network which can use

various kinds of topologies as suited and capable of

monitoring surroundings on a continuous time lapse and can

take decisions as per requirement. One major concern of such

systems is energy efficiency and with component based

onboard power switching, power consumption has been

substantially reduced providing an extended life and low

maintenance cost. Home Automation requires the user to have

continuous update and can access the system from any remote

location, using MIB520CB gateway with LAN connected PC

running as remote server linked to authenticated home owner,

user can continuously control the system by giving commands

in real time, being present anywhere in the world and also get

push notifications on smartphone for any activities.

The forthcoming substantial advancement in sensor nodes will

help to overcome the issues related to WSN with improved

fault tolerance, context awareness, power management,

Quality of Service and security aspects. The upcoming

developments in sensor nodes will produce relevant devices

which may be used in applications like cognitive sensing,

spectrum management, time-crucial systems, mobile micro-

machines, smart rotating building control, structural health

monitoring, environment friendly and adaptive systems, cold

chain management.

68

X. REFERENCES

[1]Stephen Ellis, Lorson Blair, Yenumula B Reddy,”Implementing a wireless

sensor network using MEMSIC’S professional kit,”LA 71245, USA

[2]Mattia Gamba, Alssandro Gonella, Claudio E. Palazzi,”Design issues and

solutions in a modern home automation system, “International workshop on

networking issues in multimedia entertainment, ICNC Workshop(2015)

[3]Rajesh Kannan Megalingam, Vineeth Mohan, Paul Leons, Rizwin Shooja,

Ajay M. Smart Traffic Controller using Wireless Sensor Network for

Dynamic Traffic Routing and Over Speed Detection. 2011 IEEE Global

Humanitarian Technology Conference. 978-0-7695-4595-0/11 $26.00 © 2011 IEEE; DOI 10.1109/GHTC.2011.99

[4]Crossbow. XServe User’s Manual; Revision E, April 2007PN: 7430-0111-

01

[5]S. Benjamin Arul. Wireless Home Automation System Using Zigbee.

International Journal of Scientific & Engineering Research, Volume 5, Issue

12, December-2014. ISSN 2229-5518

[6]Kunho Hong, Sukyong Lee, Kyoungwoo Lee,“ Performance improvement

in zigbee-based home networks with co-existing WLAN’s,” volume 19 of pervasive and mobile computing,pp.156-166,Elsevier,May 2015

[7]Woong Hee Kim,Sunyong Lee,Jongwoon Hwang,“Real-time Energy

Monitoring and Controlling System based on ZigBee Sensor Networks,”volume 5 of procedia computer science, pp. 794-797,

Elsevier,2011

[8]Mu-Sheng Lin,Jenq-Shiou Leu,Kuen-Han Li,Jean-Lien C. Wu,“ Zigbee-

based Internet of Things in 3D Terrains,”volume 39,issue 6 of Computers &

Electrical Engineering, pp. 1667-1683,Elsevier, August 2013

[9]Khusvinder Gill, Shuang-Hua Yang, Fang Yao, and Xin Lu,” A ZigBee-

Based Home Automation System”, Vol. 55, No. 2, IEEE Transactions on Consumer Electronics, May 2009.

[10]Intark Han, Hong-Shik Park, Youn-Kwae Jeong, and Kwang-Roh Park,”

An Integrated Home Server for Communication, Broadcast Reception, and

Home Automation”, Vol. 52, No. 1, IEEE Transactions on Consumer

Electronics, February 2006.

[11]A cloud based and Android supported scalable home automation system,

Ilker Korkmaz, Senem Kumova Metin, Alper Gurek, Caner Gur, Cagri

Gurakin, Mustafa Akdeniz, Elsevier, Computers & Electrical Engineering

Volume 43, April 2015, Pages 112–128

[12]A cloud-based architecture for emergency management and first

responders localization in smart city environments Francesco Palmieri , Massimo Ficco, Silvio Pardi, Aniello Castiglione, Elsevier, Computers &

Electrical Engineering DOI:10.1016/j.compeleceng.2016.02.012

[13]The Brightnest Web-Based Home Automation

System,Benjamin Planche, Bryan Isaac Malyn, Daniel Buldon

Blanco, Manuel Cerrillo Bermejo, Springer, UCAML20014, LNCS 8867, pp

72-75, Springer International Publication Switzerland 2014

[14] Awareness Home Automation System Based on User Behavior through

Mobile Sensing, Rischan Mafrur , M. Fiqri Muthohar, Gi Hyun Bang, Do

Kyeong Lee, Deokjai Choi, Sringer-Verlag Berlin Heidelberg 2015

[15]ZigBee Smart Home Automation Systems, Shuang-Hua Yang,

DOI:10.1007/978-1-4471-5505-8_13, Springer-Verlag London 2014

[16]A Smart Gateway Architecture for Improving Efficiency of Home

Network Applications, Fei Ding, Aiguo Song, En Tong, and Jianqing Li, Hindawi, Journal of Sensors, Volume 2016 (2016), Article ID 2197237, 10

pages

[17]Building Automation Networks for Smart Grids, Peizhong Yi,

Abiodun Iwayemi, and Chi Zhou, Hindawi, International Journal of Digital Multimedia Broadcasting Volume 2011 (2011), Article ID

926363, 12 pages [18]Mingfu Li and Hung-Ju Lin,” Design and Implementation of Smart

Home Control Systems Based on Wireless Sensor Networks and Power Line

Communications”, VOL. 62, NO. 7, IEEE TRANSACTIONS ON

INDUSTRIAL ELECTRONICS, July 2015.

[19]Dae-Man Han and Jae-Hyun Lim,” Smart Home Energy Management

System using IEEE 802.15.4 and ZigBee”, Vol. 56, No. 3, IEEE Transactions on Consumer Electronics, August 2010.

[20]Alaa Alhamoud, Arun Asokan Nair , Christian Gottron, Doreen

Bohnstedt, and Ralf Steinmetz,” Presence Detection, Identification and Tracking in Smart Homes Utilizing Bluetooth Enabled Smartphones” in 13th

Annual IEEE Workshop on Wireless Local Networks( WLN), 8-11 Sept. 2014, pp-784-789.

69

http://www.sciencedirect.com/science/article/pii/S0045790614003073







http://www.sciencedirect.com/science/journal/00457906

http://www.sciencedirect.com/science/journal/00457906/43/supp/C







http://dx.doi.org/10.1016/j.compeleceng.2016.02.012

http://www.hindawi.com/21421934/




Proceedings of the APAN – Research Workshop 2017ISBN 978-4-9905448-7-4

Abstract— Nowadays, most of the countries around the world encounter affect of disasters. Disaster can occur anytime and anywhere, without giving any alarm or message. During the disaster, the rapid response and recovery activities are critical issues to save lives and properties. The effective response actions play vital role in the disaster situation because the large amount of properties and valuable lives are depending on it. But, the rescue teams and emergency organizations have many problems and delays to give the effective response to the victim areas. To reduce the risk and damage, identifying the best evacuation routes for the recuse teams is vital. The proposed system provides not only the recuse teams which locate near the victim area but also the best evacuation routes to move people from the hazard place to the safe places. This paper describes a web-based application for the best evacuation route assessment during natural disaster.

Index Terms— Best evacuation route assessment, disaster situation, effective response.

I. INTRODUCTION

Over the past years, almost countries around the world

have suffered many disasters including flood, hurricanes,

earthquakes, landslide, fire, etc. These can cause huge damage

for people and loss a large amount of properties. The effective

emergency response systems are essential to reduce the large

amount risk by natural disaster. Today, all of the countries

struggle and emphasize to recover and response the hazardous

areas immediately. To give the effective response actions,

emergency response system are needed to identify the risky

area directly, provide the services or rescue teams located

close with the victim location and the best evacuation route to

go safe place.

In Myanmar, the major natural disaster such as floods,

earthquakes, tsunamis, cyclones and landslides have been

25th June 2017

“This work was supported in part by the University of Computer Studies,

Yangon”

K-zin Phyo is with the Geographical Information System Lab., University of

Computer Studies, Yangon, Myanmar. (e-mail:[email protected])

Prof. Myint Myint Sein is with the Head of Geographical Information System

Lab., University of Computer Studies, Yangon, Myanmar.

(email:[email protected])

occurred. The impact of Cyclone Nargis that occurred in 2008

affected 2,400,000 people left 138,000 fatalities and estimated

damage cost of US$4,000,000 to Myanmar. During this

disaster situation, emergency rescue teams have some

difficulties to give the effective operations of evacuation in

time. To overcome the difficulties during natural disaster, the

best evacuation route estimation is developed and tested. The

main goal is to evacuate the people from the disaster area to the

safe area in the situation of earthquake or flood.

In the case of any disaster, it is the most important to make

respond and recovery actions and reach the victim area in time.

Many researches and previous researches were carried in the

application of exiting studies for route guidance systems.

Urban fire is a violent problem for both the developing and

developed countries. For effective firefighting, GIS based

effective route discovery system for fire event is developed by

using Dijkstra’s Algorithm and the system is implemented

base on the landmarks of the tested region [1].The bus route

information system for public transportation is developed to

calculate the shortest route by using A* algorithm. This system

provides the shortest route and bus number with information of

public bus transport for Yangon Region [4-6]. R. Fadlalla, et

al. [7] proposed the system for producing the digital route

guided maps and improving services in case of emergencies

such as accidents. They had been done by utilizing the

capabilities of GIS in network analysis and visualization to

enhance decision making in route selection to the nearest

hospital by mapping the services area based on travel time.

Route Analysis for Decision Support System is suggested to

find shortest route between one facility to another at the time

of disaster situation [8]. The research part of this work will

comprise of Geographic Information Systems (GIS)

technologies, GIS Web services and how these interact with

each other. The problem of finding the best evacuation route

along road network is a main issue in the effective route

navigation system. In this paper, best evacuation route

assessment to provide effective migration processes in disaster

situation is developed and tested.

II. BACKGROUND THEORIES

In many route finding processes, the road network is

considered as the graph, the locations or points of interest are

nodes or vertices, the connection between the locations are the

edges and the distance between nodes are regarded as weight

Effective Evacuation Route Strategy during

Natural Disaster

K-zin Phyo and Myint Myint Sein

70

value. Most of the graphs that represented the road networks

consisted of millions of nodes and edges. When the size of

road network is growing, the route finding processes take a

large amount of time and the accuracy of algorithm becomes

more difficult to measure. To solve the path finding problems,

the shortest path algorithms are used and it might not be a

simple task to find on large graphs. In fact, the path finding

algorithm must respond the shortest path result with high

accuracy. The optimal path finding algorithms are usually used

in many researches and can be seen in a lot of applications used

in our daily life. In this paper, modified Dijkstra’s algorithm is

proposed to improve the accuracy in route finding process.

Dijkstra’s algorithm computes the shortest paths from the

start node to all other nodes in the graph. The main weakness

of this algorithm is that it traverses through all nodes in the

graph. So, it takes much iteration and consumes a lot of time.

Generally, the shortest path algorithm must produce the result

as fast as possible. Dijkstra’s Algorithm can find the shortest

path of the road network, but does not consider the road

condition. To avoid the close and one-ended road, the new

variable n_state is added to the algorithm. To reduce the

memory usage and processing time, the new condition d[n] == ∞ is added to original code. The proposed method is

developed to reduce the visited number of nodes which are not

reached from source node. It is intended to improve the

performance of the algorithm and to reduce the consumption of

memory space. In this work, Modified Dijkstra’s is used to

find the optimal route for emergency vehicles. The pseudo

code of proposed algorithm is described as follow:

function MD(G,source,destination)

int n_state;

d[source]:=0;

q:=the set of all nodes in G;

while q is not empty:

n := vertex in Q with minimum distance in d[ ] ;

if n_state= =1

remove u from q ;

else if d[n] == ∞ || n_state == 0:

break ;

end if

end if

for each neighbor v of n

temp_d := d[n] + d_between(n, v) ;

if temp_d < d[v]:

d[v] := temp_d ;

previous[v] := n ;

decrease-key v in q;

end if

end for

end while

return d[destination];

In route finding processes, the proposed method will

eliminate the nodes which are not reached from the source

node and the nodes with state value ‘0’. By applying this

method in optimal route calculation, the number of iteration

time will reduce and the performance of the algorithm will

increase and the consumption of memory space will decrease

significantly.

Computing the distance between locations is an important

part of many researches which are related with geographical

information system. In order to calculate the distance between

two locations, two geographical coordinates are required. The

appropriate method for distance calculation will depend on the

objectives of the study, the nature of the data, and the type of

coordinates. To calculate the distance on the spherical shape,

the main issue is to account for the curvature. On the flat

surface, calculating the distance between points is simple. The

haversine formula is an appropriate equation to calculate

distance between two points in navigation. It gives great-circle

distances between two points on a sphere from their longitudes

and latitudes and uses a constant r that represents the radius of

the Earth.

The earth curvature, Haversine formula is described as follow:

𝑑 = 2𝑟arcsin ( 𝑠𝑖𝑛(𝜙1 ‒ 𝜙2

2 )2

+ 𝑐𝑜𝑠(𝜙1)𝑐𝑜𝑠(𝜙2)𝑠𝑖𝑛(𝜆1 ‒ 𝜆2

2 )2) (1)

In above formula, r is the radius of the Earth and d is the

distance between two points with longitude 𝜆1 ,𝜆2 and latitude

μ1, μ2 respectively.

III.GENERAL ARCHITECTURE

Routing applications on urban road networks for vehicles

are so useful in daily life. Effective evacuation planning and

well structure road network are the essential component in

disaster situation. Transportation of road network is required to

search and rescue operation, to deliver the emergency supplies

and services, to carry and move the victims from the collapsed

shelters in case of disaster etc. Such strategies can save lives,

decrease sufferings, and provide substantial savings and

benefits to humanity. The general architecture of effective

evacuation route response system is shown in Fig. 1. This

system will provide the evacuation route not only for the

people who needed to move from the risk place to safe place

but also for the emergency vehicles to go the hazard place to

save lives and properties in time.

Fig.1. General Architecture

71

IV.SYSTEM DESIGN AND DATA CREATION

Fig.2. System Overview

The proposed system is intended to provide the best

evacuation route for people and emergency vehicles during

disaster situation. According to the complex structure road

network and absence of best evacuation route guiding system,

there are many difficulties in many developing countries. Road

network transportation is important for evacuation processes,

to provide the emergency facilities, to bring and transport the

people from the disaster affected area to the safe places. To

save valuable lives and properties, the emergency vehicles are

needed to reach to the hazard area as fast as possible. Effective

respond actions and evacuation processes are a vital role

during natural disaster. The proposed system will provide the

hazard location and close services discovery components and

the best evacuation route calculation by using our proposed

modified method. Overview of proposed system is described

in Fig.2.

The data of road network, emergency services locations and

damaged locations are prepared and stored in the database. The

data used in this system are collected from related emergency

service departments of Yangon Region, Google Map and GPS

GARMIN etrex-10 device. The number of 41 fire stations, 47

police stations and 80 medical services are collected and used

for emergency services location points. The sample data of

emergency services are describes as sample in Table 1 with

related geolocation. Table 2 shows the example data for

incident location identification. In this work, the street names

are used as address. The road network table is created to

calculate the optimal route. Sample data creation of road

network is shown in Table 3.

Table 1. Emergency Services Location

Emergency Service Name Latitude Longitude

Dagon Seikkan Fire Station 16.845701 96.265748

Hlaingtharyar_B Fire Station 16.875918 96.068839

Mingalardon Fire Station 17.046965 96.140114

South Dagon Fire Station 16.854688 96.223212

North Dagon Fire Station 16.959331 96.295907

Shwepyithar Fire Station 16.97397 96.076451

Thaketa_B Fire Station 16.807593 96.21824

Shwpaukkan Fire Station 16.928111 96.184485

Rose Hill Hospital 16.809522 96.155364

Thaketa Hospital 16.805619 96.217417

La Gabar Hospital 16.901581 96.160019

Min Ga Lar Don Hospital 16.921114 96.133447

Kan Thar Yar Hospital 16.841681 96.203544

Yangon General Hospital 16.778903 96.148975

Orthopedic Hospital 16.819436 96.122583

Workers' Hospital 16.797989 96.172372

Yangon University Hospital 16.825747 96.134581

Yangon Children Hospital 16.788158 96.136464

Bayint Naung Police Station 16.864288 96.101723

Pazundaung Police Station 16.7786903 96.1719782

Pabedan Police Station 16.7753184 96.154622

Thingangyun Police Station 16.830411 96.186589

Thuwunna Police Station 16.847473 96.186586

Botahtaung Police Station 16.7715459 96.1726236

Mingalar Taung Nyunt

Police Station 16.796389 96.152754

Table 2. Location Identification Sample Data

Id Street Name Latitude Longitude

1 Myanandar 1st 16.847820 96.11723

2 Myanandar 5th 16.852990 96.11611

3 Myanandar 6th 16.852390 96.11975

4 Myanandar 7th 16.852165 96.12692

72

5 Myanandar8th 16.851607 96.12282

6 Gantgaw 2nd 16.842031 96.12482

7 Gantgaw 3rd 16.842573 96.12632

8 Gantgaw 4th 16.842783 96.12641

9 Gantgaw 5th 16.843318 96.12559

10 Gantgaw 7th 16.844727 96.12531

11 Gonnisetyone 16.851136 96.12499

12 Hlaing Buter Yone 16.837112 96.12665

13 Hlaing Sabal 16.849316 96.11772

14 HlaingYandanar Mon 16.853857 96.11433

15 Htantapin 16.833965 96.11413

16 Kan 16.838037 96.11939

17 Kha Poung 16.838654 96.11588

18 KhaYae 16.846591 96.12651

19 KhaYae 3rd 16.846747 96.12648

20 Khine Shwe War 16.832765 96.11248

21 Padauk Shwe War 1 16.837185 96.12227

22 Padauk Shwe War 2 16.837070 96.12197

23 Padauk Shwe War 3 16.837026 96.12164

24 Padauk Shwe War 4 16.83735209 96.12113

25 Paday Tha Yazar 16.87641512 96.17377

Table 3. Sample Data for Route Calculation

From_Node To_Node Distance

21 23 8

21 17 32

22 25 85

23 21 8

23 22 35

24 34 40

24 23 140

25 19 42

26 25 34

27 82 114

28 39 29

29 33 74

30 22 8

31 33 83

32 33 41

34 32 96

35 13 49

37 38 256

38 36 42

40 41 116

41 37 44

42 754 81

43 367 123

44 28 152

45 44 42

V. SYSTEM IMPLEMENTATION

The research work is implemented and tested on the

Yangon Region.

Fig.3. Emergency Services Locations in Tested Region

Fig.4. Damage Location Identification

73

Fig.5. Defining the Close Emergency Services

Fig.6. Showing Best Evacuation Route

The road network of Yangon Region consisted of the

number of edges 87038 and the number nodes 27852. Fig.3

shows the emergency service locations in the tested region.

The location identification of victim area is shown in Fig. 4

and the close services which locate near the victim area is

illustrated in Fig. 5. The evacuation route to move people from

disaster to go safe place or the optimal route to save people by

rescue team is show in Fig. 6.

VI.SYSTEM EXPERIMENT

The effectiveness of the proposed work is tested by

calculating on the road network of Yangon region with the

number of edges 87100 and the number nodes 27900. The

performance of the proposed method is compared with

traditional Dijkstra’s algorithm to prove the improved

efficiency of its. The performance evaluation of the two

algorithms is calculated by comparing processing time as

shown in Fig. 7 and the number of visited nodes as described in

Table 8.

Fig. 6. Comparison of two methods in processing time

Table 4: Comparison of two algorithms in visited nodes

Proposed Algorithm Dijkstra’s Algorithm

10 25

20 36

30 45

40 61

50 92

100 137

VII.CONCLUSION

The proposed work is discussed and developed to solve

the problems faced by the emergency rescue teams during

natural disaster. It provides to verify the precise location of

disaster area, to know the emergency rescue teams which

located near the victim area and the optimal evacuation routes

to transport people from the hazard location to the safe places.

This system can give the significant help to the emergency

rescue teams by supporting the best route to go the disaster

location in time. It must improve the evacuation processes and

recovery actions with effectively and efficiently. As future

plan, the proposed work is advanced as mobile application

according to be used in the world wide.

REFERENCES

[1] K.Phyo and M.M.Sein , “Effective Emergency Response System by

Using Improved Dijkstra‟s Algorithm,” 14th International Conference

on Computer Applications, ICCA-2017, Yangon, Myanmar.

[2] M.T.Zar and M.M.Sein, "Public Transportation System for Yangon

Region”, 12th International Conference on ComputerApplications,

ICCA-2015, Yangon, Myanmar.

[3] M.T.Zar and M.M.Sein, "Finding Shortest Path and Transit Nodes in

Public Transportation System", the 9th InternationalConference on

Genetic and Evolutionary Computing, ICGEC- 2015, Yangon,

Myanmar.

74

[4] M.T.Zar and M.M.Sein, “Using A* Algorithm for Public Transportation

System in Yangon Region,” in Proceedings of International Conference

on Science, Technology, Engineering and Management (ICSTEM, 2015),

Singapore.

[5] N. Kumar, M. Kumar and S. Kumarsrivastva,” Geospatial Path

optimization for Hospital: a case study of Allahabad city, Uttar Pradesh”,

Internation Journal of Modern Engineering Research, Vol. 4 | Iss.10| Oct.

2014, pp.9-14.

[6] Ashraf Gubara, Zakaria Ahmed, Ali Amasha and Shawki El ghazali,” Decision Support System Network Analysis for Emergency

Applications”, The 9th International Conference on Informatics and

Systems (INFOS),2014, pp. 40-46.

[7] V.Bhanumurthy, Vinod M Bothale, Brajesh Kumar and NehaUrkude,

ReedhiShukla,” Route Analysis for Decision Support Sytem in

Emergency Management Through GIS Technologies”, International

Journal of Advanced Engineering and Global Technology, February

2015, Vol-03, Issue-02,pp.345-350.

[8] R. Fadlalla, A. Elsheikh, A. Elhag, S. EddeenKhidirSideeg and A. Elhadi

Mohammed,” Route Network Analysis in Khartoum City”, SUST Journal

of Engineering and Computer Science (JECS), 2016, Vol. 17, No.

1,pp.50-57.

K-zin Phyo was born in Kyaiktho, Mon

State, in 1988. She received B.C.Sc

(Hons:) degree from Computer University,

Mawlamyine , Myanmar in 2009 and

M.C.Sc degree from the Computer

University, Mawlamyine , Myanmar in

2011. She is currently working towards the

Ph.D. Degree at the University of

Computer Studies, Yangon, Myanmar. Her research interests

are Image Processing, Geographical Information System,

Spatial Database and Android Applications. She is a student

member of IEEE.

Myint Myint Sein received the Ph.D in

Electrical Engineering from the Graduate

School of Engineering, Osaka City

University, Osaka, Japan in 2001. She

joined the Kehanna Human Info

Communication Research Center, Kyoto,

Japan as a post doctor researcher, research

fellow. She is presently serving as a Head

of Geographical Information System Lab., University of

Computer Studies, Yangon, Myanmar since 2005. Her research

interests are Pattern Recognition, Image Processing, Soft

Computing, 3D Reconstruction, 3D Image Retrieval, GIS and

Android Applications. She is a member of IEEE.

75

proceedings of the 14th apan research workshop 2017€¦ · proceedings of the 14th apan research...

Documents