a probabilistic privacy preserving strategy for word-of-mouth social networksdownloads.hindawi.com...

Research ArticleA Probabilistic Privacy Preserving Strategy forWord-of-Mouth Social Networks

Tao Jing Qiancheng Chen and Yingkun Wen

School of Electronics and Information Engineering Beijing Jiaotong University Beijing China

Correspondence should be addressed to Qiancheng Chen 16120048bjtueducn

Received 3 May 2018 Accepted 21 June 2018 Published 8 July 2018

Academic Editor Fuhong Lin

Copyright copy 2018 Tao Jing et al This is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

An online social network (OSN) is a platform that makes people communicate with friends share messages accelerate businessand enhance teamwork In the OSN privacy issues are increasingly concerned especially in private message leaks in word-of-mouth A userrsquos privacy may be leaked out by acquaintances without userrsquos consent In this paper an integrated system is designedto prevent this illegal privacy leak In particular we only use the method of space vector model to determine whether the userrsquosprivate message is really leaked Canary traps techniques are used to detect leakers Then we define a trust degree mechanism toevaluate trustworthiness of a communicator dynamically Finally we set up a new message publishing system to determine whocan obtain the message of publisher Secrecy performance analysis is provided to verify the effectiveness of the proposed messagepublishing systemAccordingly a user in social networks can checkwhether other users are trustworthy before sending their privatemessages

1 Introduction

Fog computing is a distributed collaborative architecture thatenables specific applications or services between actual datasources and the cloud to be managed in the most efficientplace [1] This type of computing is effectively extendingcloud computing capabilities and services to the edge ofthe network bringing their advantages and functions tothe point where data can be executed and manipulatedin the closest proximity In other words fog computing isan extension of the concept of cloud computing Differentfrom cloud computing fog computing wins by volumeemphasising quantity no matter how weak a single com-puting node is Cloud computing adjusts computing powertypically by a high performance computing device in a stack[2]

Fog computing-based typical services in online socialnetworks (OSNs) such as Wechat Facebook Twitter andLinkedIn gradually become a primary mode to interact andcommunicate between participants Each user in an OSN is anode component that makes up the social network topologyThese nodes do not have strong computing and storagecapabilities but they can help with things like data transfer

Therefore it is a reliable assumption to apply fog computingto social networks Edge nodes in social networks are moremobile and more decentralized The most concerned issue insocial networks is privacy leaks In the process of communi-cation among these edge nodes there is a novel way of privacyleakage through word-of-mouth Word-of-mouth is a formof privacy disclosure on social networksThis kind of privacyleakage exists in a large number of real social networks butis seldom studied For simplicity we call it a word-of-mouthsocial network

A word-of-mouth social network exists in the realhuman-centric world which can pass messages from oneperson to another by oral communication and finally leadto the rapid spread of messages Sometimes the spread ofprivate information may be peeped misused and takenillegally by other strangers [3] As a result it is importantto prevent privacy disclosure caused by word-of-mouth Infact the disclosure degree of one userrsquos privacy spreadingis related to how to control others to access his data Inaddition it also depends on howmuch andwhat data the userwants to release To tackle this issue access control has beenenvisioned as a promising and effective approach to protectprivacy of a personrsquos account [4ndash6] We intend to carefully

HindawiWireless Communications and Mobile ComputingVolume 2018 Article ID 6031715 12 pageshttpsdoiorg10115520186031715

2 Wireless Communications and Mobile Computing

design a strategy among users in the word-of-mouth socialnetwork to share their data within a trusted user set

Considering the human-centric network our objective isto design an approach to achieve trustworthy word-of-mouthinformation release This approach can trace the source ofprivacy disclosure and automatically adjust the trust degreeof nodes in OSNs Our exploration of this uncharted areaneeds to answer the following three challenges First how canwe determine whether one userrsquos privacy has been alreadydisclosed and detect the leaker Second how to update socialrelationship of a user after detecting information disclosureFinally how to prevent acquaintances fromdisclosing privacythrough the manner of word-of-mouth

Aiming at the first challenges we intend to use thevector space model (VSM) and the canary trap technologyto achieve message disclosure detection The authors of [7]present a mathematical expression of a VSM to measurethe similarity of two sets Then Li et al improve the cal-culation method of the VSM in [8] They apply semanticresources to reduce the dimensionality of feature itemsSimilarly we apply the VSM in this article to calculate thesimilarity between the suspected leaked messages and thepublished ones to determine whether the messages is com-promised After determining message leakage we employthe canary trap technology to trace the source of the leakedmessages In brief different versions of sensitive data aresent to suspected leakers to find which version gets leaked[9]

Next we introduce the trust degree of a user to ensuresecure data sharing to cope with the second challenge Thetrust mechanism has been widely studied to achieve privacypreserving in traditional OSNs The authors of [10] investi-gate a recommendation belief based on a distributed trustmanagementmodel for peer-to-peer networksThey quantifyand evaluate the reliability of nodes and introduce a similarityfunction to construct the recommended credibility In [11]the authors propose a method to check usersrsquo credibilitybefore they enter a network In addition the authors in [12]design a secure recommendation system for mobile usersto learn about potential friends opportunistically Differentfrom the existing works we employ the dynamic trustdegree to classify the recipients and to rank the sensitivityof publisher to information that needs to be published Thereason is that a user will unintentionally disclose privacyand later regret the behavior if the user is in an emotionalstate at the time of posting [13] In order to avoid incorrectaccess to the sensitive-privacy information the premise ofaccess control for users is to classify the privacy informationclearly

Moreover in order to prevent the privacy leakage oneuserrsquos (publisherrsquos) privacy should only be shared with otherones (recipients) in a ldquocorrectrdquo user set in OSNs [14ndash16] Theset composed of numerous trusted recipients can be updatedbased on the dynamic trust value that is a personal perceptionof a publisher to recipients In this paper we allow thepublisher to rate recipients based on the value to determinewhether they are trustworthy or not and dynamically adjustprivacy settings of recipients by privacy disclosureWe designa systematic information publishing algorithm to reduce the

P U

P publisher

messages

２2

２1

２n

２n-1

３２-1５

３２５

３２2５

３２1５

[２1２2middot middot middot２n] recipientsU unexpected receiver

４０２

４０２2

４０２1

４０２-1

Figure 1 A privacy disclosure model in word-of-mouth OSNs

leakage probability of private messages In this case we canreduce the risk of illegal spread of messages [17 18]

To the best of our knowledge there are few solutions topreserve privacy in word-of-mouth OSNs which motivatesour work The main contributions of this paper are summa-rized as follows

(i) We employ the VSM to measure the similaritybetween leaked messages and original ones so as todetect whether privacy disclosure exists According tothe tracing results by using the canary trap methodwe determine which recipients are the possible leak-ers

(ii) We define a dynamic approach to compute trustdegree between users To handle privacy leaks fairlywe design a novel utility function to punish the trustdegree of leakers

(iii) We design a probabilistic privacy preserving strategybased on a publisherrsquos sensitivity to messages andrecipientsrsquo trust degree for publishing a userrsquos mes-sages securely

The rest of the paper is organized as follows Section 2introduces the privacy disclosure model in word-of-mouthOSNs In Section 3 we present our privacy disclosure detec-tion method via computing the similarity between leakedmessages and original ones Section 4 introduces the calcula-tion method of trust degree and the punishment mechanismfor leakers Next a trust degree based publishing system isproposed to achieve messages sharing with privacy preserv-ing in Section 5 Then we analyze the secure performance ofour strategy in Section 6 and survey relatedwork in Section 7Finally we conclude our work in Section 8

2 System Model

In this section we propose a privacy disclosure modelas shown in Figure 1 This model can be formulated as aweighted directed social graph Each node in the graph

Wireless Communications and Mobile Computing 3

denotes a user while a directed edge between two nodesrepresents the direction of messages transmission We define119879PR119894 and 119878R119894U as the trust degree of the publisher P toits recipients R119894 119894 isin [1 sdot sdot sdot 119899] and the similarity betweenthe unexpected receiver U with Prsquos recipients respectivelyAccording to the graph of social relationship we can extractthe corresponding social attributes to explore a dynamic trustevaluation mechanism to achieve privacy preserving

The privacy disclosure model in Figure 1 indicates that Pis aware that its private messages are leaked to U after P sendsthe messages to R119894 The threat of privacy disclosure may becaused by one or more users in R119894 So the publisher is awarethat his information is leaked and hoping to know whichrecipient sold him out In this article we hope to completethe tracking process of leakers Because it is a post-evaluationmethod we can only determine the probability that informa-tion will be leaked by a certain recipient Recipients who havedisclosed the privacy of the publisher should be penalized toprotect the publishersrsquo private message

In this paper we intend to employ the VSM algorithm tohelp A to determine whether the messages are really leakedoutNext using the canary trapmethod to assist A to trace thepossible leakers in the recipients set and then give penaltiesto these leakers (Here what we consider is that recipientsrsquomotivation for disclosing information is not malicious Theyare not trying to damage the publisherrsquos interests Thereforethemessage disclosure should get the penalty of trust withoutconsidering legal or other punishment) For the leakerswe dynamically adjust their trust degree to explore ourprivacy preserving strategy Here we present a novel utilityfunction to measure the update standard of recipientsrsquo trustdegree based on the centrality degree119862119890119899(119894) of recipients andthe similarity 119878119894119895 between two users In our model socialnetworking platforms do not need to monitor messagingbetween users The task of the platform is to help usersfind the leaker when they submit information detectionapplications

3 Detection and Tracing Scheme for Leakers

A publisher realizes that his information may have beenleaked but heshe does not have enough confidence todetermine what happened In our model the publisher canapply to the platform for leakage detection The messagingplatform uses the VSM to detect whether the publisherrsquosinformation is actually leaked In this section we first com-pute the similarity score between suspected leaked messagesand the corresponding published ones to determine whetherthe private message is leaked or not The implementation ofthe VSM-based algorithm is actually quantifying the processby which publishers realize that privacy is compromisedIn the second part after privacy disclosure detection weintend to employ the canary trap technology to trace privacyleakers

31 Privacy Disclosure Detection Comparing with imagesfiles and events we believe that only texts leakage cannotbe directly judged visually Therefore we only consider theprivate message to be a text and use the VSM method

to calculate the similarity between suspected text and theoriginal one

In general we can characterize text as a form of space vec-tor based on the VSM Therefore the similarity between twotexts can be measured by computing the similarity betweentwo vectors Assume that (1199051 1199052 119905119899) indicate the text tobe detected and (1199081 1199082 119908119899) denote the correspondingcoordinate values of the 119899-dimension space Then we intendto exploit theVSM to find out a score that indicates the degreeof semantic equivalence between two texts The followingdetails describe the procedure to determinewhether two textsare similar or not

311 Text Preprocessing First we use the NLPIR wordsegmentation system to complete the word segmentation(NLPIR system is a software developed by the Institute ofComputing Technology Chinese Academy of Sciences theprinciple of this system is based on the information crossentropy to automatically discover new language features andadapt to the language probability distribution model of thetest corpus to realize adaptive participle) and obtain 119899 wordsets (1199041 1199042 119904119899) that contain all the words that appear in thetext Next we continue to remove the stop word that refersto the words with high frequency but no practical meaningSuch words include prepositions adverbs and conjunctionsThey usually have no definite meaning in themselves andonly when they are placed in a complete sentence can theyhave a certain effect The widespread use of a stop wordin documents can easily cause interference with effectiveinformation It is very significant to eliminate noise beforefeature weighting and selection As a result we remove allstop words in the word set s119895 119895 isin [1 sdot sdot sdot 119899]312 Feature Extraction and Weight Calculation As featureitems higher analytical accuracy of phrases and sentencesmay result in higher analysis error rate In this paperwe choose words as a feature of text rather than phrasesand sentences In order to better reflect the performancedifference of feature terms in the text content we assign aweight value to each feature term In particular we calculatethe weight of feature terms of each text segment separatelyFor the 119895th paragraph of the 119894th text 119904119894119895 the feature termweight refers to the performance of the feature term 119905119896 in thetext segment 119904119894119895The formula for computingweights of featureterms can be indicated as follows119908119894119895119896 = 119905119891119894119895119896 = 119888119894119895119896119897119894119895 (1)

where 119905119891119894119895119896 is the frequency of the occurrence of feature term119905119896 in text segment 119904119894119895 119888119894119895119896 denotes the number of feature terms119905119896 in the text segment 119904119894119895 and 119897119894119895 represents the number ofwords contained in the text segment 119904119894119895313 Text Vectorization After extracting feature terms ofeach paragraph and assigning the corresponding weights thetext can be expressed as the form of a vectormatrix If the text119863 is divided into 119899 parts and each part has 119898 feature termsThe text 119863 can be expressed as follows


Input text 119863Output the vectorization result 119881[119863] of text119863(1) 119881[119863] = [997888rarr1198811997888rarr1198812 sdot sdot sdot 997888rarr119881119899]minus1(2) 119881119894 = [1199091 1199092 119909119899](3) Let section(D) be the segmentation result of the text D(4) For each segmentation in section(D) do(5) Word set 119904119895 larr997888NLPIR segmentword(119904119894)(6) For each 119908119894 in words set 119904119895 do(7) While stop words list contains 119908119894 do(8) Remove 119908119894 from word set 119904119895(9) Tlarr997888extract features in 119904119895(10) For each 119908119895 in T do(11) 119898 larr997888count(119908119895)countword(119904119895)(12) 119881119894 larr997888 119898119879(13) 119881[119863] larr997888 119881[119863] + 119881119894(14) Return 119881[119863]

Algorithm 1 Vectorization procedure

θ

cos

x

y

z

i

jdis(ij)

Figure 2 Cosine similarity and Euclidean distance in three dimen-sions

119863 = [1199041 1199042 119904119899]minus1 times [1199051 1199052 119905119899]= (11990811 sdot sdot sdot 1199081119898 d

1199081198991 sdot sdot sdot 119908119899119898) (2)

The process of text vectorization is shown in Algorithm 1Here we segment paragraphs and words remove the stopwords extract the feature terms and calculate the weight offeature terms

In Algorithm 1 ldquosection(D)rdquo denotes the processing pro-cedure of text segmentation ldquoNLPIR segmentwordrdquo repre-sents a segmentation interface ldquostop words listrdquo representsthe disabled word list defined in the text preprocessingpart ldquocountrdquo is a function that calculates the number ofoccurrences of each feature term in the text and ldquocountwordrdquois a function that calculates the number of words containedin the text

314 Similarity Measurement Cosine similarity and Euclid-ean distance are the most common methods to measuresimilarity as shown in Figure 2

From Figure 2 we can see that cosine similarity mea-sures the angle of vector space which shows the differenceof vectors in direction Euclidean distance measures the

Table 1 The result of similarity calculation

sentences similarity scores1 Julie loves me more than Linda loves me (12) 07536025322252 Jane likes me more than Julie loves me (13) 01284080270023 He likes basketball more than baseball (23) 0257423195662

absolute distance between two points The cosine similarityis not sensitive to the absolute data so it is applicable tothe similarity analysis of specific features Therefore wetake advantage of the cosine distance calculating methodto measure the similarity Intuitively two vectors (texts) areindependent or irrelevant if 119878119894119898(1198941 1198942) is equal to 0 while119878119894119898(1198941 1198942) is equal to 1 when two vectors (texts) are the sameThe similarity between information 1198941 and information 1198942 canbe calculated as follows

119878119894119898 (1198941 1198942) = 1119899 119899sum119895=1

cos 120579119895 = 1119899 119899sum119895=1

997888rarrV1119895 sdot 997888rarrV211989510038171003817100381710038171003817V111989510038171003817100381710038171003817 sdot 10038171003817100381710038171003817V211989510038171003817100381710038171003817= 1119899 119899sum119895=1

2sum119898119895119896=1

11990811198951198961199082119895119896radic(sum119898119895119896=1

11990811198951198962) (sum119898119895119896=1 11990821198951198962) (3)

In order to verify the feasibility of our solution weimplement the algorithm using Python tools on WindowsDue to limited space we only introduce the experimental textwe selected and experimental results We carried out vectorextraction operations on three sentences shown inTable 1 andcalculated the similarity between them Finally according tothe cosine distance we obtain the similarity between threetexts respectively as shown in Table 1

We simply showed how to calculate the similaritybetween texts When the publisher realized that their sen-sitive information had been compromised OSNs calculatethe similarity between suspicious information and targetinformation to determine whether publisherrsquos information isactually compromised or not After a lot of experiments weconcluded that when the similarity score is greater than 075the meaning of the sentence is basically the same Thereforein this chapter we say that the text-privacy has been leakedwhen the similarity score between the suspect text and thetarget text is greater than 075

When a publisher realizes that their a sensitive messagehas been compromised we would compute the similaritybetween a suspicious message 119868119904 and the original message ofpublisher In this case we can determine whether publisherrsquosprivacy is actually compromised or not The VSM approachquantifies the process of publishersrsquo awareness of the leakageof private message Once the publisher is aware of a privacybreach they can adopt a canary trap approach to detect whichusers have recently leaked their private message

32 Canary Trap Techniques After conducting the VSM-based privacy disclosure detection we assume that a pub-lisherrsquos privacy has been leaked In order to detect leakers it isnecessary to have a strategy to determine whether or not suchmessage is used by some user illegally The canary trap is an


Table 2 Canary traps for different types of message

Message types Trapping settingsDigital images Different watermarksDatabase files Different values of some cellsEvents Different values of some attributesTexts Mixture of different paragraphs

approach to detect information leakage sourceThe basic ideais that the publisher sends each suspect a different version ofsensitive files and focuses on which version is leaked

We consider different types of message using differenttraps for distinction according to the leakage message Wedivide the types of message into four categories digitalimages database files events and texts as shown in Table 2Accordingly we provide four different types of message trapsin our canary trap algorithm

Recent events indicate that a user in publishersrsquo friendcircle leaks publisherrsquos private message After verifying thefact that private message is disclosed we use the canary traptechnique to find out leakers First we consider that thetype of message trap is a digital image We send an imageembedded with different fingerprints to 119899 users 1198771 1198772119877119899The digital fingerprint is embedded in each userrsquos copy with aunique ID that can be extracted to help track the leaker whenunauthorised leaks are foundWe use digital watermarking toembed unique fingerprints in each copy of an image beforereleasing the image

Definition 1 (digital watermarking) Digital watermarkingtechnology directly inserts some identification informationinto the digital carrier andmodifies the structure of a specificarea without affecting the value of the original carrier Digitalwatermarks are not easy to detect and remodify but can beidentified by the manufacturer Through these watermarkinformation hidden in the carrier we can achieve the purposeof confirming that the content creator and purchaser transmitsecret information or determine whether the carrier has beentampered with

For digital images represented by vector119883 119899 recipients119877119899is required and the image owner generates unique fingerprint119908119894 for 119877119899 Watermark images will be passed to the recipient119877119899 that can be expressed as follows119884119894 = 119883 + 119869119873119863 sdot 119908119894 (4)

where 119884119894 is the digital image embedded with a fingerprint119869119873119863 is used to achieve the imperceptibility of embeddingfingerprint in 119884119894 which makes each copy unique In thiscase we can identify the leaker if dishonest users illegallyrepost its copy We assume that recipients will not be able todecrypt the fingerprint by collusion and a digital fingerprintidentification system can track the leaker as shown inFigure 3

Secondly for a database file we modify data that isnot important to implement different versions What is notimportant here is that changing of its value does not affectthe trend of the database We need to generate 119899 different

+

+

+

+

X

P

Fingerprint extracting

U

Fingerprintextracting

Leaker identifying

Leaker Tracing

２1

２2

２k

２N

Figure 3 Digital fingerprinting system

versions of data Supposing the number of data units thatneed to be changed is 119909 we can get 119909 = log2119899 Similarly oncea database has been detected and leaked we can judge theleaker

Thirdly if we want to select a short event as an informa-tion trap we modify the basic attributes like time address ortarget of the event In short 119899 users 1198771 1198772119877119899 may obtain 119899completely different times

Finally we define that a text has 119898 paragraphs if weintend to select the text as an information trap There aresix different versions of each paragraph and the mixing ofthese paragraphs is unique to each numbered copy of thetext Each version has minor changes such as the font orspacing of words used in text Unless someone tries to findthe difference it would not be noticed There are more than6119898 possible permutations but the actual text has only 119899numbered copies (assume 6119898 gt 119899) If someone refers to twoor three paragraphs of these paragraphs we knowwhich copyhe sees so we know who leaked private message

We consider two special cases of false positive and falsenegative First the false positive condition means that theleaker in the last time privacy breach event might not havebeen detected by this canary trap experiment But a personwith a high frequency of leaks can not protect himself fromevery detection Second we are considering another possiblescenario in which although a user accidentally leaked in thecanary trap we deliberately falsified information We cannotbe absolutely sure that the previously detected sensitiveinformation must be the user who leaked information in thecanary trap This is a false negative result However even ifthe previous message is not leaked by this user he should stillbe punished for his mistake

Although this approach can only determine the leaker ofprivacy in a certain probability we still want to determinethe person who leaks the privacy as fair as possible Whatwe need is to fairly assign different levels of trust penaltiesto different users Therefore we introduce the concept ofsimilarity and centrality to distribute the trust degree penaltyvalue fairlyThis process will be described in detail in the nextsection


0

Absolutedistrust Indecisiveness

Absolutetrust

High distrust Moderate

distrustModerate

trust

High trust

1

Figure 4 Trust degree range in an online social network

4 Trust Degree Mechanism

In this section we construct a trust degree mechanism tocalculate and punish the trust degrees of the publisher torecipients Trust acts as the glue that holds networks togetherenabling networks to function effectively even though theylack a hierarchal power structure

In the first part we introduce a dynamic calculationapproach of trust degree when there is no leakers In thesecond part we introduce how to punish the trust degree ofleakers when there exist leakers

41 Dynamic Calculation Approach of Trust Degree In ourmechanism the value of trust degree ranges from 0 to 10 represents absolute distrust while 1 means absolute trustspecific classification is shown in Figure 4

The trust degree of a user to another user consists oftwo parts direct trust degree (119863119879) and recommendationtrust degree (119877119879) 119863119879 indicates the direct trust level of thepublisher to the recipient based on the direct interactionexperience 119877119879 represents the recommendation trust andrelies on rating information from the other recommenderWe set P R and M to represent publisher recipient andrecommender respectively The trust degree of the user P tothe user R is defined as 119879(119875 119877) which can be computed asfollows 119879 (119875 119877) = 120572 times 119863119879 (119875 119877) + 120573 times 119877119879 (119875 119877) (5)

where119863119879(119875 119877) indicates the value of direct trust degree and119877119879(119875 119877) indicates the value of recommendation trust degree120572 and 120573 are the trust degree regulation factors Their valuesare related to the proportion that the publisher pays attentionto DT and RT Generally we can set 120572 and 120573 according to theinteraction frequency between P (or M) and R Specifically ifP transacts with Rmore frequently thanM thenwe can give120572a higher value and 120573 a lower value and vice versa Thereforewe need to calculate DT and RT respectively

DT(PR) DT between the publisher P and the recipient Rcan be calculated as follows119863119879 (119875 119877) = sum119870119896=1 119864119896 (119875 119877) times 119891119896 times 119908119896 (119875 119877)sum119870119896=1 119891119896 (6)

(i) We assume that the publisher P has conducted a totalof K transactions in the past with the recipient R

(ii) The evaluation value of the 119896th transaction is119864119896(119875 119877)119864119896(119875 119877) is provided by the user P and belongs to[0 1](iii) 119891119896 is a time fading function and is defined as follow

Definition 2 (time fading function) In order to improvethe authenticity and dynamic adaptivity of direct trustwe consider that the interaction behavior of the past hasattenuated the direct trust effect compared to the currentinteraction behavior Specifically compared with the 119870thcurrent transaction the importance of the 119896th transactionin the past same transactions is depreciated We define thisattenuated function as time fading function that can becalculated as follows119891119896 = 120575119870minus119896 0 lt 120575 lt 1 1 le 119896 le 119870 (7)

where 119891119896 = 1 is previous interaction without attenuation and1198911 = 120575119870minus1 is the first interaction with the largest attenuation(4) Last weight of the interaction behavior of the 119896thtransaction is represented as119908119896(119875 119877)The interaction behav-ior is defined by user P and can be divided into five gradesaccording to the magnitude of behavior We assign differentweights to different interactions which can distinguish theeffect of different interactions on trust to a certain extentWeights of each grade from big to small are 1 08 06 04 02Therefore the direct trust degree 119863119879(119875 119877) of the (119870 + 1)thinteraction can be calculated by (6)

RT(PR) RT of user P to user R is converged by P for thedirect trust of the all recommendation users to R Here RT isa comprehensive evaluation for R by all users who have beeninteracted with 119877 which represents the overall credibility of119877 in social networks So we define the value of RT of user Pto user R as119877119879 (119875 119877) = sum119872isin119866119863119879 (119872119877) times 119862119875119872sum119872isin119866119862119875119872 119862119875119872 ge Θ (8)

where 119877119879(119875 119877) is the recommendation trust degree of recip-ients 119862119875119872 is the credibility of P put to M and G representsa collection of all trusted recommended users Value of 119862119875119872is equal to the direct trust 119863119879(119875119872) of publisher P to rec-ommenderM In this process of calculating recommendationtrust the publisher needs to provide a constant threshold ΘThe publisher P adopts recommendation information onlywhen trustworthiness 119862119875119872 of the recommender is greaterthan Θ

42 Trust Degree Punishment of Leakers In the previoussection we detect private message leakers through the canarytrap techniques In this section we punish these leakersby reducing the trust degrees of them We punish trustdegree of leakers based on a utility function The utilityfunction is integrated by the similarity between recipientsand unexpected receiver with the centrality of the recipientsTherefore we introduce two concepts the similarity betweentwo users and the centrality of a user

Definition 3 (similarity) The similarity indicates the degreeof separation between two users It can be calculated by theamount of common friends in social networks Sociologistshave found that if two people have one or more friends theyhave a greater chance of knowing and meeting each other


Definition 4 (centrality) The centrality of a user is theindex of the relative importance of quantized user in socialnetworksThere are many manners to define the centrality ofthe user such as betweenness centrality closeness centralityand degree centrality Among these manners we apply thedirect centrality calculating approach that is defined as thenumber of other users in direct contact with the user tocalculate similarity

Accordingly the centrality of user 119894 can be expressed asfollows

119862119890119899119894 (120591) = sum119873119896=1 119889119894119896 (120591)119873 (9)

where 119889119894119896(120591) = 1 or 119889119894119896(120591) = 0 indicates whether there is aconnection or not between the user i and the user k at time120591 119873 is the number of users in the network The similaritybetween two users can be derived as follows119878119894119895 (120591) = 1 + 10038161003816100381610038161003816119865119894 (120591) cap 119865119895 (120591)10038161003816100381610038161003816 (10)

where 119865119894(120591)(119865119895(120591)) denotes the set of friends of user 119894(119895)at time 120591 When we compute the utility function belowwe convolve with the centrality and the similarity If thesimilarity is equal to zero the utility function makes nosense However when two users have no mutual friends thesimilarity between them is equal to zero So we add 1 in (10)

We apply the utility function as the standard metrics topunish the recipient who spread publisherrsquos privacy infor-mation If leakers who make mistakes in the canary trapare friends of the unexpected receiver U we calculate thesocial similarity between leakers and the userUby comparingtheir friendship lists We also calculate the social centralityof leakers Note that social similarity and social centralityonly reflect the characteristics of the network structure Inaddition we also need to consider the dynamics of socialnetworks Because of the dynamic change of nodes thecomprehensive utility should be a time-varying functionTo address dynamic characteristics and avoid cumulativeeffects we define the utility function as the convolution ofsimilarity and centrality with a factor decaying as time Thetotal utility function value is the convolution of 119862119890119899119877(120591) with119878119877119880(120591) 119862119890119899119877(120591) is the centrality of the recipient and 119878119877119880(120591)is the similarity between the recipient R and the unexpectedreceiver U119884119877119880 (119879) = 119878119877119880 (119879) otimes 1 minus 119862119890119899119877 (120591)119879= int119879

120591=0119878119877119880 (120591) sdot 1 minus 119862119890119899119877 (120591)119879 minus 120591 (11)

where the convolution operation provides a time-decayingdescription of all prior values of similarity and centrality And119884119877119880(119879) is updated each time by accumulation of similarityand centrality when a leak event occurs

In the previous section we can get recipients who makemistakes in the canary trapNowwe punish themby reducingtheir trust degree values Specific penalties scale dependson the utility function value of a leaker 119897119894 The higher the

degree of utility function the more the value of trust degreeis reduced Above we have proved that the higher the utilityfunction of the leaker the greater the probability of leakinginformation Therefore we calculate the value of the trustpenalty to a leaker based on the ratio of the utility function ofthe leaker and the sum of all utility function value We definethis specific measure of punishment as follows

119879119904119906119887 (119897119894) = 119884119897119894119880 (119879)sum119871119894=1 119884119897119894119880 (119879) lowast 119879119904119906119887 (12)

where 119879119904119906119887(119897119894) is a specific penalty in trust degree of leaker119897119894 and L is a collection of leakers who make mistakes incanary traps 119879119904119906119887 is a total attenuation in trust degree causedby a leak event of privacy information and this value isdetermined on the publisher After attenuation the trustdegree of a leaker can be calculated as follows119879 (119875 119897119894)1015840 = 119879 (119875 119897119894) minus 119879119904119906119887 (119897119894) (13)

The utility function is proportional to the similarity betweenthe leaker and the unexpected receiver and inversely propor-tional to the centrality of the leaker A leaker with a highutility function will be penalized with a high trust valueBut the total trust penalty will be equal to the publisherrsquosexpectation of a trust penalty

|119871|sum119894=1

119879119904119906119887 (119897119894) = 119879119904119906119887 (14)

Users who make more mistakes in canary traps will bepunished with more trust values Users who receive a trustpenalty are likely to fall into lower security classes (thisconcept is described in the next section) So recipients wholeak usersrsquo privacy too often will be less likely to receiveprivate information from publishers

So far we complete the calculation process and punish-ment process of the trust degree mechanism In the firstpart we propose a calculation approach of trust degreein the case of no leakers In the calculation approach oftrust degree the interaction between users can help toenhance the trust between users We comprehensively takeinto account the direct trust of the publisher to the recipientand the comprehensive evaluation of the recipient by otherusers around the social network This approach of trustcalculation takes into account the difference in degree ofinfluence of interaction between different time periods Everyinteraction between users changes the value of trust degreeThe recommendation trust of users also directly changesthe publisherrsquos trust in the recipient In the second part wedemonstrate how to punish the trust degree of leakers inthe case of existing leakers Compared with the traditionalmethod of trust degree mechanism we has a better real-timeand dynamic trust degree mechanism

5 Design of Message Publishing System

In this section we design a message publishing system toreduce the risk that publishersrsquo private message would be


Input Publisherrsquos weighted social graphOutput Publishing rules(1) Phase I The publisher classifies the sensitivity of the message(2) Let I be the message that user want to be published

(3) 119878119868 larr997888 119904119890119899119904119894119905119894V119894119905119910(119868)(4) Execute sensitivity classification after each publishing(5) Phase II Divide the secure level of the recipient(6)Let 119877 be the recipient(7) 119878119871(119877) = 119883119883 isin [1 2 3 4 5](8) Phase III Determine whether to share message 119868 with 119877 or not(9) if 119878119868 gt 119878119871(119877) then(10) Return Message 119868 can be sent to recipient 119877(11)(12) else(13) Return Message 119868 can not be sent to recipient 119877(14)(15) end

Algorithm 2 Rating scheme

compromised First we propose a rating scheme to classifysecurity level of recipients and sensitivity of publisher Nextwe introduce our message publishing system model formallyand analyze the work flow of the system

51 Rating Scheme We propose a rating scheme based onthe trust degree of a publisher to recipients and publisherrsquossensitivity to message The rating scheme protects the privatemessage of the publisher in OSNs

Definition 5 (publisher sensitivity rating) Rating of thesensitivity of a publisher is based on the fact that differentpublishers have different sensitivities Lower sensitivity of apublisher has relatively lower demand for privacy protectionwhile higher sensitivity of a publisher requires a strong pro-tection We divide the sensitivity of the publisher on specificmessage into five levels expressed as 119878119894(119878119894 = 1 2 3 4 5)Level 5 is the lowest sensitivity and level 1 is the highestsensitivity Since the publisher may have different sensitivityrequirements for information at different time it is necessaryto perform sensitivity rating at each publish request

Definition 6 (the secure level rating) A publisher needsto evaluate the trust degree of recipients that they wantto interact with According to the trust degree mechanismproposed in the previous section the publisher can obtainthe trust degree of all recipients including leakersThe securelevel (SL) of recipients is divided into five levels according tothe trust degree 119879(119875 119877)1015840 We quantify this trust degree ratingprocess and the secure level is upgraded from 1 to 5 as shownin Table 3

In Algorithm 2 we obtain the publisherrsquos request level formessage sensitivity and classify the recipientrsquos secure levelConsidering message sensitivity and the secure level of arecipient the proposed algorithm can help the publisherdetermine whether to share their private message withrecipients or not

Table 3 Security level of receivers in OSNs

Security level Trust level Value of trust degree1 Absolute trust 08 to 12 High trust 06 to 083 Indecisiveness 04 to 064 High distrust 02 to 045 Absolute distrust 0 to 02

Obtain the publisherssensitivity to information

Trust degreemechanism

Access control suggestion

Rating scheme

Decision on expected recipients

2 Suggestionsof recipients

4 Publishing

3 Revision ofrecipients suggestions

1 Posting(textsphotosvideos)

I

II

III

Figure 5 Overview of message publishing system

52 System Overview In order to prevent the leakage ofprivacy in OSNs it is necessary to develop guidelines todetermine the message publishing system The system isexploited before a user posts on a social network site Thesystem determines who should be allowed to access messagesand inform the user of the recommendation In additionusers can add or exclude their own access to the proposednonproprietary information

The overview of the proposed system is shown as inFigure 5 It consists of three components the rating scheme


module the access control suggestion module and therevision of recipient suggestion module We describe thesemodules as below

(i) Rating scheme moduleAuser needs to provide theirsensitivity to the post while applying for publishing apost because different users have different sensitivityto different messages The system then queries therecipientrsquos trust degree to build their secure levels Inthis way the system obtains the sensitivity level of thepublisher to the messages and the secure level of therecipient Then the system can compare these valuesand formulate a access control suggestion for the nextmodule

(ii) Access control suggestion module According to thelevel of disclosure determined in the previous stepthis module determines the intended recipient basedon the level of trustworthiness in the self-network ofOSNs We take into account the situation where theuser is expected to send but the level of trust degreedoes not meet the requirements We also take intoaccount the situation where the level of trust degreemeets the requirements but the user is not expected tosend During this procedure it considers the revisedmessages associated with the previous posting andadds or excludes the intended recipient

(iii) Recipient suggestion revision module This moduleinforms the publisher about the associated accesscontrol suggestion (analysis results) The access con-trol suggestion contains a list of recipients When apublisher gets a list of recipients heshe can choosewhether to publish information to the recipients inthe list or revise the recipients in the list The autho-rizationrevision messages are stored in the systemand would be considered in future access controlrecommendations

The proposed message publishing system can help thepublisher to choose the right recipients to access thesemessages Because our system can prevent recipients whomay leak privacy of the publisher from receiving messagesthe risk of privacy leakage for the publisher can be reducedeffectively

6 Security Performance Analysis

In this section we study the secure performance in terms ofattack analysis secrecy analysis and access control analysis

61 Attack Analysis The leaking model we proposed indi-cates that recipients may spread privacy to other friends orstrangers without userrsquos consent According to the privacydisclosure model in word-of-mouth OSNs shown in Figure 1P shares its private messages to its recipients 119877119894 119894 isin [1 sdot sdot sdot 119899]but it is aware that the messages are known by U after aperiod of timeUsers who have disclosed the privatemessagescan probably be these recipients and this manner of privacydisclosure is known as word-of-mouth

By calculating the similarity between a suspicious mes-sage and the original one of publisher we exploit the VSM to

help P know whether its privacy is disclosed Then a canarytrap technique is used to trace the source of disclosure Inthis technique we consider a possible scenario that we cannotbe absolutely sure that the previously detected sensitiveinformation must be the user who leaked in the canary trapbecause the recipient leaked the information in this canarytrap does not mean that he also leaked the previous privacyof the publisher However even if the previous messages arenot leaked by the user it should still be punished for itsmistake What we need is to fairly assign different levels oftrust penalties to different users As a result we combine thecentrality degree 119862119890119899(119877) of recipient R with the similarity119878119877119880 between R and unexpected receiver U to obtain a utilityfunction The specific penalties scale depends on the utilityfunction value of the leaker Intuitively the higher the degreeof the utility function is the more the value of trust degree isreduced According to the dynamic evaluation of recipientsrsquotrustworthiness decreasing of the recipientrsquos trust degreedirectly affects the ability to obtain private messages Finallywe set up a newmessage publishing system to determine whocan obtain private messages

62 Secrecy Analysis According to the sensitivity degreeof a publisher on private messages our privacy preservingstrategy first classifies users into different sensitive levelsNext the publisher computes and classifies the trust degreeof recipients Finally sensitivity rating and the secure levelof the publisher are used to determine whether a recipientis allowed to obtain publisherrsquos private messages or not Thetrust-based privacy preserving strategy may change alongwith time publisher itself and other factors In addition theprocess may start by issuing an interactive request from thepublisher to the recipients so that the recipients cannot accessa message without authorization

63 Access Control Analysis When a user in an online socialnetwork is ready to publish privatemessages it considers thatthe private messages can be known only by a small group ofits recipients rather than by random strangers The proposedmessage publishing system can help the publisher to choosethe right recipients to access these messages Our systemrequires users to submit their own sensitivity levels to thepost before applying for a post because different users havedifferent sensitivity to different messages After the publishersensitivity level acquisition process is complete we countthe secure level of the publisherrsquos recipients through ourdynamic trust degree mechanism and then give the publishera list of suggested recipients of messages The publisher canmake revisions to the recipient list to delete or add andgive feedback about the revision information to the accesscontrol system This feedback also would be considered infuture published access control recommendations Becauseour system can prevent recipients whomay leak privacy of thepublisher from receivingmessages the risk of privacy leakagefor the publisher can be reduced effectively

7 Related Work

A service of online social networks (OSNs) such as WechatFacebook Twitter and LinkedIn gradually becomes a


primary mode to interact and communicate between par-ticipants A person can use these social applications at anytime to contact with friends and send messages regardlessof age gender and even socioeconomic status [19ndash21] Thesemessages should contain sensitive and private informationeg location [22] channel state information [23] routinginformation [24] social relationships [25] browsing data ofInternet [26] health data [27] and financial transactions [28]Obviously the person should be worried about disclosures ofpersonal information which may be harmful to him eitherin virtual or real world [13 29 30] As a result an efficientprivacy preserving strategy should be investigated to detectthese threats [31]

The existing works in privacy preserving focus on infor-mation disclosure caused by malicious nodes (eg attackersor eavesdroppers) in OSNs [32 33] Essentially speaking theauthors of these works first convert nodes and the corre-sponding links in OSNs into vertices and edges of a weightedgraph and then exploit graph theory and cryptographytechnologies to develop various protection solutions [34ndash36]Although these solutions can protect personal informationfrom network attacking and illegal eavesdropping there stillexists a more serious threat to usersrsquo privacy ie informationdisclosure by word-of-mouth

Text similarity calculation has been widely used inInternet search engine [37] intelligent question and answermachine translation information filtering and informationretrieval In this paper we use the text similarity calculation todetermine whether the publisherrsquos text information is leakedIn terms of algorithm the most commonly used VSM in textsimilarity calculationwas first proposed byGerard Salton andMcGill in 1969 [38]The basic idea of the algorithm is to mapthe document to an n-dimensional vector so as to transformthe processing of text into a vector operation on a spatialvector The similarity between documents is determinedby comparing the relations between vectors Among themthe most widely used weight calculation method is TF-IDFalgorithm [39] and various improved algorithms The mostcommonly used similarity measurement method is cosinesimilarity measurement [40]

To sum up the privacy protection strategy in socialnetwork can be divided into two ways role access control anddata anonymity The anonymous method is mainly used formultidimensional data such as network topology For specificprivacy such as user attributes access control based on trustis a reliable way to protect privacy

8 Conclusion

Considering privacy word-of-mouth disclosure by acquain-tances we put forward a novel privacy preserving strategyin this paper In particular we carefully combine the privacyleaking detection with the trust degree mechanism The tra-ditional privacy protection schemes aremainly on computinga trust degree threshold In their schemes a friend whosetrust degree exceeds the threshold is considered believableIn contrast to these schemes our strategy incorporates twonew points of classifying each publisherrsquos sensitivity and eachready interactive recipient Our proposed publishing system

consists of publisherrsquos sensitivity level to information ratingand recipientrsquos secure level rating What is noteworthy is thatour scheme still gives the decision of message release to thepublisher after giving the suggestion of access control Thepublisher can make their own changes to the recipient listand these revisions alsowill be considered in future publishedaccess control recommendationsTherefore our informationpublishing strategy greatly reduces the risk of illegal spread ofuserrsquos private messages

In future we want to provide a solution that is intendedto prevent the recipient from illegally forwarding themessageitself rather than the content of the message Also we mayexplore a sensitive message transmission interface for OSNsapplications to protect usersrsquo privacy

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (Grants nos 61471028 61571010 and61572070) and the Fundamental Research Funds for the Cen-tral Universities (Grants nos 2017JBM004 and 2016JBZ003)

References

[1] J Barmagen ldquoFog computing introduction to a new cloudevolutionrdquo Jos Francisco Fornis Casals pp 111ndash126 2013

[2] H R Arkian A Diyanat and A Pourkhalili ldquoMIST Fog-baseddata analytics scheme with cost-efficient resource provisioningfor IoT crowdsensing applicationsrdquo Journal of Network andComputer Applications vol 82 pp 152ndash165 2017

[3] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks in socialnetworksrdquo IEEE Transactions on Dependable and Secure Com-puting 2016

[4] H Yiliang J Di and Y Xiaoyuan ldquoThe revocable attributebased encryption scheme for social networksrdquo in Proceedingsof the International Symposium on Security and Privacy inSocial Networks and Big Data SocialSec 2015 pp 44ndash51 chnNovember 2015

[5] S Machida T Kajiyama S Shigeru and I Echizen ldquoAnalysis ofFacebook Friends Using Disclosure Levelrdquo in Proceedings of the2014 Tenth International Conference on Intelligent InformationHiding and Multimedia Signal Processing (IIH-MSP) pp 471ndash474 Kitakyushu Japan August 2014

[6] S Machida T Kajiyama S Shigeru and I Echizen ldquoAnalysisof facebook friends using disclosure levelrdquo in Proceedings of the10th International Conference on Intelligent Information Hidingand Multimedia Signal Processing IIH-MSP 2014 pp 471ndash474jpn August 2014

[7] M Suzuki N Yamagishi T Ishidat M Gotot and S HirasawaldquoOn a new model for automatic text categorization based on


vector space modelrdquo in Proceedings of the 2010 IEEE Interna-tional Conference on Systems Man and Cybernetics SMC 2010pp 3152ndash3159 tur October 2010

[8] L Xu S Sun and Q Wang ldquoText similarity algorithm basedon semantic vector space modelrdquo in Proceedings of the 2016IEEEACIS 15th International Conference on Computer andInformation Science (ICIS) pp 1ndash4 Okayama Japan June 2016

[9] F P Miller A F Vandome and J Mcbrewster ldquoCanary Traprdquo in1em plus 05em minus 0 4em Alphascript Publishing 2010

[10] J Hu Q Wu and B Zhou ldquoRBTrust A RecommendationBelief Based Distributed Trust Management Model for P2PNetworksrdquo in Proceedings of the 2008 10th IEEE InternationalConference on High Performance Computing and Communica-tions (HPCC) pp 950ndash957 Dalian China September 2008

[11] P Yadav S Gupta and S Venkatesan ldquoTrust model for privacyin social networking using probabilistic determinationrdquo inProceedings of the 2014 4th International Conference on RecentTrends in Information Technology ICRTIT 2014 ind April 2014

[12] Y He F Li B Niu and J Hua ldquoAchieving secure and accuratefriend discovery based on friend-of-friendrsquos recommendationsrdquoin Proceedings of the 2016 IEEE International Conference onCommunications ICC 2016 mys May 2016

[13] Y Wang G Norcie S Komanduri A Acquisti P G Leonand L F Cranor ldquoldquoI regretted the minute I pressed sharerdquordquoin Proceedings of the the Seventh Symposium p 1 PittsburghPennsylvania July 2011

[14] M Gambhir M N Doja and Moinuddin ldquoAction-basedtrust computation algorithm for online social networkrdquo inProceedings of the 4th International Conference on AdvancedComputing and Communication Technologies ACCT 2014 pp451ndash458 ind February 2014

[15] F Nagle and L Singh ldquoCan friends be trusted Exploringprivacy in online social networksrdquo in Proceedings of the 2009International Conference onAdvances in Social NetworkAnalysisand Mining ASONAM 2009 pp 312ndash315 grc July 2009

[16] Y Yustiawan W Maharani and A A Gozali ldquoDegree Cen-trality for Social Network with Opsahl Methodrdquo in Proceedingsof the 1st International Conference on Computer Science andComputational Intelligence ICCSCI 2015 pp 419ndash426 idnAugust 2015

[17] A Pandey A Irfan K Kumar and S Venkatesan ldquoComput-ing Privacy Risk and Trustworthiness of Users in SNSsrdquo inProceedings of the 5th International Conference on Advances inComputing and Communications ICACC 2015 pp 145ndash150 indSeptember 2015

[18] F Riquelme and P Gonzalez-Cantergiani ldquoMeasuring userinfluence on Twitter a surveyrdquo Information Processing amp Man-agement vol 52 no 5 pp 949ndash975 2016

[19] Z-J M Shen ldquoIntegrated supply chain design models asurvey and future research directionsrdquo Journal of Industrial andManagement Optimization vol 3 no 1 pp 1ndash27 2007

[20] A M Vegni and V Loscrı ldquoA survey on vehicular socialnetworksrdquo IEEE Communications Surveys amp Tutorials vol 17no 4 article no A3 pp 2397ndash2419 2015

[21] J Heidemann M Klier and F Probst ldquoOnline social networksa survey of a global phenomenonrdquo Computer Networks vol 56no 18 pp 3866ndash3878 2012

[22] Y Huo C Hu X Qi and T Jing ldquoLoDPD A LocationDifference-Based Proximity Detection Protocol for Fog Com-putingrdquo IEEE Internet of Things Journal vol 4 no 5 pp 1117ndash1124 2017

[23] L Huang X Fan Y Huo C Hu Y Tian and J Qian ldquoA NovelCooperative Jamming Scheme for Wireless Social NetworksWithout Known CSIrdquo IEEE Access vol 5 pp 26476ndash264862017

[24] M Wang J Liu J Mao H Cheng J Chen and C Qi ldquoRoute-Guardian Constructingrdquo Tsinghua Science and Technology vol22 no 4 pp 400ndash412 2017

[25] X Zheng G Luo and Z Cai ldquoA Fair Mechanism for PrivateData Publication inOnline Social Networksrdquo IEEE Transactionson Network Science and Engineering pp 1-1

[26] J Mao W Tian P Li T Wei and Z Liang ldquoPhishing-AlarmRobust and Efficient Phishing Detection via Page ComponentSimilarityrdquo IEEE Access vol 5 pp 17020ndash17030 2017

[27] C Hu H Li Y Huo T Xiang and X Liao ldquoSecure andEfficient Data Communication Protocol for Wireless BodyArea Networksrdquo IEEE Transactions on Multi-Scale ComputingSystems vol 2 no 2 pp 94ndash107 2016

[28] H Alhazmi S S Gokhale and D Doran ldquoUnderstandingsocial effects in online networksrdquo in Proceedings of the 2015International Conference on Computing Networking and Com-munications ICNC 2015 pp 863ndash868 usa February 2015

[29] M Fire R Goldschmidt and Y Elovici ldquoOnline social net-worksThreats and solutionsrdquo IEEE Communications Surveys ampTutorials vol 16 no 4 pp 2019ndash2036 2014

[30] M Sleeper J Cranshaw P G Kelley et al ldquoldquoI read myTwitter the next morning and was astonishedrdquo a conversationalperspective on Twitter regretsrdquo in Proceedings of the 31st AnnualCHI Conference on Human Factors in Computing SystemsChanging Perspectives CHI 2013 pp 3277ndash3286 fra May 2013

[31] W Dong V Dave L Qiu and Y Zhang ldquoSecure frienddiscovery in mobile social networksrdquo in Proceedings of the IEEEINFOCOM pp 1647ndash1655 April 2011

[32] C Akcora B Carminati and E Ferrari ldquoPrivacy in socialnetworks How risky is your social graphrdquo in Proceedings of theIEEE 28th International Conference on Data Engineering ICDE2012 pp 9ndash19 usa April 2012

[33] B Zhou and J Pei ldquoPreserving privacy in social networksagainst neighborhood attacksrdquo in Proceedings of the 2008 IEEE24th International Conference onData Engineering ICDErsquo08 pp506ndash515 mex April 2008

[34] Q Liu G Wang F Li S Yang and J Wu ldquoPreserving Privacywith Probabilistic Indistinguishability in Weighted Social Net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 28 no 5 pp 1417ndash1429 2017

[35] L Zhang X-Y Li K Liu T Jung and Y Liu ldquoMessage in aSealed Bottle Privacy Preserving Friending in Mobile SocialNetworksrdquo IEEE Transactions on Mobile Computing vol 14 no9 pp 1888ndash1902 2015

[36] R Zhang J Zhang Y Zhang J Sun and G Yan ldquoPrivacy-preserving profile matching for proximity-based mobile socialnetworkingrdquo IEEE Journal on Selected Areas in Communica-tions vol 31 no 9 pp 656ndash668 2013

[37] H J Zeng Q C He Z Chen W Y Ma and J Ma ldquoLearningto cluster web search resultsrdquo in Proceedings of the InternationalACM SIGIR Conference on Research and Development in Infor-mation Retrieval pp 210ndash217 July 2004

[38] Salton and Buckley ldquoTeam weighting approaches in automatictext retireval readings in information retrievalrdquo in Proceedingsof the in International Conference on Computer Vision Theoryand Applications pp 652ndash657 1998


[39] T Peng L Liu and W Zuo ldquoPU text classification enhancedby term frequency-inverse document frequency-improvedweightingrdquo Concurrency and Computation Practice and Expe-rience vol 26 no 3 pp 728ndash741 2014

[40] V Oleshchuk and A Pedersen ldquoOntology based semanticsimilarity comparison of documentsrdquo in Proceedings of the14th International Workshop on Database and Expert SystemsApplications DEXA 2003 pp 735ndash738 cze September 2003

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018


Active and Passive Electronic Components

VLSI Design



Shock and Vibration


Civil EngineeringAdvances in

Acoustics and VibrationAdvances in



Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of



Journal ofEngineeringVolume 2018

SensorsJournal of



RotatingMachinery


Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018


Chemical EngineeringInternational Journal of Antennas and

Propagation




Navigation and Observation


Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom


design a strategy among users in the word-of-mouth socialnetwork to share their data within a trusted user set

Considering the human-centric network our objective isto design an approach to achieve trustworthy word-of-mouthinformation release This approach can trace the source ofprivacy disclosure and automatically adjust the trust degreeof nodes in OSNs Our exploration of this uncharted areaneeds to answer the following three challenges First how canwe determine whether one userrsquos privacy has been alreadydisclosed and detect the leaker Second how to update socialrelationship of a user after detecting information disclosureFinally how to prevent acquaintances fromdisclosing privacythrough the manner of word-of-mouth

Aiming at the first challenges we intend to use thevector space model (VSM) and the canary trap technologyto achieve message disclosure detection The authors of [7]present a mathematical expression of a VSM to measurethe similarity of two sets Then Li et al improve the cal-culation method of the VSM in [8] They apply semanticresources to reduce the dimensionality of feature itemsSimilarly we apply the VSM in this article to calculate thesimilarity between the suspected leaked messages and thepublished ones to determine whether the messages is com-promised After determining message leakage we employthe canary trap technology to trace the source of the leakedmessages In brief different versions of sensitive data aresent to suspected leakers to find which version gets leaked[9]

Next we introduce the trust degree of a user to ensuresecure data sharing to cope with the second challenge Thetrust mechanism has been widely studied to achieve privacypreserving in traditional OSNs The authors of [10] investi-gate a recommendation belief based on a distributed trustmanagementmodel for peer-to-peer networksThey quantifyand evaluate the reliability of nodes and introduce a similarityfunction to construct the recommended credibility In [11]the authors propose a method to check usersrsquo credibilitybefore they enter a network In addition the authors in [12]design a secure recommendation system for mobile usersto learn about potential friends opportunistically Differentfrom the existing works we employ the dynamic trustdegree to classify the recipients and to rank the sensitivityof publisher to information that needs to be published Thereason is that a user will unintentionally disclose privacyand later regret the behavior if the user is in an emotionalstate at the time of posting [13] In order to avoid incorrectaccess to the sensitive-privacy information the premise ofaccess control for users is to classify the privacy informationclearly

Moreover in order to prevent the privacy leakage oneuserrsquos (publisherrsquos) privacy should only be shared with otherones (recipients) in a ldquocorrectrdquo user set in OSNs [14ndash16] Theset composed of numerous trusted recipients can be updatedbased on the dynamic trust value that is a personal perceptionof a publisher to recipients In this paper we allow thepublisher to rate recipients based on the value to determinewhether they are trustworthy or not and dynamically adjustprivacy settings of recipients by privacy disclosureWe designa systematic information publishing algorithm to reduce the

P U

P publisher

messages

２2

２1

２n

２n-1

３２-1５

３２５

３２2５

３２1５

[２1２2middot middot middot２n] recipientsU unexpected receiver

４０２

４０２2

４０２1

４０２-1

Figure 1 A privacy disclosure model in word-of-mouth OSNs

leakage probability of private messages In this case we canreduce the risk of illegal spread of messages [17 18]

To the best of our knowledge there are few solutions topreserve privacy in word-of-mouth OSNs which motivatesour work The main contributions of this paper are summa-rized as follows

(i) We employ the VSM to measure the similaritybetween leaked messages and original ones so as todetect whether privacy disclosure exists According tothe tracing results by using the canary trap methodwe determine which recipients are the possible leak-ers

(ii) We define a dynamic approach to compute trustdegree between users To handle privacy leaks fairlywe design a novel utility function to punish the trustdegree of leakers

(iii) We design a probabilistic privacy preserving strategybased on a publisherrsquos sensitivity to messages andrecipientsrsquo trust degree for publishing a userrsquos mes-sages securely

The rest of the paper is organized as follows Section 2introduces the privacy disclosure model in word-of-mouthOSNs In Section 3 we present our privacy disclosure detec-tion method via computing the similarity between leakedmessages and original ones Section 4 introduces the calcula-tion method of trust degree and the punishment mechanismfor leakers Next a trust degree based publishing system isproposed to achieve messages sharing with privacy preserv-ing in Section 5 Then we analyze the secure performance ofour strategy in Section 6 and survey relatedwork in Section 7Finally we conclude our work in Section 8

2 System Model

In this section we propose a privacy disclosure modelas shown in Figure 1 This model can be formulated as aweighted directed social graph Each node in the graph















θ

cos

x

y

z

i

jdis(ij)



1199081198991 sdot sdot sdot 119908119899119898) (2)








119878119894119898 (1198941 1198942) = 1119899 119899sum119895=1

cos 120579119895 = 1119899 119899sum119895=1


2sum119898119895119896=1

11990811198951198961199082119895119896radic(sum119898119895119896=1

11990811198951198962) (sum119898119895119896=1 11990821198951198962) (3)















+

+

+

+

X

P


U


Leaker identifying

Leaker Tracing

２1

２2

２k

２N








0


Absolutetrust


distrustModerate

trust

High trust

1




















119862119890119899119894 (120591) = sum119873119896=1 119889119894119896 (120591)119873 (9)




120591=0119878119877119880 (120591) sdot 1 minus 119862119890119899119877 (120591)119879 minus 120591 (11)




119879119904119906119887 (119897119894) = 119884119897119894119880 (119879)sum119871119894=1 119884119897119894119880 (119879) lowast 119879119904119906119887 (12)



|119871|sum119894=1

119879119904119906119887 (119897119894) = 119879119904119906119887 (14)



















Rating scheme



4 Publishing



I

II

III

















7 Related Work







8 Conclusion




Data Availability




Acknowledgments


References














































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia
















θ

cos

x

y

z

i

jdis(ij)



1199081198991 sdot sdot sdot 119908119899119898) (2)








119878119894119898 (1198941 1198942) = 1119899 119899sum119895=1

cos 120579119895 = 1119899 119899sum119895=1


2sum119898119895119896=1

11990811198951198961199082119895119896radic(sum119898119895119896=1

11990811198951198962) (sum119898119895119896=1 11990821198951198962) (3)















+

+

+

+

X

P


U


Leaker identifying

Leaker Tracing

２1

２2

２k

２N








0


Absolutetrust


distrustModerate

trust

High trust

1




















119862119890119899119894 (120591) = sum119873119896=1 119889119894119896 (120591)119873 (9)




120591=0119878119877119880 (120591) sdot 1 minus 119862119890119899119877 (120591)119879 minus 120591 (11)




119879119904119906119887 (119897119894) = 119884119897119894119880 (119879)sum119871119894=1 119884119897119894119880 (119879) lowast 119879119904119906119887 (12)



|119871|sum119894=1

119879119904119906119887 (119897119894) = 119879119904119906119887 (14)



















Rating scheme



4 Publishing



I

II

III

















7 Related Work







8 Conclusion




Data Availability




Acknowledgments


References














































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia












+

+

+

+

X

P


U


Leaker identifying

Leaker Tracing

２1

２2

２k

２N








0


Absolutetrust


distrustModerate

trust

High trust

1




















119862119890119899119894 (120591) = sum119873119896=1 119889119894119896 (120591)119873 (9)




120591=0119878119877119880 (120591) sdot 1 minus 119862119890119899119877 (120591)119879 minus 120591 (11)




119879119904119906119887 (119897119894) = 119884119897119894119880 (119879)sum119871119894=1 119884119897119894119880 (119879) lowast 119879119904119906119887 (12)



|119871|sum119894=1

119879119904119906119887 (119897119894) = 119879119904119906119887 (14)



















Rating scheme



4 Publishing



I

II

III

















7 Related Work







8 Conclusion




Data Availability




Acknowledgments


References














































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia



0


Absolutetrust


distrustModerate

trust

High trust

1




















119862119890119899119894 (120591) = sum119873119896=1 119889119894119896 (120591)119873 (9)




120591=0119878119877119880 (120591) sdot 1 minus 119862119890119899119877 (120591)119879 minus 120591 (11)




119879119904119906119887 (119897119894) = 119884119897119894119880 (119879)sum119871119894=1 119884119897119894119880 (119879) lowast 119879119904119906119887 (12)



|119871|sum119894=1

119879119904119906119887 (119897119894) = 119879119904119906119887 (14)



















Rating scheme



4 Publishing



I

II

III

















7 Related Work







8 Conclusion




Data Availability




Acknowledgments


References














































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia





119862119890119899119894 (120591) = sum119873119896=1 119889119894119896 (120591)119873 (9)




120591=0119878119877119880 (120591) sdot 1 minus 119862119890119899119877 (120591)119879 minus 120591 (11)




119879119904119906119887 (119897119894) = 119884119897119894119880 (119879)sum119871119894=1 119884119897119894119880 (119879) lowast 119879119904119906119887 (12)



|119871|sum119894=1

119879119904119906119887 (119897119894) = 119879119904119906119887 (14)



















Rating scheme



4 Publishing



I

II

III

















7 Related Work







8 Conclusion




Data Availability




Acknowledgments


References














































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia
















Rating scheme



4 Publishing



I

II

III

















7 Related Work







8 Conclusion




Data Availability




Acknowledgments


References














































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia















7 Related Work







8 Conclusion




Data Availability




Acknowledgments


References














































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia







8 Conclusion




Data Availability




Acknowledgments


References














































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia








































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia


a probabilistic privacy preserving strategy for word-of-mouth social networksdownloads.hindawi.com...

Documents