an preliminary analysis of the #feesmustfall twitter campaign · unique feature of fmf was the...

28
An preliminary analysis of the #FeesMustFall Twitter Campaign Khan, Y 1 and Thakur, S 2 and Shabat, S 3 1 Masters candidate, KZN CoLab, DUT 2 Director, KZN eSkills CoLab, DUT 3 Lecturer Information Technology DUT

Upload: others

Post on 19-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

An preliminary analysis of the #FeesMustFall Twitter Campaign

Khan, Y1 and Thakur, S2 and Shabat, S3 1 Masters candidate, KZN CoLab, DUT

2 Director, KZN eSkills CoLab, DUT 3 Lecturer Information Technology DUT

Page 2: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Table of Contents

An preliminary analysis of the #FeesMustFall Twitter Campaign ........................................... 1

Abstract...................................................................................................................................... 3

Introduction ............................................................................................................................... 5

Background ................................................................................................................................ 5

The Twitter Data ........................................................................................................................ 6

VADER .................................................................................................................................... 9

The data ............................................................................................................................... 10

[Outlying interesting observations] ..................................................................................... 11

Describe the data [YASEEN] ............................................................................................ 11

The Twitter platform .......................................................................................................... 12

The impact of bots on campaign..................................................................................... 13

Interesting Finding ............................................................................................................. 15

Data Set .................................................................................................................................... 16

Detecting a bot......................................................................................................................... 16

The emotion trend ................................................................................................................... 18

Sentiment Analysis ............................................................................................................... 18

Further work ............................................................................................................................ 26

Conclusion ................................................................................................................................ 26

Bibliography ............................................................................................................................. 27

Word pool ................................................................................................................................ 28

Prayer Index ......................................................................................................................... 28

Bot detection ........................................................................................................................... 28

Timelines vs frequency ............................................................................................................ 28

‘Amita Bachan – twitter cleans followers (bots)’ ..................................................................... 28

Bell Pottinger ........................................................................................................................... 28

Page 3: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Abstract The #FeesMustFall (FMF) campaign was started in 2015 by students to lobby

government to fund student university education to redress past imbalances. A unique

feature of FMF was the leveraging of social media platforms to coordinate the

campaign, inform and lobby students as well as activists, garner support, and retain

media and community attention. This study consequently studied the Twitter

component of the FMF primarily through the acquisition of 576,583 Tweets.

These tweets were collated, pre-processed and cleaned. This meant, inter alia

removing duplicates and data that could not be discerned or made no sense. This

output data was then subjected to a series of analysis. The analytical method utilized

included, amongst others, descriptive statistics, sentiment analysis using a natural

language programming (NLP) approach with VADER (Valence Aware Dictionary

sEntinment Reasoner), timeline analysis, and hashtag data analysis. This was

triangulated with real-world events.

This study is relevant to understand student activism. The model and methodology

may, at a government level, be extended to anticipate and mitigate service delivery

protests and even help in tracking sources of illnesses like listeriosis. At a commercial

level companies may use this for real-time sentiment tracking.

The study shows that Twitter was a key and active platform of the campaign. It found

an intriguing evidence of software robots commonly called bots which was deployed

to drive public sentiment. This to the authors knowledge, was not mentioned in the

media or any other study, during this campaign. This influence will be analysed.

Further perceived negative events can and did drive sentiment. One example is the

arson event with the touching of the UKZN library. This incident will be analysed in

this paper.

Students it seems are students with activism largely confined to weekday and during

university term time. Contrary to some perceptions, slactivism, although probably

present, it was not a key component of the campaign. Slacktivism are actions

performed via the Internet in support of a political or social cause but requiring little

time or involvement, e.g. signing an online petition.

Page 4: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

The FMF campaign had a desired effect as the new President Ramaphosa

announced that from 2017 education will be free for students from families with a

combined income of less than ZAR 350, 000.

Page 5: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Introduction The #FeesMustFall (FMF) campaign started in October 2015 by students to forcefully

lobby government to fund student university education to redress past imbalances. A

unique feature of FMF was the leveraging of social media platforms to coordinate the

campaign, inform and lobby students as well as activists, garner support and retain

media and community attention. This study consequently studied the Twitter

component of the FMF primarily because the researchers had access to 576,583

Tweets that was posted from March 2015 to March 2017.

Background The #FeesMustFall (FMF) campaign was started in October 2015 by students to force

government to fully fund student university education to redress past imbalances. The

FMF campaign was waged on the back of the #RhodesMustFall campaign at UCT.

There was intriguing evidence of one tweet in March 2015 as well as one tweet (put

tweet here) in April 2015 (put tweet here) during the RMF campaign alluding to the

fact that this campaign gave birth to the FMF campaign. Similarly, it well be that the

DataMustFall campaign may well expand as a product of FMF.

The FMF student argued that higher education entrenched a new form of apartheid

based on class system, as the poor could ill afford the fees, the accommodation, travel

and meals. FMF is celebrated as a non-partisan largely student protest movement

although many political parties tried to gain mileage out of the process. The movement

enjoyed much support from across the political spectrum, from rich and poor, business

academia, and civil society.

The actual #FeesMustFall (FMF) movement started in Johannesburg after the

University of Witwatersrand (Wits) declaring an unaffordable rise in fees for 2016. Wits

claimed that the subsidy from government would not be enough to accommodate the

net increase in costs by the university, for library books, journal subscriptions, research

equipment, and academics’ salaries. Rhodes University in Grahamstown then

announced a minimum initial payment of 50% of fees for 2016, meaning that the

average student living in residence needed an upfront payment of ZAR45,000.00. The

Page 6: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

FMF movement became a rallying cry against financial exclusion and debt traps for

economically disadvantaged students (Pillay, 2016).

Digital activism enabled the movement to flourish. Facebook, Twitter, and instant

messaging services allowed supporters to swiftly communicate, coordinate and

organise meetings and protest marches. On 23 October 2015, thousands of

supporters marched to the Union Buildings to demand free education from the then

State President, Jacob Zuma, and then Minister of Higher Education, Blade

Nzimande.

This campaign regrettably pitched student against university administration, which

sometimes required police intervention and backing. The irony is deepened when one

considers that both elements in the ecosystem supported the campaign. In spite of

society supporting the FMF movement, there was social media but little physical

support. Pillay is unequivocal that “silence is (also) violence.”

There were some particularly horrifying events during the standoff which saw emotions

swaying towards the students when they were tear gassed and rubber bullets fired.

On the other hand, society reacted with horror when a security guard was killed at

CPUT, a ZAR100 million building was torched at UJ and a irreplaceable historic library

was burnt at UKZN.

A particular intriguing feature of FMF was the student and public leveraging of Social

Media platforms to garner support and keep the event in the public eye in a sustained.

A particular interest was the leveraging of Twitter by students to use the hashtag

#FMF. The researchers have gathered 597,000 tweets from the period 2015-2017 for

this campaign.

The Twitter Data These tweets were collated, preprocessed and cleaned. The latter meant removing

duplicates and data that were uninterpretable. This output data was subjected to a

series of analysis. The analytical method utilized included, amongst others, descriptive

statistics, sentiment analysis using natural language programming (NLP), timeline

Page 7: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

analysis, hashtag data analysis and VADER. Valence Aware Dictionary and sEntiment

Reasoner (VADER) is a NLP lexicon and rule-based sentiment analysis tool that is

specifically attuned to sentiments expressed in social media and works well on texts

from other domains (Python Foundation, 2018). This was also triangulated with real-

world events.

The students argued that higher education entrenched a new form of apartheid based

on a class system, as the poor could ill afford the fees, the accommodation, travel and

meals. FMF is celebrated as a non-partisan largely student protest movement

although many political parties tried to gain mileage out of the process. The movement

enjoyed much support from across the political spectrum, from rich and poor, business

academia, and civil society. It must be noted that the FMF campaign started to lobby

against fee increases but progressed to remind government of its free education

pledge (Hehe, 2017).

The movement is the first national struggle waged leveraging an almost entirely social

media (SM) platform. SM was used to mobilize students through virtual tools to

amplify their physical presence at various campuses on particular campuses at

particular times such as Wits University. At the same time, media houses were

strategically informed to ensure that the event occurred in glaring public eye. The

simultaneous private public nature of SM allowed student leaders to network and

coordinate without the knowledge of authorities. It must be mentioned that although

students distrusted the university administrations, perhaps because they felt that

administrations were not doing enough to support their cause, they both

philosophically were on the same side.

The free campaign was till the FMF launch was fought by Student Representative

Councils (SRC) with their respective university. This had limited success. RMF created

the intrigue and the counter-memory angle ( Bosch, 2017) which ignited academic

passion. This was matched by some revolting attention grabbing incidents such as

feaces dropping which ignited twitterspere. The RMF fueled FMF with the first mention

of FMF coming 6 months before October.

Page 8: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

South Africa is a country with a rich history of activism fighting unsurmountable odds.

These began historically when settlers invaded this land, and declared parts of it to be

the sovereign of Netherland, Britain, or Germany etc. This effected the tribes, as well

as the Khoi San.

The dispossession was entrenched in law, and took a raced-based hugh, when the

1948 government wrote Apartheid into the law, and reduced the blacks to servant

status. Even within the genre of blacks the oppressor government saw fit to accord

some form of blacks namely the so-called Indians and Coloured folk more privileges

that African folk. Gender discrimination was a given across all populations.

In 1955 in Kliptown the people of SA passed several resolutions not least the right to

free education. The people of SA rose over the next three decades and after a

protracted struggle the new country was born in 1994, with Mandela installed as our

inaugural president. The country had, since then, many competing priorities to redress

apartheid, some of which required a form of reverse apartheid, where certain positions

were reserved for people of color to statistically redress the imbalance. The people

have been patient for a reasonable period of time.

However, as time passed the people particularly the disadvantaged peoples, patience

was tested and an increasing number of service delivery protests, resulted. These

have in many cases forced government to react, which lent further credibility to

vigilantism as a method to attract government attention.

South Africa has the most unequal school system in the world. (Nic Spaull of the

University of Stellenbosch (Spaull, 2017). Many schools are deemed free fee-paying

schools and received full subsidy. This is a method to redress the school situation.

The irony is that a poor student may go through their entire schooling career not paying

fees and suddenly be required to pay tertiary fees.

Moreover, the article argues that youth are increasingly using social networking sites

to develop a new biography of citizenship which is characterized by more

individualized forms of activism. In the present case, Twitter affords youth an

opportunity to participate in political discussions, as well as discussions of broader

Page 9: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

socio-political issues of relevance in contemporary South African society, reflecting a

form of sub-activism (Bosch, 2017).

VADER Valence Aware Dictionary and sEntiment Reasoner, also known as VADER, is a

parsimonious rule-based model for sentiment analysis of social media text. According

to (Hutto & Gilbert, 2014), the effectiveness of VADER was compared to eleven typical

sentiment analysis models such as Affective Norms for English Words (ANEW),

Linguistic Inquiry Word Count (LIWC), SentiWordNet (SWN) and also those that

utilises machine learning techniques that depend on Naïve Bayes, Maximum Entropy

and Support Vector Machine (SVM) algorithms. Subsequently, VADER ranked the

highest in predictive accuracy when tested on 4200 Tweets from Twitter, 3708 product

review snippets from Amazon.com, 10605 movie review snippets and 5190 article

snippets from NY Times Editorials.

VADER is an example of a lexical method for sentiment analysis and its algorithm is

based on the following principles (Gab, 2017):

𝐸 ∈ [−4; 4], 𝑤ℎ𝑒𝑟𝑒 𝐸 𝑟𝑒𝑝𝑟𝑒𝑠𝑒𝑛𝑡𝑠 𝑡ℎ𝑒 𝑆𝑒𝑛𝑡𝑖𝑚𝑒𝑛𝑡 𝑆𝑐𝑜𝑟𝑒 𝑝𝑒𝑟 𝑤𝑜𝑟𝑑

Sentiment score or Emotion intensity of a word is measured on a scale from -4

to +4, where -4 is the most negative and +4 is the most positive. The midpoint

0 represents a neutral sentiment.

The overall sentiment (S) is normalized using the formula,

𝑆 = ∑𝐸𝑖

√ (∑𝐸𝑖)2 + 𝛼⁄ 𝑖 = 0,1,2, … , 𝑛

𝐸𝑖 𝑟𝑒𝑝𝑟𝑒𝑠𝑒𝑛𝑡𝑠 𝑡ℎ𝑒 𝑆𝑒𝑛𝑡𝑖𝑚𝑒𝑛𝑡 𝑠𝑐𝑜𝑟𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑤𝑜𝑟𝑑

𝑆 ∈ [−1; 1], 𝑤ℎ𝑒𝑟𝑒 𝑆 𝑖𝑠 𝑡ℎ𝑒 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 𝑠𝑒𝑛𝑡𝑖𝑚𝑒𝑛𝑡

𝛼 𝑖𝑠 𝑎 𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 𝑠𝑒𝑡 𝑎𝑡 𝑎 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 15

𝑆 > 0 𝑖𝑠 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑆 < 0 𝑖𝑠 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑆 = 0 𝑖𝑠 𝑁𝑒𝑢𝑡𝑟𝑎𝑙

Page 10: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

The overall sentiment score of a sentence is the normalization of the sum of the

sentiment score of each sentiment-bearing word. A value below zero is considered to

have a negative overall sentiment with -1 being the most negative whilst a value above

zero has a positive sentiment with +1 being the most positive and a value of zero

returns a neutral sentiment.

VADER incorporates colloquialism, emoticons and punctuations into its sentiment

algorithm by considering five heuristics (Gab, 2017). They are as follows:

1. Punctuation

2. Capitalization

3. Degree Modifiers

4. Shift in polarity due to “but”

5. Tri-gram examination before a sentiment-laden lexical feature to catch polarity

negation

The development of VADER by (Hutto & Gilbert, 2014) included 20 prescreened and

appropriately trained human raters that boosts its credibility and can be applied to

various domains. VADER was based upon the English language and therefore lacks

diversity and continuous development is required to improve this area.

The data These tweets were collated, preprocessed and cleaned. The latter meant removing

duplicates and data that made no sense. This output data was subjected to a series

of analysis. The analytical method utilized included, amongst others, descriptive

statistics, sentiment analysis using natural language programming (NLP), timeline

analysis, hashtag data analysis and VADER. This was also triangulated with real-

world events. This finding is analyzed in this paper.

There is a school of thought which says that online activism promotes slactivism which

refers to actions performed via the Internet in support of a political or social cause but

regarded as requiring little time or involvement such as signing an online petition or

joining a campaign group on a social media website or application. But FMF is different

because it had a physical protest component. One other campaign where slacktivism

had a physical component was Tahir Square in Egypt during 2012.

Page 11: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

[Outlying interesting observations] The range of languages used in the tweets was interesting. A minor number of tweets

were in Chinese (32), isiZulu (100), Afrikaans, …

Perspectives range across the full spectrum from those who view the Internet as

potentially disruptive (Aday et al., 2010; Howard, 2010) to those who argue that it may

even support authoritarian regimes (Morozov, 2011) who examined evidence social

media and the Internet were being used by protesters as events unfolded in real time.

Win particular the use of social media amongst participants in the Tahrir Square

protests in Egypt. Our central research questions were: Did social media use shape

how they learned about the protests, how they planned their involvement, and

how they documented their involvement? (tufekci and Wilson, 2012)

Software Robot

A social bot is a software robot or program that simulates human behavior in

automated interactions on social network platforms such as Facebook and Twitter.

They're sophisticated enough to fool other users and be taken for a human.

Social bots populate techno-social systems: they are often benign, or even useful, but

some are created to harm, by tampering with, manipulating, and deceiving social

media users (Ferrara, Varol, Davis, Menczer and Flammini, 2016).

Social bots have been used to infiltrate political discourse, manipulate the stock

market, steal personal information, and spread misinformation. The detection of social

bots is therefore an important research endeavour (Ferrara, Varol, Davis, Menczer

and Flammini, 2016)

Describe the data [YASEEN] Range type dates etc.

Page 12: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Twitter data also known as tweets have been purchased from a professional data service provider, and consists of 576 583 data points. Each data point or tweet has the following Metadata:

• Tweet – A message containing texts, emoticons and symbols limited to

280 characters (140 previously) • Time Stamp – Date according to the Gregorian calendar as well as the

time of the tweet • User name – Unique identification of the user who tweeted • Source of Tweet – The device used to send through a tweet • Favourite – Tweet tagged as favourite • Retweeted – Tweet that has been reposted or forwarded

Tweets were stored as text in a Microsoft Excel format and other media such as

images, videos and audio were omitted from the analysis. The timeline of tweets

gathered ranges from the 21 March 2015 until the 10 April 2017.

The data underwent a preprocessing phase prior to analysis which consisted of the

following:

removing duplicates and corrupt data points

validating data at random with tweets from the Twitter.com website

Applying the VADER analysis using Python

The Twitter platform

Twitter is a popular social networking and micro blocking tool which was released in

2006. It has about 300m users. Twitter users write over five hundred million messages

each day. Twitter users express whatever is on the mind through a so-called tweet. A

tweet is much like an SMS but it may only be 280 characters long. This length may

well be extended according to Twitter. These tweets are sometimes annotated with a

tag called a hashtag usually represented by the symbol #. A user on Twitter has their

user name proceeded by an @ so Justin Timberlake whose Twitter name is

jtimberlake has a handle, @jtimberlake.

Page 13: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

For our study the student protest movement, known as the Fees Must Fall movement

was tagged. It became known as #FeesMustFall. This opportunistic tagging allows for

contextual searches, grouping of responses, identification of trends and other forms

of meta-analysis. Sometimes multiple tags are used in the same tweet. For example,

the university acronym was often used with the tweet. Thus, the following tweet:

“8am DUT steve biko campus that's where imma be at tomorrow#FeesMustFall #DUT

#MUT #UKZN #DurbanShutDown” [YASEEN]

This allows a tweet to divert a hyperlink to another story.

The impact of bots on campaign

The massive spread of digital misinformation has been identified a global risk that

could impact elections, national security, company and individual reputation (Shao et

al, 2017). Much research is being undertaken to understand the viral diffusion of

misinformation. Indeed, Shao et al (2017) conducted a research on the 2016 US

Presidential campaign to mine for misinformation.

Twitter has two kinds of directed relationships friend and follower. In the case where

the user A adds B as a friend, A is a follower of B while B is a friend of A. In Twitter

terms, A follows B. B can also add A as his friend (namely, following back or returning

the follow), but is not required. From the standpoint of information flow, tweets flow

from the source (author) to subscribers (followers). More specifically, when a user

posts tweets, these tweets are displayed on both the author’s homepage and those of

his followers (Chu et al, 2012).

The growing number of users and the very open nature of Twitter have made itself an

idea target of exploitation from automated programs, known as bots. Further cyborgs

have emerged as an intermediary between humans and bots, which are either human-

assisted bots or bot assisted humans. Cyborgs have become a feature on Twitter and

display interwoven hybrid characteristics of both manual and automated behavior (Zu,

Gianvecchio, Wang and Jajodia, 2012).

Page 14: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Table 1.0 Tweet distribution

Year of Date

Month Number of Tweets

2015 March 1

April 1

October 289,458

November 13,452

December 3,922

2016 January 13,318

February 7,215

March 3,898

April 2,076

May 900

June 1,551

July 2,238

August 6,541

September 38,472

October 82,712

November 9,505

December 3,626

2017 January 4,113

February 2,244

March 2,843

April 2,363

October 2015 has the highest number of tweets with a tally of 289 458 which is greater

than the total number of tweets for the entire year of 2016.

0

20000

40000

60000

80000

100000

120000

Monday Tuesday Wednesday Thursday Friday Saturday Sunday

Total Tweets per weekday

Page 15: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Noticeably lower tweet count over the weekend (Saturday and Sunday) compared to

the significantly high tweet count for midweek days (Wednesday, Thursday and

Friday)

It was widely believed that the first widespread use of political bots to shift public

opinion was the alleged work of Bell Pottinger, a public relations company employed

to support the Gupta family. This was allegedly revealed by documents now known as

#GuptaLeaks. Guild (2017) suggest that the Twitterati come to the defense of South

Africa’s democracy by outing fake pro-Gupta Twitter bot accounts which have been

used to promote the family by praising pro-Gupta supporters and selectively targeting

journalists and others perceived to be anti-Gupta. Many were created in India. One

such Twitter accounts, “Esaia Theron”, was shown to be fake (Child, 2017).

For example, Theron praised known supporter Andile Mngxitama,

@Mngxitama “We all have to admit that is the greatest of all so much

passion he has for the improvement of the country. #BLF”

On the other hand, journalist Barry Bateman, was trolled by the bots, picking up 500

new fake followers daily forcing him to lock his account. Theron condemned for locking

his account and blocking him (Theron).

Interesting Finding

The significant finding in this research was that evidence of bots used in #FMF was

uncovered. Given the nature and the participants of the FMF campaign, who are

largely viewed as the intelligences of the country, it was interesting to find an evidence

of a bot, though in review unsurprising.

It is highly probable that some slacktivists who may well be bright students or

academics authored the bots. Students are inherently intelligent and will always find

the easiest way to do something.

Table 2.0 Description of the platforms Platform Twitter WhatsApp SMS Messenger

Page 16: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

User currently

online feature

Not available Available Not available Available

Size 140 characters No restriction 160 characters No restriction

Cost Almost free or low cost Bandwidth cost Charge per SMS Bandwidth cost

User searchable

hashtags, on all but

private tweets

Only user can

search. Encrypted

Only at user level Only at user level

Communication

Mode

Broadcast or Direct

Message (DM)

One-to-one defined,

one-to-many or

many-to-many

closed user groups

One-to-one or one-

to-many

One-to-one or

one-to-many

Sources (Differences, 2017)

Data Set

Information entropy is defined as the “the average amount of information produced by

a probabilistic stochastic source of data.” As such, it is one effective way to quantify

the amount of randomness within a data set. (Kramar, 2017)

One can reasonably conjecture that actual humans are more complicated than

automated programs, entropy can be a useful signal when one is attempting to identify

bots, as has been done by a number of previous researchers. Of the recent research

in social bot detection, particularly notable is the excellent work by groups

of researchers from the University of California and Indiana University. Their “botornot”

system uses a random forest machine learning model that incorporates 1,150 features

derived from user account metadata, friend/follower data, network characteristics,

temporal features, content and language features, and sentiment analysis. Botornot is

now called Botometer (Kramar, 2017).

Detecting a bot A Tweet may be a fake software social robot (bot) pretending to be human, a human,

or a human who uses bot technology to help them post more, faster, and longer

(cyborg). Each has some characteristics that assist is distinguishing between them.

Chu et al (2012) observed that a typical human user is very likely to follow “famous”

or reputable accounts.

Page 17: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

𝑨𝒄𝒄𝒐𝒖𝒏𝒕 𝑹𝒆𝒑𝒖𝒕𝒂𝒕𝒊𝒐𝒏 = 𝒇𝒐𝒍𝒍𝒐𝒘𝒆𝒓_𝒏𝒐/(𝒇𝒐𝒍𝒍𝒐𝒘𝒆𝒓_𝒏𝒐 + 𝒇𝒓𝒊𝒆𝒏𝒅_𝒏𝒐)

A celebrity has many followers and few friends e.g. Justin Timberlake who has a

reputation value of close to one. In contrast a bot has few followers and many friends.

This has a reputation close to zero. It follows that humans should have the highest

Account Reputation, followed by cyborgs, and then bots.

It terms of the number of tweets, it turns out that cyborgs generate the most tweets,

followed by human and finally bots (source?). At a superficial level this is surprising,

but reflection points that bots tweet frequently in a small sustained period, when it is

higher than human, then hibernate for a long period, perhaps to avoid detection.

Some bot accounts are now being suspended for extreme or aggressive activity (Chu

et al, 2012).

Indicators of a cyborg

1. Follows very few accounts, followed by very few

2. Usually topic specific

3. Frequency may be defined by characteristics

4. By equal periods

5. Short frequent bursts

6. Exact time in a day every day

7. Account Reputation

8. (Chu, 2012; Kramar, 2017)

Bot-authored tweets

1. Use timers to tweet

2. Or fixed intervals

3. Exhibit regular repetitive behavior

The use equal periods to tweet was a very simple method to determine cyborg and

bot activity.

Page 18: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Johannesburg marketing strategist Andrew Fraser, who has analysed many bots, said

they are easy to identify by their strange names, identical profiles and that they all

tweet the same thing at the same time. Using online tools, he found many were

generated in India.

SM may be used to legally monitor activity for medical evidence of spread of an

“illness.” (Chew and Eysenbach, 2010) call this through infoveilliance.

The emotion trend Off interest is the emotional mood swing of the tweets during the FMF. Was the

positive always in the majority? Did the negative moods hold sway at any point? Did

major destructive events such as the burning of the lib and the UJ hall sway moods?

Sentiment Analysis

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Sentiment Analysis

Negative Neutral Positive

Page 19: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Figure 1.0 Sentiment Analysis The table above depicts the sentiment of tweets in proportion to the total number of

tweets accumulated for a given month. Outliers have been filtered out by excluding

total tweets that were less than 100 for a month.

Negative sentiment were highest in proportion to the other sentiment classifications

for months, September, October and November of the year 2016. This coincides with

events such as the burning of the UKZN Howard library on the 06 September 2016

and the burning of a lecture hall in the University of Johannesburg (UJ).

Discuss ukzn library incident Bots Bots are automated applications that perform humanistic tasks. In the context of social

media, they are known as socialbots (Chatbots). Modern socialbots are capable of

holding a conversation with a human but not without limitations. Limitations exist due

to the complex nature of human communication and language and while natural

language programming (NLP) is on the up there are and probably will forever be

factors that curb its efficiency. Factors such as sarcasm, pragmatism and colloquialism

are complex to overcome but machine learning offers some relief and by utilising

efficient algorithms together with NLP this gap becomes narrower.

In a specific context a bot may be undetectable even with the best of methods as was

seen when the Turin test was passed years ago but for a bot to achieve this in modern

times it has to evolve with its counterparts that aim to detect and remove anti-social

bot activities. Twitter has witnessed the emergence of bot activities that aim to

influence government and financial markets such as the Bell Pottinger incident (Bond,

2017). Twitter has its own security measures that detect unusual activities within its

platform but bots adapt and evolve and continue to plague social media platforms.

Thus, there is a need for growing research in this area as malicious bots aim to

influence societies’ governments and economies. Twitterbots have been known to

attempt to influence the sentiment of twitter users by associating the hashtag with

positive or negative text together with fake news and additional hashtags.

Page 20: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

** Facebook has recently launched a website called wit.ai whereby users can create their own bots. Bot characteristics Unsurprisingly, the characteristic of an average bot displays repetitive behaviour, high

volume of output and very frequently active. A Twitterbot is a type of bot software that

uses a Twitter API to control a Twitter account (Chu, et al., 2012). Twitterbots are

capable of performing tasks autonomously such as tweeting, retweeting, liking,

following, unfollowing, or direct messaging other accounts. Twitter imposes a set of

automation rules that cap twitter bot behaviour (Twitter Inc., 2018) but it does not

effectively remove all malicious bots (Shao, et al., 2017).

According to (Gilani, et al., 2017) there are clear distinctions between bot and human activity across the following metrics:

Age of Account

User Tweets

User Retweets

User replies and mentions

URLs in tweets

Content uploaded

Likes per tweet

Retweets per tweet

Tweets Favourited

Friend-follower ratio

Activity sources count

The following accounts have been shown to exhibit bot activity:

Twitter suspends accounts based on the following 3 criterion (Twitter inc., 2018):

Spam

Account security at risk

Abusive tweets or behaviour

Users have now began to use automated tools in order to boost their profiles on social

media and in particular there are such tools for Twitter that follows and un-follows

users automatically (Karlson, 2017). A framework for bot detection has been

constructed by (Varol, et al., 2017) whereby more than 1000 features are leveraged

to evaluate the degree by which a Twitter account is similar in characteristics to known

Page 21: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

social bots (Davis, et al., 2016). This framework is adopted in the website,

truthy.indiana.edu/botornot (now Botometer), and is free to use online.

Bot influence According to (Gilani, et al., 2017) bots have been observed to have a profound

influence in social media. Since, sentiment shared in social media have been

recognized to affect external events such as financial markets and political affinity, it

is unsurprising to witness the emergence of bots that aggressively lobby viewpoints or

spread malicious information.

Bot Networks Online media has become influential in affecting the sentiment of users and bot

creators have been targeting websites and social media to either promote products,

causes or spread fake news. The sophistication of bots have known to circumvent

social media platforms and have networks of their own known as bot networks. The

recent Trump and Russia allegations revealed bot networks in Twitter that targeted

journalists’ and other users’ accounts that opposed Donald Trump with some human

accounts temporarily suspended after being attacked by a network of bots1. This

means that the bot network is capable of suspending human accounts by triggering a

breach in the rules of Twitter.

Analysis of #FeesMustFall Twitter Data According to Botometer, EduFunder is rated 74% a bot. Outlier behaviour (8 tweets within 2 seconds) The table below is an analysis of the #FeesMustFall twitter data collected from the 21 March 2015 until 09 April 2017. Total amount of twitter data gathered was 576 583 tweets. This table is reveals the top 10 list of users that populated the most number of tweets in this duration. Table 3.0 The most prolific tweets

User Name Avg. Sentiment Score

No. of hashtags (#)

Favorite No. of urls in Tweet

Number of Tweets

Retweet

Camaren Peter -0,07 63817 488 15362 15403 242

EduFunder 0,05 13665 146 4111 7018 319

Wake up SA!! 0,00 15684 1013 2215 2318 1025

Jou Ma Se Party -0,09 533 56 2294 2258 70

#AFRICA -0,09 7355 426 2221 2193 100

Jacaranda News -0,01 3388 2035 712 2063 7185

EWN Reporter -0,04 2206 6532 969 1739 28974

1 https://krebsonsecurity/2017/08/twitter-bots-use-likes-rts-for-intimidation/

Page 22: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

The Daily VOX -0,02 1561 3523 1096 1053 12034

POWER987News 0,00 1297 1371 298 1041 6622

ANN7 -0,01 259 1114 366 949 4171

- Avg Sentiment – The average sentiment of the user. Sentiment has been calculated

using the VADER2 (Valence Aware Dictionary and Sentiment Reasoner) library for

Python3.

- No. of hashtags - The sum total of hashtags (#) of a user.

- Favourite – The sum total of all the user tweets that was selected as favourite

- No. of URLs in Tweet – Sum total of the number of URLs of a user.

- No. of Tweets – Sum total of the number of tweets of a user.

- Retweets – The sum total of all the user tweets that was retweeted.

The researchers argue that the news media twitter account such as ANN, EWN, and Power are self-serving, with reasonable justification, because they direct traffic towards their news articles as a marketing exercise. We suggest self-serving because they as a marketing strategy. Twitter users view these users as reputable sources of information and RT them.

2 VADER Sentiment Analysis. VADER (Valence Aware Dictionary and Sentiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains - https://github.com/cjhutto/vaderSentiment 3 Python is an interpreted high-level programming language for general-purpose programming - https://www.python.org/

Page 23: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Edufunder was found to be a bot as the volume and frequency of tweets exhibited

abnormal behaviour as can be seen in Figure 2 below:

Figure 2: Sample of Tweets from EduFunder

The vertical axis represents time in the format of h:m:s (hour:minute:second) and the

horizontal axis represents the corresponding volume of tweets. Notice the significant

amount of activity within 10 seconds and at least 2 tweets per second from this sample

of data. EduFunder also used IFTT (IF THIS THEN THAT) to tweet which is widely

used to create bot applets.

Table 4: A few tweets from Figure 2

Tweet Sentiment

NikkyCage: RT Notty_Mnguni: "The only Blade we acknowledge" ☹☹ #FeesMustFall #NationalShutDown #UKZNFeesMustFall #… https://t.co/StX4389VIX

0

BCM_82: RT IOL: 10 powerful placards of #FeesMustFall https://t.co/3iWxA75HXs

0,4215

ABasaJJ: RT sihle_mda: #NationalShutdown ✊�#FeesMustFall ✊� 0

Ceendie_: Mandela didn't spend years in prison for this #Feesmustfall -0,5106

Mageba_Zulu: RT SASCO_Jikelele: Solidarity from Namibian students. #FeesMustFall #NationalShutDown https://t.co/2J9vbEiyDj

0,296

Table 4 displays a few tweets corresponding to Figure 4 in order to deduce the nature

of the bot’s intentions. It can be seen that the bot has a lot of retweets and virtually no

Page 24: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

‘original’ tweet. Perhaps it was created to amplify the awareness of the #FMF

campaign. There were some tweets from the bot that were retweeted and it can

therefore be said that influence of some nature occurred between the bot and other

twitter users.

Page 25: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Camaren Peter is more of a cyborg as Hootsuite and Buffer were used to automate

tweets for this account. Camaren Peter’s tweeting activity displayed punctuality unlike

a completely human controlled twitter accounts and also exhibited a lack of variety in

the substance of tweets. Figure 3 portrays this behaviour as follows:

Figure 3: Tweets vs Minutes of an hour

Table 5 expresses an example of a tweet from Camaren Peter:

Table 4: A sample tweet from Camaren Peter

Tweet Date Favourite Retweet Tweet Source

Sentiment

Thought Factory (Oct 2015): Student Protests Scuppered by Institutions https://t.co/JTRUqaIc8e #FeesMustFall #SouthAfrica #leadership ¤

2016-10-03 11:56

5 5 Hootsuite -0,2263

The tweet from Table 5 is calculated as a negative sentiment by VADER and on

inspection seems accurate as the text leaves the reader with a feeling of failure for the

student protests. Also note that this tweet was tagged as a favourite 5 times and

Page 26: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

retweeted 5 times. This signifies influence of some degree and in the broader context

means that cyborgs can effect social media users. The link in the tweet refers a user

to a blog that discusses South African politics thereby luring users to the aims of the

cyborg.

Further work This study is relevant to understand student activism. The model and methodology

may, at a government level, be extended to anticipate and mitigate service delivery

protests and even help in tracking sources of illnesses like listeriosis. At a commercial

level companies may use this for tracking real-time sentiment. New emerging

campaigns such as #DataMustFall may also be tracked.

Conclusion Although not part of the study, the researchers are pleased to announced that the

campaign had a desired effect as the new President Ramaphosa announced that

education will from 2017 be free for students from families with a combined income of

less than ZAR 350, 000.

The study shows that Twitter was a key and active platform of the campaign. Contrary

to some perceptions slactivism, although present, was not a key component of the

campaign. It found an intriguing evidence of software robots commonly or social bots

or simply bots which was, to the authors knowledge, not mentioned in the media or

any study, during this campaign. The FMF campaign had a desired effect as the new

President Ramaphosa announced that education will from 2017 be free for students

from families with a combined income of less than ZAR 350, 000.

Page 27: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Bibliography

Chew C and Eysenbach G. 2010. Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak. PLOS ONE 5(11): e14118.https://doi.org/10.1371/journal.pone.0014118

Child, K. 2017. Sunday Times. Report. Pro-Gupta bots unmasked. 10 July 2017.

Chu, Z., Gianvecchio, S., Wang, H. and Jajodia, S., 2012. Detecting automation of twitter accounts: Are you a human, bot, or cyborg?. IEEE Transactions on Dependable and Secure Computing, 9(6), pp.811-824.

Davis, C. A. et al., 2016. BotOrNot: A system to evaluate social bots. Proc. 25th Intl. Conf. Companion on World Wide Web, pp. 273-274.

Differences. 2017. Differences between Twitter and Texting. Report. Available at: http://www.differencebetween.net/technology/internet/difference-between-twitter-and-texting/

Ferrara, E., Varol, O., Davis, C., Menczer, F., Flammini, A.: 2016. The rise of social bots. Commun. ACM 59(7), 96–104 (June 2016)

Gilani, Z. et al., 2017. An in-depth characterisation of Bots and Humans on Twitter. arXiv preprint arXiv:1704.01508, pp. 1-18.

Karlson, K., 2017. AGGREGATE: everyone’s using automated twitter following tools. [Online] Available at: https://aggregateblog.com/automated-twitter-following-tools/ [Accessed 13 February 2018]

Kramar, S. 2017. Identifying viral bots and cyborgs in social media. OReilly Media, https://www.oreilly.com/ideas/identifying-viral-bots-and-cyborgs-in-social-media

Python Foundation. 2018. Vader Sentiment 2.5. Available at: https://pypi.python.org/pypi/vaderSentiment Shao, C. et al., 2017. The spread of fake news by social bots. arXiv preprint arXiv:1707.07592, pp. 1-27

Spaull,N. 2017. https://www.economist.com/news/middle-east-and-africa/21713858-why-it-bottom-class-south-africa-has-one-worlds-worst-education South Africa has one of the world’s worst education systems. The economist 7 Jan 2017

Twitter inc., 2018. About suspended accounts. [Online] Available at: https://help.twitter.com/en/managing-your-account/suspended-twitter-accounts [Accessed 8 February 2018].

Page 28: An preliminary analysis of the #FeesMustFall Twitter Campaign · unique feature of FMF was the leveraging of social media platforms to coordinate the campaign, inform and lobby students

Twitter Inc., 2018. Automation Rules. [Online] Available at: https://help.twitter.com/en/rules-and-policies/twitter-automation

Pillay, S.R. 2016. Silence is violence: (critical) psychology in an era of Rhodes Must Fall and Fees Must Fall

Varol, O. et al., 2017. Online Human-Bot Interactions: Detection, Estimation, and Characterization. arXiv:1703.03107v2, 27 March.pp. 1-11.

Word pool [social yob][social mob]coordinate manipulate incite

Prayer Index What was the prayer index during the campaign?

Bot detection

Timelines vs frequency

‘Amita Bachan – twitter cleans followers (bots)’

Bell Pottinger Slash and burn strategy Instant gratification