Download - The Benefits of a Notification Process in Addressing the Worsening Computer Virus Problem
The Benefits of a Notification The Benefits of a Notification Process in Addressing the Process in Addressing the Worsening Computer Virus Worsening Computer Virus ProblemProblem
Mike O’LearyMike O’Leary
Director, Applied Mathematics LaboratoryDirector, Applied Mathematics Laboratory
Towson UniversityTowson University
AbstractAbstract
We used epidemiological models to analyze We used epidemiological models to analyze how behavior affects the spread of a how behavior affects the spread of a computer virus. computer virus. – In particular, we created a simulation to model a
corporate computer network. • Parameters for the simulation were obtained from a
survey.
– The results of the simulation were compared to a simple analytic model.
These showed the benefit of a well-defined These showed the benefit of a well-defined process for notification in preventing the process for notification in preventing the spread of viruses.spread of viruses.
ConclusionConclusion
Instituting a formal process that Instituting a formal process that notifies the sender of a virus as notifies the sender of a virus as well as the network administrator well as the network administrator is effective in reducing the spread is effective in reducing the spread of computer viruses.of computer viruses.
This may be more cost-effective This may be more cost-effective than other technological mitigation than other technological mitigation techniques.techniques.
Project OriginsProject Origins
This project is a result of a This project is a result of a collaboration between two local collaboration between two local companies- Science Applications companies- Science Applications International Corporation and International Corporation and Science Communications Studies Science Communications Studies with the Towson University Applied with the Towson University Applied Mathematics Laboratory.Mathematics Laboratory.
The Applied Mathematics The Applied Mathematics LaboratoryLaboratoryFounded in 1980.Founded in 1980.Searches for mathematical Searches for mathematical
research projects at the advanced research projects at the advanced undergraduate level.undergraduate level.
Projects are sponsored by local Projects are sponsored by local companies and government companies and government agencies.agencies.– We charge a fee to cover our costs.
The Applied Mathematics The Applied Mathematics LaboratoryLaboratoryTwo faculty members act as Two faculty members act as
project directors.project directors.Three to six students are chosen Three to six students are chosen
by invitation to participate in each by invitation to participate in each project.project.
Projects usually last one full year.Projects usually last one full year.
The Applied Mathematics The Applied Mathematics LaboratoryLaboratoryAt the end of the Fall Semester, an At the end of the Fall Semester, an
interim report and an interim interim report and an interim presentation are made by the presentation are made by the students to the sponsoring students to the sponsoring organization.organization.
A final report and final A final report and final presentation are made by the presentation are made by the students at the end of the Spring students at the end of the Spring Semester.Semester.
Project CollaboratorsProject Collaborators
Joan L. Aron, Science Joan L. Aron, Science Communication StudiesCommunication Studies
Ron Gove, Science Applications Ron Gove, Science Applications International Corporation (SAIC)International Corporation (SAIC)
Shiva Azadegan, Department of Shiva Azadegan, Department of Computer & Information Science, Computer & Information Science, Towson UniversityTowson University
M. Cristina SchneiderM. Cristina Schneider
Student TeamStudent Team
Shadi AlaghebandShadi AlaghebandMichael R. ConnellyMichael R. ConnellySarah FarisSarah FarisMichael ThomasMichael Thomas
ContributorsContributors
John McKnightJohn McKnightMyron CramerMyron CramerCedric ArmstrongCedric Armstrong Jim FrazerJim FrazerDepartment of DefenseDepartment of Defense
What Is a Virus?What Is a Virus?
What is a Virus?What is a Virus?
A virus is a piece of computer code A virus is a piece of computer code that is designed to enter another that is designed to enter another user’s computer, and execute user’s computer, and execute without that user’s permission.without that user’s permission.
Types of VirusesTypes of Viruses
Macro virusesMacro viruses– Word– Excel– Access
Executable virusesExecutable virusesBoot sector virusesBoot sector viruses
WormsWorms
A worm is a virus that can self-A worm is a virus that can self-propagatepropagate
How Do We Stop Viruses?How Do We Stop Viruses?
Anti-virus software Anti-virus software – On workstations– On email servers– On network servers
Anti-virus software compares Anti-virus software compares unknown files with a collection of unknown files with a collection of virus signatures.virus signatures.
If there is a match, the software If there is a match, the software concludes that the file is infected.concludes that the file is infected.
Technical DetailsTechnical Details
Virus signature files must be Virus signature files must be updated regularlyupdated regularly– In many cases, this process is now
automated.Anti-virus software companies are Anti-virus software companies are
interested in technological interested in technological solutionssolutions– They use the analogy of a “vaccine”
against computer viruses.
Lessons From EpidemiologyLessons From Epidemiology
There are diseases which remain There are diseases which remain problematic despite effective problematic despite effective treatments and/or vaccines. Why?treatments and/or vaccines. Why?– Behavior– Environment– Host factors
Problems With Total Reliance on Problems With Total Reliance on TechnologyTechnology Problems in deployment.Problems in deployment. Improper installation.Improper installation. Improper configuration.Improper configuration. Maintenance.Maintenance. Windows of vulnerability.Windows of vulnerability.
– Re-install.– Rapid growth.– Change in IT personnel.
Undetectable viruses.Undetectable viruses.– Melissa et.al.
ExampleExample
Failure to update anti-virus Failure to update anti-virus signatures on our campussignatures on our campus
MethodsMethods
Virus SurveyVirus Survey
Conducted a Computer Virus Conducted a Computer Virus Epidemiology Survey (CVES) toEpidemiology Survey (CVES) to– Examine indicators of the impact of
computer viruses– Provide reasonable ranges for
parameters in the simulation model
Virus SurveyVirus Survey
A WWW surveyA WWW surveyOnline from June 1998 to September Online from June 1998 to September
19991999Advertised Advertised
– by links in search engines– by links in security web sites– by direct email
106 respondents106 respondentsObvious sources of biasObvious sources of bias
QuestionsQuestions
Organizational characteristicsOrganizational characteristicsSeverity indexSeverity index
– Effects of computer viruses in the preceding 12 months
Anti-virus postureAnti-virus posture– Number of machines running anti-
virus software– Virus signature update procedure
The SimulationThe Simulation
LanguageLanguage
Simulation language was MODSIMSimulation language was MODSIMAn object-oriented discrete time An object-oriented discrete time
simulation languagesimulation languageSimulation governed by a Simulation governed by a
continuous time variablecontinuous time variableActions can be scheduled on the Actions can be scheduled on the
basis of the simulation timebasis of the simulation time
Sample CodeSample CodeFOR I := 1 TO RecipientsFOR I := 1 TO Recipients
IF (ASK RandomCommChecked UniformReal(0.0, 1.0)) IF (ASK RandomCommChecked UniformReal(0.0, 1.0)) <ProbabilityCommChecked <ProbabilityCommChecked
TELL Network[Listener[I]] TOTELL Network[Listener[I]] TOSetStatus(ComputerSender,MethodOfComm,FileTransfer,IntegerInfectionRepSetStatus(ComputerSender,MethodOfComm,FileTransfer,IntegerInfectionRep););
ELSEELSE
WaitTime:= ASK RandomWaitTime Exponential (AvgDelayToRespond); WaitTime:= ASK RandomWaitTime Exponential (AvgDelayToRespond);
IF (WaitTime + SimTime()) > (FLOAT(Days) * 8.0 ) IF (WaitTime + SimTime()) > (FLOAT(Days) * 8.0 )
WaitTime := (FLOAT(Days * 8) - SimTime()); WaitTime := (FLOAT(Days * 8) - SimTime());
END IF;END IF;
TELL Network[Listener[I]] TO TELL Network[Listener[I]] TO SetStatus(ComputerSender,MethodOfComm,FileTransfer,IntegerInfectionRepSetStatus(ComputerSender,MethodOfComm,FileTransfer,IntegerInfectionRep) ) IN WaitTime;IN WaitTime;
END IF;END IF;
END FOR;END FOR;
ParametersParameters
Based on the survey results, we Based on the survey results, we examined 11 factors that we examined 11 factors that we thought would have a significant thought would have a significant role in the transmission of a virusrole in the transmission of a virus
ParametersParameters
Probability of effective anti-virus useProbability of effective anti-virus useProbability ofProbability of
– Email use– Network connection use– Floppy use
Probability that users would share a Probability that users would share a computercomputer
Cleanup probabilitiesCleanup probabilities
ParametersParameters
Notification ProbabilitiesNotification ProbabilitiesDetection ProbabilitiesDetection ProbabilitiesExposure ProbabilitiesExposure ProbabilitiesRe-Infection Probabilities Re-Infection Probabilities
(Lingering)(Lingering)Scrub ThresholdScrub Threshold
Parameter SelectionParameter Selection
For each parameter, a base, low, and For each parameter, a base, low, and high value was set.high value was set.
Representative values were determined Representative values were determined from survey parameters or extant from survey parameters or extant literatureliterature
A sequence of simulations were run, A sequence of simulations were run, two for each parameter, which had that two for each parameter, which had that parameter at a high or low value, with parameter at a high or low value, with the other parameters kept at their base the other parameters kept at their base valuevalue
Parameter SelectionParameter Selection
Based on these results, we focused Based on these results, we focused our attention on the following:our attention on the following:– Probability that a user had effective
anti-virus software [AV]– Communication Rate [Comm]– Exposure Rate [Exposure]– Notification Probability [Notify]
Parameters- BasicParameters- Basic
Simulation length (365)Simulation length (365)Number of computers (200)Number of computers (200)
Parameters- VirusesParameters- Viruses
Number of distinct virus types (20)Number of distinct virus types (20)– Word macro viruses (76%)– Excel macro viruses (5%)– Boot sector viruses (2%)– Executable viruses (17%)
Frequencies taken from WildList, Frequencies taken from WildList, August 1998.August 1998.
Parameters- CommunicationParameters- Communication
Number of communication events per Number of communication events per day (100, 200, 400, 1000) [Comm]day (100, 200, 400, 1000) [Comm]
MethodsMethods– Email (75%)– Network connection (20%)– Floppy disk (5%)
DataData– Word documents (70%)– Excel spreadsheets (10%)– Executable file (5%)– Other (15%)
Parameters- CommunicationParameters- Communication
Probability that a communication is Probability that a communication is checked immediately (70%)checked immediately (70%)
Average delay to respond to a Average delay to respond to a communication (1 hour)communication (1 hour)
Average number of recipients of an Average number of recipients of an email message (3)email message (3)
Parameters- Anti-VirusParameters- Anti-Virus
Probability that a computer has Probability that a computer has effective anti-virus software (80%, effective anti-virus software (80%, 95%) [AV]95%) [AV]
Probability per day of a computer’s Probability per day of a computer’s exposure to a virus from an exposure to a virus from an outside source (0.1%, 0.5%, 2%) outside source (0.1%, 0.5%, 2%) [Exposure][Exposure]
Parameters- BehaviorParameters- Behavior
Probability that a virus recipient Probability that a virus recipient notifies sender and administrator notifies sender and administrator (10%, 25%, 50%, 75%, 90%) [Notify](10%, 25%, 50%, 75%, 90%) [Notify]
Probability that a user who is notified Probability that a user who is notified that they have a virus will be able to that they have a virus will be able to successfully remove it (85%)successfully remove it (85%)
Probability per day that a user Probability per day that a user without effective anti-virus software without effective anti-virus software will recognize a virus (5%) will recognize a virus (5%)
The Simulation- InitializationThe Simulation- Initialization
Initialize random number generatorsInitialize random number generators Read input parameters from fileRead input parameters from file Randomly configure and assign virus Randomly configure and assign virus
typestypes Construct network as an array of Construct network as an array of
computer objectscomputer objects Determine which machines have effective Determine which machines have effective
anti-viral softwareanti-viral software Determine which computers are initially Determine which computers are initially
infectedinfected
Simulation- One DaySimulation- One Day
Simulation is managed by SimTime, Simulation is managed by SimTime, with 8 units of time to one day.with 8 units of time to one day.
At the start of the dayAt the start of the day– Record the network status– Introduce n new external infections by
sampling a binomial distribution– Re-Introduce m infections from
previously cleaned machines by sampling a binomial distribution
Simulation- One CommunicationSimulation- One Communication
Sample from an exponential Sample from an exponential distribution to determine the time of distribution to determine the time of the communication.the communication.
Sample from uniform distribution to Sample from uniform distribution to determine the sending computer.determine the sending computer.
Determine the type of communicationDetermine the type of communication– For email communications, sample from
an exponential distribution to determine the number of recipients.
Simulation- ResponseSimulation- Response
For each computer that receives a For each computer that receives a message, check to see if the message, check to see if the computer user will respond computer user will respond immediately to the message.immediately to the message.– If not, sample from an exponential
distribution to determine the wait time.– If the wait time extends beyond the
current day, response will occur at the start of the next day.
Simulation- Virus?Simulation- Virus?
Is there a virus? Can it be passed Is there a virus? Can it be passed in this communication?in this communication?
Yes: Yes: – This communication event is done.
No:No:– Does the anti-virus software stop it?
• Yes: check to see if the user informs the sender and the network administrator.
• No: then infect this machine.
Simulation- RecoverySimulation- Recovery
If a user is informed that they sent a If a user is informed that they sent a virus, then they attempt to clean their virus, then they attempt to clean their machine.machine.
If the network administrator receives If the network administrator receives sufficiently many notifications of virus sufficiently many notifications of virus activity, then the entire network activity, then the entire network attempts to clean their machine.attempts to clean their machine.
At the end of each day, check to see if At the end of each day, check to see if a user notices a virus on their machine. a user notices a virus on their machine. If so, then the attempt to clean their If so, then the attempt to clean their machine.machine.
The Analytic ModelThe Analytic Model
Effective ContactsEffective Contacts
The number of effective contacts The number of effective contacts per communication event isper communication event is
Prob( ) #Recipients Prob( transmits )V C C V CommComm Prob[Prob[CC
]]RecipieRecipientsnts
Prob[Prob[CC Transmits Transmits VV]]
WordWord ExcelExcel Exec.Exec. BootBoot
EmailEmail 0.750.75 33 0.700.70 0.100.10 0.050.05 00
NetworNetworkk
0.200.20 11 0.700.70 0.100.10 0.050.05 00
FloppyFloppy 0.050.05 11 0.700.70 0.100.10 0.050.05 111.75V
Analytic Model- VariablesAnalytic Model- Variables
yy is the fraction of infected machines is the fraction of infected machinesCCV V = = (Comm/200)(Comm/200) VV is the daily is the daily
contact ratecontact rate is the fraction of machines with is the fraction of machines with
effective anti-virus softwareeffective anti-virus softwareVV = Recognize + = Recognize + CCV V (Notify)(Cleanup)(Notify)(Cleanup)
GGVV is the fraction of new infections is the fraction of new infections from a particular virus from a particular virus VV..
Analytic ModelAnalytic Model
Our simplified model, for each virus Our simplified model, for each virus V V isis
This equation is autonomous, and has a This equation is autonomous, and has a stable equilibrium pointstable equilibrium point
1 1 Exposure 1 1V V V
dyC y y y G y
dt
Infection rate Infection rate due to due to
contact with contact with infected infected
machines on machines on the networkthe network
Rate at which Rate at which machines are machines are
cleaned; either cleaned; either by recognition or by recognition or by cleanup after by cleanup after
a notificationa notification
Rate at which Rate at which machines are machines are
infected infected because of because of
exposure to an exposure to an outside virusoutside virus
ResultsResults
Results: AV = 95%Results: AV = 95%
Results: AV = 95%Results: AV = 95%
Results: AV = 95%Results: AV = 95%
Changing the notification probability Changing the notification probability from 10% to 90% results in a 2-fold from 10% to 90% results in a 2-fold to a 10-fold drop in the number of to a 10-fold drop in the number of computer viruses in the network computer viruses in the network
For high anti-virus software use, For high anti-virus software use, increased communication results in increased communication results in fewer viruses in the network.fewer viruses in the network.
These follow from both the simulation These follow from both the simulation and the analytic approximation.and the analytic approximation.
Results: AV = 80%Results: AV = 80%
Results: AV = 80%Results: AV = 80%
Results: AV = 80%Results: AV = 80%
Increasing levels of notification has an Increasing levels of notification has an even greater relative effect, from 7-even greater relative effect, from 7-fold to as much as 1000-fold.fold to as much as 1000-fold.
For high levels of the notification For high levels of the notification probability, increased communication probability, increased communication still had a protective effect.still had a protective effect.
For low levels of the notification For low levels of the notification parameter, increased communication parameter, increased communication had a detrimental effect.had a detrimental effect.
Results: Reproduction RatioResults: Reproduction Ratio
Results: Reproduction RatioResults: Reproduction Ratio
Management RecommendationsManagement Recommendations
Improving the notification probability Improving the notification probability has a significant role in reducing the has a significant role in reducing the spread of computer viruses.spread of computer viruses.
This is a parameter that can be This is a parameter that can be modified within an organization.modified within an organization.
Behavior changes may be cheaper than Behavior changes may be cheaper than complex technological solutions.complex technological solutions.
Increasing user awareness may help Increasing user awareness may help mitigate viruses that can not be mitigate viruses that can not be detected by existing virus signatures.detected by existing virus signatures.