operational security moving beyond the firewall argonne national laboratory michael a. skwarek,...

30
Operational Security Moving Beyond the Firewall Argonne National Laboratory Michael A. Skwarek, Deputy CIO & Cyber Security Program Manager Christopher Poetzel, Computer Network Engineer Argonne National Laboratory 2009 DOE Cyber Security Conference May 13, 2009

Upload: ernest-jefferson

Post on 17-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Operational Security Moving Beyond the FirewallArgonne National Laboratory

Michael A. Skwarek, Deputy CIO & Cyber Security Program Manager

Christopher Poetzel, Computer Network Engineer

Argonne National Laboratory

2009 DOE Cyber Security Conference

May 13, 2009

About the Presenters

Michael A. Skwarek, Deputy CIO and Cyber Security Program Manager– Responsible for the effective balance between Cyber Security and

Science– Strong supporter of risk based cyber security systems that integrate

and provide efficiencies and effective communications to those in the trenches

Christopher Poetzel, Computer Network Engineer– Responsible for the management and integration of the Laboratory

firewalls – Strong code and analysis capabilities in anomalous Intrusion

detection

2

3

Argonne National Laboratory

Diverse population:– 2500 employees– 10,000+ visitors annually– Off-site computer users– Foreign national employees, users, and

collaborators

Diverse funding: – Not every computer is a DOE computer.– IT is funded in many ways.

Our goal: a consistent and comprehensively secure environment that supports the diversity of IT and requirements.

Argonne is managed by the UChicago Argonne LLC for the Department of Energy.

Laboratory IT Environment

4

Emphasis on the Synergies of Multi-Program Science, Engineering & Applications

AcceleratorResearch

Catalysis Science

NuclearFuel Cycle

TransportationScience

ComputationalScience

MaterialsCharacterization

StructuralBiology

FundamentalPhysics

User Facilities

InfrastructureAnalysis

.. and much more.

Operational Security – Moving Beyond the Firewall

Operational Cyber Security is very reactionary in nature– The “Dashboard view” drives the day – “Green lights vs. Red lights”.– Intel from outside sources only provides a catalyst to the actions.

Questions continue to remain on the table– How do we do more with less?– How do we learn from our incidents and those of others?– How do we leverage a risk based approach for cyber security?– How do we architect a cyber defense system that is not going to get “top heavy”?

Incident review and root cause analysis– Learning from your “mistakes” and the pain felt of others can be a healthy process.– Think like a hacker– Results of analysis and a clear understanding of the threat and risks can build new and

effective defense in-depth cyber systems.– Realization that there will never be a silver bullet to solve all of our problems.

5

Group Exercise: An incident in slow motion

Walk through a hypothetical cyber security incident that carries many trademarks of today’s reality.

Review the root cause elements that allowed for the incident to manifest and continue.

Through reflection, we will describe a number of mitigations in place today at Argonne that can be leveraged across the complex to mitigate similar and future attacks.

6

Phase I: Creation and Delivery of Infection

E-Mail addresses are harvested via an online phone book of employees within the Organization.

E-mail messages are crafted along with a Microsoft Word attachment that contains a malicious Zero-day exploit found within Office 2007.

Microsoft and AV vendors have not provided patches/virus signatures.

Local desktop administrative permissions are not required for exploitation.

Successful exploitation will result in the permission environment of the user with whom executed the file.

7

Cyber Incident in Review

8

AV/SPAMAV/SPAM

Firewall

IDSIDS

CLEAN

Recipient “A” is a member within the HR division The employee has the following IT environment

– Desktop: Fully patched Windows XP running Office 2003 and a member of the domain.

– Virus Protection: Fully up-to-date.– Access Permission: Non-Administrator

Recipient “B” is a Post Doc within a Programmatic division The employee has the following IT environment

– Desktop: Fully patched Windows XP running Office 2003 and a member of the domain.

Virus Protection: Fully up-to-date.– Access Permission: Administrator

9

Phase II: Infection

A

B

Phase III: Command and Control

Recipient “B” system detects and reports that the user is local administrator -> Attack successful

Recipient “B” system has established a command and control session with “the mothership” over a non-SSH protocol TCP/22 “VPN like” connection.– The local system is modified to created a new local service to ensure

that the command and control can be established after a reboot.– Antivirus is disabled on the local system to prevent detection of

certain tools.– The remote attacker installs a virtual machine on the infected system,

stealing an open IP address on the subnet.

Horizontal movement across the organization is now in mind

10

Communications are Established

11

Firewall

IDSIDS

CnC: 1010011001Non-Admin01101001 CnC: 1010011001*Admin*01101001

B

B

Communications are Established

12

Firewall

IDSIDS

Communications are Established

13

Firewall

IDSIDSOutbound TCP/22

B

Phase IV: Horizontal Movement

With the intention of moving to other systems within the organization, the attacker will need valid credentials for successful movement.

– Option 1: Crack the desktop local passwords•Attacker is hoping the cracked password is used on all systems

– Often seen for ease in large administration environments

– Option 2: Use the “Pass the Hash toolkit”•Within seconds, can have the password hashes of the last 10

logons to the system. – Mostly likely, an admin has logged on once… if not SMS

With credentials, it is time to scan the network for where the attacker can go.

14

Horizontal Discovery and Spread

15

Firewall

IDSIDS

A B

Outbound TCP/22iam.exe -h administrator:mydomain:0102030405060708090A0B0C0D0E0F10:0102030405060708090A0B0C0D0E0F10

PII

Outbound TCP/22

Root Cause Analysis and Associated Impacts

16

Phase I: Exploit Creation and Delivery

17

Incident Phase Root Cause Impact

Creation / Delivery Public facing website provides valid email address of all employees.

Valid targeted email addresses available to hacker

Creation / Delivery Attacker has knowledge of zero day exploit to gain local access.

Affected systems can be compromised

Creation / Delivery Vendors do not have patches or signatures to detect exploit

Malicious code can be executed and not detected

Creation / Delivery Attacker has knowledge, time and intent

We are in trouble

Phase I: Exploit Creation and DeliveryArgonne Mitigations and Actions

Root Cause: Public facing website provides valid email address of all employees.– Argonne has removed the capability to harvest email addresses from the

public facing website.– Initial Email communications are established via a website that proxies the

requested email to the recipient thus hiding the valid email address from attackers.

Root Cause: Attacker has knowledge of zero day exploit– Attempting to stay up-to-date on zero day exploits can be difficult but not

impossible with strong communication and awareness.

Root Cause: Vendors do not have patches or signatures– This happens a number of times throughout the year, and in some cases there

are workarounds to mitigate some of the risk.– User Education and Awareness are your first line of defense. Over

communicate where possible (Email, Daily Newsletters, etc) and patch ASAP when available.

Root Cause: Attacker has knowledge, time and intent

18

Phase II: Infection Recap

19

Incident Phase Root Cause Impact

Infection Recipients are enticed to open and execute email attachment

Malicious code is executed.

Infection Recipients are administrator of their local desktops

Malicious code is executed with permissions of recipient.Recipients are a user of their local

desktops

Phase II: InfectionArgonne Mitigations and Actions

Root Cause: Recipients are enticed to open and execute email attachment– User education is key as the employee is the last line of defense.– Conducted a number of social engineering assessments. For those that “went

to far”, immediate training is provided.– Plans to conduct future social engineering tests leveraging Core Impact.

Root Cause: Recipients are a user of their local desktops– This is a best case scenario. The exploit if run will assume the permission

level of the recipient. In most cases, a non-administrative system is not interesting to attackers.

Root Cause: Recipients are administrator of their local desktop– This is a worst case scenario. Administrative permissions in the wrong hands

can spell disaster.– Argonne has taken a hard stance on administrative permissions, and has

required that employee accounts be provided the least user access required to fulfill their job requirements.

20

Phase II: Least User Access (LUA)Argonne Mitigations and Actions

Least User Access rollout has both cultural and technical hurdles.– Interestingly, many employees are not aware of what it means to be “admin”

Not every scenario works in a true LUA environment.– Exceptions to the rule are a fact of life since older applications were never

written for LUA compliance.– Argonne exceptions to LUA are requested and vetted through the Cyber

Security Office.– In the event that administrative permissions are required:

• BeyondTrust is attempted to bridge the gap• If BeyondTrust does not meet the need, the user is provided a local

administrative account and educated on the use thereof.– In near real-time, systems are monitored with Snare and Splunk for account

membership changes.– On a nightly basis, each system is interrogated for local administrative

membership and compared against the approved list. • Rogue accounts are removed

Phase III: Command and Control Recap

22

Incident Phase Root Cause Impact

Command and Control

Infected systems have unfettered access to Internet.

System beacons to external “mothership”, establishes CnC.

Command and Control

Attacker alters local desktop mitigations

Antivirus detection disabled and service created for reboot situation.

Command and Control

Attacker able to gather and install toolkits without detection

Further capabilities of internal exploitation (Vmware, nmap, PTH)

Phase III: Command and ControlArgonne Mitigations and Actions

Root Cause: Infected systems have unfettered access to Internet.– Installed divisional firewalls with both ingress and egress rulesets

• Egress firewall deny logs are analyzed for anomalous behavior via splunk and in house written code to strengthen the signal to noise.

• Beacon knocking is easily detected– Leverage web filtering product to block identified malicious websites– Instituted “DNS Blackhole” capability to “deny” DNS resolution to known

malicious domains (currently ~32K domains).– Created inline IDS rules to detect when traffic on a given service port does not

match the protocol’s characteristics.• Ex: Identified traffic on TCP/22 does not look like SSH

Root Cause: Attacker alters local desktop mitigations– Installed Snare on clients to forward to splunk anomalous events.

• Account Management, Group Management, Logon Failures• Script created to generate event log for service creation

– Script created to monitor antivirus process and restart if a STOP is detected.

23

Phase III: Command and ControlArgonne Mitigations and Actions Continued

Root Cause: Attacker able to gather and install toolkits without detection– Note: In some cases, attackers have installed Virtual Machines on the infected

systems to ensure that tools are present and working– Completed an integration between authoritative host warehouse and router

ARP cache.• New MAC/IP pairs are detected through ARP are compared against the

host db• If a MAC/IP is detected to not be known, a shun is installed into the

divisional firewall to halt traffic• Future natural extension is to drop the port, which has some issues

revolving around hubs/switches.

24

Phase IV: Horizontal Discovery and Spread

25

Incident Phase Root Cause Impact

Horizontal Spread Ability to easily harvest valid credentials from system and domain.

Compromise of other systems and possibly entire domain

Horizontal Spread Unfettered internal network access.

Ability to probe systems across the network

Horizontal Spread Anomalous internal network behavior not detected.

Ability to probe systems across the network undetected.

Horizontal Spread Anomalous amounts of traffic payloads not detected.

Ability to exfiltrate data offsite undetected.

Phase IV: Horizontal Discovery and SpreadArgonne Mitigations and Actions

Root Cause: Ability to easily harvest valid credentials from system and domain.– Due to advent of “Pass the Hash” windows environments are at great risk.– An aged configuration which enables logon capabilities in the event of loss of

network or on remote, has the capability to gain domain admin within seconds• Altered the Cached Credential values on all Windows systems• Desktop and Servers = 0• Laptops = 1

– Script created to randomize the administrator password on all windows systems after reboot.• This password is not known by anyone, and is not the same on any two

systems.

Root Cause: Unfettered internal network access.– Network vlan boundaries are defined at the division level, and in some cases

broken down even further.– Ingress firewalls are installed to hamper horizontal network access

26

Phase IV: Horizontal Discovery and SpreadArgonne Mitigations and Actions Continued

Root Cause: Anomalous internal network behavior not detected.– Leveraging network netflow records, traffic patterns can be analyzed against

what is expected/normal traffic.– In-house written scripts are in place watching for anomalous traffic patterns

throughout the core of the network.• Ex: It is not normal for a system to chat with >50 hosts in a minute on a

given port(s).• “Noisy” network reconnaissance is easily identified

Root Cause: Anomalous amounts of traffic payloads not detected. – Leveraging network netflow records, traffic patterns can be analyzed for large

payload flows leaving the network.• Creating a baseline of “norms” by hour and day can identify anomalies.• Special attention is paid towards systems of known sensitivity

27

Takeaways and Lessons Learned….

28

Presentation Top Takeaways and Lessons Learned

29

No two incidents will behave the same Build general defenses to compensate for deviations in the attack vector.

There is no silver bullet answer Sadly, the answer to all of our problems is not hiding out there.

Build systems that integrate

Integration of cyber defense systems can build new defense systems.

Keep in mind the “signal to noise” of defense systems Red lights that are really not “Red” will only cause them all to be ignored.

Communicate and educate in everyway possible Find ways to reach your employees, management and peers.

Build strong systems from the “ground up” Preach and follow strong configuration management, enough said.

Contact Information and Questions

30

Please feel free to contact us with any questions or comments that you may have regarding the systems and capabilities mentioned within the presentation

Michael A. Skwarek (mskwarek at anl dot gov) Chris Poetzel (cpoetzel at anl dot gov)

Questions?