18th airborne big data analytics tech brief_june 2 2015

62
XVIIIth Airborne Corps - Enterprise Data Management John Welby, CEO & Chief Strategist/Warfighter-Support, LLC [email protected] Mobile: +1 919/247.7891 ****** Warfighter-Support, LLC Confidential******* 04/28/2022 1

Upload: john-welby

Post on 23-Feb-2017

36 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 1

XVIIIth Airborne Corps -Enterprise Data Management

John Welby, CEO & Chief Strategist/Warfighter-Support, [email protected]

Mobile: +1 919/247.7891

Page 2: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 2

Agenda• The “New Data World”• Current Systems (Discuss)• Define Types of Storage• Big Data Analytics (BDA) Primer• BDA for Strategic, Operations & Tactical Intelligence• Components of BDA/Enterprise Data Management

Page 3: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 3

Army Challenges• Policy• Laws• Culture• Access to Resources from Secure Mobile Devices

Page 4: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 4

Project Goals & Background• Design, test and implement a common user experience across

echelons, formations and phases [integrate w/ SOCOM’s TACLAN]• Solutions for supporting smaller combat teams • Extend services to tactical edge [integrate w/ Digital Edge

program]• Deploy Small Teams Anywhere in the World in Austere

Environments• Self-Defending Networks• Everything into the Cloud

Page 5: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 5

Project Optimization w/ MAPR•Most Reliable Hadoop Solution• Unique Globalization Architecture• Scales in size for very large data center deployment [CENTCOM] to smaller

deployments [FOB] to very small [Forward Deployed Personnel]• Information is available to harness, store, analyze and use to increase mission

performance

• “The Perfect Big Data Platform” • Hadoop / NoSQL / SQL-on-Hadoop

Page 6: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 6

Project Goals & Background• Network-Enable:

• 24/7 Situational Awareness• Reachback• “Project, People, & Technology”• Ramp Up to Support the Warfighter

• Codify Home Station Missions• Moving Mobility Down to the Field (e.g. A/D running in vehicles)• Level of Acceptable Risk Assessments• Always ON Global Infrastructure• Theater Intelligence Command (6) Combatant Commander Intel feeds to Home• Military Utility or Internet of Things (IoT) – sensors on everything

(vehicles/facilities/soldiers

Page 7: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 7

System Requirements (per XVIIIth)Current Tactical Field Communications Kit Upgrade Req’s

More powerful / additional capabilities

Lighter (current system approx. 500lbs)

Support up to 20 paratroopers

Satellite communications

LMR voice

Active Directory

Email

Storage

Self-Contained Power

Page 8: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 8

Types of StorageDefinition of TermsBenefits to XVIIIth AirborneQuestions to Ask

Page 9: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 9

“Hot” / “Warm” / “Cold” Storage• Hot storage is storage used for frequently accessed data that

can be accessed very quickly. An example is Flash Array Storage.•Warm storage is storage with medium IOPS & medium BW

such as hard disk drives.• Cold storage is storage used for infrequently accessed data.

An example is magnetic tape.

Page 10: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 10

What is Big Data?

Page 11: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 11

Big Data is…• Big data is a broad term for data sets so large or complex that

traditional data processing applications are inadequate. Challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and information privacy. The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set.

Source: Wikipedia

Page 12: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 12

Big Data Analytics• The Army, like any other entity, generates terabytes &

petabytes of data daily.• U.S. intelligence agencies and the military are increasingly

leveraging analytics platforms based on machine learning to sift through data sources like social media. In the vernacular of the Pentagon, these efforts are generally referred to as open source intelligence initiatives.• U.S. intelligence community is spending billions of dollars on

geospatial intelligence

Page 13: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 13

Machine Data (aka log data)

Intelligence Data• Full-Motion Drop-Zone Video• Video Analytics• Logs• Image Processing• Geo-Spatial Processing• Graph Analytics• Text Processing• Sentiment Analysis

“Maintenance” Data• Hardware & Software Inventory• Software Version• Patch Updates• End-of-Life Information• Supply Levels• Vehicle Maintenance Records• Compliance Information

Page 14: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 14

Gleaning Strategic, Operational & Tactical Intelligence from Machine Data• One of the biggest challenges for intelligence analysts is the

soaring volume of unstructured open source data as the bad guys resort to Facebook and Twitter to communicate and recruit.• Employ Splunk, MAPR on Cisco UCS to store, analyze,

package and disseminate timely, actionable intel to Commanders.

Page 15: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 15

Benefits of Next Generation Army Network• Turn data into actionable information – unified, timely information and

predictions• Assure the 18th’s creed, “America’s Contingency, Anywhere in 18 hours” is

fulfilled with maximum impact, consistency, transparency, reliability and effectiveness• Focus resources intelligently by putting them in the right place, on the right

day and at the right time• Results rigorously measured and commanders held accountable for their

performance• More effectively interface with Allies and Conventional ForcesSome info on following slides taken from CW5 Rick Pina’s Keynote Address at WWT Geek Day (May 2015)

Page 16: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 16

Benefits of Next Generation Army Network• Create “Commanders Risk Reduction Dashboard”• consolidate info from multiple Army databases• Soldiers can’t move if “at risk1”

• Company & Battalion Commanders (28 feeds)• Cyber Network Security• capture every packet and analyze later

• Tactical Operational Center (TOC)• Mobile applications reside in TOC• Secure delivery of mobile applications• Working with DISA on Army App Store (DISA has its own App Store)

Page 17: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 17

Benefits of Next Generation Army Network• DISA is partnering with Army and Air Force to change the way DoD

secures and protects information networks• Firewall/Intrusion Detection/Enterprise Management/VRF/Big Data Analytics

• Partner with and take advantage of DISA’s upgrade of DISN:• Global Infrastructure – 100Gb Fiber• All Installations = 10Gb connections• Major Installations = 20Gb connections

• Security Upgrades/Consolidation w/ Joint Regional Security Stacks [JRSS]• 25 Top-Level Architecture [TLA] Stacks (* future, now approx. 1000 stacks)

Page 18: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 18

Benefits of Next Generation Army Network• Big Data Support Down-Range [adding additional capabilities leveraging existing infrastructure]• Commercial Satellites• Wideband Satellites• Line-of-Sight Microwave• Distributed Nodes• 4G Wireless

Page 19: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 19

Questions to Ask• What Army programs are already in play / can we leverage?• Collaborate with partners that have performed in the past [IGOV]• Where is XVIIIth/Army in current program lifecycle?• What is the key mission challenge(s) to solve?• What if the XVIIIth had access to data it currently cannot access?• What “other” data will enhance XVIIIth’s mission?• What existing capabilities do we have now?• What is the state of my data, the data I want to “predict” from?

Page 20: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 20

Questions to Ask•What do we get for the cost, what do we need?

[Spend more $$$ for pre-packaged or build what you need]

ADVANTAGE DIS-ADVANTAGE COMMENT

Pre-Packaged End-to-End Solution More $$$

Requires fewer in-house resources &

expertise

A Platform for BuildingLess $$$

More FlexibilityRequires resources to

build & customizeMore legwork but you are not paying for stuff

you don’t need

Page 21: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 21

How Do We Harness BDA?• Turn data into actionable information – unified, timely

information and predictions• Help missions to have greater impact, consistency,

transparency, reliability and effectiveness

Page 22: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 22

Enterprise Data Management Components

Cisco UCSHadoopMapRSplunk

Page 23: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 23

Cisco UCS Reference Architecture

• UCS C220/C240 M4 Servers• Nexus 2232 Fabric Extenders• UCS 6200 Series Fabric Interconnect

Page 24: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 24

MapR Technologies, Inc.

•MapR Distribution is the combination of a tremendous amount of innovation in which MapR participates as part of the Apache Open Source Community along with MapR’s innovative data platform and management control system.

Page 25: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 25

MapR Improvements

hadoop Distributed File System•Many limitations with HDFS• Java Virtual Machine (JVM)

issues• Single point-of-failure• Read-and-Append only file

systems (not R/W)

MapR-FS

• Native NFS support – any application that can read/write to an NFS mount can plug into this architecture• No single point-of-failure (C++)• Application data is

automatically replicated

Page 26: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 26

MapR Security

• Pluggable security model• Linux pluggable authentication modules (PAMs)• Kerberos is an option1

1Not optimal for long-running jobs

Page 27: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 27

MapR – Zeta Application Architecture• Simplifies• Data Protection Schemes• How to Backup Data• Failure Recovery• Running Multiple Instances of Software

• Better hardware utilization = lower OpEx• Google runs on a Zeta Architecture

(over 2 Billion container deployments/week)

Page 28: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 28

MapR and hadoop•MapR works with and adopts open-source community

developments into an integrated solution offering • hadoop is a scalable centralized data hub / distribution solution• Runs same problem on multiple computers • Uses new more flexible tools and existing tools

Page 29: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 29

Benefits of MapR and hadoop• Faster time-to-value• Smaller hardware footprint (COTS hardware)• Reliability • Real snapshots for data versioning, data protection & mirroring (DR)

• hadoop is a scalable centralized data hub• Runs work/same problem on multiple computers • Uses new more flexible tools and existing tools

Page 30: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 30

MapR • “Traditional” Data

Warehouse accepts:• SQL data

•MapR w/ hadoop:• SQL• Machine-learned data• Video Analytic data• Relational Schemas• Files• Logs• Click Streams• Geo-Spatial data• Sentiment Analysis• WASP scanner data• KACE inventory data

Page 31: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 31

This architecture allows scaling

up to Google’s level.

Google’s Example

Page 32: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 32

MapR and Big Data Analytics in Action

Page 33: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 33

AADHAAR

In the course of attaining the milestone of 600 million users, the Aadhaar technology backend has become the largest biometric identity repository in the world and the first to provide an online, anytime anywhere, multi-factor authentication service. A strong technology foundation based on open architecture enabled the rapid evolution of the Aadhaar system. It was important to document all aspects of Aadhaar technology and make it available in public domain. The three white papers published by the UIDAI Technology Centre fulfill this need.

Page 34: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 34

A partnership model – UIDAI approach leverages the existing infrastructure of government and private agencies across India. The UIDAI is the regulatory authority managing a Central Identity Data Repository (CIDR), which will issue Aadhaar numbers, update resident information, and authenticate the identity of residents as required. UIDAI partners with agencies such as central and state departments who are the 'Registrars' for the UIDAI. Registrars conduct the enrollment camps using UIDAI software and procedures, upload the encrypted enrolment data to the CIDR to de-duplicate resident information, and help seed the Aadhaar number into their beneficiary databases.

AADHAAR Strategy

Page 35: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 35

• Process to ensure no duplicates – Registrars send the applicant's encrypted data packet to the UIDAI data centers for de-duplication. Aadhaar enrollment system performs a search on key demographic fields and on the biometrics for each new enrolment, to ensure uniqueness. • Process to keep data up to date – Incentives in the Aadhaar system are aligned

towards a self-cleaning mechanism. The existing patchwork of multiple databases in India gives individuals the incentive to provide different personal information to different agencies. Since de-duplication in the Aadhaar system ensures that residents have only one chance to be in the database, individuals are incentivized to provide accurate data. This incentive becomes especially powerful as benefits and entitlements are linked to the Aadhaar number. Regular usage of identity across many services naturally incentivizes the resident to keep Aadhaar system up to date.

AADHAAR Strategy

Page 36: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 36

• Online authentication – UIDAI offers a strong form of online authentication. When residents wanting to avail a service require identity/address verification, agencies can compare demographic and biometric information of the resident with the record stored in the central database. • Technology undergirds the UIDAI system – Technology systems have a

major role across the UIDAI infrastructure. Large scale biometric de-duplication, online authentication, data security, analytics, etc require well designed, secure, and scalable systems.

AADHAAR Strategy

Page 37: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 37

Splunk

Page 38: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 38

GPS, RFID, Hypervisor, Web Servers, Email, Messaging, Clickstreams, Mobile, Telephony, IVR, Databases

Report and

analyze

Custom dashboard

s

Monitor and alert

Ad hoc search

Real-timeMachine Data

Sensors, Telematics, Storage, Servers, Security devices, Desktops, CDRs

DeveloperPlatform

External Lookups

Troop/Supply/

Geo-Spatial

Info

Network Segments

/ Honeypot

s

Datastores

Splunk, The Platform For Machine Data

Page 39: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 39

Splunk App for Enterprise SecurityPre-built searches, alerts, reports, dashboards, threat intel feeds, workflow

Incident Investigations & Management

39

Dashboards and Reports

Statistical Outliers Asset and Identity Aware

Page 40: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 40

Splunk• Provides a More Complete View of Threat Landscape• US Army Authorized Splunk with Certificate of Networthiness (CoN)• Real-time search and analysis of terabytes of data across the Army’s IT infrastructure. • Patented Time-Services Indexing Technology (borrowed from MapReduce1)

• The Army and approximately 70% of all US federal agencies rely on Splunk for real-time visibility of their IT data for security, compliance and application availability

• Splunk App for FISMA• Used by the DOJ and NASA

1Please reference Appendix for MapReduce information.

Page 41: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 41

Splunk on SIPR

•Army Enterprise Certificate of Net Worthiness • Networthiness Certification applies to all organizations fielding, using, or managing ISs on the Army Enterprise

Architecture/LandWarNet (LWN), to include Commercial Off-the-Shelf (COTS) and Government Off-the- Shelf (GOTS). • In accordance with AR 25-1, paragraph 6-8 activities must obtain a Certificate of Networthiness (CoN) before they

connect hardware/software to the LWN. • Therefore, Splunk does not need to go through JITC.

•Over 50 U.S. Army customers using Splunk

Page 42: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential*******

Splunk FISMA App1

42

421Based on NIST 800-53 Rev 3

Page 43: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 43

Licensing Splunk

• Based on how much [new] data Splunk indexes in a 24-hr period• Data is ingested into Splunk once and goes against data licensed; Data indexing, manipulation

and modeling thereafter is unlimited. There is only one charge per unit of data.• Overages do not “turn off” your system.• License Enforcement (30-day period)

• 1st overage message to admin• 2nd overage message to admin• 3rd overage message to admin• 4th overage message to admin• 5th overage correlator is turned off• Contact Splunk Account Manager or Systems Engineer to get a “reset key.” No charge, the intent is to spur a conversation

between 18th Airborne and Splunk regarding capacity planning.• Theoretically, can exceed license 48x/year w/out contacting Splunk.

Page 44: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 44

Leidos• Information Dominance / Command and Control • Consistently ranked among the top federal systems integration contractors, Leidos is the company

that "pulls it all together" for U.S. forces and allies. As the lead integrator for the Global Command and Control System (GCCS), for example, we help give warfighters an integrated picture of the battlespace and commanders greater capability to deploy a U.S. fighting force around the globe at any time and provide it with the information and direction to complete its mission.

• As the military's key command, control, computing, communications, and intelligence (C4I) system, the GCCS uses the Defense Information Infrastructure Common Operating Environment (DII COE) to support joint warfighting needs. Helping to ensure that C4I maintains its pace with technology, Leidos leads several significant projects to bring leading edge DII COE-compliant technologies to the GCCS community. These include Defense Advanced Research Projects Agency (DARPA) efforts supporting senior levels of command, such as the National Command Authority and Joint Staff, down through Joint Task Force Commanders and service components, such as the Marine Corps' Chemical/Biological Warfare Incident Response Force.

Page 45: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 45

Wrapping it Up…

Page 46: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 46

The Solution

• Cisco UCS is the hardware platform•MapR – “hadoop in a box” w/ Zeta architecture• Hadoop [can be used to] provide the file system and

programming platform. • Splunk is the “search engine on steroids.” • Splunk and MapR “make Hadoop easier.”• Overcomes the Big Data skills gap

Page 47: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 47

Use Cases

Combat & Command• Timely & Applicable Intel for

cutting orders• Policies• Blend troop movement w/

SitReps, historical intel, Sat images, UAV data and provide to commanders on a single pane of glass

Company• True Numbers• Data Validity

• Add Info

Page 48: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 48

Wrapping it up…• Big Data Analytics will enable the 18th Airborne Corps to

gather, store, analyze machine data efficiently and effectively to increase mission success and potentially save lives.•Machine data such as UAV images, HUMINT and social

media data is stored in a Cloudera Enterprise Hub, extracted into Splunk [or Hadoop], Transformed in Splunk [or Hadoop], and searched [by Splunk] and presented in a usable form that becomes actionable intelligence.

Page 49: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 49

Thank You!

Page 50: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 50

Additional Technologies

• LISP – Locator ID Separation Protocol• “MAC in MAC” routing• Natively supports Equal Cost Multi-Path routing (Dijkstra’s SPF algorithm)• Alternative to running Spanning Tree• IS-IS for Layer 2 switching – computes SPT• Splits Locator info from Identifier

• Locator Endpoint ID Overlay

Page 51: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 51

Locator/ID Split and LISP• Addresses today combine location and identity semantics in a single 32-bit or 128-bit number• Separating Location and Identity changes this…

• Provide a clear separation at the Network Layer betweenwhat we are looking for vs. how best to get there

• Translation vs. Tunneling is a key question• Network Layer Identifier: WHO you are in the network

• long-term binding to the thing that they name, does not change often at all• Network Layer Locator: WHERE you are in the network

• Think of the source and destination “addresses” used in routing and forwarding• WHERE you are can change WHO you are should be the same

Page 52: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 52

Appendix

Page 53: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 53

Training Requirements

Page 54: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 54

Training Recommendations - SplunkFunction Course Title Education /Experience Delivery Duration

(hours)

Administrator For Splunk Administrators

Some college preferred/some network administration

experience preferredeLearning/WBT/

Instructor-led 51.5

Architect For Splunk Architects

Associates Degree, Network/Programming

experience

eLearning/WBT/Instructor-led 78.5

Info Sec For Enterprise Security Customers

Associates Degree, Network/Security/Programming

experience

eLearning/WBT/Instructor-led 83.5

Page 55: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 55

Training Recommendations - MapRFunction Course Title Education /Experience Delivery Duration

(hours)

AdministratorSome college preferred/some

network administration experience preferred

eLearning/WBT/Instructor-led

ArchitectAssociates Degree,

Network/Programming experience

eLearning/WBT/Instructor-led

Info SecAssociates Degree,

Network/Security/Programming experience

eLearning/WBT/Instructor-led

Page 56: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 56

Page 57: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 57

Digital Forensic Tools

Page 58: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 58

Digital Forensic Tools• Dshell – Army internal analysis

framework using Python running on Linux.• Purpose – help analysts

investigate compromises within their environments

• Cisco’s OpenSOC Security Analytics Framework.• Designed to consume and

monitor massive amounts of network traffic and machine “exhaust” data of a Data Center.• Network analysis plug-in

available to analyze network traffic at multiple layers of the OSI stack.

Page 59: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 59

Digital Forensic Tools• AccessData’s Forensic Toolkit (FTK)• Support for Microsoft’s Volume

Shadow Copy (VSC)• Retrieve metadata for deleted files• Chronology of how documents,

user activity, programs changed over time• Geomapping – data virtualization

feature

Use Case:• Retrieving information after a

disk has been wiped clean by an anti-forensics tool• After cleaning the HD showed no

evidence of the proprietary data• Examining VSCs allowed

recovery of destroyed Registry files that proved the proprietary data had been accessed

Page 60: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 60

Hadoop’s MapReduce Technology• Hadoop MapReduce is a software framework for easily writing

applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.• A MapReduce job usually splits the input data-set into

independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system. The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.

Page 61: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 61

Hadoop• A Distributed, Fault-Tolerant Framework for Storing and Analyzing Data.

Composed of:1. Hadoop File System (HDFS)2. MapReduce Application Engine / Programming Framework• Allows code to be written that Hadoop can process in a massively parallel way.

• Very broadly distributed, very efficient programming and storage of LARGE datasets.

• Hadoop does the heavy-lifting and batch processing of the MASSIVE amounts of data.

Page 62: 18th Airborne Big Data Analytics Tech Brief_June 2 2015

05/01/2023 ****** Warfighter-Support, LLC Confidential******* 62

Big Data Extract, Transform & Load (ETL)

KACE Inventory

Data

WASPScanner

Data