splunk 6.0.2 installation

133
Splunk Enterprise 6.0.2 Installation Manual Generated: 3/11/2014 2:55 pm Copyright (c) 2014 Splunk Inc. All Rights Reserved

Upload: marcio-willian

Post on 23-Nov-2015

173 views

Category:

Documents


28 download

DESCRIPTION

Splunk 6 Installation

TRANSCRIPT

  • Splunk Enterprise 6.0.2Installation ManualGenerated: 3/11/2014 2:55 pm

    Copyright (c) 2014 Splunk Inc. All Rights Reserved

  • Table of ContentsWelcome to the Splunk Enterprise Installation Manual...................................1

    What's in this manual................................................................................1

    Plan your Splunk Enterprise installation...........................................................2 Installation overview..................................................................................2System requirements.................................................................................3 Components of a Splunk Enterprise deployment....................................10 Estimate your storage requirements.......................................................12 Splunk architecture and processes.........................................................14 Information on Windows third-party binaries distributed with Splunk......17 Step-by-step installation instructions.......................................................19

    Secure your Splunk Enterprise installation....................................................20 About securing Splunk............................................................................20 Secure your system before you install Splunk........................................20 Install Splunk securely............................................................................20 More ways to secure Splunk...................................................................21

    Estimate hardware requirements.....................................................................23Hardware capacity planning for your Splunk Enterprise deployment.......23 How incoming data affects Splunk Enterprise performance...................25 How indexed data impacts Splunk Enterprise performance...................26 How the number of concurrent users impacts Splunk Enterprise

    performance.............................................................................................26 How saved searches affect Splunk Enterprise performance..................27How search types impact Splunk Enterprise performance......................27 How Splunk apps affect Splunk Enterprise performance........................29 How Splunk Enterprise calculates disk storage......................................29 Reference hardware................................................................................30 Performance questionnaire.....................................................................33 Summary of performance recommendations..........................................35

    Install Splunk Enterprise on Windows.............................................................37Choose the Windows user Splunk Enterprise should run as...................37 Prepare your Windows network for a Splunk Enterprise installation

    as a network or domain user....................................................................41 Install on Windows..................................................................................49 Install on Windows via the command line...............................................54Correct the user selected during Windows installation............................62

    i

  • Table of ContentsInstall Splunk Enterprise on Unix, Linux or Mac OS X...................................64

    Install on Linux........................................................................................64 Install on Solaris......................................................................................68 Install on Mac OS X................................................................................71 Install on FreeBSD..................................................................................76 Install on AIX...........................................................................................80 Install on HP-UX......................................................................................82 Run Splunk Enterprise as a different or non-root user............................84

    Start using Splunk Enterprise..........................................................................87 Start Splunk for the first time...................................................................87What happens next?................................................................................90 Learn about Splunk's accessibility..........................................................91

    Install a Splunk Enterprise license...................................................................93 About Splunk licenses.............................................................................93 Install a license........................................................................................93

    Upgrade or migrate Splunk Enterprise............................................................96 How to upgrade Splunk...........................................................................96 About Upgrading to 6.0 - READ THIS FIRST.........................................98 How Splunk Web procedures have changed from version 5 to version 6.................................................................................................107 Changes for Splunk App developers.....................................................109 Upgrade to 6.0 on UNIX........................................................................114 Upgrade to 6.0 on Windows..................................................................117 Migrate a Splunk Enterprise instance...................................................119 Migrate to the new Splunk licenser.......................................................123

    Uninstall Splunk Enterprise............................................................................126 Uninstall Splunk Enterprise...................................................................126

    Reference..........................................................................................................130 PGP Public Key.....................................................................................130

    ii

  • Welcome to the Splunk EnterpriseInstallation Manual

    What's in this manualUse the Installation Manual to learn how to install Splunk Enterprise.

    In this manual, you can find:

    System requirements Licensing information Procedures for installing Procedures for upgrading from a previous version

    ...and more.

    Note: If you want to install the Splunk universal forwarder, read "Universalforwarder deployment overview" in the Forwarding Data Manual. Unlike Splunkheavy and light forwarders, which are full Splunk Enterprise instances withsome features changed or disabled, the universal forwarder is an entirelyseparate executable, with its own set of installation procedures. For anintroduction to forwarders, see "About forwarding and receiving".

    Find what you need

    You can use the table of contents to the left of this panel, or simply search forwhat you want in the search box in the upper right.

    If you're interested in more specific scenarios and best practices, you can visitthe Splunk Community Wiki to see how other users Splunk IT.

    Make a PDF

    If you'd like a PDF of any version of this manual, click the red Download as PDFlink below the table of contents on the left side of this page. A PDF version of themanual is generated on the fly for you, and you can save it or print it out to readlater.

    1

  • Plan your Splunk Enterprise installation

    Installation overviewThis topic discusses the basic steps required to install Splunk Enterprise on acomputer. We strongly suggest that you read this topic and the contents of thischapter first before performing an installation.

    Installation basics

    The following list provides general guidance on how to install Splunk:

    1. Review the system requirements for installation. Specific additionalrequirements might apply based on the operating system you install Splunk on,and how you plan to use Splunk.

    2. Read "Components of a Splunk deployment" to learn about the SplunkEnterprise ecosystem, and "Splunk architecture and processes" to learn what theSplunk installer puts on your computer.

    3. Review this manual's chapter on securing your Splunk Enterprise instanceand, where appropriate, secure the server(s) on which you plan to install Splunk.

    4. Download the correct installation package for your system from the SplunkEnterprise download page.

    5. Perform the installation using the step-by-step installation instructions for youroperating system.

    6. If this is the first time you have installed Splunk Enterprise, you might want toconsider reading the Splunk Search Tutorial to learn how to index data intoSplunk and search that data using the Splunk Enterprise search language.

    7. After you've installed Splunk Enterprise, you can calculate how much spaceyou need to index your data. Read "Estimate your storage requirements" foradditional information.

    8. If you plan to run Splunk in a production environment, review "Hardwarecapacity planning for your Splunk Enterprise deployment" in this manual forinsight into the amount of hardware a Splunk deployment requires.

    2

  • Upgrading or migrating a Splunk instance?

    If you're upgrading from an earlier version of Splunk Enterprise, read "How toupgrade Splunk Enterprise" in this manual for information and specificinstructions. For tips on migrating from one specific version to another, read the"READ THIS FIRST" topic for the version you want to upgrade to. This topic is inthe "Upgrade or Migrate Splunk Enterprise" chapter.

    If you want to know how to migrate a Splunk Enterprise instance from onesystem to another, read "Migrate a Splunk instance" in this manual.

    System requirementsBefore you download and install the Splunk software, read this topic to learnwhich computing environments Splunk supports.

    Refer to the download page for the latest version to download. Check the releasenotes for details on known and resolved issues.

    For a discussion of hardware planning for deployment, review "Hardwarecapacity planning for your Splunk deployment" in this manual.

    If you have ideas or requests for new features to add to future releases, get intouch with Splunk Support. You can also review our product road map.

    Supported OSes

    Important: Read the following tables carefully when researching the systemrequirements. Splunk availability has changed significantly from previousversions.

    The tables below list the computing platforms that Splunk is available for.

    To find out whether or not Splunk is available for your platform:

    1. Find the operating system you wish to install Splunk on in the left column.

    2. Then, read across to find the appropriate computing architecture in the centercolumn that best matches your environment.

    3

  • The tables show availability for two different types of Splunk, as shown in the twocolumns on the right: Splunk Enterprise/Trial, and Splunk UniversalForwarder. An 'x' in the box that intersects your computing platform and desiredSplunk type means that Splunk is available for that platform. An empty boxmeans that Splunk is not available for that platform.

    Some boxes have characters in addition to - or instead of - an 'x'. Refer to thebottom of the tables to find out what the additional characters represent.

    Unix operating systems

    Operatingsystem Architecture Enterprise / Trial Universal Forwarder

    Solaris 8* and 9x86 (64-bit) xSPARC xx86 (32-bit) x*

    Solaris 10 and11*

    x86 (64-bit) x* x*SPARC x xx86 (32-bit) x* x*

    Linux, 2.4+ withNative POSIXThread Library

    x86 (64-bit)x86 (32-bit) x

    Linux, 2.6+ x86 (64-bit) x xx86 (32-bit) x x

    Linux, 3.0+ x86 (64-bit) x xx86 (32-bit) x x

    PowerLinux,2.6+ PowerPC x

    FreeBSD 7**, 8,and 9

    x86 (64-bit) x xx86 (32-bit) x x

    Mac OS X 10.7,10.8, and 10.9 Intel x x

    AIX 5.3 PowerPC xAIX 6.1 and 7.1 PowerPC x x

    Itanium x x

    4

  • HP/UX? 11i v2and 11i v3

    PA-RISC x

    * Solaris 8 does not support 64-bit Splunk installs. Also, Solaris 11 does notsupport 32-bit Splunk installs.** Be sure to read important notes on FreeBSD 7 below.? You must use gnu tar to unpack the HP/UX installation archive.

    Windows operating systems

    The table below lists the Windows computing platforms that Splunk is availablefor.

    Operatingsystem Architecture Enterprise / Trial Universal Forwarder

    Windows Server2003 and Server2003 R2

    x86 (64-bit) x xx86 (32-bit) x*** x***

    Windows Server2008 and Server2008 R2

    x86 (64-bit) x xx86 (32-bit) x*** x***

    Windows Server2012 x86 (64-bit) x x

    Windows XP x86 (64-bit) xx86 (32-bit) x***

    Windows Vista x86 (64-bit) xx86 (32-bit) x***

    Windows 7 x86 (64-bit) x xx86 (32-bit) x*** x***

    Windows 8 x86 (64-bit) x xx86 (32-bit) x x

    *** This version of Splunk is supported but is not recommended on this platformand architecture. Splunk Enterprise is not available on this platform. However, Splunk Trial andSplunk Universal Forwarder are available.

    5

  • Operating system notes and additional information

    Windows

    Certain parts of Splunk on Windows require elevated user permissions tofunction properly. For additional information about what is required, read thefollowing topics:

    "Splunk architecture and processes" in this manual. "Choose the user Splunk should run as" in this manual. "Considerations for deciding how to monitor remote Windows data" in theGetting Data In Manual.

    FreeBSD 7.x

    To run Splunk 6.x on 32-bit FreeBSD 7.x, install the compat6x libraries. SplunkSupport will supply "best effort" support for users running on FreeBSD 7.x. Formore information, refer to "Install Splunk on FreeBSD 7" in the Community Wiki.

    Deprecated operating systems and features

    As we continue to version the Splunk product, we gradually deprecate support ofolder operating systems. Be sure to read "Deprecated features" in the ReleaseNotes for information on which platforms and features have been deprecated orremoved entirely.

    Creating and editing configuration files on non-UTF-8 OSes

    Splunk expects configuration files to be in ASCII or Universal Character SetTransformation Format-8-bit (UTF-8) format. If you edit or create a configurationfile on an OS that does not use UTF-8 character set encoding, then you mustensure that the editor you are using is configured to save in ASCII/UTF-8.

    IPv6 platform support

    All Splunk-supported OS platforms are supported for use with IPv6 configurationsexcept for the following:

    AIX HP/UX on PA-RISC architecture Solaris 9

    6

  • Refer to "Configure Splunk for IPv6" in the Admin Manual for details on SplunkIPv6 support.

    Supported browsers

    Splunk supports the following browsers:

    Firefox 10.x and latest Internet Explorer 7, 8, 9, and 10 Safari (latest) Chrome (latest)

    You should also make sure you have the latest version of Adobe Flash installedto render any charts that use options not supported by the JSChart module. Formore information about this subject, see "About JSChart" in the Splunk DataVisualizations Manual.

    Recommended hardware

    Splunk is a high-performance application. If you are performing a comprehensiveevaluation of Splunk for production deployment, we recommend that you usehardware typical of your production environment. This hardware should meet orexceed the recommended hardware capacity specifications below.

    For a discussion of hardware planning for production deployment, see"Hardware capacity planning for your Splunk deployment" in this manual.

    Splunk and virtual machines

    If you run Splunk in a virtual machine (VM) on any platform, performance doesdegrade. This is because virtualization works by abstracting the hardware on asystem into resource pools from which VMs defined on the system draw asneeded. Splunk needs sustained access to a number of resources, particularlydisk I/O, for indexing operations. Running Splunk in a VM or alongside other VMscan cause reduced indexing performance.

    Recommended and minimum hardware capacity

    Platform Recommended hardwarecapacity/configuration

    Minimumsupportedhardwarecapacity

    7

  • Non-Windowsplatforms

    2x six-core, 2+ GHz CPU, 12 GB RAM,Redundant Array of Independent Disks(RAID) 0 or 1+0, with a 64 bit OS installed.

    1x1.4 GHz CPU,1 GB RAM

    Windowsplatforms

    2x six-core, 2+ GHz CPU, 12 GB RAM,RAID 0 or 1+0, with a 64 bit OS installed.

    Pentium 4 orequivalent at 2GHz, 2 GB RAM

    Note: RAID 0 configurations do not provide fault-tolerance. Be certain that aRAID 0 configuration meets your data reliability needs before deploying a Splunkindexer on a system configured with RAID 0.

    All configurations other than universal and light forwarder instancesrequire at least the recommended hardware configuration.

    The minimum supported hardware guidelines are designed for personaluse of Splunk. The requirements for Splunk in a production environmentare significantly higher.

    Important: For all installations, including forwarders, you must have a minimumof 5 GB of hard disk space available in addition to the space required for anyindexes. Refer to "Estimate your storage requirements" in this manual foradditional information.

    Hardware requirements for universal and light forwarders

    Recommended Dual-core 1.5 GHz+ processor, 1 GB+ RAMMinimum 1.0 Ghz processor, 512 MB RAM Supported file systems

    Platform File systemsLinux ext2/3/4, reiser3, XFS, NFS 3/4Solaris UFS, ZFS, VXFS, NFS 3/4FreeBSD FFS, UFS, NFS 3/4, ZFSMac OS X HFS, NFS 3/4AIX JFS, JFS2, NFS 3/4HP-UX VXFS, NFS 3/4Windows NTFS, FAT32Note: If you run Splunk on a filesystem that is not listed above, Splunk might runa startup utility named locktest to test the viability of a filesystem for runningSplunk. Locktest is a program that tests the start up process. If locktest runs

    8

  • and fails, then the filesystem is not suitable for running Splunk.

    Considerations regarding file descriptor limits (FDs) on *nix systems

    Splunk allocates file descriptors on *nix systems for actively monitored files,forwarder connections, deployment clients, users running searches, and so on.

    Usually, the default file descriptor limit (controlled by the ulimit command on a*nix-based OS) is 1024. Your Splunk administrator should determine the correctlevel, but it should be at least 8192. Even if Splunk allocates just a single filedescriptor for each of the activities above, it?s easy to see how a few hundredfiles being monitored, a few hundred forwarders sending data, a handful of veryactive users on top of reading/writing to/from the datastore can easily exhaust thedefault setting.

    The more tasks your Splunk instance is doing, the more FDs it will need, so youshould increase the ulimit value if you start to see your instance run intoproblems with low FD limits.

    For more information, read about ulimit in the Troubleshooting Manual.

    This consideration is not applicable to Windows-based systems.

    Considerations regarding Network File System (NFS)

    When choosing to use Network File System (NFS) as a storage medium forSplunk indexing, it is important to consider all of the ramifications of file levelstorage.

    Splunk strongly recommends that you use block level storage rather than filelevel storage for indexing your data.

    In environments with reliable, very high-bandwidth low-latency links, or withvendors that provide high-availability, clustered network storage, NFS can be anappropriate choice. However, customers who plan to choose this strategy shouldwork closely with their hardware vendor to confirm that the storage platform theychoose performs to the desired specification in terms of both performance anddata integrity.

    If you choose to use NFS, note the following caveats:

    Splunk does not support "soft" NFS mounts (mounts which cause aprogram attempting a file operation on the mount to report an error and

    9

  • continue in case of a failure).Only "hard" NFS mounts - mounts where the client continues to attempt tocontact the server in case of a failure) are reliable with Splunk.

    Do not disable attribute caching. If you have other applications whichrequire disabling or reducing attribute caching, then you must provideSplunk a separate mount with attribute caching enabled.

    Do not use NFS mounts over a wide area network (WAN). Doing socauses performance issues and can potentially lead to data loss.

    Considerations regarding solid state drives

    Solid state drives (SSDs) deliver significant performance gains over conventionalhard drives for Splunk in "rare" searches - searches that request small sets ofresults over large swaths of data - when used in combination with bloom filters.They also deliver performance gains with concurrent searches overall.

    Supported server hardware architectures

    32 and 64-bit architectures are supported for some platforms. See the downloadpage for details.

    Components of a Splunk Enterprise deploymentBy using a single software component and easy to understand configurations,Splunk Enterprise can coexist with existing infrastructure or be deployed as auniversal platform for accessing IT data.

    The simplest deployment is the one you get by default when you install Splunk:indexing and searching on the same server. You log into Splunk Web or the CLIon the server and configure data inputs to collect machine data. You then use thesame server to search, monitor, alert, and report on the incoming data.

    Depending on your needs, you can also deploy components of Splunk ondifferent servers to address your load and availability requirements. This sectionintroduces the types of components. For a more thorough introduction, see theDistributed Deployment manual, particularly the topic, "Scale your deployment:Splunk components".

    10

  • Indexer

    Splunk indexers provide indexing capability for local and remote data and hostthe primary Splunk datastore. Refer to "How indexing works" in the ManagingIndexers and Clusters manual for more information.

    Search head

    A search head is a Splunk instance configured to distribute searches to indexers(referred to as "search peers" in this context). Search heads can be eitherdedicated or not, depending on whether they also perform indexing. Dedicatedsearch heads don't have any indexes of their own (other than the usual internalindexes). Instead, they consolidate and display results that originate from remotesearch peers.

    See "What is distributed search" in the Distributed Search Manual to configure asearch head to search across a pool of indexers.

    Forwarder

    Forwarders are Splunk instances that forward data to remote indexers forindexing and storage. In most cases, they do not index data themselves. Refer tothe "About forwarding and receiving" topic in the Forwarding Data manual.

    Deployment server

    A Splunk instance can also serve as a deployment server. The deploymentserver is a tool for distributing configurations, apps, and content updates togroups of Splunk Enterprise instances. You can use it to distribute updates tomost types of Splunk Enterprise components: forwarders, non-clusteredindexers, and search heads. Refer to "About deployment server and forwardermanagement" in the Updating Splunk Enterprise Instances manual for additionalinformation.

    Functions at a glance

    Functions Indexer Search head Forwarder Deploymentserver

    Indexing xWeb x

    11

  • Direct search xForward to indexer xDeployconfigurations x x x

    Index replication and clusters

    A cluster is a group of indexers configured to replicate each others' data, so thatthe system keeps multiple copies of all data. This process is known as indexreplication. By maintaining multiple, identical copies of data, clusters preventdata loss while promoting data availability for searching.

    Splunk Enterprise clusters feature automatic failover from one indexer to thenext. This means that, if one or more indexers fail, incoming data continues toget indexed and indexed data continues to be searchable.

    Besides enhancing data availability, clusters have other key features that youshould consider when you're scaling a deployment. For example, they include acapability to coordinate configuration updates easily across all indexers in thecluster. They also include a built-in distributed search capability. For moreinformation on clusters, see "About clusters and index replication" in theManaging Indexers and Clusters manual.

    Estimate your storage requirementsThis topic describes how to estimate the size of your Splunk Enterprise index, sothat you can plan your storage capacity requirements.

    When Splunk Enterprise indexes your data, it creates two main types of files: the"rawdata" file that contains the original data in compressed form and the indexfiles that point to this data. (It also creates a few metadata files, which don'tconsume much space.) With a little experimentation, you can estimate how muchindex disk space you will need for a given amount of incoming data.

    Typically, the compressed rawdata file is 10% the size of the incoming,pre-indexed raw data. The associated index files range in size fromapproximately 10% to 110% of the rawdata file. The number of unique terms inthe data affect this value.

    Depending on the data's characteristics, you might want to tune yoursegmentation settings, as described in "About segmentation" in the Getting Data

    12

  • In Manual.

    The best way to get an idea of your space needs is to experiment by indexing arepresentative sample of your data, and then checking the sizes of the resultingdirectories in defaultdb.

    On *nix systems, follow these steps

    Once you've indexed your data sample:

    1. Go to $SPLUNK_HOME/var/lib/splunk/defaultdb/db.

    2. Run du -ch hot_v* and look at the last total line to see the size of the index.

    On Windows systems, follow these steps

    1. Download the du utility from Microsoft TechNet.

    2. Extract du.exe from the downloaded ZIP file and place it into your%SYSTEMROOT% or %WINDIR% folder.

    Note: You can also place it anywhere in your %PATH%.

    3. Open a command prompt.

    4. Once there, go to %SPLUNK_HOME%\var\lib\splunk\defaultdb\db.

    5. Run del %TEMP%\du.txt & for /d %i in (hot_v*) do du -q -u %i\rawdata| findstr /b "Size:" >> %TEMP%\du.txt.

    6. Open the %TEMP%\du.txt file. You will see Size: n, which is the size of eachrawdata directory found.

    7. Add these numbers together to find out how large the compressed persistedraw data is.

    8. Next, run for /d %i in (hot_v*) do dir /s %i, the summary of which is thesize of the index.

    9. Add this number to the total persistent raw data number.

    This is the total size of the index and associated data for the sample you haveindexed. You can now use this to extrapolate the size requirements of your

    13

  • Splunk index and rawdata directories over time.

    Answers

    Have questions? Visit Splunk Answers to see what questions and answers otherSplunk users had about data sizing.

    Splunk architecture and processesThis topic discusses Splunk's internal architecture and processes at a high level.If you're looking for information about third-party components used in Splunk,refer to the credits section in the Release notes.

    Processes

    A Splunk server runs two processes (installed as services on Windows systems)on your host, splunkd and splunkweb:

    splunkd is a distributed C/C++ server that accesses, processes andindexes streaming IT data. It also handles search requests. splunkdprocesses and indexes your data by streaming it through a series ofpipelines, each made up of a series of processors.

    Pipelines are single threads inside the splunkd process, eachconfigured with a single snippet of XML.

    Processors are individual, reusable C or C++ functions that act onthe stream of IT data passing through a pipeline. Pipelines canpass data to one another via queues. splunkd supports acommand line interface for searching and viewing results.

    splunkweb is a Python-based application server based on CherryPy thatprovides the Splunk Web user interface. It allows users to search andnavigate data stored by Splunk servers and to manage your Splunkdeployment through a Web interface.

    splunkweb and splunkd can both communicate with your Web browser viaREpresentational State Transfer (REST):

    splunkd also runs a Web server on port 8089 with SSL/HTTPS turned onby default.

    splunkweb runs a Web server on port 8000 without SSL/HTTPS by default.

    14

  • On Windows systems, splunkweb.exe is a third-party, open-source executablethat Splunk renames from pythonservice.exe. Since it is a renamed file, it doesnot contain the same file version information as other Splunk for Windowsbinaries.

    Read information on other Windows third-party binaries distributed with Splunk.

    Splunk and Windows in Safe Mode

    Neither the splunkd, the splunkweb, nor the SplunkForwarder services starts ifWindows is in Safe Mode. Additionally, if you attempt to start Splunk from theStart Menu while in Safe Mode, Splunk does not alert you to the fact that itsservices are not running.

    Additional processes for Splunk on Windows

    On Windows instances of Splunk, in addition to the two services describedabove, there are additional processes that Splunk uses when you create specificdata inputs on a Splunk instance. These scripted inputs run when configured bycertain types of Windows-specific data input.

    splunk.exe

    splunk.exe is the control application for the Windows version of Splunk. Itprovides the command line interface (CLI) for the program, and allows you tostart, stop, and configure Splunk, similar to the *nix splunk program.

    Important: splunk.exe requires an elevated context to run because of how itcontrols the splunkd and splunkweb processes. Splunk might not functioncorrectly if this executable is not given the appropriate permissions on yourWindows system. This is not an issue if you install Splunk as the Local Systemuser.

    splunk-admon

    splunk-admon.exe is spawned by splunkd whenever you configure an ActiveDirectory (AD) monitoring input. splunk-admon's purpose is to attach to thenearest available AD domain controller and gather change events generated byAD. Splunk then stores these events in the desired index.

    15

  • splunk-perfmon

    splunk-perfmon.exe runs when you configure Splunk to monitor performancedata on the local machine. This service attaches to the Performance Data Helperlibraries, which query the performance libraries on the system and extractperformance metrics both instantaneously and over time.

    splunk-netmon

    splunk-netmon (new for version 6.0) runs when you configure Splunk to monitorWindows network information on the local machine.

    splunk-regmon

    splunk-regmon.exe runs when you configure a Registry monitoring input inSplunk. This scripted input initially writes a baseline for the Registry as itcurrently exists (if desired), then monitors changes to the Registry over time.Those changes come back into Splunk as searchable events.

    splunk-winevtlog

    You can use this utility to test defined event log collections, and it outputs eventsas they are collected for investigation. Splunk has a Windows event log inputprocessor built into the engine.

    splunk-winhostmon

    splunk-winhostmon (new for version 6.0) runs when you configure a Windowshost monitoring input in Splunk. This scripted input gets detailed informationabout Windows hosts.

    splunk-winprintmon

    splunk-winprintmon (new for version 6.0) runs when you configure a Windowsprint monitoring input in Splunk. This scripted input gets detailed informationabout Windows printers and print jobs on the local system.

    splunk-wmi

    When you configure a performance monitoring, event log or other input against aremote computer, this program starts up. Depending on how you configure theinput, either it attempts to attach to and read Windows event logs as they comeover the wire, or it executes a Windows Query Language (WQL) query against

    16

  • the Windows Management Instrumentation (WMI) provider on the specifiedremote machine(s). Splunk then stores the events.

    Architecture diagram

    Information on Windows third-party binariesdistributed with SplunkThis topic provides additional information on the third-party Windows binariesthat the Splunk Enterprise and the Splunk universal forwarder packages include.

    For more information about Splunk's universal forwarder, read "Deploy theuniversal forwarder" in the Forwarding Data Manual.

    Third-party Windows binaries included with Splunk Enterprise

    The following third-party Windows binaries ship with Splunk Enterprise. Exceptwhere indicated, only the Splunk Enterprise product includes these binaries.

    These binaries provide functionality to Splunk as shown in their individualdescriptions. None of them contains file version information or authenticodesignatures (certificates which prove the binary file's authenticity). Additionally,Splunk does not provide support for debug symbols related to third-partymodules.

    Note: Only the third party binaries, apps and scripts that ship with Splunk havebeen tested for Certified for Windows Server 2008 R2 (CFW2008R2) WindowsLogo compliance. Any other binaries, apps, or scripts - such as those youdownload from the Internet in the course of extending Splunk's capabilities - havenot been tested for this compliance.

    17

  • Archive.dll

    Libarchive.dll is a multi-format archive and compression library.

    Both Splunk Enterprise and the Splunk universal forwarder include this binary.

    Bzip2.exe

    Bzip2 is a freely available, patent-free (see below), high-quality data compressor.It typically compresses files to within 10% to 15% of the best availabletechniques (the PPM family of statistical compressors), whilst being around twiceas fast at compression and six times faster at decompression.

    Jsmin.exe

    Jsmin.exe is an executable that removes whitespace and comments fromJavaScript files, reducing their size.

    Libexslt.dll

    Libexslt.dll is the Extensions to Extensible Stylesheet Language Transformation(EXSLT) dynamic link C library developed for libxslt (a part of the GNOMEproject).

    Both Splunk Enterprise and the Splunk universal forwarder include this binary.

    Libxml2.dll

    Libxml2.dll is the Extensible Markup Language (XML) C parser and toolkitdeveloped for the GNOME project (but usable outside of the GNOME platform),

    Both Splunk Enterprise and the Splunk universal forwarder include this binary.

    Libxslt.dll

    Libxslt.dll is the XML Stylesheet Language for Transformations (XSLT) dynamiclink C library developed for the GNOME project. XSLT itself is an XML languageto define transformation for XML. Libxslt is based on libxml2 the XML C librarydeveloped for the GNOME project. It also implements most of the EXSLT set ofprocessor-portable extensions functions and some of Saxon's evaluate andexpressions extensions.

    Both Splunk Enterprise and the Splunk universal forwarder include this binary.

    18

  • Minigzip.exe

    Minigzip.exe is the minimal implementation of the ?gzip? compression tool.

    Openssl.exe

    The OpenSSL Project is a collaborative effort to develop a robust,commercial-grade, full-featured, and open source toolkit implementing theSecure Sockets Layer (SSL v2/v3) and Transport Layer Security (TLS v1)protocols as well as a full-strength general purpose cryptography library.

    Both Splunk Enterprise and the Splunk universal forwarder include this binary.

    Python.exe

    Python.exe is the Python programming language binary for Windows.

    Pythoncom.dll

    Pythoncom.dll is a module that encapsulates the Object Linking and Embedding(OLE) automation API for Python.

    Pywintypes27.dll

    Pywintypes27.dll is a module that encapsulates Windows types for Pythonversion 2.7.

    Step-by-step installation instructionsNow that you've learned what Splunk Enterprise is and what is needed to installit, you can get detailed installation procedures for your operating system:

    Windows Windows (from the command line) Linux Solaris Mac OS X FreeBSD AIX HP-UX

    19

  • Secure your Splunk Enterprise installation

    About securing SplunkAs soon as you set up and begin using your new Splunk installation or upgrade,you should perform a few additional steps to ensure that Splunk and your dataare secure. Taking the proper steps to secure Splunk reduces its attack surfaceand mitigates the risk and impact of most vulnerabilities.

    This chapter highlights some of the ways you can secure Splunk before, during,and after installation. The Securing Splunk manual provides more detailedinformation about the many ways you can or should secure Splunk.

    Secure your system before you install SplunkBefore you even install Splunk, take a few steps to be sure that your operatingsystem is secure. Splunk strongly recommends hardening all Splunk serveroperating systems.

    If your organization does not have internal hardening standards, Splunkrecommends the CIS hardening benchmarks.

    As a minimum, limit shell/command line access to your Splunk servers. Secure physical access to all Splunk servers. Ensure that Splunk end users practice sound physical and endpointsecurity.

    Install Splunk securelyTake the following steps when downloading and installing Splunk

    Configure redundant Splunk instances, both indexing a copy of the samedata.

    Verify your Splunk download using a hash function such as MD5 tocompare the hashes. For example:

    ./openssl dgst md5

    20

  • More ways to secure SplunkOnce you have Splunk installed, you can take more steps to secure yourconfiguration.

    Configure user authentication and role-based access control

    Set up users and use roles to control access. Splunk allows you to configureusers in three ways:

    Splunk's own built-in system, described in "Set up user authentication withSplunk's built-in system."

    LDAP, described in "Set up user authentication with LDAP." A scripted authentication API for use with an external authenticationsystem, such as PAM or RADIUS, described in "Set up userauthentication with external systems."

    Once you've configured users you can assign roles that determine and controlcapabilities and access levels. For more information about roles and capabilities,read "About role-based user access."

    Use SSL certificates to configure encryption andauthentication

    Splunk comes with a set of default certificates and keys that, when enabled,provide encryption and data compression. You can also use your own certificatesand keys to secure communications between your browser and Splunk Web aswell as data sent from forwarders to a receiver, such as an indexer.

    For more information about securing Splunk communications with SSL, see"About securing Splunk with SSL" in this manual.

    Audit Splunk

    Splunk includes audit features to allow you to track the reliability of your data. Werecommend that you explore some of the following ways you can audit Splunk.

    Monitor Files and Directories

    Audit Splunk activity

    Cyrptographically sign audit events

    21

  • Configure IT data block signing

    About archive signing

    Configure event hashing

    Harden your Splunk installation

    We also recommend you take the following steps to harden your Splunkinstallation:

    Deploy secure passwords across multiple servers

    Use Splunk's Access Control Lists

    Secure your service accounts

    Disable unnecessary Splunk components

    Secure Splunk on your network

    22

  • Estimate hardware requirements

    Hardware capacity planning for your SplunkEnterprise deploymentSplunk Enterprise is a flexible product that meets almost any scale andredundancy requirement in the course of its operation. Taking advantage of thatflexibility requires careful planning. This chapter discusses high level hardwareguidance for Splunk deployments and describes how Splunk uses hardwareresources in various situations.

    Before deciding on your hardware outlay for Splunk:

    1. Be sure to review "Components of a Splunk Enterprise deployment" in thismanual for a description of all of the elements of a Splunk installation.

    2. Next, learn about the type of hardware that comprises a "single indexer" byreading "Reference hardware."

    3. Finally, read the remaining topics in this chapter to learn how Splunkoperations impact performance and how to maximize that performance.

    Dimensions of a Splunk Enterprise deployment

    In some cases, a single indexer can handle the load of both searching andindexing.

    There are scenarios where you must consider adding infrastructure to yourSplunk Enterprise deployment for maximum efficiency and performance. Below isa list of things that significantly impact performance:

    1. The amount of incoming data. The more data you send to Splunk, the moretime Splunk needs to index it into results that you can search, report andgenerate alerts on.

    2. The amount of indexed data. As the amount of data stored in an index goesup, the server that indexes that data requires additional bandwidth both to storethe data and provide results for searches.

    23

  • 3. The number of concurrent users. If more than one person at a time uses aninstance of Splunk, that instance requires more resources for those users to dosearches and create reports and dashboards.

    4. The number of saved searches. If you plan on running a lot of savedsearches, Splunk needs capacity to perform those searches promptly andefficiently. The more saved searches you run in a given period of time, the moreresources are required.

    5. The types of search you employ. Almost as important as the number ofsaved searches is the types of search that you run against a Splunk system.There are several different types of search, each of which affects how theindexer responds to search requests.

    6. Whether or not you run Splunk apps. Splunk apps and solutions can haveunique performance, deployment, and configuration considerations. If you planon running apps, make sure you consider the resource requirements of theapp(s) you are using. Refer to the installation and deployment section of yourapp or solution's documentation for additional information. Additionally, read"Hardware capacity planning for a distributed Splunk deployment" to learn how toproperly size your environment for an app's increased resource requirements.

    How do these dimensions impact overall performance?

    Follow the links above to determine how each of the dimensions impactsperformance on a reference indexer.

    While these factors impact the basic sizing requirements of your Splunkdeployment on the whole, it's important to understand that addressing each ofthem individually does not guarantee peak efficiency for your Splunk deployment.You must discover how these factors correlate with one another in your specificapplication in order to realize maximum performance.

    For example, if your Splunk Enterprise deployment calls for a low amount ofindexing but has a high number of concurrent users, it has significantly differentresource needs than a setup with a low number of concurrent users and a highamount of daily indexing volume. Additionally, as both user count and amount ofindexed data rise, you must distribute the environment across multiple servers tomaintain a similar performance level. Search types complicate matters further, assome are bound by available CPU resources, and others are bound by the speedof the disk subsystem.

    24

  • When should I scale my Splunk Enterprise deployment?

    To best answer this question you must understand how the above Splunkdeployment dimensions apply to your specific use case. Ask yourself thesequestions, then refer to the performance questionnaire later in this chapter tohelp ascertain when you should add more hardware resources:

    How much data do you expect to index daily? How much data do you need to retain? How many users do you expect to search through the data at any onetime?

    Do you plan to use certain specific searches more than once? Do you want or need to use a Splunk app to present or manipulate yourdata?

    The key to a well-performing installation is to develop a plan early in thedeployment cycle to account for both your initial outlay of hardware resources, aswell as the addition of resources when the deployment scales up.

    You can read about capacity planning for a distributed deployment at "Hardwarecapacity planning for a distributed Splunk deployment" in the DistributedDeployment manual.

    How incoming data affects Splunk EnterpriseperformanceThis topic discusses how incoming data impacts indexing performance in SplunkEnterprise.

    A reference Splunk indexer can index a significant amount of data in a shortperiod of time - up to 5.8 MB of data per second - or 500 GB per day. This is ifthe server is doing nothing else but consuming data.

    Performance changes depending on the size and amount of incoming data.Larger events slow down indexing performance. As events increase in size, theindexer uses more system memory to process and index them.

    If you need more indexing capacity than a single indexer can provide, you mustadd indexers into the deployment to account for the increased demand.

    25

  • How indexed data impacts Splunk EnterpriseperformanceThis topic discusses how data that has already been consumed by SplunkEnterprise affects performance.

    Once Splunk Enterprise consumes data and places it into indexes, those indexesgrow, taking up disk space. As the indexes grow and available disk spacedecreases, Splunk takes more time to index incoming data because the indexer'sdisk subsystem takes more time to find space to store the data.

    This impacts search as well. On a single indexer, disk throughput splits betweenindexing (which is ongoing) and search requests (which are interrupts based onrequests scheduled by users.) As indexes grow, search slows down because notonly does the disk subsystem need to account for search requests, it also needsto handle increasingly longer requests to store incoming data. Depending on thetype of search, those kinds of requests can be very I/O-intensive.

    How the number of concurrent users impactsSplunk Enterprise performanceThis topic discusses how the number of concurrent users impacts SplunkEnterprise performance on a single indexer.

    A reference indexer needs to dedicate one of its available CPU cores for everyuser that logs into the system. This CPU core only handles the actual sessionitself. When a user starts searching, each search request takes up an additionalCPU core, for as long as the search is active.

    These figures assume that CPUs are idle when they receive a login or searchrequest. This does not account for other system requests, or CPU cores used bySplunk to index data. If they're processing any other system requests, then theload splits across other available CPUs.

    As CPU cores get used up, all activities on an indexer slow down as thecomputer splits processing time between indexing, search, and handling on-lineusers. At that point, only additional indexers can increase capacity for all threefunctions of Splunk operation.

    26

  • How saved searches affect Splunk EnterpriseperformanceThis topic discusses how the number of saved searches - searches that yousave to use again at a later time - affect performance in Splunk Enterprise.

    On a reference indexer, a saved search consumes about 1 CPU core and aspecified amount of memory while it executes. It also increases the amount ofdisk I/O temporarily as the disk subsystem looks through the indexes to fetch thedesired data.

    Each additional saved search that executes at the same time consumes anadditional CPU core. This consumption is separate from CPU usage from theoperating system and Splunk indexing and storage processes.

    If more saved searches execute than can be accepted for processing, they willqueue. Splunk also warns you when the system reaches the maximum numberof saved searches. When searches queue, search results return more slowly.

    Adding indexers and search heads provides additional CPU cores to run moreconcurrent searches. Adding RAM to your existing machines helps withconcurrent searches but does not give you additional search capacity.

    How search types impact Splunk EnterpriseperformanceThis topic discusses how the different types of search impact overallperformance on a single reference indexer.

    There are four basic types of search that you can invoke against data stored in aSplunk index. Each of these search types impacts the Splunk indexer in adifferent way. The search types are:

    Dense. A dense search is a search that returns a large percentage (10% ormore) of matching results for a given set of data in a given period of time. Areference server should be able to fetch up to 50,000 matching events persecond for a dense search. Dense searches usually tax a server's CPU first,because of the overhead required to decompress the raw data stored in a Splunkindex.

    27

  • Sparse. Sparse searches return smaller numbers of results for a given set ofdata in a given period of time (anywhere from .01 to 1%) than dense searchesdo. A reference indexer should be able to fetch up to 5,000 matching events persecond when executing a sparse search.

    Super-sparse. A super-sparse search is a "needle in the haystack" search thatretrieves only a very small number of results across the same set of data withinthe same time period as the other searches. A super-sparse search is very I/Ointensive because the indexer must look through all of the buckets of an index tofind the desired results. This can take up to two seconds per searched bucket. Ifyou have a large amount of data stored on your indexer, there are a lot ofbuckets, and a super-sparse search can take a very long time to complete.

    Rare. Rare searches are like super-sparse searches in that they match just ahandful of results across a number of index buckets. The major difference withrare searches is that bloom filters - data structures that test whether or not anelement is a member of a set - significantly reduce the number of buckets thatneed to be searched by eliminating those buckets which do not contain eventsthat match the search request. This allows a rare search to complete anywherefrom 20 to 100 times faster than a super-sparse search, for the same amount ofdata searched.

    Summary

    The following table summarizes the different search types. Note that for denseand sparse searches, Splunk measures performance based on number ofmatching events, while with super-sparse and rare searches, performance ismeasured based on total indexed volume.

    Search type Description Ref. indexerthroughput

    Performanceimpact

    Dense

    Dense searches return a largepercentage of results for agiven set of data in a givenperiod of time.

    Up to 50,000matchingevents persecond

    GenerallyCPU-bound

    Sparse

    Sparse searches return asmaller amount of results for agiven set of data in a givenperiod of time than densesearches do.

    Up to 5,000matchingevents persecond

    GenerallyCPU-bound

    Super-sparse

    28

  • Super-sparse searches return avery small number of resultsfrom each index bucket whichmatch the search. Dependingon how large the set of data is,these types of search can takea long period of time.

    Up to 2seconds perindex bucket

    Primarily I/Obound

    Rare

    Rare searches are similar tosuper-sparse searches, but areassisted by bloom filters whichhelp eliminate index bucketsthat do not match the searchrequest. Rare searches returnresults anywhere from 20 to100 times faster than asuper-sparse search does.

    From 10 to 50index bucketsper second

    Primarily I/Obound

    How Splunk apps affect Splunk EnterpriseperformanceThis topic discusses how Splunk apps impact overall Splunk Enterpriseperformance on a single reference indexer.

    While many apps can run on a single indexer - Splunk actually runs severalincluded with the product - the more things an app does, the more likely you mustdistribute it across multiple machines.

    Many apps require a distributed Splunk Enterprise deployment by design.Whether it's a case of universal forwarders fetching data and sending it to asingle central instance, or many indexers and search heads connected togetherand serving up reports, dashboards, or alerts, Splunk apps often need more thanone server to realize both maximum performance and potential in the enterprise.

    How Splunk Enterprise calculates disk storageThis topic discusses how Splunk Enterprise calculates disk storage.

    At a high level, Splunk calculates total disk storage as follows:

    ( Daily average indexing rate ) x ( retention policy ) x 1/2

    29

  • If you want to base your calculation on the specific type(s) of data that Splunk willindex, you can use the method described in "Estimate your storagerequirements" in this manual.

    Splunk Enterprise stores raw data at up to approximately half its original size dueto compression. On a volume that contains 500 GB of usable disk space, thismeans you can store nearly 6 months' worth of data at an indexing rate of 5GB/day, or 10 days' worth at a rate of 100 GB/day.

    If you need additional storage, you can opt for either more local disks (requiredfor frequent searching) or attached or network storage (acceptable for occasionalsearching). Low-latency connections over NFS or SMB/CIFS (Server MessageBlock/Common Internet File System) are acceptable for searches over long timeperiods where instant search returns can be compromised to lower cost per GB.

    Important: Shares mounted over a Wide Area Network (WAN) connection or onstandby storage such as tape are never suitable storage choices for Splunkoperations.

    Reference hardwareWhen sizing your Splunk Enterprise environment's hardware needs, a referencemachine helps you understand when it is time to scale and distribute thedeployment. Following is an example of such a machine. Refer to thisconfiguration as the standard for the remainder of this chapter.

    The reference machine described below produces the following index and searchperformance metrics for a given sample of data:

    Indexing performance

    Up to 5.8 megabytes per second (500 GB per day) of raw indexingperformance, provided no other Splunk activity is occurring.

    Search performance

    Up to 50,000 events per second for dense searches Up to 5,000 events per second for sparse searches Up to 2 seconds per index bucket for super-sparse searches From 10 to 50 buckets per second for rare searches with bloom filters

    30

  • To find out more about the types of searches and how they affect SplunkEnterprise performance, read "How search types affect Splunk Enterpriseperformance" in this manual.

    Bare-metal hardware

    Intel x86 64-bit chip architecture 2 CPUs, 6 cores per CPU (12 cores total), at least 2 Ghz per core 12 GB RAM Standard 1 Gb Ethernet NIC, optional 2nd NIC for a management network Standard 64-bit Linux or Windows distribution

    Disk subsystem

    The reference computer's disk subsystem should be capable of handling a highnumber of averaged Input/Output Operations Per Second (IOPS).

    IOPS are a measurement of how much data throughput a hard drive canproduce. Since a hard drive reads and writes at different speeds, there are IOPSnumbers for disk reads and writes. The average IOPS is the blend betweenthose two figures.

    The more average IOPS a hard drive can produce, the more data it can indexand search in a given period of time. While many variable items factor into theamount of IOPS that a hard drive can produce, the three most importantelements are:

    its rotational speed (in revolutions per minute) its average latency (the amount of time it takes to spin its platters half arotation)

    its average seek time (the amount of time it takes to retrieve a requestedblock of data.)

    To get the most IOPS out of a hard drive, always choose those drives that havehigh rotational speeds and low average latency and seek times. Every drivemanufacturer provides this information (and some provide much more).

    For additional information on IOPS and how to calculate them, review thefollowing articles:

    "Getting the hang of IOPS(http://www.symantec.com/connect/articles/getting-hang-iops-v13) onSymantec's Connect Community.

    31

  • "Analyzing I/O performance in Linux(http://www.cmdln.org/2010/04/22/analyzing-io-performance-in-linux) onCMDLN.ORG (A sysadmin blog).

    For this application, we use eight 146-gigabyte, 15,000 RPM serial-attachedSCSI (SAS) HDs in a Redundant Array of Independent Disks (RAID) 1+0 faulttolerance scheme as the disk subsystem. Each hard drive is capable of about200 average IOPS. The combined array produces a little over 800 IOPS.

    Important: Splunk is often constrained by disk I/O first, so always consider diskinfrastructure first when specifying your hardware.

    Virtual hardware

    Splunk Enterprise performs fastest when deployed directly on to bare-metalhardware, as described above. However, Splunk can and does deliver on virtualequipment. What's more, we fully support deploying Splunk Enterprise on virtualhardware.

    Using the bare metal hardware as a baseline, Splunk Enterprise generallyindexes data about 30% slower on a virtual machine (VM) than it does on astandard reference machine. Search performance is on par with the real-worldhardware.

    This is a best-case scenario that does not account for resource contention withother active VMs on the same physical server. It also does not take into accountcertain vendor-specific I/O enhancement techniques (such as Direct I/O or RawDevice Mapping).

    Splunk Enterprise in the cloud

    While you can run Splunk in the cloud, there are various concerns that you mustbe aware of when doing so. In addition to the security concerns of runningSplunk in a public cloud, you must also note that performance degradessignificantly compared to bare-metal hardware. Using that benchmark as abaseline again, Splunk indexing performance on a cloud-based computer isroughly half that of a real one. Searching suffers, too - results return anywherefrom 15 to 20 percent slower than on a physical machine.

    32

  • Performance questionnaire Overview

    This topic helps you make the choice on whether or not to distribute your SplunkEnterprise deployment.

    This questionnaire is for a single-server Splunk Enterprise deployment based onthe reference architecture described in "Reference hardware."

    Determine when to scale your Splunk Enterprise deployment

    Before you consider whether or not to scale, estimate how much data you needto index, and whether or not you need more than one concurrent Splunk user tosearch that data.

    Depending on how much data you index and how many concurrent users yourequire, you might need to scale your environment to multiple machines. Even ifyour indexing amount and user count falls within the capabilities of a singleserver, you might have to distribute your deployment based on the types ofsearches you employ, and whether or not you use summary indexes.

    If you want to run a Splunk app or solution in your Splunk environment, or youcreate elements that generate a large number of saved searches, you mighthave to distribute Splunk Enterprise components across a number of machines.

    Question 1: Do you want to create or run a Splunk app, alert or solutionthat executes a large number of saved searches (more than 8concurrently)?

    A saved search is a search that a user saves to make available for later use. Thenumber of saved searches - especially those run concurrently - directly impacts aSplunk server's performance. If you answered "NO" to this question, thenproceed to Question 2. You don't need to consider scaling your Splunkdeployment to multiple machines just yet.

    However, if you answered "YES" then you should scale your Splunk deploymentto multiple machines. Review detailed information on hardware capacity planningfor distributed Splunk deployments in "Hardware capacity planning for adistributed Splunk Deployment" in the Distributed Deployment Manual.

    33

  • Question 2: Do you need to index more than 2 GB of data per day?

    Question 3: Do you need more than 2 users signed in at one time?

    If the answer to both questions is "NO" then your Splunk Enterprise instancecan safely share one of the reference servers with other services, with the caveatthat Splunk must have sufficient disk I/O bandwidth on the shared machine.

    If you answered "YES" to either question then proceed to Question 4.

    Note: If you are deploying Splunk Enterprise on Windows, you must not sharefull Splunk services on servers that run Microsoft Exchange, Active Directorydomain services, or machine virtualization software. This is because thoseservices are often very disk I/O intensive, and can dramatically reduce indexingand search performance. Additionally, you must ensure that any anti-virussoftware installed on the server does not scan the Splunk installation directory.

    Question 4: Do you need to index more than 100 GB per day?

    Question 5: Do you need more than 4 concurrent users?

    If the answer to both questions is "NO" then a single dedicated Splunk server ofour reference architecture should be able to handle your workload.

    Question 6: Do you need more than 500GB of total storage?

    Read "How Splunk Enterprise calculates disk storage" to learn how Splunkcalculates disk storage.

    If the answer to this question is "NO" then a single dedicated reference servershould be able to handle your workload, but you might need to add fast storageto the system to account for the increased space usage.

    If the answer to this question is "YES" then you should consider scaling yourdeployment to additional indexers to cope with the increased demand of indexingand searching.

    Question 7: Do you need to search large quantities of data for a small set(less than 1 per cent) of results?

    Searches that cover large quantities of data and return small sets of results areknown as super-sparse searches. These searches require lots of disk I/Obecause the indexer must search a number of buckets to find the data you're

    34

  • looking for.

    If the answer to this question is "NO" then you probably do not need to scaleyour deployment. However, adding additional indexers does improve bothindexing and search performance.

    If the answer to this question is "YES" then you should definitely consider scalingyour deployment up. Read the following section to determine how SplunkEnterprise calculates storage.

    Summary of performance recommendationsThis topic summarizes the performance recommendations that were given in theperformance questionnaire. The table below shows the amount of referenceservers that are required to index and search data in Splunk Enterprise,depending on the number of concurrent users and amounts of data that theinstance indexes.

    As a reminder, the reference hardware is:

    Intel x86 64-bit chip architecture 2 CPUs, 6 cores per CPU (12 cores total), at least 2 Ghz per core 12 GB RAM Disk subsystem capable of producing 800 IOPS Standard 1Gb Ethernet NIC, optional 2nd NIC for a management network Standard 64-bit Linux or Windows distribution

    For additional information about the reference server, read "Reference hardware"in this manual.

    Important: The figures shown in the table below only account for the referenceserver in question performing a single task, such as either indexing or searching.If a server is performing both actions at the same time, performance can anddoes degrade depending on the amount of indexing and searching happening atthe time. The figures shown here are approximate guidelines only.

    If you run Splunk apps, have higher indexing volumes, employ multiple orI/O-heavy searches, or need more concurrent users than this table shows, thenyou should scale your deployment as described in "Hardware capacity planningfor a distributed Splunk deployment" in the Distributed Deployment Manual.

    If you need more guidance, contact Splunk.

    35

  • DailyIndexingVolume

    Number ofConcurrent Search

    UsersRecommended

    IndexersRecommendedSearch Heads

    < 2 GB/day < 2 1, shared N/A2 GB/day to100 GB/day up to 4 1, dedicated N/A

    100 GB/day to200 GB/day up to 8 2 1

    Note: For indexing requirements greater than 100 GB per day, or for additionalconcurrent users, review "Hardware capacity planning for a distributed Splunkdeployment" in the Distributed Deployment Manual.

    Answers

    Have questions? Visit Splunk Answers to see what questions and answers otherSplunk users had about hardware and Splunk.

    36

  • Install Splunk Enterprise on Windows

    Choose the Windows user Splunk Enterpriseshould run asThis topic discusses the steps you should take to choose which Windows userSplunk Enterprise should run as when you install Splunk on Windows.

    When you run the Windows Splunk Enterprise installer, it presents you with theoption to select the user that Splunk should run as. Splunk strongly recommendsyou read this topic before installing in order to understand the ramifications ofchoosing the user type.

    This topic applies to all versions of Splunk, including Splunk Enterprise and theSplunk universal forwarder. It applies to installing Splunk on Windows only.

    The user you choose depends on what you want SplunkEnterprise to monitor

    The user Splunk Enterprise runs as determines what it can monitor. The LocalSystem user has access to all data on the local machine, but nothing else. A userother than Local System has access to whatever data you want it to, but youmust give the user that access prior to installing Splunk.

    If you already know that the computer you're installing Splunk on will notaccess remote Windows data then you can proceed directly to "Install onWindows" in this manual (or, if you want to install using the command prompt,"Install on Windows via the command line.")

    If there is a possibility that you will need to access remote Windows data,or you are not sure, then read on - this topic contains important informationabout the user you should install Splunk as.

    About the "Local System user" and "other user" choices

    The basics

    The Windows Splunk Enterprise installer provides two ways to install Splunk: asthe "Local System" user, or as another existing user on your Windows computeror network, which you designate.

    37

  • If you intend to do any of the following with Splunk, then you must install Splunkas an "other user":

    read Event Logs remotely collect performance counters remotely read network shares for log files enumerate the Active Directory schema using Active Directory monitoring

    Note: This is not an all-inclusive list.

    The user that you specify must, at a minimum:

    Be a member of the Active Directory domain or forest you wish to monitor(when using AD).

    Be a member of the local Administrators group on the server you'reinstalling Splunk Enterprise on.

    Have specific user security rights assigned to it prior to installing Splunk.Read "Minimum permissions requirements" later in this topic for specificinformation.

    Caution: If the user does not have these minimum requirements satisfied,Splunk Enterprise installation might fail. In this case, even if Splunk installationsucceeds, Splunk might not run correctly, or at all.

    The user also has unique password constraints - read "Splunk user accounts andpassword concerns" later in this topic for specifics.

    If you're not sure which user Splunk Enterprise should run as, then review"Considerations for deciding how to monitor remote Windows data" in the GettingData In Manual for additional information on how to configure the Splunk userwith the access it needs.

    User accounts and password concerns

    Another important issue that arises when you install Splunk Enterprise with auser account is that any active password enforcement security policy controls thepassword's validity. If your Windows server or network enforces passwordchanges, you must consider these things:

    Before the password expires, change it, reconfigure Splunk Enterpriseservices on every machine to use the changed password, and then restartSplunk.

    Configure the account so that its password never expires.

    38

  • Use a managed service account (read "Use managed service accounts onWindows Server 2008 and Windows 7" later in this topic).

    Use managed service accounts on Windows Server 2008, Windows Server2012 and Windows 7

    If you run Windows Server 2008, Windows Server 2008 R2, Windows Server2012, or Windows 7 in Active Directory, and your AD domain has at least oneWindows Server 2008 R2 or Server 2012 domain controller, you can installSplunk Enterprise to run as a managed service account (MSA).

    The major benefits of using a MSA are:

    Increased security from the isolation of accounts for services. Administrators no longer need to manage the credentials or administer theaccounts. This means that, among other things, passwords automaticallychange after they expire, and you do not have to manually set passwordsor restart services associated with these accounts.

    Administrators can delegate the administration of these accounts tonon-administrators.

    Some important things to understand before installing Splunk with a MSA are:

    The MSA requires the same permissions as a domain account on themachine that runs Splunk.

    The MSA must be a local administrator on the machine that runs Splunk. You cannot use the same account on different computers, as you wouldwith a domain account.

    You must correctly configure and install the MSA on the machine that runsSplunk before you install Splunk on the machine. For information andinstructions on how to do this, review "Service Accounts Step-by-StepGuide"(http://technet.microsoft.com/en-us/library/dd548356%28WS.10%29.aspx)on MS Technet.

    To install Splunk Enterprise using a MSA, read "Prepare your Windows networkfor a Splunk Enterprise installation as a network or domain user" in this manual.

    Security and remote access considerations

    39

  • Minimum permissions requirements

    If you choose to install Splunk as a domain user, then there are a minimumnumber of permissions required on the server that runs Splunk.

    The following is a list of the minimum user rights and permissions that thesplunkd, splunkweb, and splunkforwarder services require when Splunk isinstalled using a domain user. Depending on the sources of data you want tomonitor, the Splunk user might need a significant amount of additionalpermissions.

    Required basic permissions for the splunkd or splunkforwarder services

    Full control over Splunk's installation directory Read access to any flat files you want to index

    Required Local/Domain Security Policy user rights assignments for the splunkd orsplunkforwarder services

    Permission to log on as a service Permission to log on as a batch job Permission to replace a process-level token Permission to act as part of the operating system Permission to bypass traverse checking

    Important: Failure to assign these permissions to the Splunk user prior toinstallation can result in a failed Splunk install, or an installation which does notfunction correctly, or at all.

    Required basic permissions for the splunkweb service

    Full control over Splunk's installation directory

    Required Local/Domain Security Policy user rights assignments for the splunkwebservice

    Permission to log on as a service

    Note: Splunk Enterprise does not require these permissions when it runs as theLocal System account.

    40

  • How to assign these permissions

    This section contains high-level concepts on how to assign the appropriate userrights and permissions to the Splunk service account before attempting to install.For step-by-step instructions, read "Prepare your Windows network for a SplunkEnterprise installation as a network or domain user" in this manual.

    Use Group Policy to assign rights to multiple machines

    If you want to assign the policy settings shown above to a number ofworkstations and servers in your AD domain or forest, you can define a GroupPolicy object (GPO) with these specific rights, and deploy that GPO across thedomain. Read "Prepare your Windows network for a Splunk Enterpriseinstallation as a network or domain user" in this manual for specific instructions.

    Once you've created and enabled the GPO, the workstations and servers in yourdomain pick up the changes either during the next scheduled AD replicationcycle (usually every 1 1/2 to 2 hours) or at the next boot time. Alternatively, youcan force AD replication using the GPUPDATE command line utility on the server onwhich you want to update Group Policy.

    When setting user rights, remember that rights assigned by a GPO overrideidentical Local Security Policy rights on a machine, and you can't change thissetting. If you wish to retain previously existing rights that are explicitly definedthrough Local Security Policy on a machine, you must also assign these rightswithin the GPO.

    Troubleshoot permissions issues

    The rights described above are the rights that the splunkd, splunkweb, andsplunkforwarder services specifically require. Other rights might be needed,depending on your usage and what data you want to access. Additionally, manyuser rights assignments and other Group Policy restrictions can prevent Splunkfrom running. If you have issues, consider using a tool such as Process Monitoror GPRESULT to troubleshoot GPO application in your environment.

    Prepare your Windows network for a SplunkEnterprise installation as a network or domain userThe following procedures detail the steps you must take to prepare yourWindows network to allow for Splunk Enterprise installation as a network or

    41

  • domain user other than the "Local System" user.

    Important: Do not perform these instructions if you plan to install SplunkEnterprise or universal forwarder as the "Local System" user.

    The instructions shown here have been tested for Windows Server 2008 R2 andWindows Server 2012, and might differ slightly for other versions of Windows.

    Caution: These instructions require full administrative access to thecomputer and/or Active Directory domain you want to prepare for Splunkoperations. Do not attempt to perform this procedure without this access.

    Additionally, the rights you assign using these instructions are the minimumrights required for a successful Splunk installation. You might need to assignadditional rights, either within the Local Security Policy or a Group Policy object(GPO), or to the user and group accounts you create, in order for Splunk toaccess the data you want.

    Prepare Active Directory for Splunk installation as a domainuser

    The following instructions guide you through preparing your Active Directory toallow for installations of Splunk Enterprise or the Splunk universal forwarder as adomain account.

    Splunk recommends that you follow Microsoft's Best Practices(http://technet.microsoft.com/en-us/library/bb727085.aspx) when creating usersand groups. This typically involves creating a specific Organizational Unit forgroups within the organization.

    These instructions assume the following:

    You are running Active Directory. You are a domain administrator for the AD domain(s) you want toconfigure.

    The computer(s) you plan to install Splunk on are members of the ADdomain.

    Create groups

    1. Run the Active Directory Users and Computers tool by selecting Start >Administrative Tools > Active Directory Users and Computers.

    42

  • 2. Once the program loads, select the domain that you want to prepare forSplunk operations.

    3. Double-click an existing appropriate container folder to open it, or create a newOrganization Unit by selecting New > Group from the Action menu.

    4. From the Action menu, select New > Group.

    5. In the dialog that appears, type in a name that represents Splunk useraccounts, for example, "Splunk Accounts".

    Ensure that the Group scope is set to Domain Local, and Group type isset to Security.

    6. Click OK to create the group.

    7. Create a second group and specify a name that represents Splunk-enabledcomputers, for example, "Splunk Enabled Computers". This group will containcomputer accounts that get assigned the appropriate permissions to run Splunkas a domain user.

    Ensure that the Group scope is set to Domain Local, and Group type isset to Security.

    Assign users and computers to groups

    If you have not already created the user account(s) that you want to use to runSplunk, now is a good time to do so. Follow Microsoft's best practices for creatingusers and groups if you do not have your own internal policy.

    Once you have created the user account(s), add the account(s) to the SplunkAccounts group, and add the computer accounts of the computers that will runSplunk to the Splunk Enabled Computers group.

    After you have done this, you can exit Active Directory Users and Computers.

    Define a Group Policy object (GPO)

    1. Run the Group Policy Management Console (GPMC) tool by selecting Start> Administrative Tools > Group Policy Management.

    2. In the tree view pane on the left, select Domains.

    43

  • 3. Click the Group Policy Objects folder.

    4. In the Group Policy Objects in folder, right-click and selectNew from the menu that pops up.

    5. In the New GPO dialog, type in a name that represents the fact that the GPOwill assign user rights to the servers you apply it to, for example, "SplunkAccess."

    Leave the Source Starter GPO field set to "(none)".

    6. Click OK to save the GPO.

    Add rights to the GPO

    1. While still in the GPMC, right-click on the newly created group policy objectand select Edit from the pop-up menu that appears.

    2. In the Group Policy Management Editor that appears, in the left pane,browse to Computer Configuration -> Policies -> Windows Settings ->Security Settings -> Local Policies -> User Rights Assignment.

    a. In the right pane, double-click on the Act as part of the operatingsystem entry.

    b. In the window that opens, check the Define these policy settingscheckbox.

    c. Click Add User or Group?

    d. In the dialog that opens, click Browse?

    e. In the Select Users, Computers, Service Accounts, or Groupsdialog that opens, type in the name of the "Splunk Accounts" group youcreated earlier, then click Check Names?

    Windows underlines the name if it is valid. Otherwise it tells youthat it cannot find the object and prompts you for an object nameagain.

    f. Click OK to close the "Select Users?" dialog.

    g. Click OK again to close the "Add User or Group" dialog.

    44

  • h. Click OK again to close the rights properties dialog.

    3. Repeat Steps 2a-2h for the following additional rights:

    Bypass traverse checking Log on as a batch job Log on as a service Replace a process-level token

    Change per-server Administrators group membership

    The following steps restrict who is a member of the Administrators group on theserver(s) to which you apply this GPO.

    Caution: Make sure to add all accounts that need access to the Administratorsgroup on each server to the Restricted Groups policy setting. Failure to do so cancause you to lose administrative access to the servers to which you apply thisGPO!

    1. While still in the Group Policy Management Editor window, in the left pane,browse to Computer Configuration -> Policies -> Windows Settings ->Security Settings -> Restricted Groups.

    a. In the right pane, right-click and select Add Group? in the pop-up menuthat appears.

    b. In the dialog that appears, type in Administrators and click OK.

    c. In the properties dialog that appears, click the Add button next toMembers of this group:.

    d. In the Add Member dialog that appears, click Browse?"

    e. In the Select Users, Computers, Service Accounts, or Groupsdialog that opens, type in the name of the "Splunk Accounts" group youcreated earlier, then click Check Names?

    Windows underlines the name if it is valid. Otherwise it tells youthat it cannot find the object and prompts you for an object nameagain.

    f. Click OK to close the Select Users? dialog.

    45

  • g. Click OK again to close the "Add User or Group" dialog.

    h. Click OK again to close the group properties dialog.

    2. Repeat Steps 1a-1h for the following additional users or groups:

    Domain Admins any additional users who need to be a member of the Administratorsgroup on every server to which you apply the GPO.

    3. Close the Group Policy Management Editor window to save the GPO.

    Restrict GPO application to select computers

    1. While still in the GPMC, in the GPMC's left pane, select the GPO you createdand added rights to, if it is not already selected.

    GPMC displays information about the GPO in the right pane.

    2. In the right pane, under Security Filtering, click Add?

    3. In the Select User, Computer, or Group dialog that appears, type in "SplunkEnabled Computers" (or the name of the group that represents Splunk-enabledcomputers that you created earlier.)

    4. Click Check Names. If the group is valid, Windows underlines the name.Otherwise, it tells you it cannot find the object and prompts you for an objectname again.

    5. Click OK to return to the GPO information window.

    6. Repeat Steps 2-5 to add the "Splunk Accounts" group (the group thatrepresents Splunk user accounts that you created earlier.)

    7. Under Security Filtering, click the Authenticated Users entry to highlight it.

    8. Click Remove.

    GPMC removes the "Authenticated Users" entry from the "SecurityFiltering" field, leaving only "Splunk Accounts" and "Splunk EnabledComputers."

    46

  • Apply the GPO

    1. While still in the GPMC, in the GPMC's left pane, select the domain that youwant to apply the GPO you created.

    2. Right click on the domain, and select Link an Existing GPO? in the menu thatpops up.

    Note: If you only want the GPO to affect the OU that you created earlier, thenselect the OU instead and right-click to bring up the pop-up menu.

    3. In the Select GPO dialog that appears, select the GPO you created andedited, and click OK. GPMC applies the GPO to the selected domain.

    4. Close GPMC by selecting File > Exit from the GPMC menu.

    Note: Active Directory controls when Group Policy updates occur and GPOs getapplied to computers in the domain. Typically, replication happens every 90-120minutes. You must wait this amount of time before attempting to install Splunk asa domain user. Alternatively, you can force a Group Policy update by runningGPUPDATE /FORCE from a command prompt on the computer on which you want toupdate Group Policy.

    Install Splunk with a managed system account

    Alternatively, you can install Splunk with a managed system account. Followthese instructions to do so:

    1. Create and configure the MSA that you plan to use to monitor Windows data.

    Note: You can use the instructions in "Prepare your Active Directory to runSplunk services as a domain account" earlier in this topic to assign the MSA theappropriate security policy rights and group memberships.

    2. Install Splunk from the command line as the "Local System" user.

    Important: You must install Splunk from the command line and use theLAUNCHSPLUNK=0 flag to keep Splunk from starting after installation is completed.

    3. After installation is complete, use the Windows Explorer or the ICACLScommand line utility to grant the MSA "Full Control" permissions to the Splunkinstallation directory and all its sub-directories.

    47

  • Note: You might need to break NTFS permission inheritance from parentdirectories above the Splunk installation directory and explicitly assignpermissions from that directory and all subdirectories.

    4. Follow the instructions in the topic "Correct the user selected during Windowsinstallation" in this manual to change the default user for Splunk's serviceaccount. In this instance, the correct user is the MSA you configured prior toinstalling Splunk.

    Important: You must append a dollar sign ($) to the end of the username whencompleting Step 4 in order for the MSA to work properly. For example, if the MSAis SPLUNKDOCS\splunk1, then you must enter SPLUNKDOCS\splunk1$ in theappropriate field in the properties dialog for the service. You must do this for boththe splunkd and splunkweb services.

    5. Confirm that the MSA has the "Log on as a service" right.

    Note: If you use the Services control panel to make the service accountchanges, Windows grants this right to the MSA automatically.

    6. Start Splunk. Splunk will run as the MSA configured above, and will haveaccess to all data that the MSA has access to.

    Prepare a local machine or non-AD network for Splunkinstallation

    If you are not using Active Directory, follow these instructions to giveadministrative access to the user you want Splunk to run as on the computersyou want to install Splunk on.

    1. Give the user Splunk should run as administrator rights by adding the user tothe local Administrators group.

    2. Start Local Security Policy by selecting Start > Administrative Tools > LocalSecurity Policy.

    Local Security Policy launches and displays the local security settings.

    3. In the left pane, expand Local Policies and then click User RightsAssignment.

    48

  • a. In the right pane, double-click on the Act as part of the operatingsystem entry.

    b. Click Add User or Group?

    c. In the dialog that opens, click Browse?

    d. In the Select Users, Computers, Service Accounts, or Groupsdialog that opens, type in the name of the "Splunk Computers" group youcreated earlier, then click Check Names...

    Windows underlines the name if it is valid. Otherwise it tells youthat it cannot find the object and prompts you for an object nameagain.

    e. Click OK to close the "Select Users?" dialog.

    f. Click OK again to close the "Add User or Group" dialog.

    g. Click OK again to close the rights properties dialog.

    4. Repeat Steps 3a-3g for the following additional rights:

    Bypass traverse checking Log on as a batch job Log on as a service Replace a process-level token

    Once you have completed these steps, you can then install Splunk as thedesired user.

    Install on WindowsThis topic describes the procedure for installing Splunk Enterprise on Windowswith the Graphical User Interface (GUI)-based installer. More options (such assilent installation) are available if you install from the command line.

    Important: Running the 32-bit version of Splunk for Windows on a 64-bitWindows system is not recommended. If you attempt to run the 32-bit installer ona 64-bit system, the installer will warn you of this.

    49

  • We strongly recommend that you run 64-bit Splunk on 64-bit hardware. Theperformance is greatly improved over the 32-bit version.

    Note: If you want to install the Splunk universal forwarder, see the ForwardingData manual: "Universal forwarder deployment overview". Unlike SplunkEnterprise heavy and light forwarders, which are full Splunk instances withsome features changed or disabled, the universal forwarder is an