Skip to content

Virtual Machine README

Phil Hagen edited this page Dec 13, 2023 · 6 revisions

Background

This page contains details for the SOF-ELK® (Security Operations and Forensics Elasticsearch, Logstash, Kibana) VM. The VM is provided as a community resource but is covered at varying depths in the following SANS course(s):

All parsers and dashboards for this VM are now maintained in the project's Github repository.

Download

The latest version of the VM itself is always available at https://for572.com/sof-elk-vm.

Latest Distribution Vitals

  • Basic details on the distribution
    • VM is a CentOS 7.9 base with all OS updates as of 2023-06-23
    • Includes Elastic stack components v8.8.1
    • Configuration files are from the "public/v20230622" branch of this Github repository
  • Metadata
    • Filename and size: Public SOF-ELK v20230623.7z (2,228,849,930 bytes)
    • MD5: fac5e0f35232d6e15718ec265a283217
    • SHA256: 07ce02a53b008073aa91a8e16d53a207d8970fdfb6e59eb35ff31c84b93ea927

General Information

  • The VM was created with VMware Fusion v13.0.2 and ships with virtual hardware v18.
    • If you're using an older version of VMware Workstation/Fusion/Player, you will likely need to convert the VM back to a previous version of the hardware.
    • Some VMware software provides this function via the GUI, or you may find the free "VMware vCenter Converter" tool helpful.
  • The VM is deployed with the "NAT" network mode enabled
  • Credentials:
    • username: elk_user
      • password: forensics
      • has sudo access to run ALL commands
  • Logstash will ingest all files from the following filesystem locations:
    • /logstash/aws/: JSON-formatted Amazon Web Services CloudTrail log files. Use the included aws-cloudtrail2sof-elk.py loader script.
    • /logstash/azure/: JSON-formatted Microsoft Azure logs. At this time, the following log types are supported: Event Logs, Sign In Logs, Audit Logs, Admin Activity Logs, and Storage Logs.
    • /logstash/gcp/: JSON-formatted Google Compute Platform logs.
    • /logstash/gws/: JSON-formatted Google Workspace logs extracted using the Google Workspace API.
    • /logstash/httpd/: Apache logs in common, combined, or vhost-combined formats
    • /logstash/kape/: JSON-format files generated by the KAPE triage collection tool. (See this document for details on which specific output files are currently supported and their required file naming structure.)
    • /logstash/nfarch/: Archived NetFlow output, formatted as described below
    • /logstash/microsoft365/: JSON-formatted Microsoft 365 logs only.
    • /logstash/passivedns/: Logs from the passivedns utility
    • /logstash/plaso/: CSV bodyfile-format files generated by the Plaso tool from the log2timeline framework. (See this document for details on creating CSV files in a supported format.)
    • /logstash/syslog/: Syslog-formatted data
      • NOTICE: Remember that syslog DOES NOT reflect the year of a log entry! Therefore, Logstash has been configured to look for a year value in the path to a file. For example: /logstash/syslog/2015/var/log/messages will assign all entries from that file to the year 2015. If no year is present, the current year will be assumed. This is enabled only for the /logstash/syslog/ directory.
    • /logstash/zeek/: JSON-formatted logs from the Zeek Network Security Monitoring platform. These must be in decompressed form. The following Zeek logs are supported:
      • conn.log: Treated like NetFlow and stored in the netflow-* indices.
      • dns.log: Treated like other DNS logs and stored in the logstash-* indices.
      • http.log: Treated like other HTTP logs and stored in the httpdlog-* indices.
  • Commands to be familiar with:
    • /usr/local/sbin/sof-elk_clear.py: DESTROY contents of the Elasticsearch database. Most frequently used with an index name base (e.g. sof-elk_clear.py -i logstash to delete all data from the Elasticsearch logstash-* indexes.) Other options detailed with the -h flag.
    • /usr/local/sbin/sof-elk_update.sh: Update the SOF-ELK® configuration files from the Github repository. (Requires sudo.)
  • Files to be familiar with:
    • /etc/logstash/conf.d/*.conf: Symlinks to github-based configuration files that handle input, preprocessing, parsing, postprocessing, and output of log events.
    • /usr/local/sof-elk/*: Clone of the project Github repository, with the public/v* branch corresponding to the virtual machine's release version.

How to Use

  • Extract the compressed archive to your host system
  • Open and boot the VM
  • Log into the VM with the elk_user credentials (see above)
    • Logging in via SSH recommended, but if using the console login and a non-US keyboard, run sudo loadkeys uk, replacing uk as needed for your local keyboard mapping
  • cd to one of the /logstash/*/ directories as appropriate
  • Place files in this location (Mind the above warning about the year for syslog data. Files must also be readable by the "logstash" user.)
  • Open the main Kibana dashboard using the Kibana URL shown in the pre-authentication screen, http://<ip_address>:5601
    • This dashboard gives a basic overview of what data has been loaded and how many records are present
    • There are links to several stock dashboards on the left hand side
  • Wait for Logstash to parse the input files, load the appropriate dashboard URL, and start interacting with your data

Sample Data Included

  • Syslog data in ~elk_user/lab-2.3_source_evidence/
    • Unzip each of these files into the /logstash/syslog/ directory, such as: unzip -d /logstash/syslog/ ~elk_user/lab-2.3_source_evidence/<file>
    • Use the time frame 2013-06-08 15:00:00 to 2013-06-08 23:30:00 to examine this data.
  • NetFlow data in ~elk_user/lab-3.1_source_evidence/
    • Use the nfdump2sof-elk.sh script and write output to the /logstash/nfarch/ directory, such as: cd /home/elk_user/lab-3.1_source_evidence/ ; nfdump2sof-elk.sh -e 10.3.58.1 -r ~elk_user/lab-3.1_source_evidence/netflow/ -w /logstash/nfarch/lab-3.1_netflow.txt
    • Use the time frame 2012-04-02 to 2012-04-07 to examine this data.

Ingesting Archived NetFlow

  • To ingest existing nfcapd-gcreated NetFlow evidence, it must be parsed into a specific format. The included nfdump2sof-elk.sh script will take care of this.
    • Read from single file: nfdump2sof-elk.sh -r /path/to/netflow/nfcapd.201703190000 -w /logstash/nfarch/inputfile_1.txt
    • Read recursively from directory: nfdump2sof-elk.sh -r /path/to/netflow/ -w /logstash/nfarch/inputfile_2.txt
    • Optionally, you can specify the IP address of the exporter that created the flow data: nfdump2sof-elk.sh -e 10.3.58.1 -r /path/to/netflow/ -w /logstash/nfarch/inputfile_3.txt
  • To ingest existing AWS VPC Flow data files in JSON format, use the included aws-vpcflow2sof-elk.sh script.
    • Read recursively from directory: aws-vpcflow2sof-elk.sh -r /path/to/aws-vpcflow/ -w /logstash/nfarch/aws-vpcflow_1.txt
  • To ingest existing GCP VPC Flow data files in JSON format, use the included azure-vpcflow2sof-elk.py script.
    • Read from single file: azure-vpcflow2sof-elk.py -r /path/to/gcp-vpcflow/file1.json -w /logstash/nfarch/gcp-vpcflow_1.txt
    • Read recursively from directory: azure-vpcflow2sof-elk.py -r /path/to/gcp-vpcflow/ -w /logstash/nfarch/gcp-vpcflow_1.txt

Credits

  • Derek B: Cisco ASA parsing/debugging and sample data
  • Barry A: Sample data and troubleshooting
  • Ryan Johnson: Testing
  • Matt Bromiley: Testing
  • Mike Pilkington: Testing
  • Mark Hallman: Testing
  • David Szili: Testing and troubleshooting
  • Pierre Lidome: Microsoft 365 assistance and overall testing
  • Josh Lemon: GCP assistance
  • David Cowen: AWS assistance
  • Megan Roddie: Testing

Admin/Legal

  • This virtual appliance is provided "as is" with no express or implied warranty for accuracy or accessibility. No support for the functionality the VM provides is offered outside of this document.
  • This virtual appliance includes GeoLite2 data created by MaxMind, available from https://www.maxmind.com and subject to the GeoLite2 EULA. The included GeoIP database files are from December 17, 2019 and are covered by the previous MaxMind license that permitted redistribution of these files without collecting user contact information. Installation of updated GeoIP databases should be accomplished by the included /usr/local/sbin/geoip_bootstrap.sh script. This script also optionally enables scheduled automatic updates to the databases for Internet-connected systems. You can learn more about the GeoLite2 databases, as well as sign up for a free MaxMind account by clicking here.
  • SOF-ELK® is a registered trademark of Lewes Technology Consulting, LLC. Content is copyrighted by its respective contributors. SOF-ELK logo is a wholly owned property of Lewes Technology Consulting, LLC and is used by permission.