In this article
This Incident Response Cheat Sheet is for performing live analysis on a system you suspect is compromised. It is important to follow through each step in sequence as you handle an incident.
Step 1: Preparation
- Identify who will be responding to the incident in question. Incident handlers should be available to assist with the investigation from start to finish.
- Use forms to assist you through each step. Sample forms from SANS are available at https://www.sans.org/score/incident-forms/.
- Physical access to the suspicious system should be given to the incident handler(s), and they should know how to get physical access to the system. Physical access is preferred to remote access, since the attacker could detect investigations done on the system remotely. A physical copy of the hard disk might be necessary for forensic and evidence purposes. Finally, if needed, physical access could be needed to disconnect the suspected machine from any network.
- A good knowledge of the usual network activity, services, software, and users of the machine/server is needed. You should have a file in a secure place describing the usual activity, so it can be used to compare efficiently to the current state. If you use a standard OS/software bundle for your systems, identify what is different and unique about this system now.
- Have a “known good state,” such as a backup or a snapshot of the system.
The more you know about the system in its normal state, the more chances you have to detect any fraudulent activity originating from the system.
Step 2: Identification
We highly recommend the SysInternals suite of tools to help with investigating a system, especially Process Explorer and TCPView. SysInternals is available for download at https://docs.microsoft.com/en-us/sysinternals/
- Look for unusual accounts created, especially in the Administrators group:
- C:\> lusrmgr.msc
- C:\> net localgroup administrators
- Look for unusually big files on the storage support, bigger than 5MB. This can be an indication of a system compromised for illegal content storage.
- Look for unusual files added recently in system folders, especially C:\WINDOWS\system32.
- Use WinDirStat to show disk usage statistics: https://windirstat.net/.
- Look for files using the “hidden” attribute in all subfolders:
Unusual registry entries
- Look for unusual programs launched at boot time in the Windows registry:
Unusual processes and services
- Check all running processes for unusual/unknown entries, especially processes with username “SYSTEM” and “ADMINISTRATOR”:
- C:\> taskmgr.exe (or tlisk, tasklist depending on Windows release)
- Use Sysinternals Process Explorer (psexplorer) if possible.
Unusual network services running
- Look for unusual/unexpected network services installed and started:
- C:\> services.msc
- C:\> net start
Unusual network activity
- Check for file shares and verify each one is linked to a normal activity:
- C:\> net view \\127.0.0.1
- Use SysInternals TCPView (tcpview) if possible.
- Look at the opened sessions on the machine:
- Look at the sessions the machine has opened with other systems:
- Look for any suspicious Netbios connections:
- Look for any suspicious activity on the system’s ports:
- C:\> netstat –na 5 (5 makes it refresh every 5 seconds)
Check startup folders of user accounts
- Look for unusual startup programs for all users (path depends on Windows release):
- C:\Documents and Settings\<USER>\Start Menu\Programs\Startup
- C:\WinNT\Profiles\<USER>\Start Menu\Programs\Startup
Check for unusual automated tasks
- Look at the list of scheduled tasks for any unusual entries:
Check for unusual log entries
If you are using Splunk:
- Search for “index=<index-of-your-Windows-Event Logs> XXXX”
- Some Windows Event IDs to look for (depends on your OS):
- 64004 - Windows File Protection warning event.
- 4688|592 - New process created. Look for unusual processes or wrong names (spelling is off, lowercase drive letters, extra spaces).
- 1001|4097 - Application crash. Look for buffer overflow as the cause.
- 601|4697 - Created and installed a new service.
- 602|4698 - Created a new scheduled task.
- 567|4657 - Modify registry key for service to start at boot.
- 7034, 7035, 7036, 7040 - Virus protection mechanism changes.
- Use Event Viewer locally on the system:
- C:\> eventvwr.msc
- Search for events affecting the firewall, the anti-virus, the file protection, or any suspicious new service.
- Look for a huge amount of failed login attempts or locked out accounts.
Check for Rootkits
- Use 2 of more of any of the following rootkit detectors and compare their results, as false positives can occur:
Check for Malware
Run at least one anti-virus product on the whole disk. If possible use several anti-virus products. The anti-virus must absolutely be up-to-date.
Step 3: Containment
Determine, by questioning the owner of the system, what other systems they have access to or have logged into.
NOTE: Any user who accessed the compromised machine during the time it was compromised should change their passphrase IMMEDIATELY.
Determine if the system contains sensitive data such as Social Security Numbers or credit card numbers by reviewing your asset inventory or interviewing the users of the system. If the system stored sensitive data, contact the Division of IT Security Operations Center immediately at firstname.lastname@example.org or 301-226-4225.
Choose one of the following options to proceed, taking into account the criticality of the compromised system:
Option 1 - For critical systems that CANNOT be disconnected from the network
If the machine is considered critical for your department/unit’s business activity and CANNOT be disconnected from the network:
- Consider making a copy of the system’s memory for further analysis. Tools to assist with memory capture:
- Back up all important data onto an external hard drive or USB thumb drive.
- Change all local and domain passphrases for all accounts associated with the system. Consider removing any accounts used by the attacker.
- Patch the system and all neighboring systems.
Option 2 - For non-critical systems that CAN be disconnected from the network
If the machine is not critical for department/unit’s business activity and CAN be disconnected from the network:
- Remove the system from the network by unplugging the Ethernet cable or disabling the wireless network adapter.
- Turn off the system by unplugging its power cord from the wall outlet (for desktop systems/server), or on a laptop press the “off” button for a number of seconds until it shuts off.
Step 4: Eradication
- Remove all malicious files and software installed by the attacker.
- If rootkits were found, you MUST build the system from scratch, formatting the drive first.
- Locate the most recent clean backup of the system if you are able to restore from backup.
- Move the rebuilt system to a new name or IP address and change DNS names.
- Ensure rebuilt system is properly patched and hardened.
- Perform a vulnerability analysis across your network to ensure no other systems were compromised in the same way. If you find any, cycle back through the incident handling process for those systems.
Step 5: Recovery
- Validate the restored system and that it is back to a normal state.
- Determine an appropriate time to bring the restored, validated system back online. You should monitor the system carefully from the moment it is placed in production.
- Monitor OS and application logs.
- Look for attacker artifacts, which would signal a re-compromise:
- Changes to configuration through registry keys and values.
- Unusual processes.
- Accounts that were previously used by the attacker appear again.
StepP 6: Lessons learned
A report should be written and made available discussing the following themes.
- How was this initially detected?
- Timeline of important events of the incident.
- Actions taken (most importantly during containment, eradication, and recovery).
- What went right?
- What went wrong?
- Incident cost to the department.
Ensure that all parties involved in the incident handling process agree to what is written in the report. If someone strongly disagrees, they should write their own report to document the incident from their point of view.
The report(s) should be reviewed by the team and potentially upper management and discussed in a meeting held within two weeks of resuming production.