Cull Data by Date
Documents that exist before the data collection date, but were not present during the breach event, are commonly culled from data mining efforts. After the breach has been remediated, potentially compromised data may then be collected. Running date searches as outlined below can help assessors identify which documents to include in the data mining process. Use the following dates to perform these searches, based upon available information:
- Remediation Date
- The date when the incident response team is confident that the breach has been remediated. This date can be used to cull the data collected.
- Data Collection Date
- The date when the incident response team performed its first forensics data collection for upload after the remediation date. This date can also be tracked per each collection instance, allowing for culling based on either the custodian or upload set. Data collections that occurred prior the remediation date do not need to be culled by date, as that data would potentially be compromised.
Considerations for Incident Response Teams:
-
Remediation - Certainty that hackers are no longer present is assumed through remediation. If a hacker is found to still be present in the system, then all data collected, regardless of the date, is potentially compromised.
-
Files Without Dates - Depending on how the data collection was performed, some files may lack dates in their properties, except for the date the file was uploaded to the system. An example of this is when a text or CSV file is zipped without capturing the file system’s metadata during the zipping process. In this case, Canopy uses the date the file was uploaded to Canopy.
-
Files Changed and Moved Between Remediation and Collection Events - Changes made to data made after remediation may result in incorrect creation dates or cause inaccurate “last modified date” results for some file types.
For instance, there was a group of text and CSV files present during the breach, and the following actions were taken on these files after remediation, but before data collection began:
- The files were moved to another directory without using a robust copy, and
- The document headers were modified in a way that caused the last modified date to be updated to a date after remediation. In this case, these files could be excluded from the scope because their creation and last modified dates fall after the remediation date, but before the collection date.
Here are a few illustrative search examples to scope data for mining, given the following dates:
- Date Remediated
- 2024-03-01 12:10:38.292
- Date Collected
- 2024-04-10 08:37:12.000
- Potentially Comprised Documents that Contain Dates
- This query captures any document that was last modified or created prior to the remediation date:
meta.master_created_datetime:[* TO "2024-03-2 00:00:00.000"] OR meta.master_modified_datetime:[* TO "2024-03-2 00:00:00.000"]
- Potentially Compromised Documents without Dates
- This query captures the document where the creation and modified dates were not contained within the data:
meta.master_created_datetime:["2024-04-10 00:00:00.000" TO *] AND meta.master_modified_datetime:["2024-04-10 00:00:00.000" TO *]
- Potentially Compromised Documents to Data Mine
- Putting both of the above searches together, the administrator can find the data that was potentially compromised. Here are two examples:
- Using precise date/times:
(meta.master_created_datetime:[* TO "2024-3-1 12:10:38.292"] OR meta.master_modified_datetime:[* TO "2024-3-1 12:10:38.292"]) OR (meta.master_created_datetime:["2024-04-10 08:37:12.000" TO *] AND meta.master_modified_datetime:["2024-04-10 08:37:12.000" TO *])
- Using dates:
(meta.master_created_datetime:[* TO "2024-03-2 00:00:00.000"] OR meta.master_modified_datetime:[* TO "2024-03-2 00:00:00.000"]) OR (meta.master_created_datetime:["2024-04-10 00:00:00.000" TO *] AND meta.master_modified_datetime:["2024-04-10 00:00:00.000" TO *])
- Documents Not Likely Compromised
- This search returns the set of documents where both the creation and modified dates are between the remediation and collection dates. The curly brackets mean include all documents between the dates, except documents with exact date and time values specified.
(meta.master_created_datetime:{"2024-3-1 12:10:38.292" TO "2024-04-10 08:37:12.000"} AND meta.master_modified_datetime:{"2024-3-1 12:10:38.292" TO "2024-04-10 08:37:12.000"})