TABLE OF CONTENTS
- Processing Options
- Evidence Processing
- Compound Files
- Search Text Index
- Data Carving
- Creating Thumbnails for Videos
- Creating Common Video Files
- Optical Character Recognition
- Explicit Image Detection
- Cerberus Analysis
- About Cerberus Stage 1 Threat Analysis
- About Cerberus Score Weighting
- About Cerberus Override Scores
- Running Cerberus Analysis
- Filtering Scanned Files and Viewing Threat Scores
- Cerberus Stage 1 Threat Scores
- Cerberus Stage 1 File Information
- About Cerberus Stage 2 Static Analysis
- Cerberus Stage 2 Function Call Data
- File Access Call Categories
- Networking Functionality Call Categories
- Process Manipulation Call Categories
- Security Access Call Categories
- Windows Registry Call Categories
- Surveillance Call Categories
- Uses Cryptography Call Categories
- Low-level Access Call Categories
- Loads a drive Call Categories
- Subverts API Call Categories
- Document Content Analysis
- Language Identification
- Entity Extraction
- Lab/E-discovery Options
- Evidence Refinement
- Index Refinement
- Creating Custom Processing Profiles
Processing Options
Processing in simple terms is a treatment of data provided to evidence created and stored in the database to facilitate an efficient data review.
- Generally, the processing is done right away while loading the evidence in to a case or just prior to performing an analysis of the data. Typically, the data processing involves any or all of the following:
- Generating hash values for the files in the evidence.
- Categorizing the data by file types such as graphics, Office documents, encrypted files, etc.
- Extracting the contents of container and compound files, such as ZIP and TAR files.
- Creating an index of the frequently encountered words in the evidence files for quick searches and retrieval.
- Creating thumbnails for the graphics and videos in the evidence for easier identification.
- Decrypting encrypted files, if any.
- Identifying files that may need attention before reviewing. Files such as (Windows) system files, Archive files, etc.
Elements of Processing Options
Evidence Processing | |
Compound Files | |
Search Text Index | |
Data Carving | |
Creating Thumbnails for Videos | |
Creating Common Video Files | |
Optical Character Recognition | |
Explicit Image Detection | |
Cerberus Analysis |
|
Document Content Analysis | |
Language Identification | |
Entity Extraction | |
Lab/E-discovery Options | |
Evidence Refinement | |
Index Refinement | |
Processing Profiles |
Evidence Processing
When a case is created, you can define the default processing options that are to be used whenever evidence is added to that case. By specifying default processing options for a case, you do not have to manually configure the processing options each time you add new evidence. The case-level defaults can be overridden and customized when you add new evidence or when you perform an additional analysis.
Commonly Used Processing Options
Processing Option | Description | ||||||||||||||||||
MD5 Hash | Creates a digital fingerprint using the Message Digest 5 algorithm, based on the contents of the file. This fingerprint can be used to verify file integrity and to identify duplicate files. | ||||||||||||||||||
SHA-1 Hash | Creates a digital fingerprint using the Secure Hash Algorithm-1, based on the contents of the file. This fingerprint can be used to verify file integrity and to identify duplicate files. | ||||||||||||||||||
SHA-256 Hash | Creates a digital fingerprint using the Secure Hash Algorithm-256, based on the contents of the file. This fingerprint can be used to verify file integrity and to identify duplicate files. SHA-256 is a hash function computed with 32-bit words, giving it a longer digest than SHA-1. | ||||||||||||||||||
Flag Duplicate Files | Identifies files that are found more than once in the evidence. This is done by comparing file hashes. | ||||||||||||||||||
KFF | Enables the Known File Filter (KFF) that lets you identify either known insignificant files that you can ignore or known illicit or dangerous files that you want to be alerted to. See Known File Filter section. | ||||||||||||||||||
Expand Compound Files | Automatically extracts and processes the contents of compound files such as ZIP, email, and OLE files. | ||||||||||||||||||
Expand Compound Image Files | For any given evidence image file, expand any other evidence image files it contains and add their contents to the evidence. | ||||||||||||||||||
Enhanced File Identification | Enables additional processing to determine the contents of multimedia files. Note: You are advised to perform this processing since some multimedia files may be misidentified without this processing. | ||||||||||||||||||
File Signature Analysis | File Signature Analysis is an optional processing option. This lets you initially see the contents of compound files without necessarily having to process them. Processing can be done later, if it is deemed necessary or beneficial to the case by selecting File Signature Analysis. | ||||||||||||||||||
Flag Bad Extensions | Identifies files whose types do not match their extensions, based on the file header information. Enabling this automatically forces File Signature Analysis to be selected. | ||||||||||||||||||
Entropy Test | Performs an entropy test. This is useful when used in conjunction with indexing to not index binary data, etc. | ||||||||||||||||||
Include Deleted Files | Scan enumerated objects (e.g. file systems, ZIP, email archives) for deleted items. Note: If this is not selected it cannot be done later. | ||||||||||||||||||
Search Text Index | Stores the words from evidence in an index for quick retrieval. Using this processing option adds up to the memory consumption, approximately 25% of the memory required for the total evidence in the case. When FTK Central creates a full text index of evidence or places all text characters in an index file with a case, it does not capture spaces or the following symbols:
| ||||||||||||||||||
Create Thumbnails for Graphics | Creates thumbnails for all graphics in the case. The thumbnails are always created in JPG format, regardless of the original graphic file type. | ||||||||||||||||||
Create Thumbnails for Videos | Creates thumbnails for all videos in the case. The thumbnails are always created in JPG format, regardless of the original video file type. Note: You can set the frequency for picking a thumbnail from the video. You can do it by either providing the percent (1 thumbnail per n% of the video) or the time interval (1 thumbnail per n seconds of the video). | ||||||||||||||||||
Generate Common Video File | When you process the evidence in your case, you can choose to create a common video type for all the videos in your case. These common video types are not the actual video files from the evidence, but a copied conversion of the media that is generated and saved as an MP4 file that can be previewed on the video tab. | ||||||||||||||||||
EXIF for Videos | Parses XMP metadata (similar to EXIF data) from processed MP4 and most of the other modern video file formats. When parsed from a video file, the metadata values are displayed on the Properties tab of the file viewer pane. | ||||||||||||||||||
HTML File Listing | Creates a HTML version of the File Listing in the case folder. | ||||||||||||||||||
CSV File Listing | Creates a File Listing Database in CSV format instead of an MDB file. | ||||||||||||||||||
Data Carve | Carves data immediately after pre-processing. This uses file signatures to identify deleted files contained in the evidence. | ||||||||||||||||||
Meta Carve | Carves deleted directory entries and other metadata. The deleted directory entries often lead to data and file fragments that can prove useful to the case, that could not be found otherwise. | ||||||||||||||||||
Optical Character Recognition (OCR) | Scans graphic files for text and converts graphics-text into actual text. That text can then be indexed, searched and treated as any other text in the case. | ||||||||||||||||||
Explicit Image Detection | Generates explicit image scores (range 0-100) for graphic files. | ||||||||||||||||||
Cerberus Analysis | Calculates the Cerberus Stage 1 Score for the evidence. See Cerberus Analysis section. | ||||||||||||||||||
Process Internet Browser History for Visualization | Processes internet browser history files so that you can see them in the detailed visualization timeline. | ||||||||||||||||||
Language Identification | Analyses the first two pages of every document to identify the languages contained within. | ||||||||||||||||||
Document Content Analysis | Analyzes the content and groups it according to topic in the Overview tab. | ||||||||||||||||||
Entity Extraction | Identifies and extracts specific types of data in your evidence. See Entity Extraction section. | ||||||||||||||||||
Enable File Encryption Detection | Identifies files which may be encrypted. | ||||||||||||||||||
Perform Automatic Decryption | Attempts to decrypt files using a list of passwords that you provide. | ||||||||||||||||||
Populate Family for FTK Central | Makes the SMS and MMS messages (and their associated family objects / attachments) available for review in FTK Central. |
Compound Files
You can expand individual compound file types. This lets you see child files that are contained within a container such as ZIP files. You can access this feature during case creation and additional analysis.
Be aware of the following before you expand compound files:
- If you have labeled or hashed a family of files, then later choose to expand a compound file type that is contained within that label or family, the newly expanded files do not inherit the labeling from the parent, and the family hashes are not automatically regenerated.
- Compound file types such as AOL, Blackberry IPD Backup, EMFSpool, EXIF, MSG, PST, RAR, and ZIP can be selected individually for expansion.
- Only the file types selected are expanded. For example, if you select ZIP, and a RAR file is found within the ZIP file, the RAR is not expanded.
Note: If you expand data, you will have files that are generated when the data was processed and were not part of the original data.
Filtering the Compound File Expansion Options List
It is possible to filter the Compound File Expansion Options list by category. Use the Categories dropdown at the top of the list to select a category. Use the Select All and Deselect All buttons to select or clear all options within the selected category.
Supported Compound File Types
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Compound File Expansion Options Category List
Category | Description |
All | This is the full list of supported Compound File Expansion Options. |
All Communication | This option includes all supported file types that are used for communication. |
All Mobile | This option includes all supported file types found on any mobile device. |
Archives | This option includes all supported archive file types. |
Browsers | This option includes all supported file types used within a browser. |
This option includes all supported email file types. | |
Logs | This option includes all supported log file types. |
Other Forensic Tools | This option includes all support third-party forensic tool image types. |
Windows | This option includes all supported file types used within a Windows system. |
Search Text Index
All evidence should be indexed to aid in searches. Index evidence when it is added to the case by checking the Search Text Index option in the process evidence dialog, or index after the fact by clicking and specifying indexing options.
Scheduling is another factor in determining which process to select. Time restraints may not allow for all tasks to be performed initially. For example, if you disable indexing, it shortens the time needed to process a case. You can return at a later time and index the case if needed.
Warning: While performing this processing option, users must note that the File Slack and Free Space is not indexed by default. These areas can be indexed by selecting the Index Refinement options.
Search Text Indexing Space Requirements
To estimate the space required for a Search Text Index, plan on approximately 25% of the space needed for each case’s evidence.
Configuring Case Indexing Options
Case Indexing gives you almost complete control over what goes into your case index. These options can be applied globally during case creation.
Option | Description |
Letters | Specifies the letters and numbers to index. Specifies Original, Lowercase, Uppercase, and Unaccented. Choose Add or Remove to customize the list. You may need to add characters to this list for specific index searches to function properly. For example, you may want to do an index search for ‘Password#123’. By default, the # symbol is treated as a space and is not indexed. To have the # symbol included in the index, you would need to do two things:
|
Noise Words | A list of words to be considered “noise” and ignored during indexing. Choose Add or Remove to customize the list. |
Hyphens | Specifies which characters are to be treated as hyphens. You can add standard keyboard characters, or control characters. You can remove items as well. |
Hyphen Treatment | Specifies how hyphens are to be treated in the index. Options are:
Hyphens will be treated as if they never existed. For example, the term “coun- ter-culture” would be indexed as “counterculture.”
Hyphens will be treated literally. For example, the term “counter-culture” would be indexed as “counter-culture.”
Hyphens will be replaced by a non-breaking space. For example, the term “counter-culture” would be indexed as two separate entries in the index being “counter” and “culture.”
Terms with hyphens will be indexed using all three hyphen treatments. For example, the term “counter-culture” will be indexed as “counterculture”, “coun- ter-culture”, and as two separate entries in the index being “counter” and “cul- ture.” |
Spaces | Specifies which special characters should be treated as spaces. Remove characters from this list to have them indexed as any other text. Choose Add or Remove to customize the list. You may need to remove characters from this list for specific index searches to function properly. For example, you may want to do an index search for ‘Password’123’. By default, the # symbol is treated as a space and is not indexed. To have the # symbol included in the index, you would need to do two things:
Note: If special characters (e.g. @, ., etc. used in emails or other strings) are configured as spaces within the indexing options, users must search for single words beginning/ending with an asterisk *. dtSearch will not allow partial words to be searched for without a wildcard asterisk. |
Ignore | Specifies which control characters or other characters to ignore. Choose Add or Remove to customize the list. |
Max Word Length | Allows you to set a maximum word length to be indexed. |
Index Binary Files | Specify how binary files will be indexed. Options are:
|
Enable Date Recognition | Choose to enable or disable this option. |
Presumed Date Format for Ambiguous Dates | If date recognition is enabled, specify how ambiguous dates should be formatted when encountered during indexing. Options are:
|
Set Max Memory | Allows you to set a maximum size for the index. |
Auto-Commit Interval (MB) | Allows you to specify an Auto-Commit Interval while indexing the case. When the index reaches the specified size, the indexed data is saved to the index. The size resets, and indexing continues until it reaches the maximum size, and saves again, and so forth. |
Cache Filtered Text in Index | Filtered Text is being cached in the dtSearch index by default, however it can be toggled on or off. The advantage to caching filtered text is that it produces more reliable search hit highlighting and it reduces the time to return index search results. However, NOT caching filtered text will result in a smaller index and shorter time to complete the indexing process. |
Modify for TR1 Expressions | Configures the indexing engine to index TR1 regular expressions. On selecting this option, a set of special characters (example: /, @ , :) will be automatically added under 'Letters' section and these characters will be included in the search index and search results will be generated including these characters. The special characters added should be removed from spaces box. |
Create Optional Accent Sensitive Index | Generates the index in such a way that, when the “Accents are Significant” option is enabled for index searching, the investigator can optionally control whether characters with accent marks are distinguished from those without. For example: "abc" versus "äbc". FTK has always and still does default to an Accent Sensitive Index. This means that, "abc" will only find "abc" "äbc" will only find "äbc"
|
Data Carving
Data carving is the process of looking for data on media that was deleted or lost from the file system. Often this is done by identifying file headers and/or footers, and then “carving out” the blocks between these two boundaries.
Exterro provides several specific pre-defined carvers that you can select when adding evidence to a case. Data carving can be selected in the Case Creation dialog or from Additional Analysis.
Supported Carving Options
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Importing Data Carvers
To import data carvers:
- From the Process Evidence page of case creation, click Customize Options.
Note: Alternatively, during review, select the desired items, right-click > Additional Analysis > Customize Options.
2. Select Create Thumbnails for Videos.
3. Click the Context menu to open the configuration.
- The Carving Options pop-up is displayed.
4. Click Import.
5. Select any carvers requiring import within the Windows Explorer. They must be .XML.
6. Click Open.
7. The carver(s) selected will be imported and available for use globally.
Note: Users can delete any carvers imported by clicking on the deletebutton.
Creating Thumbnails for Videos
You can generate thumbnail graphics based on the content that exists within video files in your case. Video thumbnail generation is accomplished during processing. You can either set up video thumbnail generation during Case Creation, or you can run the processing against an existing case by using Additional Analysis.
To generate thumbnails for videos:
- From the Process Evidence page of case creation, click Customize Options.
Note: Alternatively, during review, select the desired items, right-click > Additional Analysis > Customize Options.
2. Select Create Thumbnails for Videos.
3. Click the Context menu to open the configuration.
- The Video Formatting Options pop-up is displayed.
4. Set the following values:
- Percent – This option generates thumbnails against videos based on the percentage of a videos total content. For example, if you set this value to 5, then at every 5% of the video a thumbnail is generated.
- Interval – This option generates thumbnails against videos based on seconds. For example, if you set this value to 5, then at every 5 seconds within a video, a thumbnail is generated.
5. Click Apply.
6. Click Process Data or Run Analysis.
Creating Common Video Files
When you process the evidence during Case Creation or during Additional Analysis, you can choose to create a common video type for videos in your case. These common video types are not the actual video files from the evidence, but a copied conversion of the media that is generated and saved as an MP4 file that can be previewed in the viewer.
To create a common video file:
- From the Process Evidence page of case creation, click Customize Options.
Note: Alternatively, during review, select the desired items, right-click > Additional Analysis > Customize Options
2. Select Create Common Video Files.
3. Click the Context menu to open the configuration.
- The Video Formatting Options pop-up is displayed.
4. Set the following values:
- Lines of Resolution – Sets the number of vertical lines in the video. The higher it is, the better the resolution.
- Bit Rate – Sets the rate of bits in Kbps measurements. The higher it is, the better the resolution.
5. Click Apply.
6. Click Process Data or Run Analysis.
Optical Character Recognition
The Optical Character Recognition (OCR) process lets you extract text that is contained in graphics files. The text is then indexed so that it can be, searched, and bookmarked.
Running OCR against a file type creates a new child file item. The graphic files are processed normally, and another file with the parsed text from the graphic is created. The new OCR file is named the same as the parent graphic, [graphicname.ext], but with the extension OCR, for example, graphicname.ext.ocr.
You can view the graphic files in the Viewer when it is selected in the Grid View. The Native tab shows the graphic in its original form. The Text tab shows the OCR text that was added to the index.
The LeadTools OCR engine can be selected in the Case Processing and Additional Analysis areas LeadTools of the application interface. The ABBYY FineReader OCR engine integration is available as a separate add-on tool (with separate license from ABBYY).
Before running OCR, be aware of the following:
- OCR is only a helpful tool for the investigator to locate images from index searches. OCR results should not be considered evidence without further review.
- OCR can have inconsistent results. OCR engines by nature have error rates. This means that it is possible to have results that differ between processing jobs on the same machine with the same piece of evidence.
- Some large images can cause OCR to take a very long time to complete. Under some circumstances, they may not generate any output.
- Graphical images that have no text or pictures with unaligned text can generate bad output.
- OCR is best on typewritten text that is cleanly scanned or similarly generated. All other picture files can generate unreliable output that can vary from run to run.
Running Optical Character Recognition
To run OCR:
- From the Process Evidence page of case creation, click Customize Options.
Note: Alternatively, during review, select the desired items, right-click > Additional Analysis > Customize Options
2. Select Optical Character Recognition.
3. Click the Context menu to open the configuration.
- The OCR Options pop-up is displayed.
5. Configure the following:
Options | Description |
File Types | Specify which file types to include in the OCR process during case processing. For PDF files, you can also control the maximum filtered text size for which to run OCR against. |
Filtering Options | Specify a range in file size to include in the OCR process. You can also specify whether or not to only run OCR against black and white, and grayscale. The Restrict File Size option is selected by default. By default, OCR file generation is restricted to files larger than 5K. If you do not want to limit the size of OCR files, you must disable this option. |
Language | Specify the output language for the OCR text. |
Engine | Specify the processing engine to use for the OCR process. |
7. Click OK.
8. Click Apply.
9. Click Process Data or Run Analysis.
ABBYY FineReader Integration
FTK can leverage the Exterro API to access the ABBYY FineReader OCR engine integration which provides a robust alternative OCR engine for indexing graphic image files. In addition to an Exterro FTK Central installation, the ABBYY product integration requires an add-on component installation and a license sold separately from ABBYY (not included with Exterro licensing – Please contact sales@exterro.com). The option to select an ABBYY OCR engine in the processing options interface will be grayed out until properly installed and configured.
Note: To use ABBYY, you must have followed the KB article ABBYY/Zeta OCR: Installation
Optical Character Recognition: Confidence Score
There is an option to show the confidence score for each file that has been processed with OCR. It is recommended to use this feature to sort documents processed using OCR to determine which files may need to be manually reviewed for the desired keywords.
The OCR Confidence Score value may be one of the following:
Options | Description |
1-100% | The OCR confidence % score for a document that had a successful OCR process; the higher the score, the higher the confidence. |
No Score Available (2) | The OCR results are from a previous version. |
Minimal Confidence (1) | The OCR extraction is not in a supported language or is not clear. |
No Text Found (0) | The OCR process did not identify any text to extract. |
OCR Skipped (-1) | The OCR process was skipped due to some condition. |
OCR Extraction Error (-2) | The OCR process failed for that file. |
Blank | The file does not need the OCR process; for example, a .DOC file or email. |
To use the OCR Confidence Score:
- Process your data using the Optical Character Recognition option.
- Add a custom column named OcrScore. Refer to the Custom Columns section for more information.
Explicit Image Detection
EID reads all graphics in a case and assigns both the files and the folders they are contained within a score according to what it interprets as being possibly illicit content.
Adding EID evidence to cases
To add EID evidence to a case:
- From the Process Evidence page of case creation, click Customize Options.
Note: Alternatively, during review, select the desired items, right-click > Additional Analysis > Customize Options.
2. Select Explicit Image Detection.
3. Click the Context menu to open the configuration.
- The Explicit Image Detection Options pop-up is displayed.
4. Select one based on the required option. The components of the option are provided below:
Profile Name | Level | Description |
X-DFT | Default (XS1) | This is the most generally accurate. It is always selected. |
X-FST | Fast (XTB) | This is the fastest. It scores a folder by the number of files it contains that meet the criteria for a high likelihood of explicit material. It is built on a different technology than X-DFT and does not use “regular” DNAs. It is designed for very high volumes, or real-time page scoring. Its purpose is to quickly reduce, or filter, the volume of data to a meaningful set. |
X-ZFN | Less False Negatives (XT2) | This is a profile similar to X-FST but with more features and with fewer false negatives than X-DFT. You can apply this filter after initial processing to all evidence, or to only the folders that score highly using the X-FST option. Check-mark or highlight those folders to isolate them for Additional Analysis. In Additional Analysis, File Signature Analysis must be selected for EID options to work correctly. |
5. Click OK.
6. Click Apply.
7. Click Process Data or Run Analysis.
Tip: Exterro recommends that you run Fast (X-FST) for folder scoring, and then follow with Less False Negatives (X-ZFN) on high-scoring folders to achieve the fastest, most accurate results.
After you select EID in Evidence Processing or Additional Analysis, and the processing is complete, you must select or modify a filter to include the EID related columns in the Grid View.
Cerberus Analysis
Cerberus lets you do a malware analysis on executable binaries. You can use Cerberus to analyze executable binaries that are on a disk, on a network share, or that are unpacked in system memory.
Cerberus consists of the following stages of analysis.
- Stage 1: Threat Analysis
Cerberus stage 1 is a general file and metadata analysis that quickly examines an executable binary file for common attributes it may possess. It identifies potentially malicious code and generates and assigns a threat score to the executable binary.
- Stage 2: Static Analysis
Cerberus stage 2 is a disassembly analysis that takes more time to examine the details of the code within the file. It learns the capabilities of the binary without running the actual executable.
Cerberus first runs the Stage 1 threat analysis. After it completes Stage 1 analysis, it will then automatically run a static analysis against binaries that have a threat score that is higher than the designated threshold. Cerberus analysis may slow down the speed of your overall processing.
Warning: Cerberus writes binaries to the AD Temp folder momentarily in order to perform the malware analysis. Upon completion it will quickly delete the binary. It is important to ensure that your antivirus is not scanning the AD Temp folder. If antivirus deletes/Quarantines the binary from the temp Cerberus analysis will not be performed.
Cerberus analyzes the following types of files:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
About Cerberus Stage 1 Threat Analysis
Cerberus stage 1 analysis is a general analysis for executable binaries. The Stage 1 analysis engine scans through the binary looking for malicious artifacts. It examines several attributes from the file's metadata and file information to determine its potential to contain malicious code within it. For each attribute, if the condition exists, Cerberus assigns a score to the file. The sum of all of the file’s scores is the file’s total threat score.
More serious attributes have higher positive scores, such as +20 or +30. Safer attributes have smaller or even negative numbers such as +5, -10 or -20.
The existence of any particular attribute does not necessarily indicate a threat. However, if a file contains several attributes, then the file will have a higher sum score which may indicate that the executable binary may warrant
further investigation. The higher the threat score, the more likely a file may be to contain malicious code.
For example, you may have a file that had four attributes discovered. Those attributes may have scores of +10, +20, +20, and +30 for a sum of +80. You may have another file with four attributes of scores of +5, +10, -10, -20 for a sum of -15. The first file has a much higher risk than the second file.
Cerberus stage 1 analysis also examines each file’s properties and provides information such as its size, version information, signature etc.
About Cerberus Score Weighting
There are default scores for each attribute of Cerberus Stage 1 threat scoring. However, you can modify the scoring so that you can weigh the threat score attributes with your own values.
For example, the Bad Signed attribute as a default value of +20. You can give it a different weight of +30.
You must configure these scores before the files are analyzed.
About Cerberus Override Scores
Some threat attributes have override scores. If a file has one of these attributes, instead of the score being the sum of the other attributes, the score is overridden with a set value of 100 or -100. This is useful in quickly identifying files that are automatically considered either as a threat or safe. If a bad artifact is found that requires immediate attention, the file is given the maximum score. If an artifact is found that is considered safe, the file is automatically given the minimum score.
Score ranges have maximum and minimum values of -100 to 100.
- High threat signatures will result in a final score of 100.
- Low threat signatures will result in a final score of -100.
Cerberus attributes that have maximum override scores include:
- Bad signatures
- Revoked signatures
- Expired signatures
- Packed with known signature
Note: If any of these attributes are found, the score is overridden with a score of +100.
Cerberus Minimum override score includes:
- Valid digital signature
If this attribute is found, thew score is overridden with a score of -100.
Note: If a file that is malware has a valid digital signature, the override will score the file as -100 (low threat), even though the file is really malware.
Running Cerberus Analysis
To run Cerberus Analysis:
- From the Process Evidence page of case creation, click Customize Options.
Note: Alternatively, during review, select the desired items, right-click > Additional Analysis > Customize Options.
2. Select Cerberus Analysis.
3. Click the Context menu to open the configuration.
- The Cerberus Analysis pop-up is displayed.
4. Select one from the options.
- In the Cerberus Analysis dialog, you can define the weight assigned to each Cerberus stage 1 score. These Stage 1 scores are designed to identify and score specific malware properties and traits.
- In the Cerberus Analysis dialog, you can choose the option Perform Cerberus Analysis stage 2 if stage 1 threshold is greater than the value provided. This option lets you choose to automatically run stage 2 analysis after stage 1 analysis completes. Do one of the following:
Options | Description |
To run stage 1 analysis only | Deselect the option to Perform Cerberus Analysis stage 2 if stage 1 threshold is greater than, then only Cerberus Analysis stage 1 is run. |
To run both stage 1 and 2 analysis | Select the option to Perform Cerberus Analysis stage 2 if stage 1 threshold is greater than n. Specify a threshold for a minimum threat score against which you want to run the stage 2 analysis. If a file’s threat score is higher than the threshold value that you set, then stage 2 is run. If a file’s threat score is lower than the threshold value, then stage 2 analysis is not run. By default, the threshold automatically runs stage 2 analysis against files with a threat score greater than +20. |
5. Click OK.
6. Click Apply.
7. Click Process Data or Run Analysis.
Filtering Scanned Files and Viewing Threat Scores
After you have processed evidence with Cerberus enabled, you can view a threat score for each executable file by filtering for scanned files. Using the CerberusScore column shows the Cerberus scores that were calculated during processing.
To filter scanned files and view threat scores:
- See Facet Filters section.
- Click Cerberus.
- Click Cerberus Stage 1 Analysis or Cerberus Stage 2 Analysis.
- Click your selected attribute type.
- See Using Custom Columns section.
- Use the Cerberus Score column.
Cerberus Stage 1 Threat Scores
The following table lists the threat scores that are provided in a Stage 1 analysis:
Attribute | Default Score | Description |
Network | +5 | The Network category is triggered when a program contains the functionality to access a network. This could involve any kind of protocol from high-level HTTP to a custom protocol written using low-level raw sockets. |
Persistence | +20 | Persistence indicates that the application may try to persist permanently on the host. For example, the application would resume operation automatically even if the machine were rebooted. |
Process | +5 | Process indicates the application may start a new a process or attempt to gain access to inspect or modify other processes. Malicious applications attempt to gain access to other processes to obfuscate their functionality or attack vector or for many other reasons. For example, reading or writing into a process’s memory, or injecting code into another process. |
Crypto | +6 | Crypto is triggered when an application appears to use cryptographic functionality. Malicious software uses cryptography to hide data or activity from network monitors, anti-virus products, and investigators. |
Protected Storage | +10 | ProtectedStorage indicates that the application may make use of the Windows “pstore” functionality. This is used on some versions of Windows to store encrypted data on the system. For example, Internet Explorer stores a database for form-filling in protected storage. |
Registry | +5 | Registry is triggered when a target application attempts to use the registry to store data. The registry is commonly used to store application settings, auto-run keys, and other data that the application wants to store permanently but not in its own file. |
Security | +5 | Imports functions used to modify user tokens. For example, attempting to clone a security token to impersonate another logged on user. |
Obfuscation | +30 | Stage 1 searches for signs that the application is 'packed', or obfuscated in a way that hinders quick inspection. The Obfuscation category is triggered when the application appears to be packed, encrypted, or otherwise obfuscated. This represents a deliberate decision on behalf of the developer to hinder analysis. |
Process Execution Space | +2 | Unusual activity in the Process Execution Space header. For example, a zero length raw section, unrealistic linker time, or the file size doesn't match the Process Execution Space header. |
Bad Signed | +20 | This category is triggered when a binary is cryptographically signed, but the signature is invalid. A signature is generally used to demonstrate that some entity you trust (like a government or legitimate company, called a 'signing authority') has verified the authorship and good intentions of the signed application. However, signatures can be revoked and they can expire, meaning that the signature no longer represents that the signing authority has trust in the application. |
Embedded Data | +10 | This category is triggered when an application contains embedded executable code. While all programs contain some program code, this category indicates that the application has an embedded 'resource', which contains code separate from the code which runs normally as part of the application. |
Bad / Bit-Bad | +20 | This category is triggered when the application contains signatures indicating it uses the IRC protocol or shellcode signature. Many malware networks use IRC to communicate between the infected hosts and the command-and-control servers. |
Signed / Bit-Bad | -20 | This category is triggered when a program is signed. A program that is signed is verified as 'trusted' by a third party, usually a legitimate entity like a government or trusted company. The signature may be expired or invalid though; check the 'BadSigned' category for this information. |
PE Good | -10 | Scores for good artifacts in PE headers. |
PE Malware | +30 | Scores for known malware artifacts in PE headers. |
Cerberus Stage 1 File Information
The following table lists the threat scores that are provided in a Stage 1 analysis:
Item | Description |
File Size | Displays the size of the file in bytes. |
Import Count | Displays the number of functions that Cerberus examined. |
Entropy Score | Displays a score of the binaries entropy used for suspected packing or encrypting. |
Entropy may be packed | Displays if the files are possibly packed. |
Interesting Functions | Displays the name of functions from the process execution space that contributed to the file’s threat score. |
Suspected Packer List | Attempts to display a list of suspected packers whose signature matches known malware packers. |
Modules | Displays the DLL files included in the binary. |
Has Version | Displays whether or not the file has a version number. |
Version Info | Displays information about the file that is gathered from the Windows API including the following:
|
Is Signed | Displays whether or not the file is signed. If the file is signed the following information is also provided:
|
Unpacker results | Attempts to show if and which packers were used in the binary. |
About Cerberus Stage 2 Static Analysis
When you run a stage 1 analysis, you configure a score that will launch a Cerberus stage 2 analysis. If an executable receives a score that is equal or higher than the configured score, Cerberus stage 2 is performed. Cerberus stage 2 disassembles the code of an executable binary without running the actual executable.
Cerberus Stage 2 Function Call Data
Stage 2 analysis data is generated for the following function call categories:
- File Access
- Networking functionality
- Process Manipulation
- Security Access
- Windows Registry
- Surveillance
- Uses Cryptography
- Low-level Access
- Loads a driver
- Subverts API
- Misc
File Access Call Categories
Cerberus Stage 2 File Access Function Call Categories
Category | Description |
File Access Functions that manipulate (read, write, delete, modify) files on the local file system. | |
Filesystem.File.Read.ExecutableExtension | This is triggered by functionality which reads executable files from disk. The executable code can then be executed, obfuscated, stored elsewhere, transmitted, or otherwise manipulated. |
FileSystem.Physical.Read | This application may attempt to read data directly from disk, bypassing the filesystem layer. This is very uncommon in normal applications, and may indicate subversive activity. |
FileSystem.Physical.Write | This application may attempt to write data directly to disk, bypassing the filesystem layer in the operating system. This is very uncommon in normal applications, and may indicate subversive activity. It is also easy to do incorrectly, so this may help explain any system instability seen on the host. |
FileSystem.Directory.Create: | This indicates the application may attempt to create directory. Modifications to the file system are useful for diagnosing how an application persists, where its code and data are stored, and other useful information. |
FileSystem.Directory.Create.Windows: | This indicates an application may try to create a directory in the \Windows directory. This directory contains important operating system files, and legitimate applications rarely need to access it. |
FileSystem.Directory.Recursion: | This indicates the application may attempt to recurse through the file system, perhaps as part of a search functionality. |
FileSystem.Delete: | This indicates the application may delete files. With sufficient permissions, the application may be able to delete files which it did not write or even system files which could affect system stability. |
FileSystem.File.DeleteWindows: | This indicates the application may try to delete files in the \Windows directory, where important system files are stored. This is rarely necessary for legitimate applications, so this is a strong indicator of suspicious activity. |
FileSystem.File.DeleteSystem32: | This indicates the application may try to delete files in the \Windows\System32 directory, where important system files are stored. This is rarely necessary for legitimate applications, so this is a strong indicator of suspicious activity. |
FileSystem.File.Read.Windows | This indicates the application may attempt to read from the \Windows directory, which is very uncommon for legitimate applications. \Windows is where many important system files are stored. |
FileSystem.File.Write.Windows: | This indicates the application may attempt to write to the \Windows directory, which is very uncommon for legitimate applications. \Windows is where many important system files are stored. |
FileSystem.File.Read.System32: | This indicates the application may attempt to read from the \Windows\System32 directory, which is very uncommon for legitimate applications. \Windows\System32 is where many important system files are stored. |
FileSystem.File.Write. System32: | This indicates the application may attempt to write to the \Windows\System32 directory, which is very uncommon for legitimate applications. \Windows\System32 is where many important system files are stored. |
FileSystem.File.Write.ExecutableExtension: | This indicates the application may attempt to write an executable file to disk. This could indicate malicious software that has multiple ‘stages’, or it could indicate a persistence mechanism used by malware (i.e. write an executable file into the startup folder so it is run when the system starts up). |
FileSystem.File.Filename.Compression: | This indicates the program may write compressed files to disk. Compression can be useful to obfuscate strings or other data from quick, automated searches of every file on a filesystem. |
FileSystem.File.Filename.Autorun: | This indicates the application may write a program to a directory so that it will run every time the system starts up. This is a useful persistence mechanism. |
Networking Functionality Call Categories
Cerberus Stage 2 Networking Functionality Function Call Categories
Category | Description |
Networking Functionality - Functions that enable sending and receiving data over a network. | |
Network.FTP.Get: | Describes the use of FTP to retrieve files. This could indicate the vector a malware application uses to retrieve data from a C&C server. |
Network.Raw: | Functions in this category indicate use of the basic networking commands used to establish TCP, UDP, or other types of connections to other machines. Programmers who use these build their own communication protocol over TCP (or UDP or other protocol below the application layer) rather than using an application-layer protocol such as HTTP or FTP. |
Network.Raw.Listen: | Functionality in this category indicates the application accepts incoming connections over TCP, UDP, or other lower-level protocol. |
Network.Raw.Receive: | Functionality in this bucket indicates that the application receives data using a socket communicating over a lower-level protocol such as TCP, UDP, or a custom protocol. |
Network.DNS.Lookup.Country.XX: | This indicates the application may attempt to resolve the address of machines in one of several countries. “XX” will be replaced by the ‘top level domain’, or TLD associated with the lookup, indicating the application may attempt to establish contact with a host in one of these countries. |
Network.HTTP.Read: | The application may attempt to read data over the network using the HTTP protocol. This protocol is commonly used by malware so that its malicious traffic appears to ‘blend in’ with legitimate web traffic. |
Network.HTTP.Connect.Nonstandard.Request: | This indicates the application may make an HTTP request which is not a head, get, or post request. The vast majority of web applications use one or more of these 3 kinds of requests, so this category indicates anomalous behavior. |
Network.HTTP.Connect.Nonstandard.Port: | Most HTTP connections occur over either port 80 or 443. This indicates the application is communicating with the server over a non-standard port, which may be a sign that the server is not a normal, legitimate web server. |
Network.HTTP.Connect.Nonstandard.Header: | HTTP messages are partially composed of key-value pairs of strings which the receiver will need to properly handle the message. This indicates the application includes non-standard or very unusual header key-value pairs. |
Network.HTTP.Post: | This indicates the application makes a ‘post’ http request. ‘post’ messages are normally used to push data to a server, but malware may not honor this convention. |
Network.HTTP.Head: | This indicates the application makes a ‘head http request. ‘head’ messages are normally used to determine information about a server’s state before sending a huge amount of data across the network, but malware may not honor this convention. |
Network.Connect.Country.XX: | This indicates the application may attempt to connect to a machine in one of several countries. “XX” will be replaced by the ‘top level domain’, or TLD associated with the lookup. |
FTP.Put: | The application may attempt to send files over the network using FTP. This may indicate an exfiltration mechanism used by malware. |
Process Manipulation Call Categories
Cerberus Stage 2 Process Manipulation Function Call Categories
Category | Description |
Process Manipulation – May contain functions to manipulate processes. | |
ProcessManagement.Enumeration: | This functionality indicates the application enumerates all processes. This could be part of a system survey or other attempt to contain information about the host. |
ProcessManagement.Thread.Create: | This indicates the target application may create multiple threads of execution. This can give insight into how the application operates, operating multiple pieces of functionality in parallel. |
ProcessManagement.Thread.Create.Suspended: | This indicates the application may create threads in a suspended state. Similar to suspended processes, this may indicate that the threads are only executed sometime after they’re created or that some properties are modified after they are created. |
ProcessManagement.Thread.Create: | This indicates the application may attempt to create a thread in another process. This is a common malware mechanism for ‘hijacking’ other legitimate processes, disguising the fact that malware is on the machine. |
ProcessManagement.Thread.Create.Remote: | This indicates that the application may create threads in other processes such that they start in a suspended state. Thus, their functionality or other properties can be modified before they begin executing. |
ProcessManagement.Thread.Open: | The application may try to gain access to observe or modify a thread. This behavior can give insight into how threads interact to affect the host. |
ProcessManagement.Process.Open: | This application may attempt to gain access to observe or modify other processes. This can give strong insight into how the application interacts with system and what other processes it may try to subvert. |
ProcessManagement.Process.Create: | This application may attempt to create one or more other processes. Similar to threads, multiple processes can be used to parallelize an application’s functionality. Understanding that processes are used rather than threads can shed insight on how an application accomplishes its goals. |
ProcessManagement.Process.Create.Suspended: | Describes functionality to create new processes in a suspended state. Processes can be created in a ‘suspended’ state so that none of the threads execute until it is resumed. While a process is suspended, the creating process may be able to substantially modify its behavior or other properties. |
Security Access Call Categories
Cerberus Stage 2 Security Access Function Call Categories.
Category | Description |
Security Access - Functions that allow the program to change its security settings or impersonate other logged on users. | |
Security: | This category indicates use of any of a large number of security related functions, including those manipulating security tokens, Access Control Entries, and other items. Even without using an exploit, modification of security settings can enable a malicious application to gain more privileges on a system than it would otherwise have. |
Windows Registry Call Categories
Cerberus Stage 2 Windows Registry Function Call Categories.
Category | Description |
Windows Registry – Functions that manipulate (read, write, delete, modify) the local Windows registry. This also includes the ability to modify autoruns to persist a binary across boots. | |
Registry.Key.Create: | The application may attempt to create a new key in the registry. Keys are commonly used to persist settings and other configuration information, but other data can be stored as well. |
Registry.Key.Delete: | This application may attempt to delete a key from the registry. While it is common to delete only keys that the application itself created, with sufficient permissions, Windows may not prevent an application from deleting other applications’ keys as well. |
Registry.Key.Autorun: | This indicates the application may use the registry to try to ensure it or another application is run automatically on system startup. This is a common way to ensure that a program continues to run even after a machine is restarted. |
Registry.Value.Delete: | This indicates the application may attempt to delete the value associated with a particular key. As with the deletion of a key, this may not represent malicious activity so long as the application only deletes its own keys’ values. |
Registry.Value.Set: | The application may attempt to set a value in the registry. This may represent malicious behavior if the value is set in a system key or the key of another application. |
Registry.Value.Set.Binary: | This indicates the application may store binary data in the registry. This data could be encrypted, compressed, or otherwise is not plain text. |
Registry.Value.Set.Text: | This indicates the application may write plain text to the registry. While the ‘text’ flag may be set, this does not mandate that the application write human-readable text to the registry. |
Registry.Value.Set.Autorun: | The application may set a value indicating it will use the registry to persist on the machine even after it restarts. |
Surveillance Call Categories
Cerberus Stage 2 Surveillance Function Call Categories.
Category | Description |
Surveillance – Usage of functions that provide audio/video monitoring, keylogging, etc. | |
Driver.Setup: | Functionality in this category involves manipulation of INF files, logging, and other driver-related tasks. Drivers are used to gain complete control over a system, potentially even gaining control of other security products. |
Driver.DirectLoad: | Functionality in this category involves loading drivers. As noted in ‘driver.setup’, drivers represent ultimate control over a host system and should be extremely trustworthy. |
Uses Cryptography Call Categories
Cerberus Stage 2 Uses Cryptography Function Call Categories.
Category | Description |
Uses Cryptography – Usage of the Microsoft CryptoAPI functions. | |
Crypto.Hash.Compute: | This indicates a hash function may be used by the target application. Hash functions are used to verify the integrity of communications or files to ensure they were not tampered with. |
Crypto.Algorithm.XX: | The “XX” could be any of several values, including ‘md5’, ‘sha-1’, or ‘sha-256’. These represent particular kinds of hashes which the target application may use. |
Crypto.MagicValue: | This indicates that the target contains strings associated with cryptographic functionality. Even if the application does not use Windows OS functionality to use cryptography, the ‘magic values’ will exist so long as the target uses standard cryptographic algorithms. |
Low-level Access Call Categories
Cerberus Stage 2 Low-level Access Function Call Categories
Category | Description |
Low-level Access – Functions that access low-level operating system resources, for example reading sectors directly from disk. | |
Driver.Setup: | Functionality in this category involves manipulation of INF files, logging, and other driver-related tasks. Drivers are used to gain complete control over a system, potentially even gaining control of other security products. |
Driver.DirectLoad: | Functionality in this category involves loading drivers. As noted in ‘driver.setup’, drivers represent ultimate control over a host system and should be extremely trustworthy. |
Debugging.dbghelp: | This indicates use of functionality included in the dbghelp.dll module from the "Debugging Tools for Windows" package from Microsoft. With the proper permissions, the functionality in this library represents a power mechanism for disguising activity from investigators or for gaining control of other processes. |
Misc.SystemRestore: | Describes functionality involved in the System Restore feature, including removing and adding restore points. Restore points are often used as part of a malware-removal strategy, so removal of arbitrary restore points, especially without user interaction, may represent malicious activity. |
Debugging.ChecksForDebugger: | This is triggered if the application tries to determine whether it is being debugged. Malicious applications commonly try to determine whether they’re being analyzed so that they can modify the behavior seen by analysts, making it difficult to discover their true functionality. |
Loads a drive Call Categories
Cerberus Stage 2 Loads a drive Function Call Categories.
Category | Description |
Loads a driver | Function that loads drivers into a running system. |
Subverts API Call Categories
Cerberus Stage 2 Subverts API Function Call Categories.
Category | Description |
Subverts API | Undocumented API functions, or unsanctioned usage of Windows APIs (for example, using native API calls). |
Document Content Analysis
You can use Document Content Analysis to group document data together for quicker review.
The application uses an algorithm to cluster the data. The algorithm accomplishes this by creating an initial set of cluster centers called pivots. The pivots are created by sampling documents that are dissimilar in content. For example, a pivot may be created by sampling one document that may contain information about children’s books and sampling another document that may contain information about an oil drilling operation in the Arctic. Once this initial set of pivots is created, the algorithm examines the entire data set to locate documents that contain content that might match the pivot’s perimeters. The algorithm continues to create pivots and clusters documents around the pivots. As more data is added to the case and processed, the algorithm uses the additional data to create more clusters.
Word frequency or occurrence count is used by the algorithm to determine the importance of content within the data set. Noise words that are excluded from Document Content Analysis are also not included in the Cluster Topic pivots or clusters.
Note: If you activated Document Content Analysis as an Evidence Processing option when you created the case, Document Content Analysis will automatically run after processing data and will not need to be run manually.
Considerations of Cluster Topic
You need to aware the following considerations when examining the Cluster Topic categories:
- Not all data will be grouped into categories at once. The application creates categories in an incremental fashion in order to return results as quickly as possible. Since the application is continually creating categories, the Cluster Topic container is continually updated.
- Duplicate documents are grouped together as they match a specific category. However, if a category is particularly large, duplicate documents may not be included as part of any category. This is to avoid performance issues. You can examine any duplicate documents or any documents not included in a category by highlighting the UNCLUSTERED category of the Cluster Topic container/filter.
- Cluster Topic results can vary when performed on different databases and/or different computers. This is due to the analytic behavior of the Document Content Analysis process. Since limits have been set on the algorithm to allow for efficient collection of data, large amounts of content can thus produce varying results.
Running Document Content Analysis
To run document content analysis:
- From the Process Evidence page of case creation, click Customize Options.
Note: Alternatively, during review, select the desired items, right-click > Additional Analysis > Customize Options.
2. Select Document Content Analysis.
3. Click the Context menu to open the configuration.
- The Document Content Analysis pop-up is displayed.
4. Configure the Analysis Threshold to sets the level of similarity (in a percentage) that is required for documents to be considered related or near duplicates. The higher the percentage, the more similar the documents need to be in order to be considered similar.
5. Click OK.
6. Click Apply.
7. Click Process Data or Run Analysis.
Filtering Documents by Document Content Analysis
Documents processed with Document Content Analysis can be filtered by the content of the documents in the evidence. The Cluster Topic container is created from data processed with Document Content Analysis. Data included in the Cluster Topic container is taken from documents, including Word documents, text documents, and PDF documents.
To filter document by document content analysis:
- See Facet Filters section.
- Click General.
- Click Document Content.
- Click Cluster Topic.
- Check a topic from the list to see related items in the Grid.
Language Identification
When processing evidence, you can perform automatic language identification. This will analyze the first two pages of every document to identify the language. To identify languages, you have to enable the Language Identification processing option.
Performing Language Identification
To perform language identification:
- From the Process Evidence page of case creation, click Customize Options.
Note: Alternatively, during review, select the desired items, right-click > Additional Analysis > Customize Options.
2. Select Language Identification.
3. Click the Context menu to open the configuration.
- The Language Identification Options pop-up is displayed.
4. Set the Language Identification Options as explained below.
Options | Description |
Document Types to Process | You can select to process the following file types:
|
Languages to Identify | You can select to identify the following:
|
6. Click OK.
7. Click Apply.
8. Click Process DataorRun Analysis.
Note: The Language Identification processing option is disabled by default. If you enable it, the basic language setting and all four document types are enabled by default.
Viewing Language Identified Documents
After processing is complete, you can add the Language column in the File List in the Grid.
See Using Custom Columns section.
You can filter by the Language field within review and determine who needs to review which documents based on the language contained within the document. If there are multiple languages in a document, the first language will be identified.
Basic Languages
The system will perform language identification for the following languages:
|
|
|
|
|
|
|
|
|
|
If the language to identify is one of the ten basic languages (except for English), select Basic when choosing Language Identification. The Extended option also identifies the basic ten languages, but the processing time is significantly greater.
Extended Languages
The system will perform language identification for 67 different languages. This is the slowest processing option. The following languages can be identified:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Entity Extraction
The Entity Extraction process extracts data from the content of files in your evidence. Unlike other processing option, this option extracts the data from the body of data rather than the metadata. Users can extract the following types of data:
- Credit Card Numbers
- Phone Numbers
- Social Security Numbers
- E-Mail Addresses
Information Type | Syntax | Successful extraction example | Fail case extraction example |
Credit Card Numbers | 16-digit numbers used by VISA, MasterCard, and Discover |
|
|
Credit Card Numbers | 15-digit numbers used by American Express |
|
|
Phone Numbers | Standard 7-digit numbers |
|
|
Phone Numbers | Standard 10-digit numbers (A leading 1, for long-distance or 001 for international, is not included in the extraction, however, a +1 is.) |
|
|
International Phone Numbers |
|
|
|
Social Security Numbers | Standard 9-digit number |
|
|
Email Address | A prefix to the left of the @ symbol and a domain to the right of the @ symbol. |
|
|
Warnings:
- Entities matching syntaxes with each other may be wrongly identified. For instance, a 15-digit Credit Card Number, 5105-1051-051-5100 may also be extracted as the phone number 510-5100.
- Apart from the 16-digit and 15-digit credit card number, other formats, such as 14-digit Diners Club numbers will not be extracted as credit card numbers
Lab/E-discovery Options
De-duplication is separated by email items and non-email items. Within each group, the available options can be applied by Case or by Custodian (People).
The following table provides more information regarding each option and its description.
Option | Description |
Enable Advanced De-duplication Analysis. | Enable this option to perform de-duplication on the email items and non-email items. This acts as the parent function for all the child function options listed in the page. |
Email Items - De-duplication Scope | Choose whether you want this de-duplication process to be applied at the Case level, or at the Custodian (People) level. |
Email Items - De-duplication Options | Select the duplicates to be eliminated from the case as it processes through the collected evidence. Options available:
|
Non-email items - De-duplication Scope | Choose whether you want this de-duplication process to be applied at the Case level, or at the Custodian (People) level. There is only one option available for non-email items; either you are going to deduplicate just the actual files, or if unmarked, you will de-duplicate actual files only, or all files, including children, zipped, OLE, and carved files. |
Propagate Email Attribute | When an email has attachments or OLE items, marking this option causes the email’s attributes to be copied and applied to all “child” files of the email “parent.” |
Cluster Analysis | Invokes the extended analysis of documents to determine related, near duplicates, and email threads. This lets you specify the options for Cluster Analysis. You can specify which document types to process:
You can also specify the similarity threshold, which determines the level of similarity required for documents to be considered related or near duplicates. Click Cluster Analysis Options to select the document types for performing Cluster Analysis. |
Include Extended Information in the Index | Enable this to make the index data fully compatible with Summation/eDiscovery. This is generally enabled if you created a case in Exterro FTK and need to review it in Summation or eDiscovery. |
Create Email Threads | Enable this to sort and group emails by conversation threads. |
Evidence Refinement
The Evidence Refinement Options allow you to specify how the evidence is to be sorted and displayed. Also, this allows you to exclude specific data from the case evidence.
Many factors can affect which processing options are required. For example, if you have text-based data, you may perform a full text index to aid in the review process. Also, you may have identified the dataset has no use of encryption. In this case an entropy test may not be needed.
Refining Evidence by File Status/Type
Processing Option | Description |
Include File Slack | To include file slack space in which evidence may be found. |
Include Free Space | To include unallocated space which evidence may be found. |
Include KFF Ignorable Files | To include files flagged as ‘Ignorable’ in the KFF for analysis. |
Include OLE Streams | To include Object Linked and Embedded (OLE) data streams that are layered, linked, or embedded. |
eDiscovery Refinement | To exclude files and folders that are not useful for most eDiscovery cases. |
Don’t Expand Embedded Graphics | This option lets you skip processing the graphics embedded in the email files. |
Deleted | To decide how to treat the deleted files. You can choose to:
|
Encrypted | To decide how to treat encrypted files. You can choose to:
|
From Email | To decide how to treat email files. You can choose to:
|
File Types | To select the required file types. You can exclude the files by proceeding to the next step without selecting it. |
Only add items to the case that match both File Status and File Type criteria | To add files matching all the criteria selected in both. |
Exclude by Category | To exclude any categories from indexing. |
Refining Evidence by File Status/Type
You can filter files by the date range defined for Created, Last Modified, or Last Accessed date of the files. Files matching any of the three date filters will be considered here.
Similarly, you can filter the files based on the minimum and maximum file size using the At least and At most fields. Files matching any of the two size filters will be considered here.
Warning: When both date and size filters are used, only the files matching both the conditions are included.
Index Refinement
The Index Refinement option allows you to specify types of data that you do not want to index. You may choose to exclude data to save time and resources, or to increase searching efficiency.
Warning: Exterro strongly recommends that you use the default index settings.
Refining an Index by File Status/Type
Refining an index by file status and type allows the investigator to focus attention on specific files needed for a case through a refined index defined in a dialog. At the bottom of the Status/Type Index Refinement tab you can choose to mark the box for Only index items that match both File Status AND File Types criteria, if that suits your needs.
Processing Option | Description |
Include File Slack | Mark to include free space between the end of the file footer, and the end of a sector, in which evidence may be found. |
Include Free Space | Mark to include both allocated (partitioned) and unallocated (unpartitioned) space in which evidence may be found. |
Include KFF Ignorable Files | Mark to include files flagged as ignorable in the KFF for analysis. |
Include Message Headers | Marked by default. Includes the headers of messages in filtered text. Unmark this option to exclude message headers from filtered text. |
Do not include document metadata in filtered text | Not marked by default. This option lets you turn off the collection of internal metadata properties for the indexed filtered text. The fields for these metadata properties are still populated to allow for field level review, but the you will no longer see information such as Author, Title, Keywords, Comments, etc in the Filtered text panel of the review screen. If you use an export utility such as ECA or eDiscovery and include the filtered text file with the export, you will also not see this metadata in the exported file. |
Include OLE Streams | Includes Object Linked or Embedded (OLE) data streams that are part of files that meet the other criteria. |
Deleted | Specifies the way to treat deleted files. Options are:
|
Encrypted | Specifies the way to treat encrypted files. Options are:
|
Specifies the way to treat email files. Options are:
| |
File Types | Specifies types of files to include and exclude. |
Only add items to the index that match both File Status and File Type criteria | Applies selected criteria from both File Status and File Types tabs to the refinement. Will not add items that do not meet all criteria from both pages. |
Refining an Index by File Date/Size
Refine index items dependent on a date range or file size you specify.
Processing Option | Description |
Refine Index by File Date | To refine index content by file date:
|
Refine Index by File Size | To refine index content by file size:
|
Creating Custom Processing Profiles
You can create a processing profile by selecting a set of processing options and then saving them as a profile. Processing profiles can only be created during case creation or within the case summary page.
Creating a Custom Processing Profile
To create a custom processing profile:
- Navigate to a Case Summary page.
- Click Customize Options.
- Select any processing options applicable.
- Click Save As.
- Enter a Name and Description (optional) for the custom processing profile.
- Click Save.
Note: Users can delete custom processing profiles by clicking the Delete Profile button.