Question
What is the Processing Manager and what does it do?
Answer
The Processing Manager (PM) can be thought of as the "brains" of the engine. It controls and orchestrates how work is done when submitting jobs that require an engine. The PM can be found in both the local Evidence Processing Engine (EP) and the Distributed Processing Manager (DPM) components.
Controlling Work
The Processing Manager controls the distribution of work to a single local engine or several distributed engines. This is done by dividing up the data to be processed into small manageable chunks and then distributing those chunks to the engine(s).
When installing the EP or DPM, the installer will prompt for the Processing State Folder. This folder is where the Processing Manager stores the information on what is being worked on and which engine is working on it. While processing data, you should see a new job folder with a unique ID as the name (e.g., "7f482a5d4cc847859b11a16bb74ea0de.job") under "[PM_folder]\10\Jobs\" Inside this job folder are several other folders including Queue and Wrk folders. Inside both Queue and Wrk is where the individual components of the engine (Processor, Indexer, and Loader or "PLI") retrieve assigned chunks to work on.
Queue & Wrk - Queue is where chunks of data to be loaded, processed, and indexed are stored until assigned out and picked up by an engine. Wrk is where they go once they are being worked on. Each chunk is identified by a ".w" (dot double-u) file with its own unique identifier (e.g., "e0a67a60cafdc45e3b202db0aaa4476fc00.w"). These .w files contain reference information to a chunk of data located within the evidence being processed (i.e., E01, AD1, native, etc.). Data to be loaded or indexed are in the Ldr and Idx folders (respectively) while data to be processed is in the root of the Queue or Wrk folder.
During processing, the .w files should be moving from the Queue to the Wrk folders regularly. Some objects may take longer to process or index than others so their corresponding .w files won't move through the system quite as fast.
Paying attention to the total number of items in the Queue and Wrk folders can help to determine if the engine is still actively working on a job when the status updates through the software's user interface (UI) don't update frequently.
Distributed Processing Benefits
Utilizing distributed processing functionality can provide vast improvements, not only in performance of jobs that utilize the engine (i.e., processing, indexing, exporting, etc.), but in user experience as well. Properly spec'ing and building a dedicated server for processing can significantly decrease processing times. It can also aid in recovering from engine crashes if the Processing Manager is installed along with the other system services rather than a local engine.
Using multiple processing managers in a single environment can, however, cause problems and is not a supported architecture. The PM is not aware of other managers that may be installed on different servers. Therefore, if two PM's are working on the same case, they can potentially task different engines to write to the dtSearch indexes or tables in the database at the same time. If this happens, it will corrupt those indexes and/or database. For this reason, only a single EP or DPM should be used in an environment.
Overview
Understanding what the Processing Manager is and what it does will provide a greater depth of understanding to how Forensic Tools function. It can also assist in the troubleshooting of issues that may arise.