How can I process very large amounts of data (tens or hundreds of millions of objects)?
Note: Before attempting to start an investigation of this scale, the proper hardware should be setup and configured. Using the Performance Guidelines article along with the system spec guides and Exterro Professional Services is the best way to architect such a system.
The best way to proceed is to create multiple cases, perform the necessary culling (keyword searching, filtering, etc.), export out the responsive results, then aggregate these results and process them into a new case. In this way, all responsive results can be de-duplicated and indexed in a single, smaller case.
Exterro is continually working to make products more stable and robust with large quantities of processed data. As we continue to develop and integrate with new technologies, our products will be able to create larger and more complex cases. Prior to version 6.1 and older, cases were only stable up to roughly 20 million objects with MSSQL.