Duplicate and Near-Duplicate Detection (eDiscovery)

Electronic discovery (also called e-discovery or eDiscovery) refers to any process in which electronic data is sought, located, secured, and searched with the intent of using it as evidence in a civil or criminal legal case. In the process of electronic discovery, data of all types can serve as evidence. This can include text, images, databases, spreadsheets, audio files, animation, websites, and computer programs. Irosoft's eDiscovery solution, DocUnik, allows information-rich organizations to sift through large amount of data in search of evidence, or simply to clean up legacy data (e.g. remove duplicates and near duplicates) prior to content archiving. This solution:

  • Can quickly analyze the content of millions of files
  • Detects duplicate and near-duplicate files in several formats (e.g. Word, PDF, XML, XLS, JPG, etc.)
  • Skims through the content of archived files (.ZIP) and email attachments
  • On demand DeNISTing
  • Embeds a search environment for user-friendly manual reviews
  • Is fully scalable - processes can be distributed on server farms
  • Is easy to deploy and use