Electronic Discovery Center- Electronic Evidence and Discovery - Deduplication
Home arrow Deduplication
Newsflash

Several law firms are asking vendors to start providing "Near Deduplication" of emails. This process entails supressing earlier strings within a string of emails to leave only the last, most complete email message for review. To note: most firms rely on and ask only for deduplication, not near deduplication.

 
Main Menu
Home
Electronic Disc News
What is E-Discovery?
Electronic Disc Vendors
Electronic Disc Articles
Zubulake-The Details
Deduplication
Native File Review
Links
Search
powered_by.png, 1 kB
Deduplication PDF Print E-mail
Written by Administrator   
Wednesday, 08 March 2006

As referenced by wikipedia,

In database maintenance, deduplication, which is sometimes reffered to as referrential integrity and various other names, refers to the database maintenance task of removing duplicate data from within its databases. I.e. similar rows featuring, "J.Smith" and "John Smith" may well refer to the same conceptual individual and the rows within the database may need to be merged. This is often achieved with the merge/purge algorithm of Felligi and Sunters.

Deduplication is mostly used in comparing email records within custodians, or against custodians, to take out duplicate records to ensure a cost savings of extensive nature. Not only can email files (psts and lotus notes) be deduplicated from within, but stand alone edoc files as well. Most often, electronic evidence vendors will associate a hash value for any given file and then compare that hash value against the other files in a database.

This method of reducing the responsive data before any amount of imaging is started is a means to bring down the costs inherit inside this activity.

Last Updated ( Wednesday, 08 March 2006 )
 
Next >
(C) 2008 Electronic Discovery Center- Electronic Evidence and Discovery
Joomla! is Free Software released under the GNU/GPL License.
Electronic Discovery Center