This document is available in two formats: this web page (for browsing content) and PDF (comparable to original document formatting). To view the PDF you will need Acrobat Reader, which may be downloaded from the Adobe site.

U.S. Department of Justice, Antitrust Division
Standard Summation Specifications for Load File Production


This document describes the specifications and procedures for producing an image-based production to the Antitrust Division in the form of Summation load files. The following page describes the sample files provided with these specifications. Included is a sample Load Files Submission. Please contact the Division if you have any questions about these technical details. The items below highlight areas of particular importance.

1) Database files (such as an Access .MDB) should not be produced in this manner. Database productions should be discussed with the appropriate government legal and technical staff to determine the optimal production format.

2) Regarding attachments, pay special attention to the PARENTID and ATTCHIDS fields, which are used by Summation to keep track of email families, and have often been delivered incorrectly by vendors.

3) Note that PowerPoint and Excel files require special handling. Both PowerPoint and Excel files should be produced natively, with links referenced in the DOCLINK field. In addition, you must provide TIFF images of the entire PowerPoint. PowerPoint files should be produced in full slide image format along with speaker notes, with any speaker notes following the appropriate full slide image. For Excel spreadsheets, provide tiff images of the first 5 pages, since images of spreadsheets often comprise thousands of pages, and are thus not useful for review purposes.

4) Extracted text should be provided with all records, except for documents that originated as hard copy. For the hard-copy records, please provide OCR text. For redacted documents, provide full text for the redacted version. The extracted text files should include page breaks that correspond to the "pagination" of the image files.

5) Before beginning production, you must produce a sample, including emails, attachments, and non-email files. We will take a day or so to evaluate that sample and confirm the technical details. If we identify a problem, you will need to resubmit the sample until we confirm there are no problems.

6) When you provide the full production, we must receive a cover letter that provides the total document and page counts, so that we can verify the records in the database. The counts should also be reported by custodian. With any submission, documents from an individual custodian should be confined to a single load file.

7) You must label the media provided, whether CD, DVD, or hard drive. At a minimum, the label must include a unique number (Submission #), the name of the company providing the response, and any references necessary to link to the information in the cover letter.


Folders and Files within the ZIP file


One Sample Load Files Submission

LoadFileSubmission
Lays out the way the DOJ would like a Load File Submission produced. The DOJ001 folder would be replaced with the <CompanyPrefix>001 for its first submission, then would increment sequentially for subsequent submissions.

Files

Summation Load File Submission Requirements

Provides overall requirements of how load file productions should be produced.

Image Details & Load File Specifications

Outlines the image, DII, metadata, and searchable text files production formats.

Metadata Fields & Family Record Specifications

Lists and defines the requested metadata fields, and describes the production of "family" records.

Sample Cover Letter Spreadsheet

Sample format for providing the statistics associated with each submission (custodians, number of records/images, the media identifier, the submission number, etc.).

Sample DII Load File

Sample format for the image (and possibly searchable text) load file.

Sample Extracted Text Control List File

Sample Control List for loading extracted/OCR text. This is an optional method of loading text files, versus loading text by means of the DII file.

Sample Metadata Load File

Sample delimited metadata file.

Sample Deduplication - Custodian Append File
This is to be used only when de-duplicating ACROSS custodians (i.e. Horizontal Deduplication). Provide on an additive, rolling basis starting with second submission. As more custodians are discovered for previously produced documents, this file is updated with new custodian information. It is used to identify the master documents and then append custodian data to their records.
Note: AppendDate is a multi-entry date field that designates when this data was appended, so Antitrust can keep track of every time that record was edited with new custodian information.