Summation Database Specifications (Attachment 4)

This document is available in two formats: this web page (for browsing content) and PDF (comparable to original document formatting).To view the PDF you will need Acrobat Reader, which may be downloaded from the Adobe site.

August 2008

U.S. DEPARTMENT OF JUSTICE, ANTITRUST DIVISION
Summation Database Specifications
  • DOJ to provide empty Summation database shell. The Division currently uses Summation version 2.6.3.
  • DOJ will accept loaded Summation databases with the following conditions:
    • Each database has no more that 5-6 GB OCRBase.
    • If the database will have more than 120,000 records, it must be addressed with Division staff prior to production.
    • Custodians do not cross databases, except under limited circumstances.
    • Metadata fields must be populated
    • Data should be structured in the standard Summation format, with images stored inside the Images subdirectory of the case folder.
    • When records include a Doclink field to a native file, the native file should also reside inside the case folder in a folder called DocLink.

See Example of Case Directory Structure, below.

Summation Database Structure in Windows Explorer

[D]


U.S. DEPARTMENT OF JUSTICE, ANTITRUST DIVISION
Summation Submission Requirements

Via e-mail or on CD-ROM, the DOJ has provided a DOJShell Summation database directory. Please categorize your submissions by placing the DOJShell folder under sequential Database folder names. The folder naming scheme should be 2 to 3 letters (indicating your company) followed by 3 numbers. For example:

For the first 3 databases from ABC Co., the root of the piece of media (External, CD, or DVD) should display the following folders: ABC001, ABC002, and ABC003. Each of these folders should contain the loaded DOJShell Summation Case Directory.

The cover letter for each submission of loaded Summation databases should include information about the loaded Summation database(s) included on each External Hard Drive or other piece of media submitted, preferably in spreadsheet format.

Include the following for each submission:

A. For each piece of media:

  1. Assign a unique identifier for each piece of media that is also readily identifiable on the piece of media (i.e. Submission #; Serial number is also acceptable), and
  2. Identify the Databases on the piece of media.

B. For each Database:

  1. The Custodians included;
  2. The total number of records;
  3. The number of records for each Custodian (e.g., ABC001 contains 183,000 records: Jones - 150,000 records, Smith - 13,000 records, Doe - 20,000 records);
  4. The Bates number ranges (and any gaps therein) for each Custodian;
  5. The total number of native files in each Database;
  6. The number of native files for each Custodian (e.g., ABC001 contains 15,980 native files: Jones - 1,500; Smith - 5,250; Doe - 9,230);
  7. The total number of images included in each database; and
  8. The total number of images included for each Custodian (e.g., ABC001 contains 15,980 images: Jones - 1,500; Smith - 5,250; Doe - 9,230).

U.S. DEPARTMENT OF JUSTICE, ANTITRUST DIVISION
METADATA & FAMILY RECORD SPECIFICATIONS

Default File Layout(.txt)        Field Name has 8 character limit.
Field Name Field Description Field Type Hard Copy E-mail Spreadsheets Presentations Other Elec. Docs
Company Company submitting data Note Text X X X X X
Box# Submission/Volume # Note Text X X X X X
Custdian Custodian(s)/Source(s) - formatted Last, First Multi-Entry X X X X X
Begdoc# Start Bates (including Prefix) - No spaces Note Text X X X X X
Enddoc# End Bates (including Prefix) - No spaces Note Text X X X X X
DocID Populate with exact same value as State Bates Note Text X X X X X
PGCount Page count Integer X X X X X
ParentID Parent Bates, including Prefix (ONLY IN CHILD RECORDS) Note Text X X X X X
Attchids Child document list - Start Bates of each Child (ONLY IN PARENT RECS) Multi-Entry X X X X X
FamlyRng Family Start and End Bates (including Prefix) (i.e. ABC-001 -ABC-003 Note Text X X X X X
Prprties Record Type-> (Redacted, Document Withheld Based On Privilege) Multi-Entry X X X X X
From Author -formatted Last, First Multi-Entry   X X X X
To Recipient -formatted Last, First Multi-Entry   X X X X
Cc Cc field -formatted Last, First Multi-Entry   X X X X
Bcc Bcc field -formatted Last, First Multi-Entry   X X X X
Subject Subject/Document Title Note Text   X X X X
DocDate Document Date / Date Sent - MM/DD/YYYY Date Keyed   X      
Timesent Time email was sent Time   X      
Datecrtd Date Created Date     X X X
Datesvd Date Modified Date     X X X
Datercvd Date Accessed / Received Date   X X X X
Filesize File size Note Text     X X X
Altitle File name - Name of file as appeared in original location Note Text     X X X
Applicat Application used to create native file (e.g., Excel, Word) Note Text     X X X
FilePath Data's source filepath information Note Text   X X X X
Doclink Current filepath location to the native file Note Text     X X X
FolderID Email folder path (sample: Inbox/active) or Hard Copy Folder Information Note Text X X      
Paragrph Paragraph # to which the document is responsive Note Text X X X X X
Hash Hash Value (used for deduplication or other processing Note Text   X X X X
Srchtrms List of Terms used to identify record as responsive (if search terms used) Multi-Entry   X X X X

*Indicated field may be empty if only native files produced)

Parent IDs, Attachment IDs, and Family Range Details
Customer Notes:
     
Y Confirm Family Range definition for attached files    
Y Confirm Field names and types    
Y Each member of the Family is its own record    
         
Family Range Definition:    
  All records will have a family range when the file or email has a parent or children    
  Family Range will start with the first page of the top most parent and go until the last child's last page    
  Each member of the Family is its own record    
         
Example:      
  Description: Top most Email Attachment to Doc 1 Attachment to Doc 1
    Doc No. 1 Doc No. 2 Doc No. 3
  Begin Bates ABC-001 ABD-011 ABC-016
  End Bates ABC-010 ABC-015 ABC-020
  ParentID {empty} ABC-001 ABC-001
  Attchids ABC-011;ABC-015 {empty} {empty}
  Family Range ABC-001 – ABC-020 ABC-001 – ABC-020 ABC-001 – ABC-020
Updated June 25, 2015