Our last post was about simple productions, which is how all productions should be.
Occasionally, though, you’ll be stuck with an agreement with the other side calling for some specific format option or other. These are usually negotiated by people who don’t know better, but by the time it comes to produce, if the other side won’t agree to a sensible format (one like our EDRM load file + multipage PDF that will save them 50% or more on their own hosting bill relative to legacy TIFF-based formats), there’s often nothing worth doing about it; you’re not going to take a production-format fight to court.
Disco gives you the options you need to deal with these requirements. Just click “Show Advanced Options” on the New Production page. You’ll see this:
Volumes. Here you can pick your starting volume label and how volumes are broken. “Volumes” are just folders that divide up a large production into separate pieces. Volumes can be broken by custodian, by size, or not at all using the “Volume Break” option. Volumes can be labeled with something like a Bates number, so, for example, if this is a rolling production, you can start the volume labels at a higher number, like VOL0005, or wherever you left off after your last production.
Deduplication level. You can also pick your deduplication level. Most review platforms force you to make a deduplication decision when you process and load data into the system. Disco doesn’t do this. Instead, we store every individual instance of a document that we encountered during processing. During review, you can review deduplicated documents, but you can also abstract further to threads or families or drill down deeper to the underlying instances. Similarly, during production, you can pick what deduplication level you want for each individual production.
By default, we produce one copy per custodian and one copy per parent. This means that family members that are in a production appear together and that each custodian who had a document will get at least one copy of the document (plus a further copy for each family in that custodian who has the document). This is what people expect when they print out sections of a production or review them linearly in the order in which they are produced: you will see emails, then their attachments, then emails, then their attachments, then free floating documents, etc. just as though you were looking at a paper collection or production.
You can also pick no deduplication, meaning you get one copy per original instance of each produced document; one copy per custodian, meaning you get exactly one copy for each custodian, even if the document occurs in multiple families in that custodian (meaning that families will not be together); or full deduplication, meaning that you get only one copy of each document in the entire production. In all cases, deduplicated records are accompanied with concatenated metadata, so that you do not lose metadata when you deduplicate the documents. The higher the level of deduplication you choose, the smaller the production, because there are fewer images to separately stamp.
Natives. Certain file formats are best produced as natives because even the best imaging results in a substantial degradation in the usefulness of the file. By default, we produce CAD files and Excel files as natives with slipsheets, but you can choose other filetypes to produce as natives with slipsheets. Clicking in the box displays a list of filetypes in the database you’re working in to make it easy to select additional ones. The “Also include natives for all images unless redacted” option allows you to include natives in addition to images (rather than instead of images). Whenever a native has redactions, these options will be overridden, and a redacted image will be produced instead of the native.
Sort order. By default, productions are sorted by custodian then, within each custodian, by primary date (meaning send date for emails, family date for children, and last modified date for other documents). But you can choose to sort productions by custodian, then path; date, then path; path only; or by reference id. A reference id is an external id, like a Nuix processing id, that was added to documents before they arrived in Disco. This can be ingested in Disco as a reference id and then used to sort productions from Disco so that they come out in the same order they came in. (Of course you should never use preprocessing software like this before Disco; you should simply send natives to Disco, and take advantage of Disco’s automagical processing.)
Stamping. “Do not apply bates stamp to images” results in the stamp appearing in the filenames of produced documents, but not being endorsed on the images themselves. So you will see, for example, Enron000101.pdf, but the PDF will not actually be stamped Enron000101 when you open it or print it.
Omit redactions. You can omit redactions, for example, when creating a set for internal use.
Store OCR text file in same folder as images. This is required by some legacy products to ingest productions successfully. Yuck.
Include all custom fields. This includes custom fields (for example, the type of camera used in taking digital pictures) in the production load files. Custom fields are those non-Disco fields ingested during processing.
Include tags in load file. Just like it sounds. This is useful for archiving a database from Disco: with this, also include natives, and include all custom fields checked, you have a set of data and load files that can be used to restore a database in Disco years later, for example, if a case goes up on appeal and is then remanded for further proceedings in the trial court.
Avoiding all these options is why we do our best to keep productions simple. Lawyers don’t need these options; they’re just there because of the unfortunate agreements we sometimes find ourselves trapped by in real-world cases and the legacy software we sometimes need to deal with — until Disco replaces it!