Software patents after Alice Corp. v. CLS Bank Int’l

Last week, in Alice Corp. v. CLS Bank Int’l, the Supreme Court held that you cannot patent the combination of (a) a known fundamental procedure or formula and (b) a known computer configuration used to automate the application of the procedure or formula. Because many software inventions fall under this description, the Supreme Court’s decision means that many software inventions are likely not patentable — and that many software patents issued before Alice are likely invalid.

What Alice says

The Supreme Court has long held that “laws of nature, natural phenomena, and abstract ideas” are not patentable, but that “applications of such concepts to a new and useful end” are patentable. Op. at 5-6. Alice is about how to tell the difference between an abstract idea and a patentable application of an abstract idea.

Alice sets out a two-part test: (1) does the invention involve an abstract idea; and (2) if so, are the other elements of the invention inventive independent of the abstract idea. Op. at 7. If yes, the invention is patentable; if not, not.

Alice does not set out a test for determining whether something is an abstract idea. We know from Alice and the Court’s earlier cases that long-existing and widespread economic or commercial practices (hedging, intermediated settlement, payroll, double-entry accounting, etc.) are abstract ideas. Op. at 8-10. Mathematical formulas and algorithms are also abstract ideas. Op. at 8. But the Court does not offer any definition beyond examples like these.

Even if an invention involves an abstract idea, it may be patentable if it satisfies the second part of the Alice test, that is, if the other parts of the invention are independently inventive. The operative holding of Alice is that adding “on a computer,” or other words to the same effect, does not satisfy this test. Op. at 13. If your invention is the automatic application of an abstract idea on a computer, and nothing else, then your invention is not patentable under Alice.

This is the holding that clarifies the law around software patents.

What Alice doesn’t say

Alice doesn’t say how to tell when something is an abstract idea. But it does explain that the reason for this exception from patentability is the fear that patents on abstract ideas will sweep too broadly, will remove from use basic building blocks of future inventions, and will, in doing so, retard the invention that patent law is supposed to promote. The definition suggested by this reasoning is that abstract ideas are those whose protection is likely to retard invention more than promote it; ideas that are well known and longstanding, ideas that are new but basic (“laws of nature”) to a field, and the like would be abstract ideas under this definition.

Alice also doesn’t say how much more than implementation on a generic computer would qualify as an independently inventive part of the invention. Implementing an abstract idea on novel hardware — computer components that were themselves genuine inventions, like a working quantum computer — would qualify. But how far short of that can you go? How about implementing an abstract idea on a known hardware configuration, but where the choice of that hardware configuration (or network or system architecture) is a genuine invention? Document-based databases; fast, redundant solid-state storage; and cached web servers may all be known, but might the choice of this architectural combination as the one on which to run a new search algorithm be patentable even though the new search algorithm is an abstract idea?

The questions after Alice are whether a sufficiently narrow algorithm or computational method can be for that reason not an “abstract idea”; and whether, if it is one, a choice or configuration of computing hardware that makes the algorithm or computational method particularly effective can make it patentable even though the configuration (as opposed to its selection for this use case) is not itself new.


Mass tagging

You can mass-tag documents directly from the search screen.

Let’s say you’re doing an initial pass to identify documents that are potentially privileged. You are doing this by searching for law-firm domain names and mass tagging them Attorney-Client Privilege for later review. Here is how you would do this in Disco:

1. Run a search to find the documents you want to mass tag, for example, domain( to find all emails to or from Vinson & Elkins’s domain:

Search for V&E emails in the Enron dataset

Search for V&E emails in the Enron dataset

2. Click the “tag all results” button in the top right of the search results grid to tag all these search results. If you want to tag only a subset of them, use the check boxes to the left of each result. Notice that if you click checkboxes, the “tag all results” button changes to “tag selected results.”

Button text updates

3. A mass-tag dialog will appear.

Pick tags to apply and remove and how far you want the changes to propagate

First, select the tags you want to apply and the tags you want to remove. You can apply and remove multiple tags at the same time in a single operation.

Second, pick how far you want your tag changes to propagate: by default, when you tag an email, the tags flow down to the email’s attachments. You can use the checkboxes to turn this off or to propagate tags up from emails to the email conversations that include them, up from email attachments to the emails that contain them, and up one more time from those emails to the email conversations that contain them. Together, these options let you decide whether to include complete families and complete conversation threads in your mass tagging operation.

When you’re done, click “Tag 388 results.”

4. A processing wheel will appear beside the top navigation bar in Disco. You can continue with your work while the mass tag runs. When it’s done, the wheel will disappear.

When the wheel disappears, the mass tag is complete

When the wheel disappears, the mass tag is complete

That’s it! Mass tagging made easy.

. . . and advanced productions too

Our last post was about simple productions, which is how all productions should be.

Occasionally, though, you’ll be stuck with an agreement with the other side calling for some specific format option or other. These are usually negotiated by people who don’t know better, but by the time it comes to produce, if the other side won’t agree to a sensible format (one like our EDRM load file + multipage PDF that will save them 50% or more on their own hosting bill relative to legacy TIFF-based formats), there’s often nothing worth doing about it; you’re not going to take a production-format fight to court.

Disco gives you the options you need to deal with these requirements. Just click “Show Advanced Options” on the New Production page. You’ll see this:

Advanced options . . . you should never need them, but sometimes you do.

Advanced options . . . you should never need them, but sometimes you do.

Volumes. Here you can pick your starting volume label and how volumes are broken. “Volumes” are just folders that divide up a large production into separate pieces. Volumes can be broken by custodian, by size, or not at all using the “Volume Break” option. Volumes can be labeled with something like a Bates number, so, for example, if this is a rolling production, you can start the volume labels at a higher number, like VOL0005, or wherever you left off after your last production.

Deduplication level. You can also pick your deduplication level. Most review platforms force you to make a deduplication decision when you process and load data into the system. Disco doesn’t do this. Instead, we store every individual instance of a document that we encountered during processing. During review, you can review deduplicated documents, but you can also abstract further to threads or families or drill down deeper to the underlying instances. Similarly, during production, you can pick what deduplication level you want for each individual production.

By default, we produce one copy per custodian and one copy per parent. This means that family members that are in a production appear together and that each custodian who had a document will get at least one copy of the document (plus a further copy for each family in that custodian who has the document). This is what people expect when they print out sections of a production or review them linearly in the order in which they are produced: you will see emails, then their attachments, then emails, then their attachments, then free floating documents, etc. just as though you were looking at a paper collection or production.

You can also pick no deduplication, meaning you get one copy per original instance of each produced document; one copy per custodian, meaning you get exactly one copy for each custodian, even if the document occurs in multiple families in that custodian (meaning that families will not be together); or full deduplication, meaning that you get only one copy of each document in the entire production. In all cases, deduplicated records are accompanied with concatenated metadata, so that you do not lose metadata when you deduplicate the documents. The higher the level of deduplication you choose, the smaller the production, because there are fewer images to separately stamp.

Natives. Certain file formats are best produced as natives because even the best imaging results in a substantial degradation in the usefulness of the file. By default, we produce CAD files and Excel files as natives with slipsheets, but you can choose other filetypes to produce as natives with slipsheets. Clicking in the box displays a list of filetypes in the database you’re working in to make it easy to select additional ones. The “Also include natives for all images unless redacted” option allows you to include natives in addition to images (rather than instead of images). Whenever a native has redactions, these options will be overridden, and a redacted image will be produced instead of the native.

Sort order. By default, productions are sorted by custodian then, within each custodian, by primary date (meaning send date for emails, family date for children, and last modified date for other documents). But you can choose to sort productions by custodian, then path; date, then path; path only; or by reference id. A reference id is an external id, like a Nuix processing id, that was added to documents before they arrived in Disco. This can be ingested in Disco as a reference id and then used to sort productions from Disco so that they come out in the same order they came in. (Of course you should never use preprocessing software like this before Disco; you should simply send natives to Disco, and take advantage of Disco’s automagical processing.)

Stamping. “Do not apply bates stamp to images” results in the stamp appearing in the filenames of produced documents, but not being endorsed on the images themselves. So you will see, for example, Enron000101.pdf, but the PDF will not actually be stamped Enron000101 when you open it or print it.

Omit redactions. You can omit redactions, for example, when creating a set for internal use.

Store OCR text file in same folder as images. This is required by some legacy products to ingest productions successfully. Yuck.

Include all custom fields. This includes custom fields (for example, the type of camera used in taking digital pictures) in the production load files. Custom fields are those non-Disco fields ingested during processing.

Include tags in load file. Just like it sounds. This is useful for archiving a database from Disco: with this, also include natives, and include all custom fields checked, you have a set of data and load files that can be used to restore a database in Disco years later, for example, if a case goes up on appeal and is then remanded for further proceedings in the trial court.


Avoiding all these options is why we do our best to keep productions simple. Lawyers don’t need these options; they’re just there because of the unfortunate agreements we sometimes find ourselves trapped by in real-world cases and the legacy software we sometimes need to deal with — until Disco replaces it!



Simple productions

Producing documents doesn’t have to be hard.

Producing in Disco: pick tags, Bates stamps, confidentiality stamps, format, and go!

Producing in Disco: pick tags, Bates stamps, confidentiality stamps, format, and go!

Disco’s production interface presents only the options lawyers actually care about. Here’s how it works:

  1. Pick the documents you want to produce using tags. Disco defaults to producing all documents tagged responsive, but not any documents that have a privilege tag. As you change your selections of tags to include in or exclude from the production, the plain English sentence underneath the tag selection boxes updates so that you can always see exactly what you’re doing.
  2. Next, pick your Bates prefix and starting number. These default to the last prefix you used and the next number.
  3. If you have confidentiality stamps to apply, click “Add a Stamp” and then pick the tag and the exact text from your protective order for the stamp. You can add as many of these as you need. For example, you can stamp every document tagged “Confidential” with the text “Highly Confidential” and every document tagged “AEO” with the text “Highly Confidential — Attorneys’ Eyes Only.”
  4. Finally, pick a production format. We support the industry-agreed EDRM format as well as legacy Concordance and Summation formats and native-only exports. Our productions load into all modern review tools that the other side might be using — although of course they should use Disco! — and meet the requirements of, for example, the SEC or DOJ.
  5. Click “Create,” see the production begin to run, and receive an email when it’s complete and ready for download. If it’s too big to download over your Internet connection (> 100 gb or so), you can ask Disco operations to ship you an encrypted drive with the contents.

That’s it. That’s how productions should be.

Search quick reference

Here is a quick reference to searching in Disco.

  • (space) . . . or
  • & . . . and
  • % . . . not
  • /n . . . proximity
  • ” ” . . . phrase
  • ! . . . stemming / truncation / trailing wildcard
  • ~ . . . fuzzy / typo search
  • field(terms) . . . search field for terms


  • contract payment . . . matches documents that contain either contract or payment
  • contract & payment . . . matches documents that contain both contract and payment
  • contract % payment . . . matches documents that contain contract, but not payment
  • “contract payment” . . . matches documents that contain the phrase contract payment
  • contract! . . . matches documents that contain any term starting with contract, for example, contractcontractscontractedcontractually, etc.
  • guaranty~ . . . matches documents that contain any term similar to guaranty, for example, guarantyguarantee, garanteegaranty, etc.
  • contract /10 signed . . . matches documents taht contain contract and signed within 10 words of each other in any order
  • custodian(Holcombe) . . . matches documents that contain Holcombe in the custodian field

Standard document fields

  • text()
  • id()
  • batesnumbers()
  • tag()
  • custodian()
  • author()
  • filename()
  • path()
  • folder()
  • to()
  • cc()
  • pagecount()
  • parentcount()
  • childcount()
  • documentnotes()
  • domain()
  • pagecount()
  • parentcount()
  • childcount()
  • documentnotes()

Search queries that do not specify a field (e.g., contract & payment) search a combined index of document text, ID, Bates numbers, document notes, custodians, subject, to, from, and cc.

Use to to search a range, for example, batesnumbers(Enron000001 to Enron000101).

Date fields

  • date() . . . primary date, which is sent date for email and last modified date for everything else
  • createdate() . . . for native file
  • lastmodifieddate() . . . for native file
  • loaddate() . . . date loaded into Disco
  • senddate() . . . date email was sent
  • receiveddate() . . . date email was received
  • familydate() . . . for all members of a family, the primary date of the parent (you can also use family date to sort documents and see parents and attachments grouped together)
  • alldates() . . . combines all dates

Example date searches: date(after 06/20/2012), date(before 2012), date(after 06/20/2012 & before 07/20/2012), date(on 06/20/2012), date(06/20/2012), date(2012)

True / false fields

  • hasprivilege(true) or hasprivilege(false) . . . true if document has at least one privilege tag
  • hasredactions(true) or hasredactions(false) . . . true if document is marked for redaction in Disco
  • hasdocumentnote(true) or hasdocumentnote(false) . . . true if document has a note

Try searching now — with 0.3 second search results on even the largest multi-TB datasets — in the live demo.

Single, simple search bar

Disco’s single, simple search bar controls all search features in Disco.

Single, simple search bar on top of the review screen.

Single, simple search bar on top of the review screen.

You don’t have to wade through wizards or advanced search screens or long lists of checkboxes and dropdowns. Instead, you can control everything by just typing in searches.

If you need help, Disco provides an unobtrusive search builder, search examples, search history, saved searches, and assignments right underneath the search bar on the same screen.

Additional search features appear underneath the search box.

Additional search features appear underneath the search box.

The first time you click in the search box, the search builder is shown by default. You can cause the search builder not to be shown by unchecking the “show search builder by default” checkbox in the lower-right corner.

Search builder

Use the search builder when you’re first getting started.

Disco leverages lawyers’ existing knowledge of Westlaw- and Lexis-style syntax to minimize training time. For example, we use the field(terms) format for field searching (e.g., custodian(Taylor)), /n for proximity searching (e.g., lunch /40 shred), and ! for stemming (e.g., lunch /40 shred!). Search examples gives examples of the most common kinds of searching:

Refresh your recollection on search syntax using search examples.

Refresh your recollection on search syntax using search examples.

Clicking on the blue text does what you would expect: custodians pops up a sortable list of custodians and document counts, and clicking on a custodian name runs a search for that custodian; document type does the same thing with document types; etc. Document fields shows a popup with the searchable metadata fields in the particular database you are reviewing, all of which can be searched using the field(terms) syntax.

Everything does what you expect.

Everything does what you expect.

Our approach to search emphasizes power and minimalism: we give lawyers tools they already know how to use and minimize the distance between these tools and the data. No separate screens for search. No wizards or endless visual options. Just powerful search syntax that lawyers already know, right on top of the results.

Oh, and of course results that show up only 0.3 seconds later, even on the largest multi-TB datasets.

Visual overhaul, instances, tied multiscreen, advanced production options

Today we released a new version of Disco. Read the press release here.

Some highlights from the release:

  • Visual overhaul of the entire user experience, increasing our focus on minimalism and simplicity.
  • 3x improvement in document-processing speed.
  • Instances: we now maintain individual instances for deduplicated documents, letting you drill down from deduplicated documents to instances during review and defer deduplication decisions until the time of action (e.g., generating a privilege log or doing a production)
  • Tied multiscreen view to support review on multiple monitors
  • Advanced production options like choice of production sort order and deduplication level

Try the new version at