Please enjoy this blog post co-authored by Andrew Lyons, eDiscovery Technology Specialist A&L Goodbody Solicitors and
Kayode Okubanjo, eDiscovery Analyst, A&L Goodbody Solicitors.
Technology-assisted review (TAR) has gone from a new technology to a critical element in a relatively short period of time. The first judicial approval of TAR in e-discovery was in the United States, in the February 2012 Opinion and Order in Da Silva Moore v. Publicis Groupe. Ireland followed in 2015 and the UK in 2016, and the use of TAR is now an everyday occurrence across the globe.
When embarking on a new project, the question of whether or not to use TAR is often near the top of the list, and most large projects won't go ahead without some element of TAR to speed along the review. However, for those new to discovery or those not involved heavily in the technology of discovery, it can be difficult to work out exactly what is being proposed, or the difference between types of methods. This article will give a brief introduction of TAR and lay out the basics of what you really need to know to start taking advantage of TAR.
What is TAR?
TAR is the use of computer software in legal proceedings or cases to automatically categorise documents. This is driven by a process of using "machine learning" which helps the computer make decisions. Machine learning is a form of Artificial Intelligence (AI) in which the computer first learns from human coding decisions, and then tries to make additional coding decisions independently.
The most basic function of TAR is to classify documents as relevant or irrelevant.
A successful TAR process can evaluate and classify a significant number of documents with fewer hours of manual, human document review thereby reducing cost and time, while achieving more accurate results. There are a few ways that the accuracy of a TAR process can be assessed, and proving your TAR process is sufficiently accurate is often cause for discussion between the parties in a case.
What types of TAR do I need to be aware of?
Although there are different types of TAR you might hear about, the basic technology today can be divided into two categories: sample-based learning and active learning. These can also be referred to as TAR 1.0 and TAR 2.0.
TAR 1.0 uses a sampling method to build a training set of documents, which is code other similar documents. The accuracy of the computer's coding will be assessed by human reviewers in a blind test over a series of coding rounds, and the process ends when the computer us shown to be sufficiently accurate in its coding decisions. In this method, not all documents identified by the computer as relevant will have been reviewed by a human, so it may be necessary to carry out a secondary review for privilege or other issues.
TAR 2.0, also known as Continuous Active Learning, has become the default approach for the use of the technology. TAR 2.0 improves on the TAR 1.0 process by providing a more intuitive way for promoting the documents most likely to be relevant for a human review, and easier tracking and visualization of the state of the computer model. Documents are continually streamed to human reviewers, with documents assessed by the computer as most likely to be relevant prioritized. Human reviewers keep reviewing until they stop seeing relevant documents, and they review documents, the computer uses their coding decisions to make up to date decisions about which documents are most likely to be relevant.
TAR 3.0 has also been introduced recently. The key aim of TAR 3.0 is to leverage the features of TAR 2.0 with new techniques that will allow for earlier and smarter identification of relevant documents. Research is ongoing into how best to implement TAR 3.0 practically.
What is not TAR?
TAR is a term that can be employed when describing technical methods utilised to refine data, however, not all instances refer to the use of predictive coding or identifying documents that are likely to be relevant through continuous learning. Examples of this could be sentiment analysis, communications analysis and image analysis. Although these methods can be classified as Technology Assisted Review, the abbreviation and meaning of 'TAR' is generally reserved for the utilization of predictive coding.
How do I know what kind of TAR to use / what kind of TAR I am using?
It is unlikely that you are using TAR 1.0. TAR 1.0 was effectively rendered obsolete with the introduction of TAR 2.0 due to its introduction of continuous learning and removal of training rounds. Unless stated otherwise, you are most likely using TAR 2.0 - and if you are not, you should be!
Benefits of TAR
There are a number of key benefits which TAR can bring to a document review
- Time and cost reductions – reviewing fewer documents and freeing up lawyers to focus on more substantive legal work
- Improved accuracy – minimise human error and review, and allow more time for document analysis
- Early access to information - allows legal teams to determine facts and problems more efficiently by isolating on the most relevant documents more quickly
- Organisation – allows for quicker triage of data and removal of irrelevant documents. We also see improvements when dealing with rolling data sets or reviews, as TAR results can be carried through to new data in the case