Many articles, webinars, blog posts and conference sessions have been dedicated to exploring Technology Assisted Review (commonly known as TAR), and what exactly it means. The abundance of terminology, technical language, and the quick moving pace of the field means that it's very easy to lose track of what acronym means what.
It's possible to encounter situations where any use of technology or analytics within a document review is referred to as TAR: however its most commonly used in reference to using computer algorithms to help identify relevant documents within the data set. The algorithms are trained by human reviewers, in order to teach the algorithm how to identify a relevant documents – also known as "predictive coding".
CAL, or Continuous Active Learning, is often heard in the same breath as TAR – but what exactly is the difference between the two? Can we use the terms interchangeably, and what is going on behind the scenes to make them different?
"TAR" is generally used as shorthand for TAR 1.0, which is the first iteration of predictive coding widely available within the industry, and recognized by courts. Similarly, you may also hear CAL referred to as TAR 2.0, as it’s the second incarnation of predictive coding to gain popularity, and to be implemented widely. So both TAR and CAL refer to predictive coding methods, but why are they different?
The difference between TAR and CAL is the underlying computer algorithm used to predict which documents are relevant to the case. The choice of computer algorithm affects how human reviewers train the system, how resilient the algorithm is against incorrect human tagging, and the practical considerations you'll need to be aware of to get the best results out of your review.
In general, TAR 1.0 requires a training phase, followed by a review phase which kicks in when the optimal point is determined through a control set of pre-reviewed documents – at that point, the computer stops learning. During a TAR 1.0 review, not all documents categorized by the algorithm as relevant will have been seen by a human, requiring perhaps extra effort to locate privileged or sensitive material in your relevant data set. For CAL reviews, the computer continuously learns from human coding decisions, and priorities human review of documents most likely to be relevant to the matter. Review continues until all documents identified by the algorithm as relevant have been checked by a human reviewer. For many teams, CAL (TAR 2.0) has replaced TAR 1.0 as the preferred approach to predictive coding.
For those in the know, you may also have heard the rumble of TAR 3.0 approaching – with the promise of further improvements on the predictive coding process. TAR 3.0 takes the principle of CAL, and applies it more thoughtfully to document clusters.
Hopefully this has given you enough grounding to start investigating the differences between TAR and CAL. If you're interested in learning more, please check out the session " Wait, Aren’t They the Same Thing?! – The TAR v. CAL Duel" at ILTACON this summer, 22-26 August 2021, which promises to be a deep dive into the direction AI is moving towards when it comes to applying analytics to your files. This session will take the conversation to the subtle differences between CAL and TAR and how each can help your case, and is not to be missed!