ILTA Just in Time: Agents: Where GenAI Value Actually Lives

By Floor Blindenbach posted 2 days ago

Like

Please enjoy this blog coauthored by Ruark W. Chick, Chief Information Officer, Jones Walker LLP & Floor Blindenbach-Driessen, Founder, Organizing4Innovation

Agents: Where GenAI Value Actually Lives

Most GenAI tools make existing work faster. Agents make different work possible.
An attorney who spent twenty minutes reconstructing her timesheet at the end of each day now spends four. A knowledge management team that once fielded the same research questions repeatedly has encoded their best answers into something that works while they sleep. A partner who could never quite keep up with inbox triage has a standing system that surfaces what needs attention before the morning meeting starts.
These are not dramatic transformations. They are quietly significant ones — and they represent something meaningfully different from what a faster search or a better first draft delivers. Agents, when they work, change the shape of a day.
The question is not whether agents are valuable. It is how organizations move from a list of promising ideas to agents that people actually trust and use. That gap — between potential and deployment — is where most of the real work happens.

Where the Market Actually Sits

A knowledge management team recently ran a workshop and surfaced more than a dozen thoughtful ideas for GenAI agents. The ideas were practical, well-scoped, and closely tied to real work. Months later, none had been deployed.
At first glance, this reads as stalled momentum. In reality, it is more indicative of where the market sits today. Agentic technologies are advancing quickly, but not evenly. Capabilities and interfaces are evolving faster than most organizations can comfortably integrate them into established workflows and governance models. Firms are generating strong, actionable ideas while simultaneously working through when and how those ideas should move from exploration into production.
This is less a failure of execution and more a hallmark of a platform shift. As GenAI evolves from isolated prompting to agent-driven, workflow-embedded systems, the constraint is no longer idea generation. It is the ability to select, structure, and scale the right use cases with intention.

What an Agent Is — and Why the Definition Matters

Part of the challenge is that the term “agent” is being used to describe very different things. At one end, an agent is essentially a standing approach to a recurring task — a structured way of working so users do not need to re-explain their intent in each prompt. At the other end, agents more closely resemble systems: connected to shared data, able to invoke tools, trigger workflows, and take action beyond generating text.
What distinguishes agents from earlier automation is the engine: a Large Language Model that can interpret ambiguous instructions, handle variation, and operate without explicit rules for every situation. That distinction matters for ownership, risk, and expectations — and it also explains why these definitions are blurring: not because of imprecision, but because the technology itself is converging. As agent frameworks mature, the boundaries between these categories are becoming less rigid — and open ecosystems have accelerated that convergence considerably.
There is also a subtler challenge. At the technology level, agents feel almost magically easy: ask anything, get anything, and it “just works.” In reality, they behave much more like software built on systems that are probabilistic and imperfect. You will get different results to the same question. Edge cases surface quickly. None of the traditional concerns around testing, ownership, permissions, or failure handling disappear. What is new is that many people do not realize they have stepped into a software development role.

Three Categories That Should Not Be Mixed

Once teams begin talking seriously about agents, there is a tendency to treat all ideas as variations of the same problem. A more durable approach is to distinguish between three broad categories: Personal agents, Workflow agents, and Enterprise agents. Each carries a different set of expectations around ownership, testing, governance, and what it actually means to be “deployed.”

Category	Build	Buy / Out-of-the-box	Right for…	The real challenge
Personal Agents	Low-code agent builders, AI assistant customization features	AI writing assistants, built-in AI features in productivity tools	Personal productivity, recurring tasks already solved in chat, small groups	Unplanned sharing. A personal agent that has never been stress-tested can quietly become informal firm-wide infrastructure — and with it, an unmanaged data leakage and access control risk.
Workflow Agents	Full low-code agent platforms, AI coding assistants	Workflow-specific solutions (contract management, due diligence platforms) and similar purpose-built legal AI	Specific repeatable workflows, practice-level consistency, shared data access	Easy to start, hard to deploy. Buy solutions face trust and fit challenges; build solutions face permissions, data scoping, and failure handling — complexity that teams consistently underestimate.
Enterprise Agents	Cloud AI platforms, API-based builds, custom development	Enterprise software with embedded LLM capability	Firm-wide or client-facing solutions where reliability, testing, and ownership are non-negotiable	The build starts faster than traditional development. What responsible deployment requires has not changed.

Each category has both a build path and a buy path. At every level, the buy option generally deploys faster and fits less precisely. The build option is the inverse — and that tradeoff applies to agents exactly as it does to any other software decision.

One distinction tends to hold across all three categories:
Agents that surface, summarize, or structure tend to hold up well. Agents expected to validate, approve, or decide tend to struggle.
A policy finder that surfaces relevant documents is useful on day one. A policy checker that must confirm whether an expense complies — won’t be, because it must be right every time. Same domain. Completely different success criteria. Most early agent disappointments come from confusing the two.

Sort Before You Build

The single most useful step — and the one most consistently skipped — is sorting an agent idea into the right category before building anything. The category determines ownership, testing requirements, and what deployment actually involves. Treating all ideas as roughly the same kind of project is one of the most reliable ways to ensure none of them ship.
A few questions can confirm which category an idea belongs to, and whether building is even the right starting point.

Personal Agent	Workflow Agent	Enterprise Agent
Most answers yes? Build or adopt for yourself.	Most answers yes? Involve IT from the start.	Most answers yes? Treat it as a software project.
☐ Have I already solved this in chat?	☐ Is this for a specific, repeatable workflow or process?	☐ Is there client impact or material risk exposure?
☐ Is this for my own workflow or a small group?	☐ Will multiple people rely on consistent output?	☐ Does it require ongoing ownership and maintenance?
☐ Does it rely on judgment rather than strict accuracy?	☐ Does it require shared data or access controls?	☐ Would failure have consequences beyond the immediate team?
☐ Could the intended users build or configure this themselves?	☐ Does failure need to be handled explicitly?	☐ Does the answer need to be right every time?
☐ Before building: is there an existing tool that already does this out of the box?	☐ Before building: does a vendor already offer this workflow?	☐ Before building: is there enterprise software with LLM capability that could be a better fit?

If answers point in different directions, the project likely needs scoping before it needs building — or buying. The Workflow agent category is where most initiatives stall: it appears accessible, with no-code interfaces and quick early results, but hides genuine complexity around data permissions, failure handling, and what happens when output is wrong at scale.

The Sequence That Works

When agents succeed, the sequence is usually consistent. Someone first develops a reliable way to approach the task using GenAI directly. Only after that approach proves useful does it make sense to encode it into an agent. Problems tend to arise when that sequence is reversed — building an agent before the task is well understood hardens uncertainty rather than resolving it.
____________________________________________________________________________________________________________________________________
If you can’t get what you want from chat, the task likely isn’t ready to be encoded into an agent. An agent locks in a solution; it doesn’t discover one.
____________________________________________________________________________________________________________________________________
Experimentation cycles are shortening. Open and community-driven environments are surfacing effective patterns quickly, and commercial platforms are beginning to formalize those lessons. But agents are not shortcuts to understanding how GenAI should be applied. They are a way to make that understanding repeatable and scalable once it exists.

What Success Looks Like

A number of attorneys have experimented with using agents to reconstruct timesheets — working from notes, emails, and calendar entries to capture billable time at the end of the day. A few find it genuinely valuable. Most do not. The difference is consistent: those who succeed are already frequent, fluent GenAI users. Not because the task requires advanced prompting, but because they understand what the agent can and cannot do. They use it to surface items otherwise forgotten and to structure narratives that would have taken longer to write from scratch. Billable time increases not because the agent is always right, but because it is useful enough to change behavior.
The ones who succeed have already figured out how to approach the task — and use the agent to do it more consistently. Those who give up are often looking for the agent to solve the problem for them. That gap is where most agent efforts quietly fail, at every level of the spectrum.

Four Things That Trip Firms Up

Beyond the visible challenges of data access and failure handling, four dynamics consistently undermine agent adoption without ever being named explicitly.

• Agent adoption rarely fails at rollout. It fails the first time an agent produces a surprising output and no one knows what to do with it. Users who understand how these tools work will check the result, recalibrate, and continue. Users who lack that understanding will conclude the tool does not work and quietly disengage. That trust break cannot be repaired after deployment — it has to be addressed before the agent reaches them, through honest expectation setting and a named person who owns the output.

• Testing agents is also fundamentally different from testing traditional software. Traditional testing assumes repeatability. Agents break that assumption — they operate probabilistically, which means “it worked in testing” no longer means what people expect. Small unstructured pilots can be especially misleading: low volume and high attention mask failure modes that only emerge at scale. Pilots designed with explicit success criteria and scale-testing protocols are more reliable — but they require more rigor than most teams apply.

• Personal success with an agent creates a particular kind of false confidence. An agent that works well for one person, refined through their own iteration, is often assumed to be production-ready. What that person has actually done is enter a software-like development role without realizing it — and deciding not to scale a personal agent can feel like undoing progress, which makes the decision emotionally harder than it should be.

• Finally, agents do not just require learning a new tool. They change how judgment, review, and responsibility are distributed in daily work. A team that adopts an agent for contract review is reorganizing who reads what, when, and with what level of scrutiny. Training alone does not address that shift. Clear ownership, defined review protocols, and explicit agreement about what the agent is and is not responsible for are what make the difference.

What Governance Actually Looks Like

Governance is most effective when it aligns with impact. Personal agents benefit primarily from guidance, shared examples, and clear guardrails — the goal is not central control, but raising the quality of what individuals do for themselves. Workflow agents require more deliberate oversight: ownership, testing approaches, and failure handling need to be defined before deployment. Enterprise agents should follow the same expectations applied to any other firm-wide system.
The firms that move fastest on agents are usually the ones who prepared most carefully — not the ones who skipped the steps. Across all categories, governance works best when it prioritizes clarity and proportionality over blanket restriction: the goal is to ensure that what scales is trusted, understood, and sustainable.

Category	How to govern it
Personal Agents	Help the owner build or choose. Provide guidance, share what works, and make good examples visible. Ask whether an out-of-the-box tool already solves the problem before encouraging a build. The goal is not to centralize — it is to raise the quality of what individuals do for themselves.
Workflow Agents	Work backwards from your software path — or ask whether a vendor already solved it. For in-house builds: start with what responsible software deployment requires — testing, ownership, data scoping, failure handling — and identify what can be safely streamlined at this scale. For vendor solutions: evaluate them as you would any software procurement, with the added question of where GenAI-specific risks sit.
Enterprise Agents	Follow your existing build/buy path. Whether building on a cloud AI platform or API, or procuring enterprise software with embedded LLM capability, the organizational requirements are the same: business case, named owner, testing protocol, change management, and maintenance plan. The faster build does not compress the governance.

The Workflow agent category typically requires the most new organizational thinking — it sits between familiar territory on both sides and has the least established process. If that process does not yet exist inside a firm, building it is more urgent than building any individual agent. For law firms specifically, confidentiality obligations and privilege considerations add a further layer of governance requirements that apply across all three categories — particularly where agents access client matter data, communicate externally, or generate outputs that could enter the record.

Agents as an Outcome, Not a Starting Point

The industry is already moving from prompt-centric interaction toward agentic, workflow-embedded systems. This shift is visible across open experimentation and commercial platforms alike, and it will continue regardless of individual organizational readiness.
____________________________________________________________________________________________________________________________________
Agents aren’t the next step after chat. They’re the reward for understanding how GenAI works. Too often, that reward is claimed before the work is done.
____________________________________________________________________________________________________________________________________
Agents amplify whatever understanding already exists. Where that understanding is strong — where the task is well scoped, the approach is validated, and ownership is clear — agents create real leverage. Where it is not, they scale the confusion just as effectively. Sequencing, categorization, and proportional governance are not obstacles to momentum. They are what allow momentum to turn into sustained use rather than disappointment.
Firms that learn first, encode later, and govern proportionally will not deploy the most agents. They will deploy the agents that deliver the most meaningful, sustained impact.