Perspective_

Beyond the Vending Machine: R&D for Agentic AI

Disciplined Research & Development (R&D) is critical to explore practical AI applications, balancing experimentation and collaboration to assess feasibility and viability.

Josh Malloy

Data Science Principal

Summary

Disciplined Research & Development (R&D) is critical to explore practical AI applications, balancing experimentation and collaboration to assess feasibility and viability
Continuous R&D helps organizations balance capability, uncertainty, and transparency for deeper understanding and proactive adaptation to AI advancements
Integrating AI requires significant changes in technology, team structures, and workflows that R&D helps identify
Generative AI models, particularly AI agents, offer new capabilities for autonomous tasks

Most software is like a vending machine: deterministic and predictable. The same input always gives the same output. You approach the machine, select your item, insert your money, and it dispenses your snack and change. Its deterministic design ensures it works perfectly almost every time, and when it breaks, repairs are quickly diagnosed and addressed.

But agentic AI systems are not vending machines.

Imagine walking into a convenience store to buy a snack. You select your item, approach the cashier, hand them a $10 bill, and await your change. You watch the cashier struggle to complete the transaction. They spend 15 minutes pressing every button and flipping through the manual. Eventually, they give up, and you leave the store confused and… snackless.

On your way out, you remember you built the store, designed the cash register, and trained the cashier. Questions flood your mind: Where did I go wrong? Does the register even work? Did I explain how to use it correctly? Is the manual unclear? Is the cashier capable?

This perplexes you even more because two days ago, the same cashier effortlessly managed a transaction for the same snack.

Determined, you try again later that week. The cashier again struggles to operate the register. However, this time, halfway through the transaction, the cashier builds a new register, completes the transaction, hands over your change, and then recommends several low fee index funds to invest your leftover cash.

You didn’t even know they could do that. Next time, you’ll just stick to using the vending machine.

Controlled Chaos

Unlike vending machines, harnessing the power of AI means dealing with a healthy dose of non-determinism. Minor variations in user behavior, seemingly innocuous changes to error messages, and slight adjustments to instructions can all result in wildly different outputs and the root cause can be challenging to identify. We embrace this non-determinism not because we like unpredictability, but because it unlocks capabilities that deterministic systems can't offer.

“We embrace [this] non-determinism not because we like unpredictability, but because it unlocks capabilities that deterministic systems simply can’t offer.”

The capabilities unlocked from embracing non-determinism bring a new kind of challenge: it’s no longer enough to just write code and expect predictable behavior. We need processes that help us explore, test, and refine system behavior in uncertain conditions, and for these to be effective, they must move beyond open-ended tinkering. In a space where ambiguity is the norm, disciplined structure becomes the differentiator.

Unifying Skeptics, Enthusiasts, and Tinkerers

LLM-driven agents, like cashiers, often exhibit advanced skills (e.g., analogical reasoning, collaboration, tool use), but the difference between real and perceived competence is hard to delineate. This leads stakeholders to either overestimate their capabilities or underestimate them as risky and overhyped. Even experts who have spent their entire careers in the field struggle to assess the viability of potential use cases when it comes to agentic AI. While we can identify what's impossible, many requests fall into a "let's try and see" category, leading to endless tinkering.

So, how do we bring together the enthusiast, the skeptic, the tinkerers, and the tech to solve a problem for which there is no established roadmap?

Disciplined R&D.

Explore Boldly, Build Responsibly

R&D becomes essential — not only to build effective AI solutions, but also to create a shared understanding across the organization of what this technology can realistically achieve, understand what the trade-offs are, and how to prepare for an increasingly agentic future.

Disciplined R&D takes this impulse to “try it out” and creates processes for technical and non-technical stakeholders to collaborate. It begins with structured exploration—surfacing promising ideas and scoping their theoretical potential. From there, iterative experimentation cycles use controlled tests to assess feasibility under real-world organizational constraints and determine:

What is theoretically possible under ideal conditions
What is technically feasible in the context of your organization
What is operationally viable based on its reliability, safety, and cost

When executed well, this process helps an organization build institutional knowledge through demonstration and direct experimentation, rather than hype-driven narratives. As cycles of rapid learning and problem redefinition are completed, the skeptics and the proponents move toward a more grounded understanding of the technology, technical and non-technical stakeholders develop a shared vocabulary, and an organization begins to develop working prototypes and the collective knowledge needed to use them effectively.

Over time, iterative R&D cycles help to transform uncertainty into informed judgment. Enthusiasts gain structure, skeptics gain evidence, and leadership gains clarity. Most importantly, the organization builds the capacity to not only assess what agentic AI can do, but to make confident, intentional decisions about what it should do and how to do it responsibly.

Agentic AI’s Potential and Promise

The rapid advancement of commercial-grade generative AI models has sparked intense interest in a wide array of potential applications. Included is the development of AI agents.

Unlike basic LLMs, agents are assigned specific roles and instructions that define a focused set of goals and expected behaviors. They are designed to control their problem-solving process, use external tools at their discretion, and operate autonomously whether independently, as part of an agent team, or in collaboration with human users.

However, applying “Agentic AI” to real-world use cases can be a bewildering experience for both developers and users alike. Discerning the difference between what is practical, plausible, and impossible is not a trivial matter. It requires a tolerance for uncertainty that can only be reduced through continuous experimentation and iterative refinement, supported by disciplined R&D to enable an organization to adapt and learn as quickly as the technology develops.

Without clarity and structure, the promise of utility from these systems can dissolve into unpredictability. That’s where discipline makes the difference.

The Discipline Behind Decisions

Let’s assume we’ve successfully vetted an agentic use case, established technical feasibility, and now we’re ready to operationalize a prototype by piloting the product with a limited user group. Implicitly or explicitly, during development we have made some assumptions about the right balance between the capability of the system, the variability of its outputs, and how transparent its internal logic needs to be for users to trust it. We hope we’ve gotten it right because:

We know the agents will generate inconsistent outputs, behave unpredictably in edge cases, or struggle with intent alignment. Our early efforts to impose strict constraints to reduce this uncertainty inadvertently limited the system’s utility and adaptability so we gave the agents a lot of latitude to act.

We hope that we implemented enough auditability to explain how and why an agent behaves the way it does so that we can debug, audit, and adjust its parameters after it does something we never saw in testing.

Now imagine attempting this outside of an R&D process.

In reality, the R&D process is essential for exploring and shaping this trade space intentionally rather than reactively. By systematically prototyping use cases, instrumenting agent behavior, and evaluating performance across a range of scenarios, R&D efforts allow teams to identify viable regions of operation where capability is high enough to deliver value, uncertainty is low enough to manage risk, and transparency is sufficient to support governance and accountability.

This process surfaces not just technical trade-offs, but design and process decisions that can shift the balance between these three dimensions and forces the conversation between users, developers, and organizational stakeholders. Understanding and shaping this trade space is what allows organizations to move from experimental agents to deployable systems that are aligned with operational, ethical, and mission-driven goals.

Adapt Now or Fail Tomorrow

If we accept the proposition that AI agents are here to stay then we must begin adapting the environment to suit the technology, not the other way around. This marks a shift from thinking about AI as a feature we integrate, to treating it as an actor in our systems, a participant in business processes, a user of software, and sometimes, a decision-maker. That shift requires us to change not only how we build technology, but how we structure teams, write requirements, define roles, and evaluate outcomes. In short: this is a sociotechnical change, and R&D is the space where we learn how to navigate it.

“If we accept the proposition that AI agents are here to stay then we must begin adapting the environment to suit the technology, not the other way around.”

R&D in this context extends beyond the technical dimension by helping surface the organizational changes and design practices needed to derive real value from agentic systems. It forces us to ask questions like: What assumptions in our systems no longer hold true when the user is an AI agent? How do we evolve the infrastructure, workflows, organizational documentation, and knowledge management practices of today to support tomorrow’s mixed human-agent teams?

Follow Curiosity, Take Action

Today’s R&D activities help us identify pain points that, if removed, would enable agentic systems to operate more effectively. In practical terms, it could mean redesigning APIs to accommodate machine users, updating access control models to reflect agent permissions, or rewriting documentation so it can be parsed and reasoned over by an LLM. It might involve refining business processes to clarify when agents act autonomously versus when they escalate to humans, or rethinking knowledge management so agents can access the same institutional memory as their human counterparts.

These changes, while technical in execution, have profound implications for how work is organized, how accountability is managed, and how collaboration is defined. R&D gives us a controlled space to explore these transformations proactively rather than waiting for them to be forced upon us.

We're not just fine-tuning individual parts—we're upgrading the entire experience and system. If traditional software is the vending machine, predictable and limited, agentic systems are something entirely new: adaptive and proactive. They don't wait for inputs; they interpret, adapt, and respond.

The challenge isn’t to make agents behave like vending machines, but to design the processes, guardrails, and expectations so their creativity doesn’t turn into chaos. Therefore, the goal isn’t perfect automation, it’s purposeful augmentation through systems that reduce friction, extend human capability, and deliver value in ways we couldn’t program in advance.

Bottom line: Agentic AI isn’t a vending machine. There’s no manual, and no way to predict every outcome. But disciplined R&D gets us ready—that’s the promise of agentic systems.

The appearance of U.S. Department of War (DoW) visual information does not imply or constitute DoW endorsement.

Perspective_

Summary

Controlled Chaos

Unifying Skeptics, Enthusiasts, and Tinkerers

Explore Boldly, Build Responsibly

Agentic AI’s Potential and Promise

The Discipline Behind Decisions

Adapt Now or Fail Tomorrow

Follow Curiosity, Take Action

Related Content_

28 Systems. One Training Enterprise.

Dispersion as Advantage: Designing Logistics for a Contested Environment

Soldier-centric Design for Mission Impact

Designing Intelligent, Resilient Mission Systems

The Future of Government Transformation Is Orchestration, Not Overhaul

The Future of Event Security: Intelligence, Integration, and Impact

The Need for Reliable Data: Protecting Patients, Clinicians, and Outcomes

Speed to Mission: Delivering Capabilities for Operators in the Field

For the Warfighter: Equipping Readiness at the Speed Missions Demand

Ready to accelerate your mission?

Contact_

Thank you for your submission!