Agents in the CLI

Executive Summary

An enterprise Salesforce environment is one of the most complex systems a consultant will ever encounter: hundreds of custom objects, thousands of Apex classes, layered automation triggers, and validation rules that have accumulated over years of development. Mapping this environment manually takes a seasoned consultant days of clicking through Setup menus. We gave an AI agent access to the Salesforce CLI and watched it do the same job in under an hour. Then we built a custom skill to make it repeatable — and it got exponentially faster. This is the story of what happens when you give an autonomous agent a terminal.

The Problem: Enterprise Archaeology

Every enterprise CRM implementation carries layers of history. Custom objects were created during a 2019 migration. Apex triggers were written by a contractor who left two years ago. Flows were built by an admin who learned by trial and error. Validation rules conflict with each other. Nobody has a complete map.

When a new initiative requires modifying the Opportunity object, the first question is always the same: "What will break?" Answering that question today means a consultant manually navigating through Setup, clicking into each trigger, each flow, each validation rule, then cross-referencing Apex classes in the developer console. It is slow, expensive, and error-prone.

The Manual Process: A consultant opens Object Manager, clicks through fields one by one, opens each trigger in a new tab, searches for class references in the Developer Console, and manually documents what they find. For a complex org, this takes 2-4 days of focused work.
The Cognitive Bottleneck: A human can hold perhaps a dozen relationships in working memory. An enterprise org has thousands. The consultant is not just slow — they are fundamentally limited by human cognition when tracing dependency chains across 234 Apex classes.
The Cost: At enterprise consulting rates, a multi-day org discovery engagement can cost $15,000-$40,000 before a single line of code is changed. And the resulting documentation is immediately stale.

The Core Tension

The Salesforce CLI already exposes every piece of metadata programmatically. The Tooling API can query triggers, flows, validation rules, and Apex classes. The data is all there — behind commands, not clicks. What was missing was an intelligence layer capable of orchestrating dozens of these commands in sequence, interpreting the output, and synthesizing it into architectural understanding.

The Experiment: Agent Meets Terminal

The setup was simple. We authenticated to a Salesforce sandbox via the SF CLI, then handed terminal access to an AI agent running in an agentic coding environment. The instruction: "Explore this org. Map its architecture. Tell me what's in here."

What happened next was remarkable — not because any single command was complex, but because the agent composed them. It didn't just list objects; it listed objects, then queried custom fields per object, ranked them by complexity, and moved on to triggers before we'd finished reading the first output.

The Discovery Sequence

The agent orchestrated its own workflow, executing commands in a logical sequence that a human would plan out in advance but an agent derived in real time:

# Step 1: Inventory — What objects exist?

$ sf sobject list --sobject-type custom --target-org sandbox

→ 47 custom objects discovered

# Step 2: Complexity — Which objects are heavy?

$ sf data query --query "SELECT TableEnumOrId, COUNT(Id) fieldCount

FROM CustomField GROUP BY TableEnumOrId

ORDER BY COUNT(Id) DESC LIMIT 20" --use-tooling-api

→ Top object: Opportunity with 78 custom fields

# Step 3: Automation — What code runs on these objects?

$ sf apex list class --target-org sandbox

$ sf data query --query "SELECT Name, TableEnumOrId, Status

FROM ApexTrigger" --use-tooling-api

→ 234 Apex classes, 23 triggers catalogued

# Step 4: Flows — What low-code automation exists?

$ sf data query --query "SELECT MasterLabel, ProcessType, Status

FROM Flow WHERE Status = 'Active'" --use-tooling-api

→ 67 active flows (record-triggered, screen, subflows)

Why This Is Novel

Each command above is trivial. Any Salesforce developer could run them. The breakthrough is that the agent chains them autonomously, interprets the output of each to decide what to run next, and synthesizes the results into a coherent architectural narrative — all without being told the specific sequence in advance. It reasons about what it doesn't yet know, then fills the gaps. A human does this over days of tabbed browsing. The agent does it in a continuous stream of execution that takes minutes.

The Deep Dive: Tracing an Object

After the broad inventory, the agent went deep. Given a single object — say, Opportunity — it ran a second orchestration layer:

Field Inventory: Queried every custom field on the object via the Tooling API — names, data types, descriptions — producing a complete field map.
Validation Rules: Retrieved every active validation rule, its error message, and which fields it references — surfacing conflicts and redundancies a human would miss.
Trigger Inventory: Identified all Apex triggers firing on the object, then parsed each trigger body to trace which handler classes they invoke — mapping the full before-insert, after-update execution chain.
Flow Automation: Found every record-triggered flow attached to the object, categorized by trigger type (before save, after save), and mapped their interaction with the Apex trigger layer.
Code Cross-Reference: Retrieved the entire Apex codebase locally, then searched every class file for references to the object — SOQL queries, DML operations, Schema references — producing a complete dependency graph.

The output was a structured architectural profile: 78 custom fields, 23 validation rules, 2 triggers routing to handler classes, 8 record-triggered flows, and 14 Apex classes containing direct references. Complete with the Salesforce automation execution order — before triggers, then validation rules, then after triggers, then flows — mapped to the specific code running at each stage.

Speed Comparison

A senior Salesforce consultant performing this same object deep-dive manually — clicking through Object Manager, opening each trigger, searching the Developer Console, cross-referencing flows — would spend 3-4 hours on a single complex object. The agent completed it in under 8 minutes. That is not an incremental improvement. It is a category shift.

Dependency Mapping: The Blast Radius Problem

The most valuable — and most dangerous — work in enterprise CRM development is understanding what will break before you change something. This is the "blast radius" problem. Modify a handler class and you need to know: What triggers call this class? What other classes does it call? What objects does it query? What flows invoke it? What test classes cover it?

The agent attacked this by working in both directions — upstream and downstream:

Upstream Analysis

Retrieved all Apex source code and searched for instantiation or static method calls to the target class. Queried all triggers for references. Checked all active flows for Apex action invocations. Produced a list of every entry point that could invoke the class.

Downstream Analysis

Parsed the class source code to extract every class it instantiates, every object it queries via SOQL, every DML operation, every HTTP callout, and every field reference. Built the complete tree of what this single class touches.

The result was a full dependency graph: 4 upstream callers, 6 downstream classes, 5 objects accessed, 23 fields referenced, and a complete change impact assessment. The kind of analysis that prevents the "we changed one class and broke the entire opportunity pipeline" scenario that haunts enterprise teams.

# The agent's dependency trace — automated, not manual

$ sf project retrieve start --metadata ApexClass --output-dir ./temp-apex

$ grep -r "LeadConversionHandler" ./temp-apex --include="*.cls" -l

→ Found in: LeadTrigger.cls, LeadBatchProcessor.cls,

LeadConversionService.cls, LeadConversionTest.cls

# Then parsed each for SOQL, DML, and class references

→ Downstream: ContactService, AccountService, TaskCreator

→ Objects: Lead, Contact, Account, Task, Opportunity

→ Test coverage: 87% (12 test methods)

What a Human Cannot Do

A human can trace one dependency chain at a time. The agent traces all of them simultaneously. It doesn't forget that a class it checked 10 minutes ago also references the target. It doesn't skip a trigger because it opened too many tabs. It holds the entire graph in context and delivers a complete picture — every time.

The Compounding Move: Building a Custom Skill

The first round of exploration was impressive but ad-hoc. The agent figured out the right commands on the fly, which meant it occasionally ran queries in a suboptimal order or missed a cross-reference pattern. The real breakthrough came when we encoded what worked into a custom skill — a structured set of instructions, command references, and output templates that the agent loads before every Salesforce engagement.

This is the difference between a talented analyst doing something for the first time and a seasoned consultant following a refined playbook. The skill didn't just make the agent faster — it made it exponentially better.

Anatomy of the Skill

The custom skill we built is a structured knowledge package containing five layers, each designed to eliminate a specific failure mode:

1. Orchestrated Workflows

Three predefined execution sequences — Full Org Discovery, Object Deep Dive, and Dependency Mapping — each specifying the exact commands, their order, and what to synthesize from the output. The agent no longer invents the sequence; it follows a battle-tested playbook that produces consistent, comprehensive results.

2. Complete Command Reference

A comprehensive CLI command catalog organized by metadata type — objects, Apex classes, triggers, flows, validation rules, custom fields, permission sets. Every command includes the exact syntax, output format options (--json, --result-format csv), and the Tooling API queries that expose metadata the standard CLI does not.

3. Metadata Relationship Map

A document encoding how Salesforce metadata types interconnect — Objects to Fields, Objects to Triggers, Triggers to Apex Classes, Classes to SOQL queries — along with the exact cross-reference queries to trace each relationship. This is the institutional knowledge that takes consultants years to internalize, packaged as context the agent loads in seconds.

4. Cross-Reference Patterns

Ten advanced analysis patterns: Complete Object Automation Stack, Apex Class Dependency Graph, Field Usage Analysis, Test Class Coverage Mapping, Permission Analysis, Flow-Apex Integration Points, Scheduled Job Inventory, External Integration Points, and full Change Impact Analysis. Each pattern is a recipe the agent can execute on demand.

5. Example Outputs

Complete sample outputs for each workflow showing the expected format, level of detail, and synthesis style. These function as "few-shot" examples — the agent doesn't just know what commands to run, it knows what the final deliverable should look like. The output matches what a client would expect from a senior consultant's engagement report.

The Exponential Effect

Without the skill, the agent would spend tokens reasoning about what to do next. With the skill, it spends those tokens executing and analyzing. The improvement is not linear — it compounds:

Zero Planning Overhead: The agent no longer needs 3-4 turns figuring out which commands exist. It immediately executes the right workflow for the user's request.
No Missed Cross-References: Without the relationship map, the agent might forget to check if flows also reference a class it's tracing. With it, every relationship path is followed every time.
Consistent Output Quality: Example outputs serve as a formatting contract. Every analysis produces the same structured, actionable deliverable — categorized, quantified, and prioritized.
Transferable Knowledge: The skill encodes consultant-grade expertise into a reusable artifact. Any team member — regardless of Salesforce experience — can trigger a full org analysis just by asking.

The Skill as an Efficiency Accelerator

In our Agentic Engineering Handbook, we introduced the concept of "Efficiency Accelerators" — living artifacts that encode institutional knowledge for agent consumption. The Salesforce Metadata Analyzer skill is a textbook implementation. It transforms years of platform expertise into structured context that an agent loads in milliseconds, then executes with perfect consistency.

Why This Is Wildly Faster Than a Human

It is tempting to frame this as "automation." It isn't. Automation is a script that runs the same commands every time. What the agent does is reason about what to do next based on what it just learned. When it discovers an object has 78 custom fields, it decides to examine validation rules more carefully. When it finds a trigger that delegates to a handler class, it traces that handler class without being told to. This is not a script — it is an analyst with infinite patience, perfect memory, and the ability to read 234 Apex classes in seconds.

30+

CLI Commands

Orchestrated per org analysis

<1hr

Full Org Map

vs. 2-4 days manually

100%

Coverage

No missed dependencies

The Three Speed Multipliers

1. No Context Switching

A human jumps between Object Manager, Developer Console, Flow Builder, and a text editor to document findings. Each switch costs cognitive overhead — remembering where you were, what you were looking for, what you already found. The agent operates in a single terminal session. Every piece of data it retrieves stays in context. There are no tabs to manage, no browsers to refresh, no mental models to reload.

2. Parallel Comprehension

When the agent retrieves 234 Apex classes and searches them all for references to the Opportunity object, it processes every file. A human performing the same search would use the Developer Console's limited search, miss files that use dynamic references, and stop after finding "enough" results. The agent's search is exhaustive by default. It doesn't satisfice — it completes.

3. Instant Synthesis

After gathering data from dozens of queries, the agent produces a structured analysis immediately. A human would spend hours organizing their notes, building a spreadsheet, and writing a summary. The agent's synthesis is simultaneous with its discovery — it categorizes Apex classes into handlers, services, utilities, batch jobs, and tests as it encounters them, not as a separate documentation pass.

The Broader Pattern: CLI as the Universal Interface

Salesforce is the case study, but the pattern is universal. Every enterprise platform that exposes a CLI becomes a surface for agentic automation. AWS, GCP, Azure, Kubernetes, Terraform, GitHub — all of them have CLIs that agents can operate.

The key insight is that the CLI was never designed for humans to use at scale. It was designed for humans to use one command at a time. Agents remove that constraint. They can execute 30 commands in sequence, hold all the output in context, cross-reference the results, and synthesize a deliverable — all in the time it takes a human to run 3 commands and start a spreadsheet.

Infrastructure Audits: An agent with AWS CLI access can inventory every resource, trace IAM policies, map VPC configurations, and produce a security posture report — the same work a cloud architect does manually over a week.
CI/CD Analysis: An agent with GitHub CLI access can analyze workflow runs, identify flaky tests, trace deployment failures across repositories, and recommend pipeline optimizations.
Database Archaeology: An agent with database CLI access can map schema relationships, identify orphaned tables, trace query patterns, and produce data architecture documentation that no one ever wrote.

The Pattern

Authenticate → Inventory → Cross-Reference → Synthesize. It is the same loop regardless of the platform. The CLI provides the data. The agent provides the intelligence. The custom skill provides the expertise. Together, they perform in minutes what used to require days of specialized consulting.

Conclusion

The terminal is the most powerful interface in enterprise software. Agents are the first intelligence capable of using it at scale.

The CLI already exposes everything. Agents unlock it.

Autonomous orchestration replaces manual discovery.

Custom skills encode expertise and compound performance.

Minutes, not days. Complete, not partial. Every time.

The question is no longer whether AI agents can do enterprise platform analysis. They can. The question is how long your organization will pay for multi-day manual discovery when the same work can be done before lunch.