Conversational AI: From RAG Prototypes to Domain-Specific Supervision

Building a Tool Builder: Making AI Tools Configurable Without Code

The first chat API had two tools: look up an order and estimate delivery. They were hardcoded in the service layer - function definitions passed to OpenAI, with handlers wired directly to Shopify’s GraphQL API.

When I built a corporate AI Slack bot trained on company data, it needed different tools. Document search, internal API lookups, data queries. Hardcoding tools per deployment wasn’t going to scale.

The problem

Every new tool required a code change, a deployment, and testing. The tool definitions (name, description, parameters) were mixed in with the execution logic. Adding a tool meant touching the AI service, the handler layer, and the deployment pipeline.

For an internal tool where non-developers might want to add or modify capabilities, this was a bottleneck.

Two types of tools

The solution split tools into two categories:

Built-in tools are code-level integrations - document search, Shopify order lookup, email sending. They ship with the application and can be enabled or disabled from the admin panel. The code exists, but whether it’s available to the AI is a configuration choice.

Custom tools are defined entirely through the admin dashboard. An admin specifies the tool name, description, parameters (with types and validation), and execution rules. The AI sees them as function calls, and the system routes execution based on the tool definition.

What the admin sees

The dashboard has a tool management section. Each tool shows:

  • Name and description (what the AI sees)
  • Parameter schema (what inputs the tool accepts)
  • Execution configuration (how the tool runs)
  • Usage statistics (how often it’s called, success rate, average duration)
  • Enable/disable toggle

Creating a new tool is a form. Define the name, write a description that tells the AI when to use it, specify the parameters, and configure execution. No code, no deployment.

Execution tracking

Every tool call is logged with inputs, outputs, errors, and duration. This turned out to be more useful than I expected.

The logs revealed that the AI was calling certain tools unnecessarily - asking for document search on questions it could answer from context. They also showed which tools had high error rates and which were slow enough to affect response time.

Tool execution tracking made the system debuggable. Without it, you’re guessing why a response was slow or wrong. With it, you can trace the exact sequence of tool calls, see what each returned, and understand the AI’s decision path.

Pattern matching before tool calling

An optimisation: before sending the message to the LLM with tool definitions, pattern-match against tool trigger conditions. If a message mentions an order number and the order lookup tool is enabled, skip the LLM’s tool-calling decision and go straight to execution.

This saves a round trip for obvious cases. The LLM’s tool-calling is still available for ambiguous situations where the intent isn’t clear from simple pattern matching.

What I’d change

The custom tool system works for tools with simple execution patterns - API calls, data lookups, formatted responses. For tools that need complex logic (multi-step workflows, conditional execution, error recovery), you still need code.

A middle ground would be a visual workflow builder for tool execution, but that’s a product in itself.