Large Language Models (LLMs) like GPT often struggle to answer mathematical questions. In fact, if you ask a human a tough math question, like what is 185 cm in ft, they'll struggle as well. They'd likely need a calculator to perform this conversion - and so do LLMs.
LLMs are built to handle natural language. While generally being good at generating words and stringing together language, when it comes to math, they often need help.
Unlike a calculator or math library, LLMs cannot sometimes reason or process symbolic logic. So, while they can manage basic arithmetic, especially if it's something familiar from their training data, they typically struggle with more complex problems, particularly word problems.
The main question is how to fix this LLM limitation?
No doubt, LLMs have evolved with the launch of reasoning models like GPT-o1 or Llama 3.3. But they still hallucinate, lack real-time data access, struggle with complex math, and produce non-deterministic outputs. Fortunately, we can solve this problem using AI agents.
AI agents are autonomous software that use LLMs to perform tasks beyond simple text generation.
They make decisions and execute actions. AI agents rely on LLMs for language understanding but add capabilities like memory, real-time interaction, and decision-making.
Agents augment the capabilities of LLMs in the following ways:
To help address some of the math limitations LLMs experience, let's create an AI agent that builds a calculator using MathJS and BaseAI tool calls.
In this tutorial, I'll be using the following tech stack:
To start creating an AI agent, you need to create a directory in your local machine and install all the relevant dev dependancies in it. You can do this by navigating to it and running the following command in the terminal:
This command will create a file in your project directory with default values. It will also install the package to read environment variables from the file, and to handle math operations.
Next, we'll be creating an AI agent pipe. Pipes are different from other agents, as they are serverless AI agents with agentic tools that can work with any language or framework. They are easily deployable, and with just one API they let you connect 100+ LLMs to any data to build any developer API workflow.
To create your AI agent pipe, navigate to your project directory. Run the following command:
Upon running that command, you'll see the following prompts:
Once you are done with the name, description, and status of the AI agent pipe, everything will be set up automatically for you. Your pipe will be created successfully at .
Create a file in the root directory of your project and add the OpenAI and Langbase API key in it. You can access your Langbase API key from here.
In this step, we'll configure the AI agent pipe created according to our needs.
Navigate to your project directory and open the AI agent pipe you created. You can add a system prompt to the pipe if you want. I'm sticking to This is what it will look like:
Tool calling lets an LLM use external tools, such as functions, APIs, or other resources, to get information or perform tasks beyond its built-in knowledge.
In this step, we'll create a Calculator Tool using BaseAI tools. This tool will handle all mathematical computations in your project, ensuring they are error-free and trustworthy. The tool is versatile and suitable for both simple calculations (e.g., ) and more advanced ones (e.g., ).
It will also be particularly helpful in reducing hallucinations, which it can do by offloading computations to an external tool This avoids incorrect or fabricated answers that LLMs might otherwise generate. It also reduces the likelihood of getting incorrect responses from the LLM by rechecking or gathering additional data to ensure accuracy.
By using BaseAI's smart tool-calling and memory features, we can reduce AI hallucinations by 21% while improving the model's ability to self-correct its outputs.
These enhancements are useful when dealing with complex mathematical expressions or formula evaluations and should really improve the quality and accuracy of the LLM's answers.
To create a calculator tool in your project that will be responsible for doing all the calculations without errors, run this command in your terminal:
You'll be asked to provide a name and description of the tool in your terminal. This is what I'm providing:
Your tool will be created at .
To configure the tool, navigate to your project directory and open the tool you created. You can find it at .
This is what the code will look like:
The key in the object is the function that will be executed when the tool is called. You can write your logic to get the mathematical calculations for a given function.
Update the calculator tool's description and code by adding parameters to the calculator function. The LLM will give values to these parameters when it calls the tool. And it'll even import math from . This is the final code:
In this step, we'll integrate the tool in the AI agent pipe we created. For that, open the pipe file present at and import the calculator tool at the top of the file. We will also call the calculator tool in the tools array of the pipe.
Now we'll integrate the AI agent pipe you created into the Node.js project to build an interactive command-line interface (CLI) for the calculator tool. This Node.js project will serve as the base for testing and interacting with the AI agent pipe (in the beginning of the tutorial, we set up a Node.js project by initializing npm).
Now, create an file:
In this TypeScript file, import the AI agent pipe you created. We will use the pipe primitive from to run the pipe.
Add the following code to the file:
This code creates an interactive CLI for chatting with an AI agent, using a pipe from the library to process user input. Here's what happens:
In the function:
To run the AI agent pipe locally, you need to start the BaseAI server. Run the following command in your terminal:
Run the file using the following command:
In your terminal, you'll be prompted to "Enter your query." For example, let's ask: "What is 120 cm in feet?" LLMs usually hallucinate when converting to feet. But because of the self-healing tool calling of the BaseAI framework, the tool detects and corrects its own errors.
As Large Language Models (LLMs) often struggle with mathematical reasoning due to their focus on language, leading to frequent errors in calculations, especially with complex math problems.
AI agents extend LLM capabilities by integrating tool calls. They handle real-time data, ensure more consistent outputs, and reduce hallucination.
By incorporating MathJS and tool calls via the BaseAI framework, developers can create custom serverless AI agents called pipes that serve as reliable calculators and address LLMs' inherent limitations.