Perplexity Free Best free AI chatbot for coding and research
I've been around technology for long enough that very little excites me, and even less surprises me. But shortly after Open AI's ChatGPT was released, I asked it to write a WordPress plugin for my wife's e-commerce site. When it did, and the plugin worked, I was indeed surprised.
That was the beginning of my deep exploration into chatbots and AI-assisted programming. Since then, I've subjected 10 large machine models (LLMs) to four real-world tests.
How to use ChatGPT to write: Resumes | Excel formulas | Essays | Cover letters
Unfortunately, not all chatbots can code alike. It's been 18 months since that first test, and even now, five of the 10 LLMs I tested can't create working plugins.
In this article, I'll show you how each LLM performed against my tests. There are two chatbots I recommend you use, but they cost $20/month. The free versions of the same chatbots do well enough that you could probably get by without paying. But the rest, whether free or paid, are not so great. I won't risk my programming projects with them or recommend that you do until their performance improves.
Also: How I test an AI chatbot's coding ability - and you can too
I've written a lot about using AIs to help with programming. Unless it's a small, simple project, like my wife's plugin, AIs can't write entire apps or programs. But they excel at writing a few lines and are not bad at fixing code.
Rather than repeat everything I've written, go ahead and read this article: How to use ChatGPT to write code: What it can and can't do for you.
If you want to understand my coding tests, why I've chosen them, and why they're relevant to this review of the 10 LLMs, read this article: How I test an AI chatbot's coding ability - and you can too.
Let's start with a comparative look at how the chatbots performed:
Next, let's look at each chatbot individually. I'll discuss nine chatbots, even though the above chart shows 10 LLMs. The results for GPT-4 and GPT-4o are both included in ChatGPT Plus. Ready? Let's go.
I tested nine chatbots, and four passed most of my tests. The other chatbots, including a few pitched as great for programming, each only passed one of my tests -- and Microsoft's Copilot didn't pass any.
I'm mentioning them here because people will ask, and I did test them thoroughly. Some of them do just fine for other work, so I'll point you to their more general reviews if you're just curious about how they function.
Meta AI is Facebook's general-purpose AI. As you can see above, it failed three of our four tests.
Also: How to get started with Meta AI in Facebook, Instagram, and more
The AI did generate a nice user interface but with zero functionality. And it did find my annoying bug, which is a fairly serious challenge. Given the specific knowledge required to find the bug, I was surprised it choked on a simple regular expression challenge. But it did.
Meta Code Llama is Facebook's AI designed specifically for coding help. It's something you can download and install on your server. I tested it running on a Hugging Face AI instance.
Also: Can Meta AI code? I tested it against Llama, Gemini, and ChatGPT - it wasn't even close
Weirdly, even though both Meta AI and Meta Code Llama choked on three of four of my tests, they choked on different problems. AIs can't be counted on to give the same answer twice, but this result was a surprise. We'll see if that changes over time.
Anthropic claims the 3.5 Sonnet version of its Claude AI chatbot is ideal for programming. After failing all but one test, I'm not so sure.
If you're not using it for programming, Claude may be a better choice than the free version of ChatGPT.
Also: 4 things Claude AI can do that ChatGPT can't
My ZDNET colleague Maria Diaz reports that Claude can handle uploaded files, process more words than the free version of ChatGPT, provide information roughly a year more current than GPT-3.5, and access websites.
Gemini Advanced is Google's $20 pro version of its Gemini (formerly Bard) chatbot. I expected the tool to do better than one out of four. Interestingly, it passed the one test that every AI other than GPT-4/4o failed -- knowledge of that fairly obscure programming language produced by one programmer in Australia.
Also: 3 ways Gemini Advanced beats other AI assistants, according to Google
So, if it knew that language, why couldn't it handle basic regular expressions or other first-year programming student problems?
You'd think the company with the "Developers! Developers! Developers!" mantra in its DNA would have an AI that does better on the programming tests. Microsoft produces some of the best coding tools on the planet. And yet, Copilot did badly.
Also: What are Microsoft's different Copilots? Here are the differences and how you can use them
The one positive thing is that Microsoft always learns from its mistakes. So, I'll check back later and see if this result improves.
The results of my tests were fairly surprising, especially given the big investments of Microsoft and Google. But this area of innovation is improving at warp speed, so we'll be back with updated tests and results over time. Stay tuned.
Have you used any of these AI chatbots for programming? What has your experience been? Let us know in the comments below.