Programmers have spent decades writing code for AI models, and now, in a full-circle moment, AI is being used to write code. But how does an AI code generator compare to a human programmer? How Good Is ChatGPT at Coding?
A recent study published in the June issue of IEEE Transactions on Software Engineering evaluated the code produced by OpenAI’s ChatGPT in terms of functionality, complexity, and security. The results reveal a wide range of success rates and raise important questions about the future of AI in software development.
The Study: ChatGPT’s Coding Capabilities
The study, conducted by a team including Yutian Tang, a lecturer at the University of Glasgow, aimed to evaluate ChatGPT’s ability to address 728 coding problems from the LeetCode testing platform. These problems were tackled in five programming languages: C, C++, Java, JavaScript, and Python. The findings show that ChatGPT has an extremely broad range of success when it comes to producing functional code, with success rates varying significantly depending on the difficulty of the task, the programming language, and other factors.
Performance Before and After 2021
Interestingly, the study found that ChatGPT performed better on coding problems that existed on LeetCode before 2021. For example, it was able to produce functional code for easy, medium, and hard problems with success rates of about 89%, 71%, and 40%, respectively. However, when tackling problems introduced after 2021, the success rates dropped dramatically. For easy problems, the success rate fell from 89% to 52%, and for hard problems, it plummeted from 40% to a mere 0.66%.
Tang explains, “A reasonable hypothesis for why ChatGPT can do better with algorithm problems before 2021 is that these problems are frequently seen in the training dataset.” This suggests that ChatGPT’s training data heavily influences its performance, and it struggles with newer, unfamiliar problems.
Strengths and Limitations
The study highlighted several strengths of ChatGPT in coding. For example, ChatGPT was able to generate code with smaller runtime and memory overheads than at least 50% of human solutions to the same LeetCode problems. This efficiency is a significant advantage in optimizing software performance.
However, the analysis also revealed some security concerns with AI-generated code. ChatGPT-generated code had a fair number of vulnerabilities, such as missing null tests, though many of these were easily fixable. Additionally, while ChatGPT was good at fixing compiling errors, it generally struggled to correct its own mistakes when feedback was provided. Tang notes, “ChatGPT may generate incorrect code because it does not understand the meaning of algorithm problems, thus, this simple error feedback information is not enough.”
Enhancing AI Code Generation
To mitigate these issues, Tang suggests that developers using ChatGPT provide additional information to help the AI better understand problems and avoid vulnerabilities. “For example, when encountering more complex programming problems, developers can provide relevant knowledge as much as possible, and tell ChatGPT in the prompt which potential vulnerabilities to be aware of,” he says. This approach can enhance the quality and security of AI-generated code.
The Future of AI in Coding
While ChatGPT shows promise in automating software development tasks and enhancing productivity, it’s important to understand its strengths and limitations. The study underscores the need for continuous improvement in AI training techniques and better integration of human oversight in the coding process. By addressing these challenges, AI code generators like ChatGPT can become valuable tools for developers, complementing human expertise rather than replacing it.
Conclusion
The study on ChatGPT’s coding capabilities offers valuable insights into the potential and pitfalls of AI in software development. With a success rate ranging from 0.66% to 89%, depending on various factors, ChatGPT demonstrates both impressive capabilities and notable limitations. As AI continues to evolve, understanding these dynamics will be crucial for leveraging its strengths and addressing its weaknesses. By fostering a collaborative approach between AI and human programmers, the future of coding can become more efficient, secure, and innovative.
More News: Artificial Intelligence