Google today, along with its Gemini artificial intelligence model, unveiled AlphaCode 2, an improved version of the AlphaCode code generator introduced by Google’s DeepMind lab about a year ago.
AlphaCode 2 is actually powered by Gemini, or at least its Gemini Pro variant, augmented with data from programming contests. And according to Google, the model is far more capable than its predecessor – at least in one benchmark.
According to Google, in a subset of programming competitions run on Codeforces, a platform for programming contests, the AlphaCode 2 – in Python, Java, C++ and Go – performed better than the average 85 per cent of entrants. That’s compared to the roughly 50 per cent of contestants its predecessor managed to beat on the same subset.
“We selected 12 recent contests with more than 8,000 entrants, either from Division 2 or the more challenging 1+2 division. This yielded a total of 77 challenges,” AlphaCode 2’s technical description reads. “AlphaCode 2 solves 43% of the tasks in 10 attempts, which is almost twice as many as the original AlphaCode (25%).”
AlphaCode 2 can understand programming problems involving “complex” mathematics and theoretical computer science. Among other rather complex techniques, AlphaCode 2 is capable of performing dynamic programming, explains DeepMind researcher Rémi Leblon.
Dynamic programming involves simplifying a complex problem by breaking it down into simpler subproblems over and over again. Leblond says AlphaCode 2 knows not only when to apply this strategy correctly, but also where to use it. That’s no small thing, considering that problems requiring dynamic programming were a major sticking point for the original AlphaCode.
“AlphaCode 2 has to show some level of understanding, some level of reasoning and solution design, before it can go to actual implementation to solve a programming problem,” Leblond says. “And it does all of this on problems it has never encountered before.”
AlphaCode 2 solves problems by first addressing a family of “policy models” that generate a number of code samples for each problem. Code samples that do not match the problem description are filtered out, and a clustering algorithm groups “semantically similar code samples” to avoid redundancy. Finally, the scoring model in AlphaCode 2 identifies the best candidate from each of the top 10 “clusters” of code samples, which is AlphaCode 2’s answer to the problem.
All artificial intelligence models have flaws, and AlphaCode 2 is no exception. According to the description, AlphaCode 2 requires a lot of trial and error, which is too expensive for large-scale work and relies heavily on the ability to weed out obviously bad code samples. Moving to a more feature-rich version of Gemini, such as Gemini Ultra, may alleviate some of these problems, the review said.
As for whether AlphaCode 2 can be expected to appear in products at some point – AlphaCode was never released – Eli Collins, vice president of products at DeepMind, hinted at the possibility at the briefing.
“One of the most interesting things to me about the latest results is that when programmers collaborate with Gemini-based AlphaCode 2 to define certain properties that the code should follow, the performance of the model becomes even better,” Collins said. “In the future, we’ll see programmers using high-performance AI models as collaborative tools to help with the software development process – from reasoning about problems to helping with implementation.”
Mobile App Development Trends – 29.02
Working with dates and Codable, Fun with shapes in Compose, The missing guide to deep linking and more!
The open source StarCoder 2 model runs on regular GPUs
Companies are making more and more artificial intelligence-based code generators at an astonishing rate – services like GitHub Copilot and...
Google TV updates the homescreen
You may notice your apps taking a different shape on your Google TV For You screen.
Decompose – Kotlin Multiplatform lifecycle-aware business logic components with routing and pluggable UI
Decompose is a Kotlin Multiplatform library for breaking down your code into tree-structured lifecycle-aware business logic components (aka BLoC), with...
Mobile App Development Trends – 28.02
How to use VariadicView, Comprehensive Guide To Kotlin Context Receiver, OWASP Mobile Top 10 and more!
GitHub opens access to Copilot Enterprise
Copilot Enterprise includes all the features of the existing Business plan, including intellectual property indemnification, but extends it with a...