As Google enters AI coding autocomplete race, Kite for Python language gets smarter

Developers who build machine-learning applications are themselves getting coding suggestions generated by AI.

Python eats up Java and is now chasing down JavaScript Python has attracted two million new developers over the past year.

Developers code for a living and that means a whole lot of time on laptops writing. So when they're coding, they could benefit from something like Google's AI-generated Smart Compose suggestions in Gmail. But until recently, developers haven't had many smart autocomplete options. 

However, earlier this month Google announced that the new version of its popular Dart software development kit for building smartphone apps would ship with 'ML Complete', which uses machine learning to deliver code-completion suggestions in Dart. 

SEE: How to build a successful developer career (free PDF)

It's the first smart auto-complete tool Google has delivered to developers but it's something that Kite, a San Francisco startup behind an AI-powered code autocompletion tool for Python developers, has been plugging away at for years. 

Kite has just announced Intelligent Snippets, a feature that allows developers to complete the equivalent of a whole sentence made up of 'tokens' in Python, a language that's become essential for programmers, thanks to the growth of machine learning.

The spoken equivalent of a token in programming is a word, and now Kite can suggest multiple tokens without users having to manually define the structure of a sentence first, meaning it can adapt on the fly to a developer's style of coding. 

But for now, Kite and its new Intelligent Snippets feature is only available for developers using Python via the latest version of Kite for Python code editors like Atom, PyCharm, Sublime Text, VS Code, and Vim.   

"Intelligent Snippets are larger chunks of code where the user can fill in the blanks," Kite CEO Adam Smith tells ZDNet. 

The feature builds on Kite's main premise of helping developers save time and effort by allowing them to type faster and avoid the need to look up reference documentation on the web.  

Of Kite's current user-base of 30,000 developers, Smith said European coders represent just under a third of its web traffic and the region is the second largest group behind the US. 

Around 3,000 Python developers in Europe started using Kite for the first time in August, suggesting growing interest among developers from the EU region. 

Kite has also received support from Joachim Ansorg, a well-known German developer who built the Kite plugin for PyCharm, integrated development environment (IDE) from JetBrains, the Czech-based company behind the official Android programming language Kotlin.  

Intelligent Snippets is designed to address the limitations in using machine-learning models to predict more complex suggestions involving multiple tokens. 

"Intelligent Snippets basically works with the editor to give you an experience where there's a blank inside the completion," explains Smith. 

"What we're ultimately trying to build towards is an interaction between the human developer and the editing environment, where there's as close of a symbiosis as possible. This is an important step to help our users get the intelligence of these models," says Smith.

Kite has its limits, though. TabNine, autocomplete software from Canadian computer-science undergraduate Jacob Jackson, has taken a novel approach to the developer challenge by using natural-language processing, allowing it to provide smart suggestions for over a dozen programming languages, including Python, JavaScript, Java, C++, C, and PHP. 

Jackson built TabNine on Open AI's GPT-2, which was intended for predicting words in a human-spoken sentence but is used by TabNine to predict the next token typed by developers, based on preceding tokens.  

Whether Jackson's approach is the winning strategy remains to be seen, but for now it appears to offer more flexibility than Kite. 

"We're at a stage where web search was in 1995 or 1994. It's very early days. It's very unclear which technology or which approaches are going to win at the end of the day," says Smith.  

"For TabNine versus Kite, it's a pretty interesting set of trade-offs," says Smith. "TabNine doesn't use any of the semantic information. That means their model doesn't understand or use the deeper structure of the code you're working with. It learns some elements of that, but it's pretty limited."

special feature

How to Implement AI and Machine Learning

The next wave of IT innovation will be powered by artificial intelligence and machine learning. We look at the ways companies can take advantage of it and how to get started.

Read More

Smith points out that one of the key differences between natural language and code is that, in natural language, context is defined locally. 

So if a person says a pronoun like 'that', 'where' or 'herself', the model could look at the words said before it to understand what the speaker is referring to. But for a model to understand a function from a programming language, it would need access to non-local information, according to Smith. 

Python is by no means the wrong language to support if you had to pick one today, but Kite would need to build an entirely new engine to support Java or JavaScript.

"One of the trade-offs of using GPT-2 compared to the model we've used so far, which is the user program analysis engine, is it's not naturally cross-language," says Smith. "You almost need to have a different engine for each language you want to support." 

TabNine's Jackson thinks that framing the trade-offs between the two approaches is flawed because it's not an 'either-or' choice. 

"There's a danger of a false dichotomy here: when comparing deep models with semantic approaches, it's easy to forget that you can use both at once. Indeed, this is what TabNine does when semantic completion is enabled: it uses the semantic completer to filter the deep model's results," Jackson told ZDNet in an email. 

"In my view it's not a question of which approach is better, but rather 'How can these approaches be used together to complement each other?'."

Another AI autocomplete option developers have is Microsoft's IntelliCode in Visual Studio. Smith concedes Microsoft's approach is "simple to build, simple to implement, and very fast", but notes it lacks multi-token completions and takes "very little context" into account.   

"You don't have to worry about performance, you don't have to send your code to a server to be processed," he says of Microsoft's IntelliCode.

That was a major advantage both TabNine and Kite had to work around initially, leading them both to launch as cloud-based services to exploit greater compute power compared with a high-end laptop. 

SEE: Google: Take our free Kotlin programming language courses to build Android apps

However, as of this year TabNine and Kite now offer locally run systems, so developers don't need to send their private source code to the cloud. 

As for why Smith settled on Python rather than JavaScript or Java, the CEO says it was a pragmatic decision: dynamically typed languages like Python and JavaScript are harder to analyze than statically typed languages, such as Java or C#. Plus Python tools aren't as evolved, making it an easier target.  

"The development environments for JavaScript and Python even today aren't really as good for Java or C#," Smith notes.   

As for the choice of Python over JavaScript, Smith contends JavaScript is a more difficult target because there are so many different flavors of it, from the browser version to server versions like React and Node.js. 

"If we wanted to support JavaScript, it would just be a lot more engineering effort to do that, especially in the early days where we just wanted to iterate as quickly as possible, so we liked the lower surface area of Python. Since, we've been kind of lucky in a way that Python has been very successful," says Smith.

More on programming languages