video description
The video shows the Godot code editor with some unfinished code. After the user presses a button offscreen, the code magically completes itself, seemingly due to an AI filling in the blanks. The examples provided include a print_hello_world function and a vector_length function. The user is able to accept and decline the generated code by pressing either tab or backspace
This is an addon I am working on. It can help you write some code and stuff.
It works by hooking into your local LLMs on ollama, which is a FOSS way to run large language models locally.
Here’s a chat interface which is also part of the package
video description
The video shows a chat interface in which the user can talk to a large language model. The model can read the users code an answer questions about it.
Do you have any suggestions for what I can improve? (Besides removing the blue particles around the user text field)
Important: This plugin is WIP and not released yet!
Currently the completion is implemented via keyboard shortcut.
Would you prefer it, if I made it automatically complete the code? I personally feel, that intentionally asking for it to complete the code is more natural than waiting for it to do so.
Are there some other features you would like to see? I am currently working on a function-refactoring UI.
Completed via a keyboard shortcut is perfect.
As far as other features I want; I don’t want any. I just want code completion via keyboard shortcut.
I think a hard aspect is figuring out what context to feed the LLM. Iirc GitHub Copilot only feeds what is in the current file, above the cursor, but I think feeding the whole file + other open code tabs would be super useful.
You are right in that it can be useful to feed in all of the contents in other related files.
However!
LLMs take a really long time before writing anything with a large context input. the fact that githubs copilot can generate code so quickly even though it has to keep the entire code file in context is a miracle to me.
Including all related or opened GDScript files would be way too much for most models and it would likely take about 20 seconds for it to actually start generate some code (also called
first token lag
). So I will likely only implement the current file into the context window, as that might already take some time. Remember, we are running local LLMs here, so not everyone has a blazingly fast GPU or CPU (I use a GTX1060 6GB for instance).Example
I just tried it and it took a good 10 seconds for it to complete some 111 line code without any other context using this pretty small model and then about 6 seconds for it to write about 5 lines of comment documentation (on my CPU). It takes about 1 second with a very short script.
You can try this yourself using something like HuggingChat to test out a big context window model like Command R+ and fill its context windw with some really really long string (copy paste it a bunch times) and see how it takes longer to respond. For me, it’s the difference between one second and 13 seconds!
I am thinking about
embedding
either the current working file, or maybe some other opened files though, to get the most important functions out of the script to keep context length short. This way we can shorten thisfirst token delay
a bit.This is a completely different story with hosted LLMs, as they tend to have blazingly quick
first token delays
, which makes the wait trivial.