The debate about whether AI will eventually take over our jobs, including even those related to software engineering, is not a new one. This debate may pop up again now that OpenAI and Microsoft have collaborated to offer an AI-powered pair programming capability in Visual Studio Code.
The dedicated webpage has emphasized that this capability is different than standard autocomplete functionalities available in most integrated development environments (IDE). This is because it uses context from your comments, function names, and the code itself to try and develop safe code for you. An example of this is where you write a meaningful comment and a function name below it, and GitHub Copilot automatically writes the rest of the function for you including handling the return types. In an ideal scenario, this means that you could just write a comment describing what your piece of code should do, and have AI actually write out the code.
In terms of technicalities, GitHub Copilot is powered by OpenAI Codex, which is an AI system developed by OpenAI. It has been trained on code and natural English language available public GitHub repositories. While the AI is obviously not perfect and the developer using it should be held accountable at the end of the day, when trained on empty Python function bodies, it produced the correct result on the first attempt 43% of the time. After 10 retries, this number grew to 57%, improvements which are common in AI development in general.
GitHub Copilot will get better if it is used more. Despite being trained on publicly available code, the AI rarely produces code verbatim from the training set, which means that the originality of the work of developers is preserved. Similarly, work will also be carried out towards removing unsafe code from training data. It already has filters in place to remove offensive words or processing suggestions in "sensitive context", this may still happen as the project is at a technical preview stage.
Those worried about having their code and telemetry sent to GitHub Copilot should read the statement from the project's FAQs page below carefully:
In order to generate suggestions, GitHub Copilot transmits part of the file you are editing to the service. This context is used to synthesize suggestions for you. GitHub Copilot also records whether the suggestions are accepted or rejected. This telemetry is used to improve future versions of the AI system, so that GitHub Copilot can make better suggestions for all users in the future. In the future we will give users the option to control how their telemetry is used.
All data is transmitted and stored securely. Access to the telemetry is strictly limited to individuals on a need-to-know basis. Inspection of the gathered source code will be predominantly automatic, and when humans read it, it is specifically with the aim of improving the model or detecting abuse.
[…] We use telemetry data, including information about which suggestions users accept or reject, to improve the model. We do not reference your private code when generating code for other users.
It is important to note that GitHub Copilot is only available as a technical preview for Visual Studio Code for now and since it requires powerful hardware at the service's backend, it is only being made available to a select group of testers for free – although you can sign up here to join the waitlist. The team is undecided yet on whether GitHub Copilot will be available at a cost once it becomes commercially available at scale, but this is a question it plans to answer after the technical preview is over.