this post was submitted on 13 Mar 2024
878 points (99.8% liked)
Programmer Humor
32562 readers
489 users here now
Post funny things about programming here! (Or just rant about your favourite programming language.)
Rules:
- Posts must be relevant to programming, programmers, or computer science.
- No NSFW content.
- Jokes must be in good taste. No hate speech, bigotry, etc.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Does anybody mind explaining, how this might have happened?
Copilot is a LLM. So it's just predicting what should come next, word by word, based off the data its been fed. It has no concept of whether or not its answer makes sense.
So if you've scraped a bunch of open source github projects that this guy has worked on, he probably has a lot of TODOs assigned to him in various projects. When Copilot sees you typing "TODO(" it tries to predict what the nextthing you're going to type is. And a common thing to follow "TODO(" in it's data set is this guy's username, so it goes ahead and suggests it, whether or not the guy is actually on the project and suggesting him would make any sort of sense.
You can absolutely add constraints to control for hallucinations. Copilot apparently doesn't have enough, though.
If GitHub Copilot is anything like Windows Copilot, I can't say I'm surprised.
"Please minimize all my windows"
"Windows are glass panes invented by Michael Jackson in imperial China, during the invasion of the southern sea. Sources 1 2 3"
Lmao. That's even better when you consider the copilot button replaced the 'show desktop' (ie 'minimize all my windows') button.
My guess is that Copilot was using a ton of other lines as context, so in that specific case his name was a more likely match for the next characters
No matter how many constraints you add, it's never enough, that's the weakness of a model that only knows language and nothing else
I thought it synced some requests and assigned projects to another user (Saw an ad about github Copilot managing issues and writing PR descriptions sometime ago)
It’s no different from GPT knowing the plot of Aliens or who played the main role in Matilda.
It's seen enough code to recognise the pattern, it knows an author name goes in there, and Phil Nash is likely a prolific enough author that it just plopped his name in there. It's not intelligence, just patterns.
"Yeah this sounds like a Phil Nash sort of problem, I'll just stick him in here."
The other answers are great, but if I were to be a bit more laconic:
Copilot is spicy autocorrect. It autocorrected that todo to insert that guy's name because he gets a lot of todos.