this post was submitted on 12 Apr 2025
1262 points (98.5% liked)

Programmer Humor

22748 readers
787 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] jorm1s@sopuli.xyz 4 points 1 week ago (2 children)

Isn't writing tests with AI like a really bad idea? I mean, the whole point of writing separate tests is hoping that you won't make the same mistakes twice, and therefore catch any behavior in the code that does not match your intent. But If you use an LLM to write a test using said code as context (instead of the original intent you would use yourself), there's a risk that it'll just write a test case that makes sure the code contains the wrong behavior.

Okay, it might still be okay for regression testing, but you're still missing most of the benefit you'd get by writing the tests manually. Unless you only care about closing tickets, that is.

[–] Grazed@lemmy.world 5 points 1 week ago

"Unless you only care about closing tickets, that is."

Perfect. I'll use it for tests at work then.

[–] EmilyIsTrans@lemmy.blahaj.zone 2 points 1 week ago* (last edited 1 week ago)

I've used it most extensively for non-professional projects, where if I wasn't using this kind of tooling to write tests they would simply not be written. That means no tickets to close either. That said, I am aware that the AI is almost always at best testing for regression (I have had it correctly realise my logic is incorrect and write tests that catch it, but that is by no means reliable) Part of the "hand holding" I mentioned involves making sure it has sufficient coverage of use cases and edge cases, and that what it expects to be the correct is actually correct according to intent.

I essentially use the AI to generate a variety of scenarios and complementary test data, then further evaluating it's validity and expanding from there.