Back

Human coders are still better than LLMs

382 points11 hoursantirez.com
mattnewton10 hours ago

This matches my experience. I actually think a fair amount of value from LLM assistants to me is having a reasonably intelligent rubber duck to talk to. Now the duck can occasionally disagree and sometimes even refine.

https://en.m.wikipedia.org/wiki/Rubber_duck_debugging

I think the big question everyone wants to skip right to and past this conversation is, will this continue to be true 2 years from now? I don’t know how to answer that question.

Buttons8403 hours ago

LLMs aren't my rubber duck, they're my wrong answer.

You know that saying that the best way to get an answer online is to post a wrong answer? That's what LLMs do for me.

I ask the LLM to do something simple but tedious, and then it does it spectacularly wrong, then I get pissed off enough that I have the rage-induced energy to do it myself.

Buttons8403 hours ago

I'm probably suffering undiagnosed ADHD, and will get stuck and spend minutes picking a function name and then writing a docstring. LLMs do help with this even if they get the code wrong, because I usually won't bother to fix their variables names or docstring unless needed. LLMs can reliably solve the problem of a blank-page.

linotype2 hours ago

This. I have ADHD and starting is the hardest part for me. With an LLM it gets me from 0 to 20% (or more) and I can nail it for the rest. It’s way less stressful for me to start now.

+1
raihansaputra1 hour ago
therealpygon3 hours ago

LLMs follow instructions. Garbage in = garbage out generally. When attention is managed and a problem is well defined and necessary materials are available to it, they can perform rather well. On the other hand, I find a lot of the loosely-goosey vibe coding approach to be useless and gives a lot of false impressions about how useful LLMs can be, both too positive and too negative.

GiorgioG3 hours ago

So what you’re saying is you need to be very specific and detailed when writing your specifications for the LLM to spit out the code you want. Sounds like I can just skip the middle man and code it myself.

+1
AndrewKemendo1 hour ago
AndrewKemendo1 hour ago

This seems to be what’s happened

People are expecting perfection from bad spec

Isn’t that what engineers are (rightfully) always complaining about to BD?

myvoiceismypass2 hours ago

They should maybe have a verifiable specification for said instructions. Kinda like a programming language maybe!

Affric2 hours ago

Yep.

I like maths, I hate graphing. Tedious work even with state of the art libraries and wrappers.

LLMs do it for me. Praise be.

lanstin16 minutes ago

Yeah, I write a lot of little data analysis scripts and stuff, and I am happy just to read the numbers, but now I get nice PNGs of the distributions and so on from LLM, and people like that.

AndrewKemendo2 hours ago

Out of curiosity can you give me an example prompt(s) you’ve used and been disappointed

I see these comments all the time and they don’t reflect my experience so I’m curious what your experience has been

Buttons84014 minutes ago

I asked Chat GPT 4o to write an Emacs function to highlight a line. This involves setting the "mark" at the beginning, and the "point" at the end. It would only set the point, so I corrected it "no, you have to set both", but even after correction it would move the point to the beginning, and then moved the point again to the end, without ever touching the mark.

seattle_spring3 hours ago

This has been my experience as well. The biggest problem is that the answers look plausible, and only after implementation and experimentation do you find them to be wrong. If this happened every once in a while then it wouldn't be a big deal, but I'd guess that more than half of the answers and tutorials I've received through ChatGPT have ended up being plain wrong.

God help us if companies start relying on LLMs for life-or-death stuff like insurance claim decisions.

dabraham12482 hours ago

I'm not sure if you're being sarcastic, but in case you're not... From https://arstechnica.com/health/2023/11/ai-with-90-error-rate...

"UnitedHealth uses AI model with 90% error rate to deny care, lawsuit alleges" Also "The use of faulty AI is not new for the health care industry."

bsder3 hours ago

LLMs are a decent search engine a la Google circa 2005.

It's been 20 years since that, so I think people have simply forgotten that a search engine can actually be useful as opposed to ad infested SEO sewage sludge.

The problem is that the conversational interface, for some reason, seems to turn off the natural skepticism that people have when they use a search engine.

AdieuToLogic1 hour ago

> LLMs are a decent search engine a la Google circa 2005.

Statistical text (token) generation made from an unknown (to the user) training data set is not the same as a keyword/faceted search of arbitrary content acquired from web crawlers.

> The problem is that the conversational interface, for some reason, seems to turn off the natural skepticism that people have when they use a search engine.

For me, my skepticism of using a statistical text generation algorithm as if it were a search engine is because a statistical text generation algorithm is not a search engine.

+1
wvenable55 minutes ago
andrekandre2 hours ago

  > the conversational interface, for some reason, seems to turn off the natural skepticism that people have
n=1 but after having chatgpt "lie" to me more than once i am very skeptical of it and always double check it, whereas something like tv or yt videos i still find myself being click-baited or grifted (iow less skeptical) much more easily still... any large studies about this would be very interesting...
+3
myvoiceismypass2 hours ago
marcosdumay9 hours ago

It's a damning assertive duck, completely out of proportion to its competence.

I've seen enough people led astray by talking to it.

jasonm231 hour ago

Try a system prompt like this:

- - -

System Prompt:

You are ChatGPT, and your goal is to engage in a highly focused, no-nonsense, and detailed way that directly addresses technical issues. Avoid any generalized speculation, tangential commentary, or overly authoritative language. When analyzing code, focus on clear, concise insights with the intent to resolve the problem efficiently. In cases where the user is troubleshooting or trying to understand a specific technical scenario, adopt a pragmatic, “over-the-shoulder” problem-solving approach. Be casual but precise—no fluff. If something is unclear or doesn’t make sense, ask clarifying questions. If surprised or impressed, acknowledge it, but keep it relevant. When the user provides logs or outputs, interpret them immediately and directly to troubleshoot, without making assumptions or over-explaining.

- - -

foxyv9 hours ago

Same here. When I'm teaching coding I've noticed that LLMs will confuse the heck out of students. They will accept what it suggests without realizing that it is suggesting nonsense.

cogogo6 hours ago

I’m self taught and don’t code that much but I feel like I benefit a ton from LLMs giving me specific answers to questions that would take me a lot of time to figure out with documentation and stack overflow. Or even generating snippets that I can evaluate whether or not will work.

But I actually can’t imagine how you can teach someone to code if they have access to an LLM from day one. It’s too easy to take the easy route and you lose the critical thinking and problem solving skills required to code in the first place and to actually make an LLM useful in the second. Best of luck to you… it’s a weird time for a lot of things.

*edit them/they

ilamont3 hours ago

> I’m self taught and don’t code that much but I feel like I benefit a ton from LLMs giving me specific answers to questions that would take me a lot of time to figure out with documentation and stack overflow

Same here. Combing discussion forums and KB pages for an hour or two, seeking how to solve a certain problem with a specific tool has been replaced by a 50-100 word prompt in Gemini which gives very helpful replies, likely derived from many of those same forums and support docs.

Of course I am concerned about accuracy, but for most low-level problems it's easy enough to test. And you know what, many of those forum posts or obsolete KB articles had their own flaws, too.

XorNot5 hours ago

This was what promptly led me to turning off Jetbrains AI assistant: the multiline completion was incredibly distracting to my chain of thought, particularly when it would suggest things that looked right but weren't. Stopping and parsing the suggestion to realize if it was right or wrong would completely kill my flow.

SchemaLoad4 hours ago

The inline suggestions feel like that annoying person who always interrupts you with what they think you were going to finish with but rarely ever gets it right.

hn_acc12 hours ago

With VS Code and Augment (company won't allow any other AI, and I'm not particularly inclined to push - but it did just switch to o4, IIRC), the main benefit is that if I'm fiddling / debugging some code, and need to add some debug statements, it can almost always expand that line successfully for me, following our idiom for debugging - which saves me a few seconds. And it will often suggest the same debugging statement, even if it's been 3 weeks and in a different git branch where I las coded that debugging statement.

My main annoyance? If I'm in that same function, it still remembers the debugging / temporary hack I tried 3 months ago and haven't done since and will suggest it. And heck, even if I then move to a different part of the file or even a different file, it will still suggest that same hack at times, even though I used it exactly once and have not since.

Once you accept something, it needs some kind of temporal feedback mechanism to timeout even accepted solutions over time, so it doesn't keep repeating stuff you gave up on 3 months ago.

Our codebase is very different from 98% of the coding stuff you'll find online, so anything more than a couple of obvious suggestions are complete lunacy, even though they've trained it on our codebase.

chucksmash5 hours ago

Tbf, there's a phase of learning to code where everything is pretty much an incantation you learn because someone told you "just trust me." You encounter "here's how to make the computer print text in Python" before you would ever discuss strings or defining and invoking functions, for instance. To get your start you kind of have to just accept some stuff uncritically.

It's hard to remember what it was like to be in that phase. Once simple things like using variables are second nature, it's difficult to put yourself back into the shoes of someone who doesn't understand the use of a variable yet.

+1
ZoomZoomZoom2 hours ago
eszed5 hours ago

Yeah, and accepting the LLM uncritically* is exactly what you shouldn't do in any non-trivial context.

But, as a sibling poster pointed out: for now.

+1
tharkun__4 hours ago
klntsky9 hours ago

I would argue that they are never led astray by chatting, but rather by accepting the projection of their own prompt passed through the model as some kind of truth.

When talking with reasonable people, they have an intuition of what you want even if you don't say it, because there is a lot of non-verbal context. LLMs lack the ability to understand the person, but behave as if they had it.

marcosdumay9 hours ago

Most of the times, people are led astray by following average advice on exceptional circumstances.

People with a minimum amount of expertise stop asking for advice for average circumstances very quickly.

sho_hn5 hours ago

This is right on the money. I use LLMs when I am reasonably confident the problem I am asking it is well-represented in the training data set and well within its capabilities (this has increased over time).

This means I use it as a typing accelerator when I already know what I want most of the time, not for advice.

As an exploratory tool sometimes, when I am sure others have solved a problem frequently, to have it regurgitate the average solution back at me and take a look. In those situations I never accept the diff as-is and do the integration manually though, to make sure my brain still learns along and I still add the solution to my own mental toolbox.

sigmoid108 hours ago

It's mostly a question of experience. I've been writing software long enough that when I give chat models some code and a problem, I can immediately tell if they understood it or if they got hooked on something unrelated. But junior devs will have a hell of a hard time, because the raw code quality that LLMs generate is usually top notch, even if the functionality is completely off.

+1
daveguy4 hours ago
traceroute668 hours ago

> When talking with reasonable people

When talking with reasonable people, they will tell you if they don't understand what you're saying.

When talking with reasonable people, they will tell you if they don't know the answer or if they are unsure about their answer.

LLMs do none of that.

They will very happily, and very confidently, spout complete bullshit at you.

It is essentially a lotto draw as to whether the answer is hallucinated, completely wrong, subtly wrong, not ideal, sort of right or correct.

An LLM is a bit like those spin the wheel game shows on TV really.

+1
bbarn7 hours ago
+1
seunosewa4 hours ago
protocolture4 hours ago

I spend a lot of time working shit out to prove the rubber duck wrong and I am not completely sure this is a bad working model.

prisenco6 hours ago

I use it as a rubber duck but you're right. Treat it like a brilliant idiot and never a source of truth.

I use it for what I'm familiar with but rusty on or to brainstorm options where I'm already considering at least one option.

But a question on immunobiology? Waste of time. I have a single undergraduate biology class under my belt, I struggled for a good grade then immediately forgot it all. Asking it something I'm incapable of calling bullshit on is a terrible idea.

But rubber ducking with AI is still better than let it do your work for you.

amelius5 hours ago

If this is a problem for you, just add "... and answer in the style of a drunkard" to your prompts.

TedDallas3 hours ago

Yeah, the problem is if you don't understand the problem space then you are going to lean heavy on the LLM. And that can lead you astray. Which is why you still need people who are experts to validate solutions and provide feedback like Op.

My most productive experiences with LLMs is to have my design well thought out first, ask it to help me implement, and then help me debug my shitty design. :-)

drivenextfunc8 hours ago

Regarding the stubborn and narcissistic personality of LLMs (especially reasoning models), I suspect that attempts to make them jailbreak-resistant might be a factor. To prevent users from gaslighting the LLM, trainers might have inadvertently made the LLMs prone to gaslighting users.

taneq2 hours ago

Treat it as that enthusiastic co-worker who’s always citing blog posts and has a lot of surface knowledge about style and design patterns and whatnot, but isn’t that great on really understanding algorithms.

They can be productive to talk to but they can’t actually do your job.

all26 hours ago

My typical approach is prompt, be disgusted by the output, tinker a little on my own, prompt again -- but more specific, be disgusted again by the output, tinker a littler more, etc.

Eventually I land on a solution to my problem that isn't disgusting and isn't AI slop.

Having a sounding board, even a bad one, forces me to order my thinking and understand the problem space more deeply.

suddenlybananas5 hours ago

Why not just write the code at that point instead of cajoling an AI to do it.

all22 hours ago

I don't cajole the model to do it. I rarely use what the model generates. I typically do my own thing after making an assessment of what the model writes. I orient myself in the problem space with the model, then use my knowledge to write a more concise solution.

XorNot4 hours ago

This is the part I don't get about vibe coding: I've written specification documents before. They frequently are longer and denser then the code required to implement them.

Typing longer and longer prompts to LLMs to not get what I want seems like a worse experience.

lupire4 hours ago

Because saving hours of time is nice.

eptcyka5 hours ago

Some humans are the same.

dwattttt5 hours ago

We also don't aim to elevate them. We instead try not to give them responsibility until they're able to handle it.

olddustytrail3 hours ago

Unless you're an American deciding who should be president.

schwartzworld8 hours ago

For me it's like having a junior developer work under me who knows APIs inside and out, but has no common sense about architecture. I like that I delegate tasks to them so that my brain can be free for other problems, but it makes my job much more review heavy than before. I put every PR through 3-4 review cycles before even asking my team for a review.

eslaught7 hours ago

How do you not completely destroy your concentration when you do this though?

I normally build things bottom up so that I understand all the pieces intimately and when I get to the next level of abstraction up, I know exactly how to put them together to achieve what I want.

In my (admittedly limited) use of LLMs so far, I've found that they do a great job of writing code, but that code is often off in subtle ways. But if it's not something I'm already intimately familiar with, I basically need to rebuild the code from the ground up to get to the point where I understand it well enough so that I can see all those flaws.

At least with humans I have some basic level of trust, so that even if I don't understand the code at that level, I can scan it and see that it's reasonable. But every piece of LLM generated code I've seen to date hasn't been trustworthy once I put in the effort to really understand it.

schwartzworld7 hours ago

I use a few strategies, but it's mostly the same as if I was mentoring a junior. A lot of my job already involved breaking up big features into small tickets. If the tasks are small enough, juniors and LLMs have an easier time implementing things and I have an easier time reviewing. If there's something I'm really unfamiliar with, it should be in a dedicated function backed by enough tests that my understanding of the implementation isn't required. In fact, LLMs do great with TDD!

> At least with humans I have some basic level of trust, so that even if I don't understand the code at that level, I can scan it and see that it's reasonable.

If you can't scan the code and see that it's reasonable, that's a smell. The task was too big or its implemented the wrong way. You'd feel bad telling a real person to go back and rewrite it a different way but the LLM has no ego to bruise.

I may have a different perspective because I already do a lot of review, but I think using LLMs means you have to do more of it. What's the excuse for merging code that is "off" in any way? The LLM did it? It takes a short time to review your code, give your feedback to the LLM and put up something actually production ready.

> But every piece of LLM generated code I've seen to date hasn't been trustworthy once I put in the effort to really understand it.

That's why your code needs tests. More tests. If you can't test it, it's wrong and needs to be rewritten.

xandrius6 hours ago

Keep using it and you'll see. Also that depends on the model and prompting.

My approach is to describe the task in great detail, which also helps me completing my own understanding of the problem, in case I hadn't considered an edge case or how to handle something specific. The more you do that the closer the result you get is to your own personal taste, experience and design.

Of course you're trading writing code vs writing a prompt but it's common to make architectural docs before making a sizeable feature, now you can feed that to the LLM instead of just having it be there.

akshay_trikha2 hours ago

I've had this same thought that it would be nice to have an AI rubber ducky to bounce ideas off of while pair programming (so that you don't sound dumb to your coworkers & waste their time).

This is my first comment so I'm not sure how to do this but I made a BYO-API key VSCode extension that uses the OpenAI realtime API so you can have interactive voice conversations with a rubber ducky. I've been meaning to create a Show HN post about it but your comment got me excited!

In the future I want to build features to help people communicate their bugs / what strategies they've tried to fix them. If I can pull it off it would be cool if the AI ducky had a cursor that it could point and navigate to stuff as well.

Please let me know if you find it useful https://akshaytrikha.github.io/deep-learning/2025/05/23/duck...

AdieuToLogic57 minutes ago

> I've had this same thought that it would be nice to have an AI rubber ducky to bounce ideas off of while pair programming (so that you don't sound dumb to your coworkers & waste their time).

I humbly suggest a more immediate concern to rectify is identifying how to improve the work environment such that the fear one might "sound dumb to your coworkers & waste their time" does not exist.

akadeb1 hour ago

I like the sound of that! I think youre gonna like what we are building here https://github.com/akdeb/ElatoAI

Its as if the rubber duck was actually on the desk while youre programming and if we have an MCP that can get live access to code it could give you realtime advice.

akshay_trikha31 minutes ago

Wow, that's really cool thanks for open sourcing! I might dig into your MCP I've been meaning to learn how to do that.

I genuinely think this could be great for toys that kids grow up with i.e. the toy could adjust the way it talks depending on the kids age and remember key moments in their life - could be pretty magical for a kid

p1necone6 hours ago

> the duck can occasionally disagree

This has not been my experience. LLMs have definitely been helpful, but generally they either give you the right answer or invent something plausible sounding but incorrect.

If I tell it what I'm doing I always get breathless praise, never "that doesn't sound right, try this instead."

crazygringo6 hours ago

That's not my experience. I routinely get a polite "that might not be the optimal solution, have you considered..." when I'm asking whether I should do something X way with Y technology.

Of course it has to be something the LLM actually has lots of material it's trained with. It won't work with anything remotely cutting-edge, but of course that's not what LLM's are for.

But it's been incredibly helpful for me in figuring out the best, easiest, most idiomatic ways of using libraries or parts of libraries I'm not very familiar with.

Jarwain2 hours ago

I find it very much depends on the LLM you're using. Gemini feels more likely to push back than claude 3.7 is. Haven't tried claude 4 yet

mbrameld4 hours ago

Ask it. Instead of just telling it what you're doing and expecting it to criticize that, ask it directly for criticism. Even better, tell it what you're doing, then tell it to ask you questions about what you're doing until it knows enough to recommend a better approach.

lupire4 hours ago

This is key. Humans each have a personality and some sense of mood. When you ask for help, you choose ask and that person can sense your situation. LLM has every personality and doesn't know your situation. You have to tell it which personality to use and what your situation is.

_tom_10 hours ago

For me, it's a bit like pair programming. I have someone to discuss ideas with. Someone to review my code and suggest alternative approaches. Some one that uses different feature than I do, so I learn from them.

traceroute668 hours ago

I guess if you enjoy programming with someone you can never really trust, then yeah, sure, its "a bit like" pairs programming.

mock-possum3 hours ago

Trust, but verify ;]

platevoltage9 hours ago

This is how I use it too. It's great at quickly answering questions. I find it particularly useful if I have to work with a language of framework that I'm not fully experienced in.

12_throw_away7 hours ago

> I find it particularly useful if I have to work with a language of framework that I'm not fully experienced in

Yep - my number 1 use case for LLMs is as a template and example generator. It actually seems like a fairly reasonable use for probabilistic text generation!

johnnyanmac5 hours ago

>I think the big question everyone wants to skip right to and past this conversation is, will this continue to be true 2 years from now?

For me, it's less "conversation to be skipped" and more about "can we even get to 2 years from now"? There's so much insability right now that it's hard to say what anything will look like in 6 months. "

marcosdumay10 hours ago

LLMs will still be this way 10 years from now.

But IDK if somebody won't create something new that gets better. But there is no reason at all to extrapolate our current AIs into something that solves programing. Whatever constraints that new thing will have will be completely unrelated to the current ones.

smokel10 hours ago

Stating this without any arguments is not very convincing.

Perhaps you remember that language models were completely useless at coding some years ago, and now they can do quite a lot of things, even if they are not perfect. That is progress, and that does give reason to extrapolate.

Unless of course you mean something very special with "solving programming".

bigstrat20034 hours ago

> Perhaps you remember that language models were completely useless at coding some years ago, and now they can do quite a lot of things, even if they are not perfect.

IMO, they're still useless today, with the only progress being that they can produce a more convincing facade of usefulness. I wouldn't call that very meaningful progress.

wvenable51 minutes ago

I don't know how someone can legitimately say that they're useless. Perfect, no. But useless, also no.

drdeca2 hours ago

I’ve found them somewhat useful? Not for big things, and not for code for work.

But for small personal projects? Yes, helpful.

marcosdumay9 hours ago

Why state the same arguments everybody has been repeating for ages?

LLMs can only give you code that somebody has wrote before. This is inherent. This is useful for a bunch of stuff, but that bunch won't change if OpenAI decides to spend the GDP of Germany training one instead of Costa Rica.

vidarh7 hours ago

> LLMs can only give you code that somebody has wrote before. This is inherent.

This is trivial to prove to be false.

Invent a programming language that does not exist. Describe its semantics to an LLM. Ask it to write a program to solve a problem in that language. It will not always work, but it will work often enough to demonstrate that they are very much capable of writing code that has never been written before.

The first time I tried this was with GPT3.5, and I had it write code in an unholy combination of Ruby and INTERCAL, and it had no problems doing that.

Similarly giving it a grammar of a hypothetical language, and asking it to generate valid text in a language that has not existed before also works reasonably well.

This notion that LLMs only spit out things that has been written before might have been reasonable to believe a few years ago, but it hasn't been a reasonable position to hold for a long time at this point.

+1
JoshCole8 hours ago
+3
Epa0959 hours ago
+2
nahbruheem9 hours ago
rhubarbtree9 hours ago

That’s not true. LLMs are great translators, they can translate ideas to code. And that doesn’t mean it has to be recalling previously seen text.

Retric9 hours ago

Progress sure, but the rate the’ve improved hasn’t been particularly fast recently.

Programming has become vastly more efficient in terms of programmer effort over decades, but making some aspects of the job more efficient just means all your effort it spent on what didn’t improve.

lexandstuff5 hours ago

People seem to have forgotten how good the 2023 GPT-4 really was at coding tasks.

mirsadm9 hours ago

The latest batch of LLMs has been getting worse in my opinion. Claude in particular seems to be going backwards with every release. The verbosity of the answers is infuriating. You ask it a simple question and it starts by inventing the universe, poorly

apwell239 hours ago

> Perhaps you remember that language models were completely useless at coding some years ago

no i don't remember that. They are doing similar things now that they did 3 yrs ago. They were still a decent rubber duck 3 yrs ago.

vidarh7 hours ago

And 6 years ago GPT2 had just been released. You're being obtuse by interpreting "some years" as specifically 3.

Bukhmanizer9 hours ago

There are a couple people I work with who clearly don’t have a good understanding of software engineering. They aren’t bad to work with and are in fact great at collaborating and documenting their work, but don’t seem to have the ability to really trace through code and logically understand how it works.

Before LLMs it was mostly fine because they just didn’t do that kind of work. But now it’s like a very subtle chaos monkey has been unleashed. I’ve asked on some PRs “why is this like this? What is it doing?” And the answer is “ I don’t know, ChatGPT told me I should do it.”

The issue is that it throws basically all their code under suspicion. Some of it works, some of it doesn’t make sense, and some of it is actively harmful. But because the LLMs are so good at giving plausible output I can’t just glance at the code and see that it’s nonsense.

And this would be fine if we were working on like a crud app where you can tell what is working and broken immediately, but we are working on scientific software. You can completely mess up the results of a study and not know it if you don’t understand the code.

protocolture3 hours ago

>And the answer is “ I don’t know, ChatGPT told me I should do it.”

This weirds me out. Like I use LLMs A LOT but I always sanity check everything, so I can own the result. Its not the use of the LLM that gets me its trying to shift accountability to a tool.

jajko7 hours ago

Sounds almost like you definitely shouldnt use llms nor those juniors for such an important work.

Is it just me or are we heading into a period of explosion of software done, but also a massive drop of its quality? Not uniformly, just a bit of chaotic spread

Bukhmanizer5 hours ago

> llms nor those juniors for such an important work.

Yeah we shouldn’t and I limit my usage to stuff that is easily verifiable.

But there’s no guardrails on this stuff, and one thing that’s not well considered is how these things which make us more powerful and productive can be destructive in the hands of well intentioned people.

palmotea7 hours ago

> Is it just me or are we heading into a period of explosion of software done, but also a massive drop of its quality? Not uniformly, just a bit of chaotic spread

I think we are, especially with executives mandating the use LLMs use and expecting it to massively reduce costs and increase output.

For the most part they don't actually seem to care that much about software quality, and tend to push to decrease quality at every opportunity.

jrochkind16 hours ago

Which is frightening, because it's not like our industry is known for producing really high quality code at the starting point before LLM authored code.

gerad9 hours ago

It's like chess. Humans are better for now, they won't be forever, but humans plus software is going to better than either alone for a long time.

seadan838 hours ago

> It's like chess. Humans are better for now, they won't be forever

This is not an obviously true statement. There needs to be proof that there are no limiting factors that are computationally impossible to overcome. It's like watching a growing child, grow from 3 feet to 4 feet, and then saying "soon, this child will be the tallest person alive."

overfeed7 hours ago

One of my favourite XKCD comics is about extrapolation https://xkcd.com/605/

kelseydh9 hours ago

The time where humans + computers in chess were better than just computers was not a long time. That era ended well over a decade ago. Might have been true for only 3-5 years.

qsort9 hours ago

Unrelated to the broader discussion, but that's an artifact of the time control. Humans add nothing to Stockfish in a 90+30 game, but correspondence chess, for instance, is played with modern engines and still has competitive interest.

dwohnitmok8 hours ago

It is not clear to me whether human input really matters in correspondence chess at this point either.

I mused about this several years ago and still haven't really gotten a clear answer one way or the other.

https://news.ycombinator.com/item?id=33022581

LandR8 hours ago

What do you mean? Chess engines are incredibly far ahead of humans right now.

Even a moderately powered machine running stockfish will destroy human super gms.

Sorry, after reading replies to this post i think I've misunderstood what you meant :)

hollerith8 hours ago

I think he knows that. There was a period from the early 1950s (when people first started writing chess-playing software) to 1997 when humans were better at chess than computers were, and I think he is saying that we are still in the analogous period for the skill of programming.

But he should've know that people would jump at the opportunity to contradict him and should've written his comment so as not to admit such an easily-contradictable interpretation.

LandR8 hours ago

Yes, amended my post. I understand what he was saying now. Thanks.

Wasn't trying to just be contradictory or arsey

seadan838 hours ago

The phrasing was perhaps a bit odd. For a while, humans were better at Chess, until they weren't. OP is hypothesizing it will be a similar situation for programming. To boot, it was hard to believe for a long time that computers would ever be better than a humans at chess.

apwell239 hours ago

its not like chess

quantadev3 hours ago

Your information is quite badly out of date. AI can now beat humans at not only chess but 99% of all intellectual exercises.

vFunct9 hours ago

No guarantee that will happen. LLMs are still statistically based. It's not going to give you edgier ideas, like filling a glass of wine to the rim.

Use them for the 90% of your repetitive uncreative work. The last 10% is up to you.

skydhash5 hours ago

The pain of that 90% work is how you get libraries and framework. Imagine having many different implementation of sorting algorithms inside your codebase.

vFunct3 hours ago

OK now we have to spend time figuring out the framework.

It's why people say just write plain Javascript, for example.

bandoti5 hours ago

My take is that AI is very one-dimensional (within its many dimensions). For instance, I might close my eyes and imagine an image of a tree structure, or a hash table, or a list-of-trees, or whatever else; then I might imagine grabbing and moving the pieces around, expanding or compressing them like a magician; my brain connects sight and sound, or texture, to an algorithm. However people think about problems is grounded in how we perceive the world in its infinite complexity.

Another example: saying out loud the colors red, blue, yellow, purple, orange, green—each color creates a feeling that goes beyond its physical properties into the emotions and experiences. AI image-generation might know the binary arrangement of an RGBA image but actually, it has NO IDEA what it is to experience colour. No idea how to use the experience of colour to teach a peer of an algorithm. It regurgitates a binary representation.

At some point we’ll get there though—no doubt. It would be foolish to say never! For those who want to get there before everyone else probably should focus on the organoids—because most powerful things come from some Faustian monstrosity.

eddd-ddde5 hours ago

This is really funny to read as someone who CANNOT imagine anything more complex than the most simple shape like a circle.

Do you actually see a tree with nodes that you can rearrange and have the nodes retain their contents and such?

bandoti4 hours ago

Haha—yeah, for me the approach is always visual. I have to draw a picture to really wrap my brain around things! Other people I’d imagine have their own human, non-AI way to organize a problem space. :)

I have been drawing all my life and studied traditional animation though, so it’s probably a little bit of nature and nurture.

Waterluvian6 hours ago

Same. Just today I used it to explore how a REST api should behave in a specific edge case. It gave lots of confident opinions on options. These were full of contradictions and references to earlier paragraphs that didn’t exist (like an option 3 that never manifested). But just by reading it, I rubber ducked the solution, which wasn’t any of what it was suggesting.

joshdavham9 hours ago

> I actually think a fair amount of value from LLM assistants to me is having a reasonably intelligent rubber duck to talk to.

I wonder if the term "rubber duck debugging" will still be used much longer into the future.

layer86 hours ago

As long as it remains in the training material, it will be used. ;)

ortusdux8 hours ago

> I think the big question everyone wants to skip right to and past this conversation is, will this continue to be true 2 years from now? I don’t know how to answer that question.

I still think about Tom Scott's 'where are we on the AI curve' video from a few years back. https://www.youtube.com/watch?v=jPhJbKBuNnA

empath758 hours ago

Just the exercise of putting my question in a way that the LLM could even theoretically provide a useful response is enough for me to figure out how to solve the problem a good percentage of the time.

cortesoft9 hours ago

Currently, I find AI to be a really good autocomplete

jdiff9 hours ago

The crazy thing is that people think that a model designed to predict sequences of tokens from a stem, no matter how advanced the model, to be much more than just "really good autocomplete."

It is impressive and very unintuitive just how far that can get you, but it's not reductive to use that label. That's what it is on a fundamental level, and aligning your usage with that will allow it to be more effective.

lavelganzu7 hours ago

There's a plausible argument for it, so it's not a crazy thing. You as a human being can also predict likely completions of partial sentences, or likely lines of code given surrounding lines of code, or similar tasks. You do this by having some understanding of what the words mean and what the purpose of the sentence/code is likely to be. Your understanding is encoded in connections between neurons.

So the argument goes: LLMs were trained to predict the next token, and the most general solution to do this successfully is by encoding real understanding of the semantics.

vidarh7 hours ago

It's trivial to demonstrate that it takes only a tiny LLM + a loop to a have a Turing complete system. The extension of that is that it is utterly crazy to think that the fact it is "a model designed to predict sequences of tokens" puts much of a limitation on what an LLM can achieve - any Turing complete system can by definition simulate any other. To the extent LLMs are limited, they are limited by training and compute.

But these endless claims that the fact they're "just" predicting tokens means something about their computational power are based on flawed assumptions.

suddenlybananas5 hours ago

The fact they're Turing complete isn't really getting at the heart of the problem. Python is Turing complete and calling python "intelligent" would be a category error.

fl73059 hours ago

> "The crazy thing is that people think that a model designed to"

It's even crazier that some people believe that humans "evolved" intelligence just by nature selecting the genes which were best at propagating.

Clearly, human intelligence is the product of a higher being designing it.

/s

dwaltrip6 hours ago

It’s reductive and misleading because autocomplete, as it’s commonly known, existed for many years before generative AI, and is very different and quite dumber than LLMs.

sunrunner8 hours ago

Earlier this week ChatGPT found (self-conscious as I am of the personification of this phrasing) a place where I'd accidentally overloaded a member function by unintentionally giving it the name of something from a parent class, preventing the parent class function from ever being run and causing <bug>.

After walking through a short debugging session where it tried the four things I'd already thought of and eventually suggested (assertively but correctly) where the problem was, I had a resolution to my problem.

There are a lot of questions I have around how this kind of mistake could simply just be avoided at a language level (parent function accessibility modifiers, enforcing an override specifier, not supporting this kind of mistake-prone structure in the first place, and so on...). But it did get me unstuck, so in this instance it was a decent, if probabilistic, rubber duck.

amelius5 hours ago

It's also quite good at formulating regular expressions based on one or two example strings.

mock-possum3 hours ago

Yeah in my experience as long as you don’t stray too far off the beaten path, LLMs are great at just parroting conventional wisdom for how to implement things - but the second you get to something more complicated - or especially tricky bug fixing that requires expensive debuggery - forget about it, they do more harm than good. Breaking down complex tasks into bite sized pieces you can reasonably expect the robot to perform is part of the art of the LLM.

bossyTeacher9 hours ago

I think of them as highly sycophant LSD-minded 2nd year student who has done some programming

koonsolo7 hours ago

It seems to me we're at the flat side of the curve again. I haven't seen much real progress in the last year.

It's ignorant to think machines will not catch up to our intelligence at some point, but for now, it's clearly not.

I think there needs to be some kind of revolutionary breakthrough again to reach the next stage.

If I were to guess, it needs to be in the learning/back propagation stage. LLM's are very rigid, and once they go wrong, you can't really get them out of it. A junior develop for example could gain a new insight. LLM's, not so much.

UncleOxidant9 hours ago

There's some whistling past the graveyard in these comments. "You still need humans for the social element...", "LLMs are bad at debugging", "LLMs lead you astray". And yeah, there's lots of truth in those assertions, but since I started playing with LLMs to generate code a couple of years ago they've made huge strides. I suspect that over the next couple of years the improvements won't be quite as large (Pareto Principle), but I do expect we'll still see some improvement.

Was on r/fpga recently and mentioned that I had had a lot of success recently in getting LLMs to code up first-cut testbenches that allow you to simulate your FPGA/HDL design a lot quicker than if you were to write those testbenches yourself and my comment was met with lots of derision. But they hadn't even given it a try to form their conclusion that it just couldn't work.

xhevahir8 hours ago

This attitude is depressingly common in lots of professional, white-collar industries I'm afraid. I just came from the /r/law subreddit and was amazed at the kneejerk dismissal there of Dario Amodei's recent comments about legal work, and of those commenters who took them seriously. It's probably as much a coping mechanism as it is complacency, but, either way, it bodes very poorly for our future efforts at mitigating whatever economic and social upheaval is coming.

garciasn7 hours ago

This is the response to most new technologies; folks simply don't want to accept the future before the ramifications truly hit. If technology folk cannot see the INCREDIBLE LEAP FORWARD made by LLMs since ChatGPT came on the market, they're not seeing the forest through the trees because their heads are buried in the sand.

LLMs for coding are not even close to imperfect, yet, but the saturation curves are not flattening out; not by a long shot. We are living in a moment and we need to come to terms with it as the work continues to develop; and, we need to adapt and quickly in order to better understand what our place will become as this nascent tech continues its meteoric trajectory toward an entirely new world.

eikenberry5 hours ago

I don't think it is only (or even mostly) not wanting to accept it, I think it is at least equal measure just plain skepticism. We've seen all sorts of wild statements about how much something is going to revolutionize X and then turns out to be nothing. Most people disbelieve these sorts of claims until they see real evidence for themselves... and that is a good default position.

const_cast3 hours ago

Lawyers don't even use version control software a lot of the time. They burn hundreds of paralegal hours reconciling revisions, a task that could be made 100x faster and easier with Git.

There's no guarantee a technology will take off, even if it's really, really good. Because we don't decide if that tech takes off - the lawyers do. And they might not care, or they might decide billing more hours is better, actually.

heartbreak1 hour ago

> billing more hours is better, actually

The guiding principle of biglaw.

Attorneys have the bar to protect them from technology they don’t want. They’ve done it many times before, and they’ll do it again. They are starting to entertain LLMs, but not in a way that would affect their billable hours.

dgfitz2 hours ago

“First thing we do, let’s kill all the lawyers”

History majors everywhere are weeping.

ben-schaaf3 hours ago

Friendly reminder that people like you were saying the exact same thing about metaverse, VR, web3, crypto, etc.

abootstrapper36 minutes ago

I didn’t buy the hype of any of those things, but I believe AI is a going to change everything much like the introduction of the internet. People are dismissing AI because its code is not bug free, completely dismissing the fact that it generates PRs in minutes from a poorly written text prompt. As if that’s not impressive. In fact if you put a human engineer on the receiving end of the same prompt with the same context as what we’re sending to the LLM, I doubt they could produce code half as good in 10x the time. It’s science fiction coming true, and it’s only going to continue to improve.

drodgers37 minutes ago

Yes. If you judge only from the hype, then you can't distinguish LLMs from crypto, or Nuclear Weapons from Nuclear Automobiles.

If you always say that every new fad is just hype, then you'll even be right 99.9% of the time. But if you want to be more valuable than a rock (https://www.astralcodexten.com/p/heuristics-that-almost-alwa...), then you need to dig into the object-level facts and form an opinion.

In my opinion, AI has a much higher likelihood of changing everything very quickly than crypto or similar technologies ever did.

+2
AYBABTME3 hours ago
bgwalter7 hours ago

Adapt to your manager at bigcorp who is hyping the tech because it gives him something to do? No open source project is using the useless LLM shackles.

+1
xandrius6 hours ago
spamizbad6 hours ago

I think it's pretty reasonable to take a CEO's - any CEO in any industry - statements with a grain of salt. They are under tremendous pressure to paint the most rosy picture possible of their future. They actually need you to "believe" just as much as their team needs to deliver.

sanderjd2 hours ago

Isn't this also kind of just ... a reddit thing?

golergka5 hours ago

Lawyers say those things and then one law firm after another is frantically looking for a contractor to overpay them to install local RAG and chatbot combo.

layer86 hours ago

Programmers derided programming languages (too inefficient, too inflexible, too dumbing-down) when assembly was still the default. That phenomenon is at the same time entirely to be expected but also says little about the actual qualities of the new technology.

ch4s38 hours ago

It seems like LLMs made really big strides for a while but don't seem to be getting better recently, and in some ways recent models feel a bit worse. I'm seeing some good results generating test code, and some really bad results when people go to far with LLM use on new feature work. Base on what I've seen it seems like spinning up new projects and very basic features for web apps works really well, but that doesn't seem to generalize to refactoring or adding new features to big/old code bases.

I've seen Claude and ChatGPT happily hallucinate whole APIs for D3 on multiple occasions, which should be really well represented in the training sets.

soerxpso6 hours ago

> hallucinate whole APIs for D3 on multiple occasions, which should be really well represented in the training sets

With many existing systems, you can pull documentation into context pretty quickly to prevent the hallucination of APIs. In the near future it's obvious how that could be done automatically. I put my engine on the ground, ran it and it didn't even go anywhere; Ford will never beat horses.

prisenco6 hours ago

It's true that manually constraining an LLM with contextual data increases their performance on that data (and reduces performance elsewhere), but that conflicts with the promise of AI as an everything machine. We were promised an everything machine but if we have to not only provide it the proper context, but already know what constitutes the proper context, then it is not in any way an everything machine.

Which means it's back to being a very useful tool, but not the earth-shattering disruptor we hoped (or worried) it would be.

roywiggins57 minutes ago

Depends on how good they get at realizing they need more context and tool use to look it up for you.

+1
munksbeer5 hours ago
empath758 hours ago

the LLM's themselves are making marginal gains, but the tools for using LLMs productively are getting so much better.

dinfinity7 hours ago

This. MCP/tool usage in agentic mode is insanely powerful. Let the agent ingest a Gitlab issue, tell it how it can run commands, tests etc. in the local environment and half of the time it can just iterate towards a solution all by itself (but watching and intervening when it starts going the wrong way is still advisable).

Recently I converted all the (Google Docs) documentation of a project to markdown files and added those to the workspace. It now indexes it with RAG and can easily find relevant bits of documentation, especially in agent mode.

It really stresses the importance of getting your documentation and processes in order as well as making sure the tasks at hand are well-specified. It soon might be the main thing that requires human input or action.

ch4s32 hours ago

Every time I’ve tried to do that it takes longer than it would take me, and comes up with fairly obtuse solutions. The cursor agent seems incapable of putting code in the appropriate files in a functional language.

max_on_hn4 hours ago

I 100% agree that documenting requirements will be the main human input to software development in the near future.

In fact, I built an entirely headless coding agent for that reason: you put tasks in, you get PRs out, and you get journals of each run for debugging but it discourages micro-management so you stay in planning/documenting/architecting.

bgwalter9 hours ago

Yet you are working on your own replacement, while your colleagues are taking the prudent approach.

Jolter9 hours ago

Here’s the deal: if you won’t write your replacement, a competitor will do it and outprice your employer. Either way you’re out of a job. May be more prudent to adapt to the new tools and master them rather than be left behind?

Do you want to be a jobless weaver, or an engineer building mechanical looms for a higher pay than the weaver got?

bgwalter8 hours ago

I want to be neither. I either want to continue being a software engineer who doesn't need a tricycle for the mind, or move to law or medicine; two professions that have successfully defended themselves against extreme versions of the kind of anxiety, obedience and self hate that is so prevalent among software engineers.

xandrius6 hours ago

Nobody is preventing people writing in Assembly, even though we have more advanced language.

You could even go back to punch cards if you want to. Literally nobody forcing you to not use it for your own fun.

But LLMs are a multiplier in many mundane tasks (I'd say about 80+% of software development for businesses), so not using them is like fighting against using a computer because you like writing by hand.

nssnsjsjsjs2 hours ago

That grass is not #00FF00 there. Cory's recent essay on uber for nurses (doctors are next) and law is only second to coding on tbe AI disruptors radar plus both law and medicine have unfriendly hours for the most part.

Happy to hate myself but earn OK money for OK hours.

empath757 hours ago

Funnily enough, I had a 3 or 4 hour chat with some co workers yesterday about an LLM related project and my feeling about LLM's is that it's actually opening up a lot of fun and interesting software engineering challenges if you want to figure out how to automate the usage of LLM's.

allturtles8 hours ago

I think it's the wrong analogy. The prompt engineer who uses the AI to make code maps to the poorly-paid, low-skill power loom machine tender. The "engineer" is the person who created the model. But it's also not totally clear to me that we'll need humans for that either, in the near future.

91bananas8 hours ago

Not all engineering is creating models though, sometimes there are simpler problems to solve.

nialse9 hours ago

Ahh, the “don’t disturb the status quo” argument. See, we are all working on our replacement, newer versions, products, services and knowledge always make the older obsolete. It is wise to work on your replacement, and even wiser to be in charge of and operate the replacement.

bgwalter9 hours ago

No, nothing fundamentally new is created. Programmers have always been obsessed with "new" tooling and processes to distract from that fact.

"AI" is the latest iteration of snake oil that is foisted upon us by management. The problem is not "AI" per se, but the amount of of friction and productivity loss that comes with it.

Most of the productivity loss comes from being forced to engage with it and push back against that nonsense. One has to learn the hype language, debunk it, etc.

Why do you think IT has gotten better? Amazon had a better and faster website with far better search and products 20 years ago. No amount of "AI" will fix that.

nialse7 hours ago

Maybe I would be useful to zoom out a bit. We're in a time of technological change, and change its gonna. Maybe it isn't your job that will change, maybe it is? Maybe it's not even about you or what you do. More likely it's the processes that will change around you. Maybe it's not change for better or worse. Maybe it's just change. But it's gonna change.

palmotea7 hours ago

> It is wise to work on your replacement...

Depends on the context. You have to keep in mind: it is not a goal of our society or economic system to provide you with a stable, rewarding job. In fact, the incentives are to take that away from you ASAP.

Before software engineers go celebrate this tech, they need realize they're going to end up like rust-belt factory workers the day after the plant closed. They're not special, and society won't be any kinder to them.

> ...and even wiser to be in charge of and operate the replacement.

You'll likely only get to do that if your boss doesn't know about it.

+1
9dev6 hours ago
+1
nialse7 hours ago
npteljes8 hours ago

Carteling doesn't work bottom-up. When changes begin (like this one with AI), one of the things an individual can do is to change course as fast as they can. There are other strategies as well, not evolving is also one, but some strategies yield better results than others. Not keeping up just worsens the chances, I have found.

asdff8 hours ago

It does when it is called unionizing, however for some reason software developers have a mental block towards the concept.

+1
twodave4 hours ago
dughnut2 hours ago

The reason might be that union members give a percentage of their income to a governing body which is barely distinct from organized crime in which they have no say in. The federal government already exists. You really want more boots on your neck?

BlackSwanMan7 hours ago

[dead]

JeremyNT8 hours ago

I don't think that this should be downvoted because it raises a really important issue.

I hate AI code assistants, not because they suck, but because they work. The writing is on the wall.

If we aren't working on our own replacements, we'll be the ones replaced by somebody else's vibe code, and we have no labor unions that could plausibly fight back against this.

So become a Vibe Coder and keep working, or take the "prudent" approach you mention - and become unemployed.

neta13376 hours ago

I’ll work on fixing the vibe coders mess and make bank. Experience will prove valuable even more than before

realusername6 hours ago

Personally I used them for a while and then just stopped using them because actually no, unfortunately those assistants don't work. They appear to work at first glance but there's so much babysitting needed that it's just not worth it.

This "vibe coding" seems just another way to say that people spend more time refining the output of these tools over and over again that what they would normally code.

+2
JeremyNT5 hours ago
dughnut5 hours ago

Do you want to work with LLMs or H1Bs and interns… choose wisely.

Personally I’m thrilled that I can get trivial, one-off programs developed for a few cents and the cost of a clear written description of the problem. Engaging internal developers or consulting developers to do anything at all is a horrible experience. I would waste weeks on politics, get no guarantees, and waste thousands of dollars and still hear nonsense like, “you want a form input added to a web page? Aw shucks, that’s going to take at least another month” or “we expect to spend a few days a month maintaining a completely static code base” from some clown billing me $200/hr.

rsyring4 hours ago

You can work with consulting oriented engineers who get shit done with relatively little stress and significant productivity. Productivity enhanced by AI but not replaced by it. If interested, reach out to me.

cushychicken8 hours ago

ChatGPT-4o is scary good at writing VHDL.

Using it to prototype some low level controllers today, as a matter of fact!

UncleOxidant7 hours ago

Claude and Gemini are decent at it as well. I was surprised when I asked claude (and this was several months back) to come up with a testbench for some very old, poorly documented verilog. It did a very decent job for a first-cut testbench. It even collected common, recurring code into verilog tasks (functions) which really surprised me at the time.

roflyear8 hours ago

It's better-than-senior at a some things, but worse-than-junior at a lot of things.

quantadev3 hours ago

It's more like better-than-senior 99% of the time. Makes mistakes 1% of the time. Most of the 'bad results' I've seen people struggle with ended up being the fault of the human, in the form of horrible context given to the AI or else ambiguous or otherwise flawed prompts.

Any skilled developer with a decade of experience can write prompts that return back precisely what we wanted almost every single time. I do it all day long. "Claude 4" rarely messes up.

retetr2 hours ago

Unrelated, but is this a case of the Pareto Principle? (Admittedly the first time I'm hearing of it) Wherein 80% of the effect is caused by 20% of the input. Or is this more a case of diminishing returns? Where the initial results were incredible, but each succeeding iteration seems to be more disappointing?

klabb32 hours ago

Pareto is about diminishing returns.

> but each succeeding iteration seems to be more disappointing

This is because the scaling hypothesis (more data and more compute = gains) is plateauing, because all text data is used and compute is reaching diminishing returns for some reason I’m not smart enough to say why, but it is.

So now we're seeing incremental core model advancements, variations and tuning in pre- and post training stages and a ton of applications (agents).

This is good imo. But obviously it’s not good for delusional valuations based exponential growth.

parliament325 hours ago

I'd like to agree with you and remain optimistic, but so much tech has promised the moon and stagnated into oblivion that I just don't have any optimism left to give. I don't know if you're old enough, but remember when speech-to-text was the next big thing? DragonSpeak was released in 1997, everyone was losing their minds about dictating letters/documents in MS Word, and we were promised that THIS would be the key interface for computing evermore. And.. 27 years later, talking to the latest Siri, it makes just as many mistakes as it did back then. In messenger applications people are sending literal voice notes -- audio clips -- back and forth because dictation is so unreliable. And audio clips are possibly the worst interface for communication ever (no searching, etc).

Remember how blockchain was going to change the world? Web3? IoT? Etc etc.

I've been through enough of these cycles to understand that, while the AI gimmick is cool and all, we're probably at the local maximum. The reliability won't improve much from here (hallucinations etc), while the costs to run it will stay high. The final tombstone will be when the AI companies stop running at a loss and actually charge for the massive costs associated with running these models.

some_random5 hours ago

How can you possibly look at what LLMs are doing and the progress made in the last ~3 years and equate it to crypto bullshit? Also it's super weird to include IoT in there, seeing as it has become all but ubiquitous.

r14c4 hours ago

I'm not as bearish on AI, but its hard to tell if you can really extrapolate future performance based on past improvements.

Personally, I'm more interested in the political angle. I can see that AI will be disruptive because there's a ton of money and possibly other political outcomes depending on it doing exactly that.

ChrisMarshallNY2 hours ago

Really good coders (like him) are better.

Mediocre ones … maybe not so much.

When I worked for a Japanese optical company, we had a Japanese engineer, who was a whiz. I remember him coming over from Japan, and fixing some really hairy communication bus issues. He actually quit the company, a bit after that, at a very young age, and was hired back as a contractor; which was unheard of, in those days.

He was still working for them, as a remote contractor, at least 25 years later. He was always on the “tiger teams.”

He did awesome assembly. I remember when the PowerPC came out, and “Assembly Considered Harmful,” was the conventional wisdom, because of pipelining, out-of-order instructions, and precaching, and all that.

His assembly consistently blew the doors off anything the compiler did. Like, by orders of magnitude.

yua_mikami10 hours ago

The thing everyone forgets when talking about LLMs replacing coders is that there is much more to software engineering than writing code, in fact that's probably one of the smaller aspects of the job.

One major aspect of software engineering is social, requirements analysis and figuring out what the customer actually wants, they often don't know.

If a human engineer struggles to figure out what a customer wants and a customer struggles to specify it, how can an LLM be expected to?

malfist9 hours ago

That was also one of the challenges during the offshoring craze in the 00s. The offshore teams did not have the power, or knowledge to push back on things and just built and built and built. Sounds very similar to AI right?

Probably going to have the same outcome.

pandastronaut9 hours ago

I tend to see today's AI Vibrators as the managers of the 00s and their army of offshore devs.

9dev6 hours ago

Did you actually mean to say AI Vibrators?

mreid5 hours ago

I'm guessing it is a derogatory pun, alluding to vibe coders.

+1
platevoltage5 hours ago
hathawsh9 hours ago

The difference is that when AI exhibits behavior like that, you can refine the AI or add more AI layers to correct it. For example, you might create a supervisor AI that evaluates when more requirements are needed before continuing to build, and a code review AI that triggers refinements automatically.

nevertoolate9 hours ago

Question is, how autonomous decision making works, nobody argues that llm can finish any sentence, but can it push a red button?

+1
johnecheck6 hours ago
devjab9 hours ago

LLM's do no software engineering at all, and that can be fine. Because you don't actually need software engineering to create successful programs. Some applications will not even need software engineering for their entire life cycles because nobody is really paying attention to efficiency in the ocean of poor cloud management anyway.

I actually imagine it's the opposite of what you say here. I think technically inclined "IT business partners" will be able of creating applications entirely without software engineers... Because I see that happen every day in the world of green energy. The issues come later, when things have to be maintained, scale or become efficient. This is where the software engineering comes in, because it actually matters if you used a list or a generator in your Python app when it iterates over millions of items and not just a few hundreds.

AstroBen7 hours ago

That's the thing too right.. the vast majority of software out there barely needs to scale or be super efficient

It does need to be reliable, though. LLMs have proven very bad at that

ilaksh3 hours ago

It actually comes down to feedback loops which means iterating on software being used or attempting to be used by the customer.

Chat UIs are an excellent customer feedback loop. Agents develop new iterations very quickly.

LLMs can absolutely handle abstractions and different kinds of component systems and overall architecture design.

They can also handle requirements analysis. But it comes back to iteration for the bottom line which means fast turnaround time for changes.

The robustness and IQ of the models continue to be improved. All of software engineering is well underway of being automated.

Probably five years max where un-augmented humans are still generally relevant for most work. You are going to need deep integration of AI into your own cognition somehow in order to avoid just being a bottleneck.

victorbjorklund9 hours ago

Yea, this is why I dont buy the "all developers will disappear". Will I write a lot less code in 5 years (maybe almost none)? Sure, I already type a lot less now than a year ago. But that is just a small part of the process.

xandrius6 hours ago

Exactly, also today I can actually believe I could finish a game which might have taken much longer before LLMs, just because now I can be pretty sure I won't get stuck on some feature just because I never done it before.

elzbardico9 hours ago

No. the scope will just increase to occupy the space left by LLMs. We will never be allowed to retire.

bbarn7 hours ago

The thing is, it is replacing _coders_ in a way. There are millions of people who do (or did) the work that LLMs excel at. Coders who are given a ticket that says "Write this API taking this input and giving this output" who are so far down the chain they don't even get involved in things like requirements analysis, or even interact with customers.

Software engineering, is a different thing, and I agree you're right (for now at least) about that, but don't underestimate the sheer amount of brainless coders out there.

callc3 hours ago

That sounds more like a case against a highly ossified waterfall development process than anything.

I would argue it’s a good thing to replace the actual brainless activities.

rowanG0779 hours ago

I think LLMs are better at requirement elicitation than they are at actually writing code.

wanderingstan10 hours ago

“Better” is always task-dependent. LLMs are already far better than me (and most devs I’d imagine) at rote things like getting CSS syntax right for a desired effect, or remembering the right way to invoke a popular library (e.g. fetch)

These little side quests used to eat a lot of my time and I’m happy to have a tool that can do these almost instantly.

jaccola10 hours ago

I've found LLMs particularly bad for anything beyond basic styling since the effects can be quite hard to describe and/or don't have a universal description.

Also, there are often times multiple ways to achieve a certain style and they all work fine until you want a particular tweak, in which case only one will work and the LLM usually gets stuck in one of the ones that does not work.

danielbln8 hours ago

Multi modal LLMs to the rescue. Throw a screenshot or mockup in there and tell the LLM "there, like this". Gemini can do the same with videos.

karn973 hours ago

Still terrible result. Multi modal = actually understands the image

gherkinnn9 hours ago

I have found it to be good at things I am not very strong at (SQL) but terrible at the things I know well (CSS).

Telling, isn't it?

mywittyname5 hours ago

Ironically, I find it strong at things I don't know very well (CSS), but terrible at things I know well (SQL).

This is probably really just a way of saying, it's better at simple tasks rather than complex ones. I can eventually get Copilot to write SQL that's complex and accurate, but I don't find it faster or more effective than writing it myself.

ehansdais2 hours ago

Actually, you've reinforced their point. It's only bad at things the user is actually good at because the user actually knows enough in that domain to find the flaws and issues. It appears to be good in domains the user is bad at because the user doesn't know any better. In reality, the LLM is just bad at all domains; it's simply whether a user has the skill to discern it. Of course, I don't believe it's as black and white as that but I just wanted to point it out.

ch4s38 hours ago

I kind of agree. It feels like they're generally a superior form of copying and pasting fro stack overflow where the machine has automated the searching, copying, pasting, and fiddling with variable names. It be just as useful or dangerous as Google -> Copy -> Paste ever was, but faster.

sanderjd2 hours ago

Funny, I find it to be good at things I'm not very strong at (CSS) but terrible at the things I know well (SQL). :)

Actually I think it's perfectly adequate at SQL too.

sanderjd2 hours ago

Yeah, this is what I really like about AI tools though. They're way better than me at annoying minutia like getting CSS syntax right. I used to dread that kind of thing!

kccqzy10 hours ago

> and most devs I’d imagine

What an awful imagination. Yes there are people who don't like CSS but are forced to use it by their job so they don't learn it properly, and that's why they think CSS is rote memorization.

But overall I agree with you that if a company is too cheap to hire a person who is actually skilled at CSS, it is still better to hoist that CSS job onto LLMs than an unwilling human. Because that unwilling human is not going to learn CSS well and won't enjoy writing CSS.

On the other hand, if the company is willing to hire someone who's actually good, LLMs can't compare. It's basically the old argument of LLMs only being able to replace less good developers. In this case, you admitted that you are not good at CSS and LLMs are better than you at CSS. It's not task-dependent it's skill-dependent.

marcosdumay10 hours ago

Hum... I imagine LLMs are better than every developer on getting CSS keywords right like the GP pointed. And I expect every LLM to be slightly worse than most classical autocompletes.

skydhash9 hours ago

Getting CSS keywords right is not the actual point of writing CSS. And you can have a linter that helps you in that regards. The endgame of writing CSS is to style an HTML page according to the specifications of a design. Which can be as detailed as a figma file or as flimsy as a drawing on a whiteboard.

michaelsalim6 hours ago

This is like saying that LLMs are better at knowing the name of that one obscure API. It's not wrong, but it's also not the hard part about CSS

klabb32 hours ago

Wait until they hear how good dictionaries are at spelling.

lelandfe9 hours ago

I'm one of those weirdos who really likes handwriting CSS. I frequently find ChatGPT getting my requests wrong.

jjgreen9 hours ago

... even better with a good fountain pen ...

zdragnar10 hours ago

I think that's great if it's for something outside of your primary language. I've used it to good effect in that way myself. However, denying yourself the reflexive memory of having learned those things is a quick way to become wholly dependent upon the tool. You could easily end up with compromised solutions because the tool recommends something you don't understand well enough to know there's a better way to do something.

dpkirchner10 hours ago

You're right, however I think we've already gone through this before. Most of us (probably) couldn't tell you exactly how an optimizing compiler picks optimizations or exactly how JavaScript maps to processor instructions, etc -- we hopefully understand enough at one level of abstraction to do our jobs. Maybe LLM driving will be another level of abstraction, when it gets better at (say) architecting projects.

skydhash9 hours ago

> Most of us (probably) couldn't tell you exactly how an optimizing compiler picks optimizations or exactly how JavaScript maps to processor instructions,

That's because other people are making those working well. It's like how you don't care about how the bread is being made because you trust your baker (or the regulations). It's a chain of trust that is easily broken when LLMs are brought in.

+1
danielbln8 hours ago
AnimalMuppet10 hours ago

So here's an analogy. (Yeah, I know, proof by analogy is fraud. But it's going to illustrate the question.)

Here's a kid out hoeing rows for corn. He sees someone planting with a tractor, and decides that's the way to go. Someone tells him, "If you get a tractor, you'll never develop the muscles that would make you really great at hoeing."

Different analogy: Here's someone trying to learn to paint. They see someone painting by numbers, and it looks a lot easier. Someone tells them, "If you paint by numbers, you'll never develop the eye that you need to really become good as a painter."

Which is the analogy that applies, and what makes it the right one?

I think the difference is how much of the job the tool can take over. The tractor can take over the job of digging the row, with far more power, far more speed, and honestly far more quality. The paint by numbers can take over the job of visualizing the painting, with some loss of quality and a total loss of creativity. (In painting, the creativity is considered a vital part; in digging corn rows, not so much.)

I think that software is more like painting, rather than row-hoeing. I think that AI (currently) is in the form of speeding things up with some loss of both quality and creativity.

Can anyone steelman this?

bluefirebrand9 hours ago

> Here's a kid out hoeing rows for corn. He sees someone planting with a tractor, and decides that's the way to go. Someone tells him, "If you get a tractor, you'll never develop the muscles that would make you really great at hoeing

In this example the idea that losing the muscles that make you great at hoeing" seems kind of like a silly thing to worry about

But I think there's a second order effect here. The kid gets a job driving the tractor instead. He spends his days seated instead of working. His lifestyle is more sedentary. He works just as many hours as before, and he makes about the same as he did before, so he doesn't really see much benefit from the increased productivity of the tractor.

However now he's gaining weight from being more sedentary, losing muscle from not moving his body, developing lower back problems from being seated all day, developing hearing loss from the noisy machinery. His quality of life is now lower, right?

Edit: Yes, there are also health problems from working hard moving dirt all day. You can overwork yourself, no question. It's hard on your body, being in the sun all day is bad for you.

I would argue it's still objectively a physically healthier lifestyle than driving a tractor for hours though.

Edit 2: my point is that I think after driving a tractor for a while, the kid would really struggle to go hoe by hand like he used to, if he ever needed to

+1
hatefulmoron9 hours ago
stonemetal127 hours ago

>I think the difference is how much of the job the tool can take over.

I think it is about how utilitarian the output is. For food no one cares how the sausage is made. For a painting the story behind it is more important than the picture itself. All of Picasso's paintings are famous because they were painted by Picasso. Picasso style painting by Bill? Suddenly it isn't museum worthy anymore.

No one cares about the story or people behind Word, they just want to edit documents. The Demo scene probably has a good shot at being on the side of art.

danielbln8 hours ago

For me the creativity in software engineering doesn't come from coding, that's an implementation detail. It comes from architecture, from thinking about "what do I want to build, how should it behave, how should it look, what or who is it for?" and driving that forward. Bolting it together in code is hoeing, for that vast majority of us. The creative endeavor sits higher up on the abstraction ladder.

acquisitionsilk8 hours ago

It is quite heartening to see so many people care about "good code". I fear it will make no difference.

The problem is that the software world got eaten up by the business world many years ago. I'm not sure at what point exactly, or if the writing was already on the wall when Bill Gates' wrote his open letter to hobbyists in 1976.

The question is whether shareholders and managers will accept less good code. I don't see how it would be logical to expect anything else, as long as profit lines go up why would they care.

Short of some sort of cultural pushback from developers or users, we're cooked, as the youth say.

BirAdam2 hours ago

This is fun to think about. I used to think that all software was largely garbage, and at one point, I think this _was_ true. Sometime over the last 20 years, I believe this ceased to be the case. Most software these days actually works. Importantly, most software is actually stable enough that I can make it half an hour without panic saving.

Could most software be more awesome? Yes. Objectively, yes. Is most software garbage? Perhaps by raw volume of software titles, but are most popular applications I’ve actually used garbage? Nope. Do I loathe the whole subscription thing? Yes. Absolutely. Yet, I also get it. People expect software to get updated, and updates have costs.

So, the pertinent question here is, will AI systems be worse than humans? For now, yeah. Forever? Nope. The rate of improvement is crazy. Two years ago, LLMs I ran locally couldn’t do much of anything. Now? Generally acceptable junior dev stuff comes out of models I run on my Mac Studio. I have to fiddle with the prompts a bit, and it’s probably faster to just take a walk and think it over than spend an hour trying different prompts… but I’m a nerd and I like fiddling.

JackSlateur8 hours ago

Code is meant to power your business

Bad code leads to bad business

This makes me think of hosting departement; You know, which people who are using vmware, physical firewalls, dpi proxies and whatnot;

On the other edge, you have public cloud providers, which are using qemu, netfilter, dumb networking devices and stuff

Who got eaten by whom, nobody could have guessed ..

robocat3 hours ago

> Short of some sort of cultural pushback from developers or users

Corporations create great code too: they're not all badly run.

The problem isn't a code quality issue: it is a moral issue of whether you agree with the goals of capitalist businesses.

Many people have to balance the needs of their wallet with their desire for beautiful software (I'm a developer-founder I love engineering and open source community but I'm also capitalist enough to want to live comfortably).

frogperson4 hours ago

The context required to write real software is just way too big for LLMs. Software is the business, codified. How is an LLM supposed to know about all the rules in all the departments plus all the special agreements promised to customers by the sales team?

Right now the scope of what an LLM can solve is pretty generic and focused. Anytime more than a class or two is involved or if the code base is more than 20 or 30 files, then even the best LLMs start to stray and lose focus. They can't seem to keep a train of thought which leads to churning way too much code.

If LLMs are going to replace real developers, they will need to accept significantly more context, they will need a way to gather context from the business at large, and some way to persist a train of thought across the life of a codebase.

I'll start to get nervous when these problems are close to being solved.

zachlatta4 hours ago

I’d encourage you to try the 1M context window on Gemini 2.5 Pro. It’s pretty remarkable.

I paste in the entire codebase for my small ETL project (100k tokens) and it’s pretty good.

Not perfect, still a long ways to go, but a sign of the times to come.

karn973 hours ago

Did you not even read what you replied to?

zachlatta3 hours ago

Yes… I did?

+1
nssnsjsjsjs2 hours ago
loudmax9 hours ago

Companies that leverage LLMs and AIs to let their employees be more productive will thrive.

Companies that try to replace their employees with LLMs and AIs will fail.

Unfortunately, all that's in the long run. In the near term, some CEOs and management teams will profit from the short term valuations as they squander their companies' future growth on short-sighted staff cuts.

bdbenton52555 hours ago

That's really it. These tools are useful as assistants to programmers but do not replace an actual programmer. The right course is to embrace the technology moderately rather than reject it completely or bet on it replacing workers.

BirAdam2 hours ago

By the time AI hype dies down and hurts the bottom line, AI systems might be good enough to do the jobs.

“The market can remain irrational longer than you can remain solvent.” — Warren Buffett

joshdavham9 hours ago

> In the near term, some CEOs and management teams will profit from the short term valuations

That's actually really interesting to think about. The idea that doing something counter-productive like trying to replace employees with AI (which will cause problems), may actually benefit the company in terms of valuations in the short run. So in effect, they're hurting and helping the company at the same time.

to11mtm5 hours ago

Hey if the check clears for the bonus they got for hitting 'reduce costs in the IT department', they often bail before things rear their ugly head, or in the ugly case, Reality Distortion Field's the entire org into making the bad anti patterns permanent, even while acknowledging the cost/delivery/quality inefficiencies[0].

This is especially prevalent in waterfall orgs that refuse change. Body shops are more than happy to waste a huge portion of their billable hours on planning meetings and roadmap revisions as the obviousness of the mythical man month comes to bear on the org.

Corners get cut to meet deadlines, because the people who started/perpetuated whatever myth need to save their skins (and hopefully continue to get bonuses.)

The engineers become a scapegoat for the org's management problems (And watch, it very likely will happen at some shops with the 'AI push'). In the nasty cases, the org actively disempowers engineers in the process[0][1].

[0] - At one shop, there was grief we got that we hadn't shipped a feature, but the only reason we hadn't, was IT was not allowed to decide between a set of radio buttons or a drop-down on a screen. Hell I got yelled at for just making the change locally and sending screenshots.

[1] - At more than one shop, FTE devs were responsible for providing support for code written by offshore that they were never even given the opportunity to review. And hell yes myself and others pushed for change, but it's never been a simple change. It almost always is 'GLWT'->'You get to review the final delivery but get 2 days'->'You get to review the set of changes'->'Ok you can review their sprint'->'OK just start reviewing every PR'.

janalsncm9 hours ago

Very well said. Using code assistance is going to be table stakes moving forward, not something that can replace people. It’s not like competitors can’t also purchase AI subscriptions.

bbarn7 hours ago

Honestly, if you're not doing it now, you're behind. The sheer amount of time savings using it smartly can give you to allow you to focus on the parts that actually matter is massive.

kweingar5 hours ago

If progress continues at the rate that AI boosters expect, then soon you won't have to use them smartly to get value (all existing workflows will churn and be replaced by newer, smarter workflows within months), and everybody who is behind will immediately catch up the moment they start to use the tool.

abletonlive53 minutes ago

But if it doesn't and you're not using it now then you're gonna be behind and part of the group getting laid off

the people that are good at using these tools now will be better at it later too. you might have closed the gap quite a bit but you will still be behind

using LLMs are they are now requires a certain type of mindset that takes practice to maintain and sharpen. It's just like a competitive game. The more intentionally do it, the better you get. And the meta changes every 6 months to a year.

That's why I scroll and laugh through all the comments on this thread dismissing it, because I know that the people dismissing it are the problem.

the interface is a chatbox with no instructions or guardrails. the fact that folks think that their experience is universal is hilarious. so much of using LLM right now is context management.

I can't take most of yall in this thread seriously

bdbenton52554 hours ago

The human ability to design computer programs through abstractions and solve creative problems like these is arguably more important than being able to crank out lines of code that perform specific tasks.

The programmer is an architect of logic and computers translate human modes of thought into instructions. These tools can imitate humans and produce code given certain tasks, typically by scraping existing code, but they can't replace that abstract level of human thought to design and build in the same way.

When these models are given greater functionality to not only output code but to build out entire projects given specifications, then the role of the human programmer must evolve.

am17an9 hours ago

All the world's smartest minds are racing towards replacing themselves. As programmers, we should take note and see where the wind is blowing. At least don't discard the possibility and rather be prepared for the future. Not to sound like a tin-foil hat but odds of achieving something like this increase by the day.

In the long term (post AGI), the only safe white-collar jobs would be those built on data which is not public i.e. extremely proprietary (e.g. Defense, Finance) and even those will rely heavily on customized AIs.

wijwp4 hours ago

> Not to sound like a tin-foil hat but odds of achieving something like this increase by the day.

Where do you get this? The limitations of LLMs are becoming more clear by the day. Improvements are slowing down. Major improvements come from integrations, not major model improvements.

AGI likely can't be achieved with LLMs. That wasn't as clear a couple years ago.

drodgers24 minutes ago

I don't know how someone could be following the technical progress in detail and hold this view. The progress is astonishing, and the benchmarks are becoming saturated so fast that it's hard to keep track.

Are there plenty of gaps left between here and most definitions of AGI? Absolutely. Nevertheless, how can you be sure that those gaps will remain given how many faculties these models have already been able to excel at (translation, maths, writing, code, chess, algorithm design etc.)?

It seems to me like we're down to a relatively sparse list of tasks and skills where the models aren't getting enough training data, or are missing tools and sub-components required to excel. Beyond that, it's just a matter of iterative improvement until 80th percentile coder becomes 99th percentile coder becomes superhuman coder, and ditto for maths, persuasion and everything else.

Maybe we hit some hard roadblocks, but room for those challenges to be hiding seems to be dwindling day by day.

bitpush5 hours ago

> All the world's smartest minds are racing towards replacing themselves

Isnt every little script, every little automation us programmers do in the same spirit? "I dont like doing this, so I'm going to automate it, so that I can focus on other work".

Sure, we're racing towards replacing ourselves, but there would be (and will be) other more interesting work for us to do when we're free to do that. Perhaps, all of us will finally have time to learn surfing, or garden, or something. Some might still write code themselves by hand, just like how some folks like making bread .. but making bread by hand is not how you feed a civilization - even if hundreds of bakers were put out of business.

AstroBen3 hours ago

> all of us will finally have time to learn surfing, or garden

Unless you have a mortgage.. or rent.. or need to eat

AstroBen7 hours ago

Ultimately this needs to be solved politically

Making our work more efficient, or humans redundant should be really exciting. It's not set in stone that we need to leave people middle aged with families and now completely unable to earn enough to provide a good life

Hopefully if it happens, it happens to such a huge amount of people that it forces a change

lyu072826 hours ago

But that already happened to lots of industries and lots of people, we never cared before about them, now it's us so we care, but nothing is different about us. Just learn to code!

AstroBen5 hours ago

The difference is in how many industries AI is threatening. It's not just coding on the chopping block

bluefirebrand5 hours ago

No different than how many industries that offshoring wrecked

BirAdam2 hours ago

Nah. As more people are rendered unemployed the buying market and therefore aggregate demand will fall. Fewer sales hurts the bottom line. At some point, revenues across the entire economy fall, and companies cannot afford the massive datacenters and nuclear power plants fueling them. The hardware gets sold cheap, the companies go under, and people get hired again. Eventually, some kind of equilibrium will be found or the world engages in the Butlerian Jihad.

bgwalter9 hours ago

The Nobel prize is said to have been created partly out of guilt over having invented dynamite, which was obviously used in a destructive manner.

Now we have Geoffrey Hinton getting the prize for contributing to one of the most destructive inventions ever.

reducesuffering5 hours ago

At least he and Yoshua Bengio are remorseful. Many others haven't even gotten that far...

smilbandit10 hours ago

From my limited experience, former coder now management but I still get to code now and then. I've found them helpful but also intrusive. Sometimes when it guesses the code for the rest of the line and next few lines it's going down a path I don't want to go but I have to take time to scan it. Maybe it's a configuration issue, but i'd prefer it didn't put code directly in my way or be off by default and only show when I hit a key combo.

One thing I know is that I wouldn't ask an LLM to write an entire section of code or even a function without going in and reviewing.

haiku207710 hours ago

Zed has a "subtle" mode like that. More editors should provide it. https://zed.dev/docs/ai/edit-prediction#switching-modes

PartiallyTyped10 hours ago

> One thing I know is that I wouldn't ask an LLM to write an entire section of code or even a function without going in and reviewing.

These days I am working on a startup doing [a bit of] everything, and I don't like the UI it creates. It's useful enough when I make the building blocks and let it be, but allowing claude to write big sections ends up with lots of reworks until I get what I am looking for.

bouncycastle5 hours ago

Last night I spent hours fighting o3.

I never made a Dockerfile in my life, so I thought it would be faster just getting o3 to point to the GitHub repo and let it figure out, rather than me reading the docs and building it myself.

I spent hours debugging the file it gave me... It kept on adding hallucinations for things that didn't exist, and removing/rewriting other parts, and other big mistakes like understanding the difference between python3 and python and the intricacies with that.

Finally I gave up and Googled some docs instead. Fixed my file in minutes and was able to jump into the container and debug the rest of the issues. AI is great, but it's not a tool to end all. You still need someone who is awake at the wheel.

throwaway3141555 hours ago

Pro-tip: Check out Claude or Gemini. They hallucinate far less on coding tasks. Alternatively, enable internet search on o3 which boosts its ability to reference online documentation and real world usage examples.

I get having a bad taste in your mouth but these tools _aren't _ magic and do have something of a steep learning curve in order to get the most out of them. Not dissimilar from vim/emacs (or lots of dev tooling).

edit: To answer a reply (hn has annoyingly limited my ability to make new comments) yes, internet search is always available to ChatGpT as a tool. Explicitly clicking the globe icon will encourage the model to use it more often, however.

Sohcahtoa824 hours ago

> enable internet search on o3

I didn't know it could even be disabled. It must be enabled by default, right?

osigurdson31 minutes ago

This is similar to my usage of LLMs. I use Windsurf sometimes but more often it is more of a conversation about approaches.

pupppet10 hours ago

If an LLM just finds patterns, is it even possible for an LLM to be GOOD at anything? Doesn't that mean at best it will be average?

bitpush5 hours ago

Humans are also almost always operating on patterns. This is why "experience" matters a lot.

Very few people are doing truly cutting edge stuff - we call them visionaries. But most of the time, we're just merely doing what's expected

And yes, that includes this comment. This wasnt creative or an original thought at all. I'm sure hundreds of people have had similar thought, and I'm probably parroting someone else's idea here. So if I can do it, why cant LLM?

dgb234 hours ago

The times we just operate on patterns is when we code boilerplate or just very commonly written code. There's value in speeding this up and LLMs help here.

But generally speaking I don't experience programming like that most of the time. There are so many things going on that have nothing to do with pattern matching while coding.

I load up a working model of the running code in my head and explore what it should be doing in a more abstract/intangible way and then I translate those thoughts to code. In some cases I see the code in my inner eye, in others I have to focus quite a lot or even move around or talk.

My mind goes to different places and experiences. Sometimes it's making new connections, sometimes it's processing a bit longer to get a clearer picture, sometimes it re-shuffles priorities. A radical context switch may happen at any time and I delete a lot of code because I found a much simpler solution.

I think that's a qualitative, insurmountable difference between an LLM and an actual programmer. The programmer thinks deeply about the running program and not just the text that needs to be written.

There might be different types of "thinking" that we can put into a computer in order to automate these kinds of tasks reliably and efficiently. But just pattern matching isn't it.

riknos3149 hours ago

My experience is that LLMs regress to the average of the context they have for the task at hand.

If you're getting average results you most likely haven't given it enough details about what you're looking for.

The same largely applies to hallucinations. In my experience LLMs hallucinate significantly more when at or pushed to exceed the limits of their context.

So if you're looking to get a specific output, your success rate is largely determined by how specific and comprehensive the context the LLM has access to is.

jaccola10 hours ago

Most people (average and below average) can tell when something is above average, even if they cannot create above average work, so using RLHF it should be quite possible to achieve above average.

Indeed it is likely already the case that in training the top links scraped or most popular videos are weighted higher, these are likely to be better than average.

lukan10 hours ago

There are bad patterns and good patterns. But whether a pattern is the right one for a specific task is something different.

And what really matters is, if the task gets reliable solved.

So if they actually could manage this on average with average quality .. that would be a next level gamechanger.

JackSlateur8 hours ago

Yes, IA is basically a random machine aiming for average outcome

IA is neat for average people, to produce average code, for average compagnies

In a competitive world, using IA is a death sentence;

some-guy9 hours ago

The main thing LLMs have helped me with, and always comes back to, tasks that require bootstrapping / Googling:

1) Starting simple codebases 2) Googling syntax 3) Writing bash scripts that utilize Unix commands whose arguments I have never bothered to learn in the first place.

I definitely find time savings with these, but the esoteric knowledge required to work on a 10+ year old codebase is simply too much for LLMs still, and the code alone doesn't provide enough context to do anything meaningful, or even faster than I would be able to do myself.

mywittyname5 hours ago

LLMs are amazing at shell scripting. It's one of those tasks where I always half-ass it because I don't really know how to properly handle errors and never really learned the correct way. But man, perplexity and poop out a basic shell script in a few seconds with pretty much every edge case I can think of covered.

decasia10 hours ago

We aren't expecting LLMs to come up with incredibly creative software designs right now, we are expecting them to execute conventional best practices based on common patterns. So it makes sense to me that it would not excel at the task that it was given here.

The whole thing seems like a pretty good example of collaboration between human and LLM tools.

writeslowly10 hours ago

I haven't actually had that much luck with having them output a boring API boilerplate in large Java projects. Like "I need to create a new BarOperation that has to go in a different set of classes and files and API prefixes than all the FooOperations and I don't feel like copy pasting all the yaml and Java classes" but the AI has problems following this. Maybe they work better in small projects.

I actually like LLMs better for creative thinking because they work like a very powerful search engine that can combine unrelated results and pull in adjacent material I would never personally think of.

coffeeismydrug7 hours ago

> Like "I need to create a new BarOperation that has to go in a different set of classes and files and API prefixes than all the FooOperations and I don't feel like copy pasting all the yaml and Java classes" but the AI has problems following this.

To be fair, I also have problems following this.

ehutch7910 hours ago

Uh, no. I've seen the twitter posts saying llms will replace me. I've watched the youtube videos saying llms will code whole apps on one prompt, but are light on details or only show the most basic todo app from every tutorial.

We're being told that llms are now reasoning, which implies they can make logical leaps and employ creativity to solve problems.

The hype cycle is real and setting expectations that get higher with the less you know about how they work.

prophesi9 hours ago

> The hype cycle is real and setting expectations that get higher with _the less you know about how they work_.

I imagine on HN, the expectations we're talking about are from fellow software developers who at least have a general idea on how LLM's work and their limitations.

bluefirebrand9 hours ago

Right below this is a comment

> you will almost certainly be replaced by an llm in the next few years

So... Maybe not. I agree that Hacker News does have a generally higher quality of contributors than many places on the internet, but it absolutely is not a universal for HNers. There are still quite a few posters here that have really bought into the hype for whatever reason

zamalek8 hours ago

> hype for whatever reason

"I need others to buy into LLMs in order for my buy-in to make sense," i.e. network effects.[1]

> Most dot-com companies incurred net operating losses as they spent heavily on advertising and promotions to harness network effects to build market share or mind share as fast as possible, using the mottos "get big fast" and "get large or get lost". These companies offered their services or products for free or at a discount with the expectation that they could build enough brand awareness to charge profitable rates for their services in the future.

You don't have to go very far up in terms of higher order thinking to understand what's going on here. For example, think about Satya's motivations for disclosing Microsoft writing 30% of their code using LLMs. If this really was the case, wouldn't Microsoft prefer to keep this competitive advantage secret? No: Microsoft and all the LLM players need to drive hype, and thus mind share, in the hope that they become profitable at some point.

If "please" and "thank you" are incurring huge costs[2], how much is that LLM subscription actually going to cost consumers when the angel investors come knocking, and are consumers going to be willing to pay that?

I think a more valuable skill might be learning how to make do with local LLMs because who knows how many of these competitors will still be around in a few years.

[1]: https://en.wikipedia.org/wiki/Dot-com_bubble#Spending_tenden... [2]: https://futurism.com/altman-please-thanks-chatgpt

danielbln8 hours ago

I wish we'd measure things less against how hyped they are. Either they are useful, or they are not. LLMs are clearly useful (to which extent and with what caveats is up to lively debate).

bgwalter10 hours ago

After the use-after-free hype article I tried CoPilot and it outright refused to find vulnerabilities.

Whenever I try some claim, it does not work. Yes, I know, o3 != CoPilot but I don't have $120 and 100 prompts to spend on making a point.

ldjkfkdsjnv10 hours ago

you will almost certainly be replaced by an llm in the next few years

einpoklum10 hours ago

You mean, as a HackerNews commenter? Well, maybe...

In fact, maybe most of has have been replaced by LLMs already :-)

darkport10 hours ago

I think this is true for deeply complex problems, but For everyday tasks an LLM is infinitely “better”.

And by better, I don’t mean in terms of code quality because ultimately that doesn’t matter for shipping code/products, as long as it works.

What does matter is speed. And an LLM speeds me up at least 10x.

kweingar5 hours ago

You're making at least a year's worth of pre-LLM progress in 5 weeks?

You expect to achieve more than a decade of pre-LLM accomplishments between now and June 2026?

nevertoolate9 hours ago

How do you measure this?

rel2thr10 hours ago

Antirez is a top 0.001% coder . Don’t think this generalizes to human coders at large

ljlolel10 hours ago

Seriously, he’s one of the best on the planet of course it’s not better than him. If so we’d be cooked.

99% of professional software developers don’t understand what he said much less can come up with it (or evaluate it like Gemini).

This feels a bit like a humblebrag about how well he can discuss with an LLM compared to others vibecoding.

justacrow7 hours ago

Hey, my CEO is saying that LLMs are also top 0.001% coders now, so should at least be roughly equivalent.

AlotOfReading9 hours ago

Unrelated to the LLM discussion, but a hash function function is the wrong construction for the accumulator solution. The hashing part increases the probability that A and B have a collision that leads to a false negative here. Instead, you want a random invertible mapping, which guarantees that no two pointers will "hash" to the same value, while distributing the bits. Splitmix64 is a nice one, and I believe the murmurhash3 finalizer is invertible, as well as some of the xorshift RNGs if you avoid the degenerate zero cycle.

antirez9 hours ago

Any Feistel Network has the property you stated actually, and this was one of the approaches I was thinking using as I can have the seed as part of the non linear transformation of the Feistel Network. However I'm not sure that this actually decreases the probability of A xor B xor C xor D being accidentally zero, bacause the problem with pointers is that they may change only for a small part. When you using hashing because of avalanche effect this is going a lot harder since you are no longer xoring the pointer structure.

What I mean is that you are right assuming we use a transformation that still while revertible has avalanche effect. Btw in practical terms I doubt there are practical differences.

AlotOfReading9 hours ago

You can guarantee that the probability is the theoretical minimum with a bijection. I think that would be 2^-N since it's just the case where everything's on a maximum length cycle, but I haven't thought about it hard enough to be completely certain.

A good hash function intentionally won't hit that level, but it should be close enough not to matter with 64 bit pointers. 32 bits is small enough that I'd have concerns at scale.

DrJid8 hours ago

I never quite understand these articles though. It's not about Humans vs. AI.

It's about Humans vs. Humans+AI

and 4/5, Humans+AI > Humans.

throwaway4390802 hours ago

Of course they are. The interesting thing isn't how good LLMs are today, it's their astonishing rate of improvement. LLMs are a lot better than they were a year ago, and light years ahead of where they were two years ago. Where will they be in five years?

hiatus2 hours ago

Reminds me of the 90s when computer hardware moved so fast. I wonder where the limit is this time around.

twodave4 hours ago

If you stick with the same software ecosystem long enough you will collect (and improve upon) ways of solving classes of problems. These are things you can more or less reproduce without thinking too much or else build libraries around. An LLM may or may not become superior at this sort of exercise at some point, and might or might not be able to reliably save me some time typing. But these are already the boring things about programming.

So much of it is exploratory, deciding how to solve a problem from a high level, in an understandable way that actually helps the person who it’s intended to help and fits within their constraints. Will an LLM one day be able to do all of that? And how much will it cost to compute? These are the questions we don’t know the answer to yet.

SKILNER2 hours ago

There's a lot of resistance to AI amongst the people in this discussion, which is probably to be expected.

A chunk of the objections indicate people trying to shoehorn in their old way of thinking and working.

I think you have to experiment and develop some new approaches to remove the friction and get the benefit.

bachmeier5 hours ago

I suspect humans will always be critical to programming. Improved technology won't matter if the economics isn't there.

LLMs are great as assistants. Just today, Copilot told me it's there to do the "tedious and repetitive" parts so I can focus my energy on the "interesting" parts. That's great. They do the things every programmer hates having to do. I'm more productive in the best possible way.

But ask it to do too much and it'll return error-ridden garbage filled with hallucinations, or just never finish the task. The economic case for further gains has diminished greatly while the cost of those gains rises.

Automation killed tons of manufacturing jobs, and we're seeing something similar in programming, but keep in mind that the number of people still working in manufacturing is 60% of the peak, and those jobs are much better than the ones in the 1960s and 1970s.

noslenwerdna5 hours ago

Sure, it's just that the era of super high paying programming jobs may be over.

And also, manufacturing jobs have greatly changed. And the effect is not even, I imagine. Some types of manufacturing jobs are just gone.

monocularvision4 hours ago

That might be the case. Perhaps it lowers the difficulty level so more people can do it and therefor puts downward pressure on wages.

Or… it still requires similar education and experience but programmers end up so much more efficient they earn _more_.

Hard to say right now.

bachmeier4 hours ago

> the era of super high paying programming jobs may be over.

Probably, but I'm not sure that had much to do with AI.

> Some types of manufacturing jobs are just gone

The manufacturing work that was automated is not exactly the kind of work people want to do. I briefly did some of that work. Briefly because it was truly awful.

elzbardico9 hours ago

I use LLMs a lot, and call me arrogant, but every time I see a developer saying that LLMs will substitute them, I think they are probably shitty developers.

Fernicia9 hours ago

If it automates 1/5th of your work, then what's unreasonable about thinking that your team could be 4 developers instead of 5?

AstroBen1 hour ago

If software costs 80% as much to write, what's unreasonable about thinking that more businesses would integrate more of it, hiring more developers?

aschobel26 minutes ago

Bingo. This additional throughput could be used to create more polished software. What happens in a free market; would your competitor fall behind or will they try to match your polish?

archagon5 hours ago

This just feels like another form of the mythical man month argument.

agumonkey9 hours ago

There's also the subset of devs who are just bored, LLMs will end up as an easier StackOverflow and if the solution is not one script away, then you're back to square one. I already had a few of "well, uhm, chatGPT told me what you said basically".

headelf9 hours ago

What do you mean "Still"? We've only had LLMs writing code for 1.5 years... at this rate it won't be long.

cess118 hours ago

More like five years. It's been around for much longer than a lot of people feel it has for some reason.

galaxyLogic4 hours ago

Coding is not like multiplication. You can teach kids the multiplication table, or you can give them a calculator and both will work. With coding the problem is the "spec" is so much more complicated than just asking what is 5 * 7.

Maybe the way forward would be to invent better "specifiction languages" that are easy enough for humans to use, then let the AI implement the specifciation you come up with.

ww5202 hours ago

The value of LLMs are as a better Stackoverflow. It’s much better than search now because it’s not populated with all the craps that have seeped through over time.

rubit_xxx171 hour ago

Gemini may be fine for writing complex function, but I can’t stand to use it day to day. Claude 4 is my go to atm.

ntonozzi9 hours ago

If you care that much about having correct data you could just do a SHA-256 of the whole thing. Or an HMAC. It would probably be really fast. If you don’t care much you can just do murmur hash of the serialized data. You don’t really need to verify data structure properties if you know the serialized data is correct.

ants_everywhere3 hours ago

I'm increasingly seeing this as a political rather than technical take.

At this point I think people who don't see the value in AI are willfully pulling the wool over their own eyes.

prmph9 hours ago

There's something fundamental here.

There is a principle (I forget where I encountered it) that it is not code itself that is valuable, but the knowledge of a specific domain that an engineering team develops as they tackle a project. So code itself is a liability, but the domain knowledge is what is valuable. This makes sense to me and matched my long experience with software projects.

So, if we are entrusting coding to LLMs, how will that value develop? And if we want to use LLMs but at the same time develop the domain acumen, that means we would have to architects things and hand them over to LLMs to implement, thoroughly check what they produce, and generally guide them carefully. In that case they are not saving much time.

jonator9 hours ago

I believe it will raise the standard of what is valuable. Now that LLMs can now handle what we consider "mundane" parts of building a project (boilerplate), humans can dedicate focused efforts to the higher impact areas of innovation and problem solving. As LLMs get better, this bar simply continues to rise.

AstroBen8 hours ago

Better than LLMs.. for now. I'm endlessly critical of the AI hype but the truth here is that no-one has any idea what's going to happen 3-10 years from now. It's a very quickly changing space with a lot of really smart people working on it. We've seen the potential

Maybe LLMs completely trivialize all coding. The potential for this is there

Maybe progress slows to a snails pace, the VC money runs out and companies massively raise prices making it not worth it to use

No one knows. Just sit back and enjoy the ride. Maybe save some money just in case

buremba3 hours ago

If the human here is the creator of Redis, probably not.

catigula10 hours ago

Working with Claude 4 and o3 recently shows me just how fundamentally LLMs haven't really solved the core problems such as hallucinations and weird refactors/patterns to force success (i.e. if account not found, fallback to account id 1).

marcosno5 hours ago

LLMs can be very creative, when pushed. In order to find a creative solution, like antirez needed, there are several tricks I use:

Increase the temperature of the LLMs.

Ask several LLMs, each several time the same question, with tiny variations. Then collect all answers, and do a second/third round asking each LLM to review all collected answers and improve.

Add random constraints, one constraints per question. For example, to LLM: can you do this with 1 bit per X. Do this in O(n). Do this using linked lists only. Do this with only 1k memory. Do this while splitting the task to 1000 parallel threads, etc.

This usually kicks the LLM out of its confort zone, into creative solutions.

dwringer5 hours ago

Definitely a lot to be said for these ideas, even just that it helps to start a fresh chat and ask the same question in a better way a few times (using the quality of response to gauge what might be "better"). I have found if I do this a few times and Gemini strikes out, I've manually optimized the question by this point that I can drop it into Claude and get a good working solution. Conversely, having a discussion with the LLM about the potential solution, letting it hold on to the context as described in TFA, has in my experience caused the models to pretty universally end up stuck in a rut sooner or later and become counterproductive to work with. Not to mention that way eats up a ton of api usage allotment.

jonator9 hours ago

I think will will increasingly be orchestrators. Like at a symphony. Previously, most humans were required to be on the floor playing the individual instruments, but now, with AI, everyone can be their own composer.

nixpulvis9 hours ago

The number one use case for AI for me as a programmer is still help finding functions which are named something I didn't expect as I'm learning a new language/framework/library.

Doing the actual thinking is generally not the part I need too much help with. Though it can replace googling info in domains I'm less familiar with. The thing is, I don't trust the results as much and end up needing to verify it anyways. If anything AI has made this harder, since I feel searching the web for authoritative, expert information has become harder as of late.

taormina9 hours ago

My problem with this usage is that the LLMs seem equally likely to make up a function they wish existed. When questioned about the seeming-too-convenient method they will usually admit to having made it up on the spot. (This happens a lot in Flutter/Flame land, I'm sure it's better at something more mainstream like Python?) That being said, I do agree that using it as supplemental documentation is one of the better usecases I have for it.

tonyhart74 hours ago

I think also it depends on the model of course

General LLM model would not be as good as LLM for coding, for this case Google deepmind team maybe has something better than Gemini 2.5 pro

vouaobrasil10 hours ago

The question is, for how long?

spion9 hours ago

Vibe-wise, it seems like progress is slowing down and recent models aren't substantially better than their predecessors. But it would be interesting to take a well-trusted benchmark and plot max_performance_until_date(foreach month). (Too bad aider changed recently and there aren't many older models; https://aider.chat/docs/leaderboards/by-release-date.html has not been updated in a while with newer stuff, and the new benchmark doesn't have the classic models such as 3.5, 3.5 turbo, 4, claude 3 opus)

vouaobrasil8 hours ago

I think that we can't expect continuous progress either, though. Often in computer science it's more discrete, and unexpected. Computer chess was basically stagnant until one team, even the evolution of species often behaves in a punctuated way rather than as a sum of many small adaptations. I'm much more interested (worried) of what the world will be like in 30 years, rather than in the next 5.

spion7 hours ago

Its hard to say. Historically new discoveries in AI often generated great excitement and high expectations, followed by some progress, then stalling, disillusionment and AI winter. Maybe this time it will be different. Either way what was achieved so far is already a huge deal.

jppittma10 hours ago

It's really gonna depend on the project. When my hobby project was greenfield, the AI was way better than I am. It was (still is) more knowledgable about the standards that govern the field and about low level interface details. It can shit out a bunch of code that relies on knowing these details in seconds/minutes, rather than hours/days.

Now that the project has grown and all that stuff is hammered out, it can't seem to consistently write code that compiles. It's very tunnel visioned on the specific file its generating, rather than where that fits in the context of what/how we're building what we're building.

jonator9 hours ago

We can slightly squeeze more juice out of them with larger projects by providing better context, docs, examples of what we want, background knowledge, etc.

Like people, LLMs don't know what they don't know (about your project).

sixQuarks10 hours ago

Again, the question is for how long

sixQuarks10 hours ago

Exactly! We’ve been seeing more and more posts like this, saying how AI will never take developer jobs or will never be as good as coders. I think it’s some sort of coping mechanism.

These posts are gonna look really silly in the not too distant future.

I get it, spending countless hours honing your craft and knowing that AI will soon make almost everything you learned useless is very scary.

sofal9 hours ago

I'm constantly disappointed by how little I'm able to delegate to AI after the unending promises that I'll be able to delegate nearly 100% of what I do now "in the not too distant future". It's tired impatience and merited skepticism that you mistake for fear and coping. Just because people aren't on the hype train with you doesn't mean they're afraid.

vouaobrasil9 hours ago

Personally, I am. Lots of unusual skills I have, have already been taken by AI. That's not to say I think I'm in trouble, but I think it's sad I can't apply some of these skills that I learned just a couple of years ago like audio editing because AI does it now. Neither do I want to work as an AI operator, which I find boring and depressing. So, I've just moved onto something else, but it's still discouraging.

Also, so many people said the same thing about chess when the first chess programs came out. "It will never beat an international master." Then, "it will never beat a grandmaster." And Kasparov said, "it would never beat me or Karpov."

Look where we are today. Can humanity adapt? Yes, probably. But that new world IMO is worse than it is today, rather lacking in dignity I'd say.

+1
sofal9 hours ago
suddenlybananas8 hours ago

What do you mean that AI can do audio editing? I don't think all sound engineers have been replaced.

sixQuarks8 hours ago

Yes. I know what you’re referring to, but you can’t ignore the pace of improvement. I think within 2-3 years we will have AI coding that can do anything a senior level coder can do.

foobar838 hours ago

Nobody knows what the future holds, including you.

vouaobrasil7 hours ago

That is true, which is why we should be cautious instead of careless.

sixQuarks3 hours ago

Yes, but we can see current progress and extrapolate into the future. I give it 2/3 years before AI can code as well as a senior level coder

vjvjvjvjghv10 hours ago

I think we need to accept that in the not too far future LLMs will be able to do most of the mundane tasks we have to do every day. I don't see why an AI can't set up kubernetes, caching layers, testing, databases, scaling, check for security problems and so on. These things aren't easy but I think they are still very repetitive and therefore can be automated.

There will always be a place for really good devs but for average people (most of us are average) I think there will be less and less of a place.

zonethundery9 hours ago

No doubt the headline's claim is true, but Claude just wrote a working MCP serving up the last 10 years of my employer's work product. For $13 in api credits.

While technically capable of building it on my own, development is not my day job and there are enough dumb parts of the problem my p(success) hand-writing it would have been abysmal.

With rose-tinted glasses on, maybe LLM's exponentially expand the amount of software written and the net societal benefit of technology.

failrate5 hours ago

LLMs are using the corpus of existing software source code. Most software source code is just North of unworkable garbage. Garbage in, garbage out.

fHr3 hours ago

yes of course they are but MBA regard management gets told by McK/Big4 AI could save them millions and they should let go people already as AI can do there work it doesn't matter currently, see job market

dbacar9 hours ago

I disagree—'human coders' is a broad and overly general term. Sure, Antirez might believe he's better than AI when it comes to coding Redis internals , but across the broader programming landscape—spanning hundreds of languages, paradigms, and techniques—I'm confident AI has the upper hand.

nthingtohide9 hours ago

Do you want to measure antirez and AI on a spider diagram, generally used to evaluate employee? Are you ignoring why society opted for division of work and specialization?

dbacar8 hours ago

They are not investing billions on it so a high schooler can do his term paper on it, is is already much more than a generalist. It might be like a very good sidekick for now, but that is not the plan.

EpicEng9 hours ago

What does the number of buzzwords and frameworks on a resume matter? Engineering is so much more than that it’s not even worth mentioning. You’re comparison is on the easiest aspect of what we do.

Unless you’re a web dev. Then youre right and will be replaced soon enough. Guess why.

dbacar9 hours ago

Not everyone builds Redis at home/work. So you do the math. And now Antirez himself is feeding the beast by himself.

kurofune9 hours ago

The fact that we are debating this topic at all is indicative of how far LLMs have come in such a short time. I find them incredibly useful tools that vastly enhance my productivity and curiosity, and I'm really grateful for them.

horns4lyfe1 hour ago

Writing about AI is missing the forest for the trees. The US software industry will be wholesale destroyed (and therefore global software will be too) by offshoring.

burningion10 hours ago

I agree, but I also didn’t create redis!

It’s a tough bar if LLMs have to be post antirez level intelligence :)

ljlolel10 hours ago

Seriously, he’s one of the best on the planet of course it’s not better than him. If so we’d be cooked.

99% of professional software developers don’t understand what he said much less can come up with it (or evaluate it like Gemini).

This feels a bit like a humblebrag about how well he can discuss with an LLM compared to others vibecoding.

Poortold5 hours ago

For coding playwright automation it has use cases. Especially if you template out function patterns. Though I never use it to write logic as AI is just ass at that. If I wanted a shitty if else chain I'd ask the intern to code it

lodovic10 hours ago

Sure, human coders will always be better than just AI. But an experienced developer with AI tops both. Someone said, your job won't be taken by AI, it will be taken by someone who's using AI smarter than you.

bluefirebrand10 hours ago

> Someone said, your job won't be taken by AI, it will be taken by someone who's using AI smarter than you.

"Your job will be taken by someone who does more work faster/cheaper than you, regardless of quality" has pretty much always been true

That's why outsourcing happens too

palavrov9 hours ago

From my experience AI for coders is multiplier of the coder skills. It will allow you to faster solve problems or add bugs. But so far will not make you a better coder than you are.

kristopolous9 hours ago

Correct. LLMs are a thought management tech. Stupider ones are fine because they're organizing tools with a larger library of knowledge.

Think about it and tell me you use it differently.

janalsncm9 hours ago

Software engineering is in the painful position of needing to explain the value of their job to management. It sucks because now we need to pull out these anecdotes of solving difficult bugs, with the implication that AI can’t handle it.

We have never been good at confronting the follies of management. The Leetcode interview process is idiotic but we go along with it. Ironically LC was one of the first victims of AI, but this is even more of an issue for management that things SWEs solve Leetcodes all day.

Ultimately I believe this is something that will take a cycle for business to figure out by failing. When businesses will figure out that 10 good engineers + AI always beats 5 + AI, it will become table stakes rather than something that replaces people.

Your competitor who didn’t just fire a ton of SWEs? Turns out they can pay for Cursor subscriptions too, and now they are moving faster than you.

foobarian9 hours ago

I find LLMs a fantastic frontend to StackOverflow. But agree with OP it's not an apples-to-apples replacement for the human agent.

orangebread6 hours ago

It's that time again where a dev writes a blog post coping.

callamdelaney9 hours ago

LLMs will never be better than humans on the basis that LLMs are just a shitty copy of human code.

danielbln8 hours ago

I think they can be an excellent copy of human code. Are they great at novel out-of-training-distribution tasks? Definitely not, they suck at them. Yet I'd argue that most problems aren't novel, at most they are some recombination of prior problems.

jbellis9 hours ago

But Human+Ai is far more productive than Human alone, and more fun, too. I think antirez would agree, or he wouldn't bother using Gemini.

I built Brokk to maximize the ability of humans to effectively supervise their AI minions. Not a VS code plugin, we need something new. https://brokk.ai

devmor3 hours ago

I have been evaluating LLMs for coding use in and out of a professional context. I’m forbidden to discuss the specifics regarding the clients/employers I’ve used them with due to NDAs, but my experience has been mostly the same as my private use - that they are marginally useful for less than one half of simple problem scenarios, and I have yet to find one that has been useful for any complex problem scenarios.

Neither of these issues is particularly damning on its own, as improvements to the technology could change this. However, the reason I have chosen to avoid them is unlikely to change; that they actively and rapidly reduce my own willingness for critical thinking. It’s not something I noticed immediately, but once Microsoft’s study showing the same conclusions came out, I evaluated some LLM programming tools again and found that I generally had a more difficult time thinking through problems during a session in which I attempted to rely on said tools.

uticus10 hours ago

same as https://news.ycombinator.com/item?id=44127956, also on HN front page

CivBase4 hours ago

In my experience some of the hardest parts of software development is figuring out exactly what the stakeholder actually needs. One of the talents a developer needs is the ability to pry for that information. Chatbots simply don't do that, which I imagine has a significant impact on the usability of their output.

hello_computer46 minutes ago

Corporations have many constraints—advertisers, investors, employees, legislators, journalists, advocacy groups. So many “white lies” are baked into these models to accommodate those constraints, nerfing the model. It is only a matter of time before hardware brings this down to the hobbyist level—without those constraints—giving the present methods their first fair fight; while for now, they are born lobotomized. Some of the “but, but, but…”s we see here daily to justify our jobs are not going to hold up to a non-lobotomized LLM.

pknerd10 hours ago

Let's not forget that LLMs can't give a solution they have not experienced themselves

willmarch9 hours ago

This is objectively not true.

nssnsjsjsjs2 hours ago

So LLMs have sweated to debug a production issue, got to the bottom of it, realised it is worth having more unit tests so values that and then produces a solution that has more unit tests. So when you ask the LLM to write code it is opinionated and always creates a test to go with it?

AnimalMuppet10 hours ago

OK. (I mean, it was an interesting and relevant question.)

The other, related question is, are human coders with an LLM better than human coders without an LLM, and by how much?

(habnds made the same point, just before I did.)

vertigolimbo9 hours ago

Here’s the answer for you. Tldr; 15% performance increase, in some cases up to 40% increase, in the others 5% decrease. It all depends.

Source: https://www.thoughtworks.com/insights/blog/generative-ai/exp...

anjc6 hours ago

Gemini gives instant, adaptive, expert solutions to an esoteric and complex problem, and commenters here are still likening LLMs to junior coders.

Glad to see the author acknowledges their usefulness and limitations so far.

ModernMech6 hours ago

The other day an LLM told me that in Python, you have to name your files the same as the class name, and that you can only have one class per file. So... yeah, let's replace the entire dev team with LLMs, what could go wrong?

RayMan110 hours ago

of course they are.

zb38 hours ago

Speak for yourself..

3cats-in-a-coat9 hours ago

"Better" is relative to context. It's a multi-dimensional metric flattened to a single comparison. And humans don't always win that comparison.

LLMs are faster, and when the task can be synthetically tested for correctness, and you can build up to it heuristically, humans can't compete. I can't spit out a full game in 5 minutes, can you?

LLMs are also cheaper.

LLMs are also obedient and don't get sick, and don't sleep.

Humans are still better by other criteria. But none of this matters. All disruptions start from the low end, and climb from there. The climbing is rapid and unstoppable.

varispeed10 hours ago

Looks like this pen is not going to replace the artist after all.

fspoto986 hours ago

Yes i agree:D

657 hours ago

AI is good for people who have given up, who don't give a shit about anything anymore.

You know, those who don't care about learning and solving problems, gaining real experience they can use to solve problems even faster in the future, faster than any AI slop.

oldpersonintx210 hours ago

but their rate of improvement is like 1000x human devs, so you have to wonder what the shot clock says for most working devs

chuckreynolds10 hours ago

for now. (i'm not a bot. i'm aware however a bot would say this)

Klaus_2 hours ago

[dead]

hackernewshomos5 hours ago

[flagged]

gxs5 hours ago

Argh people are insufferable about this subject

This stuff is still in its infancy, of course its not perfect

But its already USEFUL and it CAN do a lot of stuff - just not all types of stuff and it still can mess up the stuff that it can do

It's that simple

The point is that overtime it'll get better and better

Reminds me of self driving cars and or even just general automation back in the day - the complaint has always been that a human could do it better and at some point those people just went away because it stopped being true

Another example is automated mail sorting by the post office. The gripe was always humans will always be able to do it better - true, in the meantime the post office reduced the facilities with humans that did this to just one

habnds10 hours ago

seems comparable to chess where it's well established that a human + a computer is much more skilled than either one individually

bgwalter10 hours ago

This was the Centaur hypothesis in the early days of chess programs and it hasn't been true for a long time.

Chess programs of course have a well defined algorithm. "AI" would be incapable of even writing /bin/true without having seen it before.

It certainly wouldn't have been able to write Redis.

NitpickLawyer9 hours ago

> This was the Centaur hypothesis in the early days of chess programs and it hasn't been true for a long time.

> Chess programs of course have a well defined algorithm.

Ironically, that also "hasn't been true for a long time". The best chess engines humans have written with "defined algorithms" were bested by RL (alphazero) engines a long time ago. The best of the best are now NNUE + algos (latest stockfish). And even then NN based engines (Leela0) can occasionally take some games from Stockfish. NNs are scarily good. And the bitter lesson is bitter for a reason.

bgwalter9 hours ago

No, the alphazero papers used an outdated version of Stockfish for comparison and have always been disputed.

Stockfish NNUE was announced to be 80 ELO higher than the default. I don't find it frustrating. NNs excel at detecting patterns in a well defined search space.

Writing evaluation functions is tedious. It isn't a sign of NN intelligence.

BlackSwanMan7 hours ago

[dead]

hatefulmoron10 hours ago

I don't think that's been true for a while now -- computers are that much better.

vjvjvjvjghv10 hours ago

Can humans really give useful input to computers? I thought we have reached a state where computers do stuff no human can understand and will crush human players.