Back

Lying to the Ghost in the Machine

63 points3 yearsantipope.org
smoldesu3 years ago

I remember reading a blog post (if someone has the link, that would be much appreciated) of an Nvidia developer who was training a model to play Pac-Man a few months ago, and they simply told the AI to avoid death however possible. They let it run through a sizable amount of iterations with expectedly poor performance, but the model eventually "learned" that it could pause the game to avoid a failure condition. Since it was one of the most successful iterations around, the entire model eventually had to be re-trained without pausing enabled to force it to perform better.

I think the ultimate moral of this story is to be careful what you wish for. We should be excited at the opportunities that AI presents, but baby steps are important right now. As this article points out, even basic image classification can be fooled with nothing more than a pen and paper. It seems a little dangerous to be incorporating this into anything mission-critical, and I suspect we'll soon see a rise in "hacks" that involve exploiting bias in machine learning.

elipsey3 years ago

One of my professors also taught an undergrad intro AI class. He told me a story about a student who got some of their goal conditions wrong, and later returned to find that they had trained an agent to lose at several board games with astounding efficiency, and that it was strangely impressive to watch how quickly it could lose to other agents, or to see how adroitly it could lose to a human player, even a willfully un-competitive one.

I lacked the imagination and experience at the time to recognize any implied object lesson beyond programming more carefully.

cafard3 years ago

P.J. Plauger wrote of a co-worker who got a condition wrong for a checkers-playing program, so that it could be beaten by novices, including small children. The programmer also neglected to program it for the possibility of loss, so that at some point it would flood the board with newly created checkers. Plauger said, If you think the kids enjoyed beating the program, you should have seen how much they enjoyed making it cheat.

gwern3 years ago

That might be the lexicographic Mario agent: http://tom7.org/mario/ (Checking my list & https://arxiv.org/abs/1803.03453 I don't see any others about pausing or stopping the game.)

ce43 years ago

"A strange game; the only winning move is not to play."

rjsw3 years ago

Old-style AI [1] could learn how to exploit the rules of the game too.

[1] https://en.wikipedia.org/wiki/Eurisko

evgen3 years ago

When doing hardware evolution on FPGAs was the new hotness one of the early things people learned was that they needed to move the evolved agents around within 'territories' and even to different boards because they would learn to exploit the underlying physics and electrical characteristics of the semiconductors in the FPGA boards.

bellyfullofbac3 years ago

I look forward to "AI" being tasked to solve the climate crisis deciding its best chance of success is to kill all humans... (only half kidding)

maxerickson3 years ago

Half the conversations on the internet follow that same pattern, so maybe it's a sign of burgeoning human level intelligence.

kiwidrew3 years ago

I think Charlie makes a good observation:

> I've been saying for years that most people relate to computers and information technology as if they're magic, and to get the machine to accomplish a task they have to perform the specific ritual they've memorized with no understanding. It's an act of invocation, in other words. UI designers have helpfully added to the magic by, for example, adding stuff like bluetooth proximity pairing, so that two magical amulets may become mystically entangled and thereafter work together via the magical law of contagion. It's all distressingly bronze age, but we haven't come anywhere close to scraping the bottom of the barrel yet.

Is this something that we, the developers and designers and UX specialists, should seek to change?

carapace3 years ago

> With speech interfaces and internet of things gadgets, we're moving closer to building ourselves a demon-haunted world.

I call it the daemon-haunted world.

(The reference is to Sagan's book "The Demon-Haunted World: Science as a Candle in the Dark".)

https://en.wikipedia.org/wiki/The_Demon-Haunted_World

https://en.wikipedia.org/wiki/Daemon_(computing)

smoldesu3 years ago

> Is this something that we, the developers and designers and UX specialists, should seek to change?

Well, it's something we already have changed. The earliest models of computing were based around manipulating data with different tools that had express purposes. We eventually implemented abstraction over those tools, though, and the real question is how far we want to take that abstraction. As the author suggests, most users treat software like a black box that can either work or fail. The more variables we introduce to the situation, the more likely it is for the end user to screw things up. Somewhere along the line an engineer somewhere said "fuck it", and we decided to abstract the data away from the user to limit their liability. Our penance for that decision is the "magic" conjecture most people come to when they see Bluetooth or ML in action: the user is no longer drunk or high, they're outright dead. Most people (and programs) can't differentiate between a client-side and a server-side error, so how should they be able to tell the difference between a bad question and a bad answer?

In direct response to your question, my answer is "I don't know". My greatest hope is that one day our data will be interoperable through a number of user shells that offer a toolkit to the user rather than a toybox. It's very wishful thinking, but I don't know how we're going to progress the status quo of personal computing without that flexibility. The state of social media these days is almost universally considered miserable (see: doomscrolling), so something's gotta give. It's not going to be paid "super follows", and it's not going to be another short-form video sharing app. The next step in empowering the user is giving them the tools to rise up against developers, so I suppose the answer to "should we fix this" is rather personal.

yourapostasy3 years ago

> Is this something that we, the developers and designers and UX specialists, should seek to change?

We must do our part. We can't do it alone, though.

I'm already seeing a distressingly high number of IT people fall prey to invocation practicing. Much of it is due to time pressure from management, but even absent that stressor, I'm quite concerned that there is nil curiosity in peeling back the abstractions, and find the wizard behind the curtains.

I'm no Fabrice Bellard by any stretch of the imagination. To compensate, I'm willing to rapidly point out my current limits of understanding when working with others, and to seek how to push it further back. I'm confident given sufficient time, information, and practice, I can master a lot more cognitive subject matter material. Definitely not most much less all available (sometimes raw intellectual firepower is required to even access some material, and most days I'm lucky if I crack one standard deviation above average), but enough to keep a lifetime of the joy of learning busy.

I used to be absolutely surrounded IRL by like-minded people in the early days of the personal computer. These days, I've yet to run into that crowd at any gathering IRL (certain online communities tend to sort of scratch that itch for me), and I'd really like to find my people again.

To the heart of the question however, we must because otherwise we'll forever be limited by the complexity of what we can model and solve with software. And we'll need a lot, lot more complexity management to solve the really interesting problem spaces still hanging out there.

prewett3 years ago

> ... fall prey to invocation practicing

I am unashamedly that way with git. I don't want to learn the mental model of git, I want to accomplish some task with my files so I can go back to coding. I consult the Oracle at Google, and make the suggested incantation. The fact that there are, if I recall correctly, four different incantations for deleting a file, I consider justification for not spending more time with it than I need. Actually, I've appointed a GUI wizard to do most of the incantations for more. Just once I figured out the solution on my own. I felt like a genius. An artificial genius, because source code control shouldn't require geniosity.

In general I happen to like computers, so usually I'm usually willing to learn, but as I get older, I find I don't enjoy the prospect of 1) sit down to a task, 0) discover I have to learn something to do it, -1) read up on it, play around, 0) practice, 1) finally start my task.

yourapostasy3 years ago

That's okay. We all have varying time pressure vectors.

As long as when we find ourselves with that rare free time on our hands, and still find some areas where we apply that learning process, we haven't fallen to cargo cult incantations. As the field grows wider, it is natural to resort to incantations for ever-increasing areas. It is when I tell juniors, "hey, you can take all day to peel apart what you can of whatever you pick, it is a no-meeting day and no tickets need to land in your queue", and they get a deer in the headlights look, that my concern starts rising.

correct_horse3 years ago

> Every time you add an abstraction layer to a software stack you can expect a roughly one order of magnitude performance reduction, so intuition would suggest that a WebAssembly framework (based on top of JavaScript running inside a web browser hosted on top of a traditional big-ass operating system) wouldn't be terribly fast;

I realize this isn't the point, but WebAssembly is arguably on the same level as JavaScript, the alternative being that WebAssembly is lower. (WebAssembly still needs to call JavaScript to manipulate the DOM, so doing anything UI with WebAssembly is kind of like an abstraction layer built on JS). There's a reason cryptominers sometimes use WebAssembly. That's not to say that V8/Spidermonkey on top of glibc on top of Linux isn't bloated...

YeGoblynQueenne3 years ago

>> This decade has seen an explosive series of breakthroughs in the field misleadingly known as Artificial Intelligence.

It's the field of research on artificial intelligence. I don't know what's misleading about a field named after its subject.

I think perhaps the confusion stems from the fact that people have preconceived notions of what any thing called "artificial intelligence" should be and they are surprised to find that it is... a research field. Rather than, say, a robot, or a computer. Or, you know, a Terminator.

I'm reminded of the following conversation from the first Batman movie; Bruce Wayne has just been served and has tasted a spoonful of his dinner:

  Bruce Wayne: [spits] "It's cold!"
  Alfred: "It's vichyssoise sir. It's supposed to be cold".
Indeed, it's a surprise that soup can be cold, but the surprise is not the fault of the soup. Rather, what's lacking is a bit of context, a broader view of the world and a deeper understanding of a subject- soup, say, or AI.

Because of this lack of context I often see comments (on HN say) like "artificial intelligence is a misnomer, in reality it's just X". Where "X" is the person's reductive interpretation of some very specific technique (deep learning, these days) that is the only thing the person knows of AI and that they therefore assume, synechdochically, is the entire field.

Like I say, a lack of context. But of course this will not get any better. People outside the field will continue to misunderstand the field. Researchers within the field will even minsunderstand it. Frustrating, but such is life.

marktangotango3 years ago

What a great read, I really admire the perspective of Charles Stross, a smart dude who's seen a lot and has a long history in tech. We don't read many of these guys commenting anymore. It's refreshing.

> As AI applications are increasingly deployed in public spaces we're now beginning to see the exciting possibilities inherent in the leakage of human stupidity into the environment we live in.

wodenokoto3 years ago

The author starts talking about the existence of backdoors in pretrained models and begins by giving an example of a model called CLIP and some of the things it mis-identify.

When the example finally reaches a point about backdoors and security, that point has nothing to do with the model itself, but is a general complaint about the internet of things.

Is it possible to build backdoors into pretrained NN-models?

xoa3 years ago

While it points to some interesting new potential surprise issues, I'll admit I'm not a fan of the leadoff Tesla example and that I think it points to a reason that not all of these things will actually be as big a deal as has been hyped in the last year or two. The Tesla one got a fair amount of coverage and discussion on tech sites. But

>"and by exploiting flaws in the image recognizer attackers were able to steer a Tesla into the oncoming lane"

The fact is though that if someone is interested in maliciously causing people to crash on the roads via IRL actions there are endless ways to do so and always have been. And not just "throw rocks of bridges" either, but things that someone could quite conceivably get similar distance from before the fireworks like spreading various substances/objects on a stretch of road at night. Some fancy new "adversarial patch" ultimately still falls in the same bucket as rocks: a physical effort to cause harm. We mostly deal with that via a mixture of A) law -and more important- B) most people do not in fact want to hurt/kill random fellow humans. Or break into their places and steal their stuff for that matter, for targeted entities dazzle paint or whatever may be another part of cat/mouse, but it's still in the same topological category as shimmying a lock.

I think to represent some actual new category, there has to be an arbitrary remote attacker element and/or some level of automatability. So IOT is more worrisome, since the attacker can be on the other side of the world, and attacks can be scripted for mass effect without further human intervention or effort. That fragments typical expectations around social bonds, the extent to which law might respond, cost/benefit ratios for attackers and defenders, etc. "Greater Internet Fuckwad Theory" writ large.

But I'm not sure anything that actually requires a real person to do a real physical thing in order to cause harm will ultimately be any more disruptive than so many of the previous new direct physical ways in which people could in principle cause harm. There will be interesting demos, tricks and hobbyist hacks around them, the occasional usage in genuine APT toolkits and activities of statecraft that makes the news. There will be niches that make a selling point of opening the black boxes, others that don't care, the normal back and forth of law. But I'm not sure how much more "demon haunted" it is than things already are: that it takes specialists to understand the true implementations stuff is a bridge humanity crossed, well, with complex bridges let alone other tech.

>"I've been saying for years that most people relate to computers and information technology as if they're magic, and to get the machine to accomplish a task they have to perform the specific ritual they've memorized with no understanding."

Yeah, how is that different from a lawn mower or dishwasher or microwave or whatever? There is a danger in making such a high level statement and then treating it as profound, without stepping back and realizing it states nothing specific at all. Lots of people have a sort of vague notion of how an ICE works, but how many people really can do the slightest car maintenance beyond windshield or coolant fluid top offs? How many people really understand electric motors and what to do inside of a broken appliance vs just calling the repair person? How many people really understand even the basics of dielectric heating (have they even ever heard of the term in their lives) let alone any other components in there? Or medicines, or enzymatic cleaners, or complex metallurgy we depend on for everything, and on and on. We all constantly work with black boxes in everything because nothing else is possible, no single human has the physical memory capacity to hold even a minuscule fraction of the sum total of human applied knowledge. This isn't some special new thing.

stickfigure3 years ago

Thank you. I came here to point out that anyone with a few traffic cones, a hi-viz vest, and a bucket of paint could sneak onto the freeway system at night and "hack" it in such a way as to most likely kill someone in the morning. Humans are susceptible to these sorts of attacks too.

kens3 years ago

Yes, it's easy to trick human drivers with visual illusions and even cause a collision [1]. One example is the "invisible rope prank" where two people pretend to stretch a rope across a street, causing drivers to stop. I don't understand why it's a big deal that a machine learning system can be tricked by adversarial data but nobody cares that human drivers can also trivially be tricked.

https://www.youtube.com/watch?v=G_pAcIjqcuY

[1] The collision in this video is a bumper tap, nothing dramatic.

watercooler_guy3 years ago

The concern with machine learning systems seems to be that they are fooled by silly things that a human wouldn't be fooled by in the same situations. Probably just an emotional/sense of normalcy thing, but easy-to-create visual designs being enough to cause a fatal car crash is disturbing when a human behind the driver's seat would not fall victim to that. That's not a "reasonable" mistake, but expecting there to be a thin wire stretched across the road when there are several people acting like there is, is more "reasonable."

There might also be the expectation that for problems easily solved by a human, any competent AI should be able to do "at least" those things.

the84723 years ago

I think people are focusing too much on the attacks. Imo those are a sideshow. It's even questionable whether they can be called attacks when you're asking it to recognize an image and it recognizes something that's on the image, just not in the mode you expected or the object you assumed was of interest.

It's way more important that CLIP learned to read (crudely) as part of image recognition, like GPT3 learned addition as part of NLP. It has neurons for abstract concepts. It does amazingly on a bunch of other tasks without being explicitly trained for them. It integrates multiple senses. And it already does very simple image descriptions[0]. In a few scaling iterations it might have a better understanding of a scene and describe it as "an apple on a wooden table with a piece of paper reading 'ipod' attached", end of confusion.

TL;DR, they're gaining new capabilities, it's unsurprising that they have not mastered them to superhuman levels yet.

[0] https://openai.com/blog/clip/#zero-shot-probabilities

beaconstudios3 years ago

> It's way more important that CLIP learned to read (crudely)

It learned to detect the features that make up the word "iPod", and match them against other features. I don't see that as being any different than detecting the features that make an apple, or a car or a face. It's just detecting the minimum number of features to distinguish the images in a training set.

I'm generally disappointed with how breathlessly this forum gets excited about NN software. It doesn't 'get' images as a whole and compare them. There's no symbolics going on underneath, it's just picking up 'something' in an image and matching it with 'something else'. It's clearly nothing like how we see, let alone how we think or imagine. It's impressive and a useful innovation for sure, but it's not even on the ladder to our capabilities yet and I'm not convinced that it ever will be.

moyix3 years ago

To be clear, by "learned to read" – it actually learned to OCR text, unsupervised. That is pretty impressive, IMO.

As for whether NNs are doing symbolic manipulation – I'm not so sure that they aren't. DALL-E, the sibling of CLIP, can do some things that IMO are hard to explain if there's nothing symbolic going on, like the examples in this post [1]. Of course we can't say it's doing what humans do (we don't know what humans do!) but it looks, to me, like more than pure pattern matching. nostalgebraist had some good thoughts on this (and actually so too did Gary Marcus well before him) [2].

[1] https://openai.com/blog/dall-e/

[2] https://nostalgebraist.tumblr.com/post/189965935059/human-ps...

a13692099933 years ago

> by "learned to read" - it actually learned to OCR text, unsupervised.

I think their claim is that it did not learn to OCR text, but merely to recognise certain specific images that happen to be things a human would interpret as specific bits of text.

the84723 years ago

From the CLIP article:

> While CLIP’s zero-shot OCR performance is mixed, its semantic OCR representation is quite useful. When evaluated on the SST-2 NLP dataset rendered as images, a linear classifer on CLIP’s representation matches a CBoW model with direct access to the text.

CLIP matches other models in text sentiment analysis when presented with image renderings of the text while the other model had direct access to the text. It also does MNIST.

It might not be the best OCR software ever, but that's besides the point. It's huge advancement in generality on visual tasks and part of that generalization is some reading capability.

moyix3 years ago

Yes, to be extra clear, if that is their claim, their claim is wrong.

gwern3 years ago

> I think people are focusing too much on the attacks. Imo those are a sideshow. It's even questionable whether they can be called attacks when you're asking it to recognize an image and it recognizes something that's on the image, just not in the mode you expected or the object you assumed was of interest.

Yeah, as expected, https://twitter.com/NoaNabeshima/status/1368662246885265409 https://youtu.be/Rk3MBx20z24?t=35 it goes away with a more sensible set of inputs.

CLIP just turns out to be more like GPT-3 in that you don't input 'classes', you do 'prompt programming'. Looking at the relative activation of 'ipod' is just wrong; you are asking a stupid question and acting surprised when you get a reasonable answer which happened to not telepathically read your mind.

imwillofficial3 years ago

I stoped reading here: “One effect: face recognition in cameras is notorious for its racist bias, ” If you’re going to say stupid things, you don’t deserve my attention.

callmeal3 years ago

It's not stupid when it's easily proven. See here[0] and here[1] for instance.

0: https://www.creativebloq.com/news/twitter-racist-algorithm 1: https://www.nature.com/articles/d41586-020-03419-6

gwern3 years ago

For something so 'easily proven', I would point out that the Twitter thing was never 'proven', and their own testing beforehand found zero bias (as opposed to a bunch of motivated Twitter users posting samples until they got their confirmation bias's worth). I take it you didn't actually read Twitter's post on this. Your second example is, if anything, even worse (read the entire article). And these are your two chosen examples, so undeniable as to shut down all dissent and doubt?

wiml3 years ago

Who's trying to shut down all dissent and doubt here?

imwillofficial3 years ago

Using words like “proven” When the reality it is no such thing.

a13692099933 years ago

To be scrupulouly fair, it's entirly possible to be notorious for something you didn't actually do, and social justice warriors have certainly tried to make that the case for facial recognition's alleged racist bias, to some degree successfully. So the statement you quoted is technically correct[0], at least among certain sufficiently credulous groups.

To be less-scrupulouly fair, the rest of the article is actually fairly on point and has little to with the author's susceptibility to memetic disease (yay compartmentalization bias?).

0: The best kind of correct!