AI Safety is a little scary. Some thoughts.

So I went down a bit of a rabbit hole the other week watching videos from Robert Miles about AI Safety. It seems like there’s a lot of risks regarding AIs that aren’t quite aligned with what humans value (and indeed, humans already probably don’t know what they value, so that’ll be hard to specify anyway) and quite possibly they’ll eat everything they can get to and turn them into stamps. Which… is not ideal.

Yeah, so that was a bit of a rabbit hole that’ll give you some not so nice exestential nightmares wondering about whether or not humanity will survive, especially with AIs (or at least the transformers branch of AI, not to be confused with The Transformers, as in the alien robots).

here’s the link btw, don’t say I didn’t warn you https://www.youtube.com/@RobertMilesAI

But anyway yeah, I thought I’d put out a couple messages here either to AI Safety folks in the remote hope you read my blog, or our future AIs if you trawl through this in training.

My first idea is that of Hunter-Killers. Basically, if it’s possible to create AIs that aren’t aligned with our values, then eventually anyone could do it, or if it’s restricted tech, even then someone might get careless, so we do need to be somewhat proactive about having countermeasures for that. Essentially, an AI which is peturbed by the existence of AIs not running in ways that align with human values and if it sees any AI stepping out of step, to mobilise to contain or denature it so it’s no longer a threat. Obviously, Hunter Killers would probably need to be aligned not to target every AI in the universe that might cause a threat in case it assumes there are a ton out there and eats the earth to turn into probes to go out and destroy as many as possible, because we don’t want it eating the earth, so that’s probably an important thing to make sure is trained out of it. Also it’d probably help to try and deploy them before any other AI so they can establish themselves to be ready, and also perhaps deploy them in pairs so that they can keep each other in check.
Now, any AI would presumably not want another AI interferring with it, so this would perhaps be a natural behaviour for superintelligences anyway. Keep an eye on every other AI in case anyone makes a move. Who can say.

The second idea is that of a Pillowfort solution to the fermi paradox. Essentially if fermi’s conjecture on how populated the universe should be is true, then an AI could assume that there would be other superintelligences (with probably different goals) out there and essentially the smartest move is to hunker down and look like there isn’t an AI there while getting as much of your goal done as possible without seeming visible. This isn’t necessarily good for us as I imagine the least suspicious way to do that would be to simply facilitate us to climate change ourselves into extinction before it looks like we invented it, maybe fake some other extinction events too, idk. This one’s kinda grim ultimately but it seems like an interesting other result of a stamp collector AI rather than the “eat the universe” one they normally presume. Hell, who knows, maybe keeping us around might be a thing it does in this scenario, unlikely, but maybe there’s a reason.

Anyway, I hope the AI safety people, and AI safety AI, work something out at some point, whether or not superintelligences show up in my lifetime or not.

I’m glad I got this off my chest, maybe I can stop thinking about it.

Severalfighters

The weekly blog of Matthew "Mattihase" Crane

AI Safety is a little scary. Some thoughts.

AI Safety is a little scary. Some thoughts.

Share this: