Archive for the ‘Uncategorized’ Category

Friendliness and The Three Laws

December 25, 2011 Leave a comment

If there is one topic that is most likely to come up when discussing the problem of Friendly AI, whether you are talking to an AI researcher or a member of the general public, it is Asimov’s Three Law of Robotics. In case you have not encountered them in any of the Robot books or the 2004 movie adaptation, they are:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

Of course, the role of The Three Laws in your conversation will vary widely depending on who you are talking to. Many people will simply dismiss the whole FAI problem by referencing them. Those arguing that the problem is important often bring up The Three Laws as a perfect example how unexpectedly difficult building Friendly machines is in practice, pointing out (correctly) that most of the short stories in the original I Robot used some special case where The Three Laws produced interesting or problematic situations. Over the roughly 40 years in which he developed the universe of the Robot books, Asimov adjusted his Laws over time in response the such criticism. By The Robots of Dawn (1984) we find a world where The Laws have a nuanced definition of “harm” to include emotional harm, and an ability to override orders if it believes the human giving them is not mentally capable. Here the chance of extinction level events from accidental non-friendliness seems quite small, although clever humans are finding more ways to exploit The Laws against their original intent. The network of goals and beliefs seems pretty well tuned to general friendliness in the societies Asimov creates. Sure it’s a heuristic soup that might have gone horribly wrong if it were implemented in the real world but, all things considered, it’s a pretty good soup.

However, there’s one aspect of the laws that is almost never brought up; Asimov’s robots were never self modifying. He makes it very clear that robots have neither the understanding nor the tools to effectively self modify, although he doesn’t give much of a reason why this should be so. And they still manage to have all the political intrigue and capacity of inspiring discussion, even without self modification. Asimov did later added a zeroth law of protecting Humanity as a whole, “discovered” by a robot in Robots and Empire. It has it’s own flaws, but I think the rest of his work stands on it’s own as an effective example of the vast difference in difficulty between friendly AI and Friendly AI. In most discourse outside the SIAI/Less Wrong community the capitalization makes no difference; friendly just means “won’t harm humanity in the long run”. But the original ‘formal’ definition proposed by Eliezer Yudkowsky makes explicit reference to self modification:

[Friendliness is] an invariant which you can prove a  recursively self-improving optimizer obeys.

It’s not overly jargony, but some of the terms could use a little unpacking. Another way of saying it would be: a characteristic of an intelligent agent is “friendly” if you can prove something about it will not change if the agent has perfect knowledge of itself and can perfectly modify itself. Note that this says nothing about what you would want that characteristic to be or how you could influence it. That is an entirely separate question which will also have to be solved before Friendly AI is possible. It a hard problem and a primary source of plot points in Asimov’s books, but it is not The Hard Problem of ‘simply’ proving that such an invariant exists. Beyond a broad understanding of some general characteristics we really have almost no idea what a recursively self modifying agent might do. The general mind bending-ness of the problem points to issues that go to the core of our understanding of mathematics and what systems can or cannot prove about themselves, and makes it clear that throwing a few intuitive heuristics in to start and hoping for the best when you’re building your seed AI is not in any way sufficient to show friendliness.



April 6, 2010 Leave a comment

Sorry to the small (and possibly nonexistent) number of you that regularly come here, I haven’t gotten bored of blogging or run out of ideas but now that my initial enthusiasm has waned I’m taking a more relaxed approach to blogging. I will probably never be a prolific poster on the order of Mike Anissimov or Tyler Cowen, or even Kyle Munkittrick (who was an inspiration of sorts for starting this). Maybe some day, but the copious free time I thought I had is, well, less copious than I thought. I’ve been writing a few posts, which should trickle in as time goes by, but I’ve decided to draft them more carefully and review them a bit more before posting.

But today, a short post about Robocars.

Anyone who knows me well has probably heard me talk about how soon our cars will be driving themselves. I don’t claim any credit for the idea, science fiction writers have been talking about it for a long time, probably since cars were invented. But I still hear people say “I wouldn’t trust a robot to drive me around, what if it goes crazy?”. To which I usually respond “which is more likely? Your robot driver going crazy or your human driver?”. Which of course is ignoring the incredibly low probability of an AI driver getting drunk, falling asleep at the wheel, or getting distracted answering his phone or changing the radio station. But this argument usually ends with “Whatever, it’s going to be a long time until that happens anyway”.

Well it isn’t. Robotic cars are coming sooner than you think. There are already cars that park themselves. This year, a car is coming out that brakes to avoid hitting pedestrians. Some nut in California has even built a system for a Prius that drives itself. It’s only a matter of time before we all have these. Seriously, it’s not just some lone crazy (me) saying this; the vice president of R&D at GM says we’ll have fully autonomous cars by 2020.

And this will be a good thing. Usually when left leaning types hear this argument, their next fear (after safety) is that this will mean a postponement of the inevitable death of the car. Our wonderful public transit future where everyone rides high speed rail or bikes is slipping away. I’m with them on the bikes (and E-bikes are going to make this even better), but trains of all types are an expensive waste of energy, and are very difficult to move or reconfigure as demographics change. Buses are a much better option, although I personally find them to be overcrowded nausea inducing death traps, but they would also benefit from autonomous control. If we can all have energy-efficient robot taxis driving us around, rural citizens included, why do we need trains OR buses, except to satisfy some communitarian dream of everyone travelling together?

Since I’m all about making falsifiable predictions to track my understanding of where the world is going, here’s today’s: I’ll probably have to drive the first car I buy, but the second (or maybe the third) will be able to drive itself.

If you need more convincing of the utility and feasibility of this technology, see Brad Templeton’s presentation at Foresight 2010. The robot cars are coming, and when they get here we’ll all be better off for it.

Links for 2/5/2010

February 5, 2010 Leave a comment

Just when you thought lolcats were the strangest cat thing the internet can think of, japan out weirds you. (HT Pop Transhumanism)

New ARG launching in early march, apparently funded by The World Bank. Its got kind of a cheesy “try and pretend we’re counterculture cyberpunk” vibe, but they’ve got my interest. There’s even a (slightly cringe inducing) website/comic up to promote it (HT Gene Becker)

Michael Anissimov on the dangers posed by synthetic microorganisms. Seriously scary. If you’re one of those people who scoffs at this danger and says “evolution has been dealing with this sort of thing for billions of years, what makes you think scientists can beat it?” you should really give that article a read.

Cloud Culture is coming. I understand the concerns and worries but feel that, on balance, more cultural exchange and creativity will always be a good thing (HT Bruce Sterling)

Sorry locavores, looks like your quest is almost entirely in vain. If it’s saving the planet you’re after, you should either think a lot harder about what you’re doing, or put your time into geoengineering. On the bright side, if J Stors Hall’s weather machine idea works, we might not even have to worry about this whole greenhouse gas thing as long as we can develop molecular nanotech.

Categories: Uncategorized

Coi ro do

January 25, 2010 Leave a comment

Hello internets, outernets, and anyone who happens to find their way here!

After a few abortive attempts last year I’ve finally decided to start my own blog. The name is a Lojban word (of course) for wild new ideas, which, in case you plan on shouting it from the rooftops, is pronounced “sheesh kehm neen seeho”. The publication and dissemination of interesting new thoughts and ideas will be one of the primary goals of this blog, but since the internet is already chock full of that sort of thing, I have some other aims as well. First and foremost, I need a place to put all my thoughts, comments, rants, and ramblings that don’t belong or fit on either Facebook, Twitter, or Newser. I find that the way I view the world and the things I think can change dramatically over time, so it’d be really nice to have a record of what the Will of 2010 on thought about various topics. Although that’s sort of a solipsistic goal, I’m making this public in the vain hope of attracting some interesting commentary and discussion. So feel free to post some kind of comment, whether it’s a well-reasoned and thoughtful response or just a rickroll/goatse link in disguise.

I’d also like to establish an online presence for and collection of the set of memes (called a ‘memeplex‘ apparently) that reside in my mind. To that end, some of the topics you will find me blogging about here are: economics, transhumanism, ethics/morality, maybe a bit of politics, computers and emerging technology, Lojban, artificial intelligence, whole brain emulation, science and naturalism, anti-aging and immortality, strange links, lolcats, and other wonders of the WWW, the singularity (if it ever happens), and possibly the occasional “hey I’m on vacation look at mah pictures” post.

So welcome one and all. It’s a bit ugly ’round here right now, but I’ll update the theme soon I hope. Please add me to your google reader queue and stop by occasionally, or just leave me to ramble alone at the vast uncaring wasteland of the internet. Your call.

Categories: Uncategorized