Friendliness and The Three Laws

December 25, 2011 Leave a comment

If there is one topic that is most likely to come up when discussing the problem of Friendly AI, whether you are talking to an AI researcher or a member of the general public, it is Asimov’s Three Law of Robotics. In case you have not encountered them in any of the Robot books or the 2004 movie adaptation, they are:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

Of course, the role of The Three Laws in your conversation will vary widely depending on who you are talking to. Many people will simply dismiss the whole FAI problem by referencing them. Those arguing that the problem is important often bring up The Three Laws as a perfect example how unexpectedly difficult building Friendly machines is in practice, pointing out (correctly) that most of the short stories in the original I Robot used some special case where The Three Laws produced interesting or problematic situations. Over the roughly 40 years in which he developed the universe of the Robot books, Asimov adjusted his Laws over time in response the such criticism. By The Robots of Dawn (1984) we find a world where The Laws have a nuanced definition of “harm” to include emotional harm, and an ability to override orders if it believes the human giving them is not mentally capable. Here the chance of extinction level events from accidental non-friendliness seems quite small, although clever humans are finding more ways to exploit The Laws against their original intent. The network of goals and beliefs seems pretty well tuned to general friendliness in the societies Asimov creates. Sure it’s a heuristic soup that might have gone horribly wrong if it were implemented in the real world but, all things considered, it’s a pretty good soup.

However, there’s one aspect of the laws that is almost never brought up; Asimov’s robots were never self modifying. He makes it very clear that robots have neither the understanding nor the tools to effectively self modify, although he doesn’t give much of a reason why this should be so. And they still manage to have all the political intrigue and capacity of inspiring discussion, even without self modification. Asimov did later added a zeroth law of protecting Humanity as a whole, “discovered” by a robot in Robots and Empire. It has it’s own flaws, but I think the rest of his work stands on it’s own as an effective example of the vast difference in difficulty between friendly AI and Friendly AI. In most discourse outside the SIAI/Less Wrong community the capitalization makes no difference; friendly just means “won’t harm humanity in the long run”. But the original ‘formal’ definition proposed by Eliezer Yudkowsky makes explicit reference to self modification:

[Friendliness is] an invariant which you can prove a  recursively self-improving optimizer obeys.

It’s not overly jargony, but some of the terms could use a little unpacking. Another way of saying it would be: a characteristic of an intelligent agent is “friendly” if you can prove something about it will not change if the agent has perfect knowledge of itself and can perfectly modify itself. Note that this says nothing about what you would want that characteristic to be or how you could influence it. That is an entirely separate question which will also have to be solved before Friendly AI is possible. It a hard problem and a primary source of plot points in Asimov’s books, but it is not The Hard Problem of ‘simply’ proving that such an invariant exists. Beyond a broad understanding of some general characteristics we really have almost no idea what a recursively self modifying agent might do. The general mind bending-ness of the problem points to issues that go to the core of our understanding of mathematics and what systems can or cannot prove about themselves, and makes it clear that throwing a few intuitive heuristics in to start and hoping for the best when you’re building your seed AI is not in any way sufficient to show friendliness.


Problems of ontology

March 22, 2011 Leave a comment

If there was a theme to my week in browsing the internet, it would be arguments over ontology warping people’s thinking. I think it’s a legitimate subject for discourse, I guess, but the more I see it lead people astray the less useful it seems. Somehow intelligent, thoughtful people seem to think crazy things when they start worrying about what is “really real”, especially regarding morality. This is not a new debate of course, but it was reignited by the publication of Sam Harris’ book The Moral Landscape, which argued that morals are “true” in some sense and could be determined scientifically. On the off-chance I haven’t posted this before, I was sympathetic to this argument before I discovered a PhD thesis from Joshua Greene entitled The Terrible, Horrible, No Good, Very Bad Truth about Morality and What to Do About it. Here’s a link to all 250+ pages of it (worth reading at least some of it), but you might just want to read this shorter article. The basic idea is quite simple: morality is a property of minds, not the natural world, and therefore is not “true” in some universal way. That doesn’t mean you should go out and kill your neighbor or rob a bank, but even if it did, the facts about morality would be the same. There is the Truth about morality, and, separately, What to Do About it. That’s the most frustrating objection I’ve heard to moral anti-realism, so I thought I’d get it out of the way before continuing.

This weeks troubles started with a post on Cosmic Variance about another moral realist, and why he was wrong. Good on Sean for setting the record straight, but I was surprised to see the moral realist he was arguing with was none other than Richard Carrier, who so spectacularly and elegantly defined naturalism as “no ontologically basic mental entities”. Naturalism is perhaps a discussion for another post, but I think I may have brought it up before. If not, here’s the link. If he was advocating moral realism, perhaps I should at least consider his view. After reading his argument, I was surprised by the subtle missteps in reasoning he made. I suspect it is due mainly to Carrier’s desire to recover what he sees as “beneficial” aspects of Christian doctrine such as an absolute moral force, as well as goodness, kindness, and other things I really would call unmitigated goods.

Carrier manages to agree with me on almost every philosophical fact, and yet calls his view realism, whereas I call myself an anti-realist. Situations such as these suggest that at least one of us is failing to make our beliefs pay rent in anticipated experience. I think Carrier’s desire to find a naturalistic source for the good bits of Christianity gives him the motive, but luckily I don’t have to speculate on exactly where he went wrong, since he provides an explicit discussion of his reasoning in his post on moral ontology. He uses a number of examples in his post, but I think the first is sufficient to explain his logic:

Take, for instance, the scariness of an enraged bear: a bear is scary to a person (because of the horrible harm it can do) but not scary to Superman, even though it’s the very same bear, and thus none of its intrinsic properties have changed. Thus the bear’s scariness is relative, but still real. It is not a product of anyone’s opinions, it is not a cultural construct, but a physical fact about bears and people. Thus the scariness of an enraged bear is not a property of the bear alone but a property of the entire bear-person system.

Certainly you cannot observe bear-scariness under a microscope or pick it up with a radio antenna, but, he claims, it’s not solely a mental phenomenon. Therefore, assuming we aren’t superman, we ought to believe bears are scary. Given this definition of ought, its only a few (completely valid) philosophical jumps to oughts for values. Given that we have certain goals, goals like happiness and fulfillment that are common to almost all intelligent agents, there are certain instrumental values we ought to have, like the rule of law, free expression, etc. Thus, he concludes, as there are values grounded in real life that we should hold, regardless of any other rational belief, morality is real. I don’t deeply disagree with this, although I feel it’s slightly misleading based on what moral realists usually believe.

But, as I said, I think the real problem comes when you try to use these beliefs about morality to constrain your expectations of the world. Although this is absolutely essential to the pursuit of rationalism, I think Carrier can be forgiven for not including it in his article since he usefully covered so much philosophical ground. I will also save this for my next post, but in case you’re reading this before I’ve posted it, ask yourself this; if a highly intelligent (and therefore not irrationally amoral) alien/robot suddenly came to our planet, what “morals” would you expect it to have by Carrier’s definition, assuming you have no previous information about its beliefs and goals?

Categories: memeplex Tags: , ,

Is Computer Science a science?

November 13, 2010 Leave a comment

A lot of my idle thinking relates to computers, what exactly they are in the broadest meaningful sense, and how they relate to the intelligent processes in our brains. Although I’m heading generally toward becoming an economist at the moment, computer science is a hobby and possible secondary specialty of mine. So I read this article in Scientific American with interest. The first paragraph summarizes it pretty well:

What kind of discipline is computer science? I thought it was a science when I received my BS. I believed its subdiscipline software engineering was engineering when I received my PhD. I’d heard, and would continue to hear, “This isn’t any kind of science/engineering I know!” from physicists and electrical engineers. I tried for years to prove them wrong. But now I think they’re right.

Essentially the author thinks computer science belongs in the realm of philosophy, and is not very amenable to normal scientific inquiry. I’ve thought quite a bit about this sort of claim, but more in the context of economics. Although I haven’t quite formulated my argument for why economics is a science, I’m pretty sure of how I feel about the subject. While its calm tone may help (by avoiding my metacontrarian reflex), the Scientific American article was more thought-provoking than what I’m used to reading, and I’m less certain of my feelings on its conclusion.

The core of the argument is computer scientists’ inability to formulate predictive hypotheses about the world, and the notion that computers somehow inhabit an abstract “virtual” reality divorced from our own. While it’s hard for me to really grasp the concepts involved here, I think both claims are most likely false. The first makes me think of the way Stephen Wolfram approaches the idea of computation in his February 2010  TED talk. He was also quoted saying something similar in the July 2008 edition of Philosophy of Computing and Information:

4. What do you consider the most neglected topics and/or contributions in late 20th century studies of computation and/or information?

Computer and information science have tended to define themselves in a rather engineering-based way–concentrating on creating and studying systems that perform particular specified tasks.

But there’s a whole different approach that’s much closer to natural science: to just investigate the computational universe of possible programs, and see what’s out there.

One might have thought that most programs that one would encounter this way would not do anything very interesting. But the discovery that launched what I’ve done for the past quarter century is that that’s not the case. Even remarkably simple programs–that one would quickly encounter in sampling programs from the computational universe–can show immensely rich and complex behavior.

There’s a whole new kind of science that can be done by studying those programs.

The idea that computation is something we can sample just like a vernal pool ecosystem or a statistical representation of demographics is fascinating, and I can’t see anything wrong with it on face. I must admit to being attracted to the idea of mapping concepts into spaces (eg “mind design space“, or the original concept of “phase space“), so I might be a bit biased, but if Wolfram is correct then Computer Science really is a naturalistic science in some ways, even more so than mathematics or logic.

Of course the idea that computation space can be sampled suggests that computation really is a property of the universe, which gets into the second main claim made in Steve Wartik’s article, that computing is more like philosophy; necessarily separate from the everyday world we inhabit. This is a way complicated topic, far too complicated for me to address with my limited knowledge of the field, but I’ll try to say a little about how I feel and delve deeper into it another time.

Although my grasp of computation theory is tenuous at best, here’s my understanding of the situation: in the late 1930’s, many mathematicians and logicians were scrambling in the wake of Gödel’s Incompleteness Theorems. While those still tie my head in knots, I understand that they did more than just destroyed once and for all the idea that mathematics (or any system) can be provably complete and consistent; they advanced our understanding of the limits of formal systems in general, and in doing so gave mathematicians more direction in their studies of such system. Around this time, Alonzo Church, Alan Turing, and two other mathematicians/logicians were working on systems to define functions and the methods of calculating them. Church’s was called lambda calculus, Turing’s the Turing machine, and a third developed by JB Rosser and Stephen Kleene was known as recursion functions. In 1939, Rosser claimed that the 3 systems were equivalent; all were different representations of the same underlying set of rules. This lead to the concept of a universal Turing machine, which would theoretically be able to run any calculation that any other programmable computer could run. This is the Thesis, that in some way all (turing-complete) computers are equivalent systems. This, again, points to computation being some deep and underlying property of the universe.

But where the whole line of inquiry gets really interesting is in the so called “strong” version of the Church-Turing thesis; that the universe itself is Turing computable. While this has not been formally proven, the fact that all known laws of physics have effects that are computable by approximation on a digital computer is evidence for it, and for the corresponding interpretation of physics as “digital“. This is a rich vein of interesting stuff I would like to explore further, but suffice to say if the strong version of the Church-Turing thesis were true, it would mean that the universe is a property of computation, instead of the other way around, and thus the study of computation is one of the most valid pursuits of the ultimate truth of reality. It could be simulated on the most powerful computer imaginable, on my laptop, or even on a billiard ball computer given enough time, memory, and energy, and it would make no difference. This gets into highly metaphysical territory very quickly, and lends some credence to Stephen Landsburg’s claims about the reality of mathematical objects. All of my explanations gloss over so much for the sake of some semblance of brevity, even without considering how little I know about the computation theory. But if the universe really is a giant computer, I think it’s safe to say that the study of the process behind that computer is as scientific a discipline as any.

Robocars of the day

October 14, 2010 Leave a comment

The news that google is running a fleet of 7 autonomous cars is making its way around the internets this week. The cars use radar, LIDAR, image recognition, some sort of (gyroscopic?) position estimation, and I’d assume GPS as well. Just as with us humans, it’s going to require a radically multi-modal approach to build robots that can truly sense their place in the world, as well as clever algorithms to integrate the data. This is front page news in the New York Times people; when The Grey Lady picks up a tech story you know the tech its reporting on is going mainstream. We’re even at the point where we can start arguing about whether or not this is legal.

And now at Freie University in Germany they’re taking it one step further with autonomous taxis that can be called from an iPad. It doesn’t seem like this is being widely deployed, but it’s only a matter of time. This is the sort of real application I’m looking for. The geek in me loves to see projects like the one at google solving the interesting technical AI challenges, but all this really starts to matter when we have a concrete vision of the technology’s effect on the real world. My estimate for how long it’s going to take for me to get a self-driving car has been revised downward.

When discussing this with someone the other day, I was reminded of a part in Vernor Vinge’s excellent book Rainbow’s End. If you haven’t read it, there’s a version online here. I couldn’t find the exact passage, but there’s a part where one of the characters is looking out into the road, and sees two separate parts of the street, one for high-speed, efficient and autonomous cars, and another for people who want to drive themselves. Of course, the speed limit on the human driven section is much lower. I’m sure we’ll get there eventually, but we’ll have some interesting times to go through first, both technologically and legally. There have been plenty of milestones for robotic cars in the past year, but I’m still waiting for one of the more unpleasant ones; the first person hurt or killed by a computer-driven car.

Categories: current events Tags: ,

Firefighters and Libertarians

October 7, 2010 1 comment

Hello blog and blog readers, its been a while. How are you? I’m great, thanks for asking. I kind of forgot about this place over the summer, and have fulfilled my need to add my voice to the swirling vortex of trollery that is the internet in other ways. But once again I’ve found the combination of facebook, twitter, and various comment sections to be too restrictive.

This time, it’s the story about the Tennessee fire department that refused to put out a fire at a man’s house because he didn’t pay his fire protection subscription (about $75 a year). I believe the original article appeared in Salon. A key part of the story is that the man whose house was on fire offered “any amount” of money while his house was burning for the firefighters to put it out, and the fire department still refused to act.

I have a fairly wide range of blogs in my google reader queue, from right-ish econ blogs to fairly left leaning ones like Pharyngula and BoingBoing. I’m not masochistic enough to subject myself to the likes of The Daily Kos or Town Hall, but I’d say I get a decent view of the politics of the blogosphere as a whole. Like you’d expect, the lefty blogs have been all over this. And the message, in both their posts and their comment sections, has been about what you’d expect: “haha this is what your conservative/libertarian utopia would look like, my beliefs sure were validated there”. None have been quite as overtly hostile toward the libertarian view as the original article, but most have had the same tone.

What I find interesting is how well this story illustrates our vulnerability to confirmation bias, and how much nuance the liberal commentators have missed. First, the obvious: this is as much an indictment of statist inclinations as it is of libertarian ones. As David Henderson explains over at Econlog, it was a government run fire department. His conclusion is pretty clear, and I haven’t seen it addressed or even mentioned on most other blogs:

So let’s see now: Libertarians tend to advocate that government not be in the business of providing fire protection because we think that people should be free to contract with whoever they want to contract with and, as a side benefit, they will get better protection at a lower cost. If someone could show us that this works badly, we would look at that case. But it’s bizarre for a statist to attack libertarians when his own statist alternative works out badly.

Furthermore, the way the fire department acted was inefficient from a private business standpoint; there is absolutely no reason a privately owned fire department should refuse large amounts of money to put out a fire at a non-payer’s house. As Henderson points out, most equivalent private insurance schemes charge punitively large fees for non-subscribers. Low enough that the customer would rather pay than have his house burn down, high enough to discourage people from waiting until something bad happens to pay. Liberals often complain about capitalism’s obsessive focus on efficiency, but here is a good example of private efficiency trumping public rule systems. See, the firefighters couldn’t take the man’s money; they’d need approval from their bosses, who are probably under instruction (from the state, by the way) to deny their services to non-payers. That’s inefficient, and also harmful. So a good lesson to take away from this is that private enterprise can be coldly focused on efficiency, but it’s often better than bureaucracy, the only real alternative, which is only focused on following whatever rules are handed down to them, regardless of the sense or reason behind them. But I think there’s a deeper, if similar, lesson here that illustrates a fundamental misunderstanding of libertarianism by the left. However, this post is pushing 650 words and getting into tl;dr territory, so that will have to wait for another time.

Categories: current events Tags: , ,

The Farmville skinner box

April 22, 2010 Leave a comment

I don’t play many Facebook games. It’s not that I’m above it, so much as I’ve found better games to waste my time on. That and I’m a solipsistic hermit who prefers games where success is not reliant on inviting friends to your farm, restaurant, zombie trap or whatever. But I like that there are games on Facebook that some people find fun.

But are they really “fun”? I’ve noticed quite a bit of discussion about how social games in general, and Farmville in particular, are over hyped time wasters that are of little to no value for anyone playing them. With a number of qualifications, I agree with this. In fact I think that’s true of most video games, including most of the games I play. But usually these articles go further, implying something sinister is going on. We’re being exploited by Zynga and other game companies! Something must be done! (see this article for a good example of the sentiment).

As far as I’m concerned, this is all complete and utter bullshit. It’s a rehash of every tired old cliché about how much time young people waste these days. About how our culture has become rotten to the core, and how we, the old guard, are the only ones who appreciate it. It’s been said of the internet, music, and even books way back when. We’re going to say it about whatever media comes along when we start getting old. It’s a seductive us-and-them trap to be sure, and one that people have been falling into at least since the rise of mass media, probably for all of human history.

All of this has been bothering me since I first read the criticisms of Farmville and other social games, but today I read a great post on that really crystallized things. It looks at just about every major game genre and identifies the “disingenuous design practices” that make it little more than a Skinner Box (machine that encourages an arbitrary behavior by rewarding you for it).

Social games and MMORPGs should be obvious, but what about classic arcade games? Surely no-one could call Pac-Man evil!

Disingenuous Design Practices: stacking difficulty in a probabalistic manner through edge of screen spawns, exponential spawn functions, non-essential time limits, interface kills (whoops wrong button! Insert Coin), roundDown collision detection and simply setting killer objects to be faster or more agile than you.

How about Tetris and Bejeweled?

Disingenuous Design Practices: Saturated color contrast, chunk-lite SFX and lateral number incrementation via score are designed to make your dopamine receptors squeal like a piggy while putting up the absolute minimum in content or design.

jRPGs, FPSs and classic console games get taken down too. Thing is, video games are about being rewarded for doing useless things. I’ve mentioned this in a previous post, but really, video games are just dopamine hacks (although there are other explanations for some gaming behavior) . Lovely, wonderful dopamine hacks that won’t harm your lungs as much as smoking or your bank balance as much as gambling. Sometimes they help you socialize, or kill boredom, or even learn things, but usually they are unproductive wastes of time. Which is OK; when was the last time you listened to someone who told you all your time has to be productive, and you should never relax or have fun? You find different reasons to keep playing different types of games because you derive value from them for different reasons.

The question of what is “fun” is surprisingly complicated. You (and Farmville’s critics) may wonder how something you consider a chore and gain little subjective pleasure from can be considered “fun”. As the skinner box demonstrates, just because you keep doing it doesn’t mean its fun. But as with so called “classic” games, something that seems like an unpleasant chore to some can be rewarding in and of itself for others. Fun comes in all shapes and sizes, and you need to ask yourself what fun really is when you’re questioning whether or not the jRPG nerd or the guy playing WOW is really having it grinding out level after level.

Today there are no widely accepted definitions of fun that are both rigorous and meaningful. However you can find a pretty good one here if you’re willing to do some reading. Yudkowskian fun theory accepts that, as with much of humanity’s shared value systems, fun is not just one simple concept manifesting in different ways. It’s a number of things, including complex novelty, improvement over time, and having direct control of your future in a given area. Based on Yudkowsky’s definition of fun, games like Farmville aren’t too bad in the scope of things. There are certainly more fun-optimal activities you could be pursuing, but there are many, many less eudaimonic things you might do, a number of which you’ll probably do anyway for skinner box related reasons.

It comes down to this: just because something is hacking your brain’s reward system doesn’t mean it’s fun. But it also doesn’t mean it’s not fun, and your intuitions really aren’t as bad as you might think at telling the two apart. So go waste yourself some time! Your brain will thank you later for the dopamine hit at least.

Categories: memeplex Tags: , ,


April 6, 2010 Leave a comment

Sorry to the small (and possibly nonexistent) number of you that regularly come here, I haven’t gotten bored of blogging or run out of ideas but now that my initial enthusiasm has waned I’m taking a more relaxed approach to blogging. I will probably never be a prolific poster on the order of Mike Anissimov or Tyler Cowen, or even Kyle Munkittrick (who was an inspiration of sorts for starting this). Maybe some day, but the copious free time I thought I had is, well, less copious than I thought. I’ve been writing a few posts, which should trickle in as time goes by, but I’ve decided to draft them more carefully and review them a bit more before posting.

But today, a short post about Robocars.

Anyone who knows me well has probably heard me talk about how soon our cars will be driving themselves. I don’t claim any credit for the idea, science fiction writers have been talking about it for a long time, probably since cars were invented. But I still hear people say “I wouldn’t trust a robot to drive me around, what if it goes crazy?”. To which I usually respond “which is more likely? Your robot driver going crazy or your human driver?”. Which of course is ignoring the incredibly low probability of an AI driver getting drunk, falling asleep at the wheel, or getting distracted answering his phone or changing the radio station. But this argument usually ends with “Whatever, it’s going to be a long time until that happens anyway”.

Well it isn’t. Robotic cars are coming sooner than you think. There are already cars that park themselves. This year, a car is coming out that brakes to avoid hitting pedestrians. Some nut in California has even built a system for a Prius that drives itself. It’s only a matter of time before we all have these. Seriously, it’s not just some lone crazy (me) saying this; the vice president of R&D at GM says we’ll have fully autonomous cars by 2020.

And this will be a good thing. Usually when left leaning types hear this argument, their next fear (after safety) is that this will mean a postponement of the inevitable death of the car. Our wonderful public transit future where everyone rides high speed rail or bikes is slipping away. I’m with them on the bikes (and E-bikes are going to make this even better), but trains of all types are an expensive waste of energy, and are very difficult to move or reconfigure as demographics change. Buses are a much better option, although I personally find them to be overcrowded nausea inducing death traps, but they would also benefit from autonomous control. If we can all have energy-efficient robot taxis driving us around, rural citizens included, why do we need trains OR buses, except to satisfy some communitarian dream of everyone travelling together?

Since I’m all about making falsifiable predictions to track my understanding of where the world is going, here’s today’s: I’ll probably have to drive the first car I buy, but the second (or maybe the third) will be able to drive itself.

If you need more convincing of the utility and feasibility of this technology, see Brad Templeton’s presentation at Foresight 2010. The robot cars are coming, and when they get here we’ll all be better off for it.