Towards an Agent Society

Tuesday, January 31, 2006

Agent Group Selection

From: Evolving Social Rationality for MAS using "Tags"
Hales & Edmonds, 2003

This paper discusses adding "tags" to agents in a multi-agent system to help co-operation break out over a number of generations. Each agent has some strategy for playing a game (let's say the Prisoner's Dilemma, seeing as they do), and they also have some tag which is a number of bits which is not related to their strategy at all, but will be reproduced via the same evolutionary principles.

Each round of this iterated Prisoner's Dilemma game, the agents pair up in some way and play each other, with the best performing agents being represented more strongly in the next round. A low level of mutation is also included - all standard evolutionary stuff. With random pairing the system will quickly devolve into every agent defecting as this is individually optimal.

The twist comes in that the agents choose a partner with the same tag as them if there is one available. In this way the system essentially becomes group selection with each tag having an associated "group". This sounds a lot like what people have evolved to do such as in Carpenter et al, 2003. They don't talk about the idea of group selection in a general sense although they do explain that the reason positive trends develop is due to groups forming. This leaves me wondering whether they just don't know about all the work done in philosophy on group selection theories that would almost certainly help further this work, or whether they just chose not to address them as they're not strictly vital to what's being done in this paper.

Hales and Edmonds show some pretty pictures which show almost total co-operation breaking out over a few thousand generations. More significantly they show an inverse scaling phenomenon in terms of how long it takes given the number of agents - with more agents there are more chances for co-operative groups to form early and influence the evolution in their direction. Of course, given the occasional mutations, every so often a co-operative group will be invaded by a defector who will quickly rip them apart by getting the highest payoff every time. But overall the trend will be towards co-operation.

More impressive is their final section which shows that a bunch of robots using these tags can outperform a group of robots using the obvious co-operative strategy. While impressive at first glance, as they admit the results look very preliminary.

This is applicable stuff for areas where you want a robust system designed to learn the most efficient way to deal with a problem over time. On the other hand, it doesn't really look useful in a society kind of sense, because a manipulator can easily insert a defecting agent to join co-operative groups and feed off each until they die out before moving onto the next. One way around that would be to allow agents to have more detailed strategies and force a newly created agent to "pay his dues" to other agents with the same tag. Again, various techniques for this are in Friedman, Resnick, 1999.

I think I'll find myself referencing that paper by Friedman and Resnick a lot, so I should probably find more papers in the same area that are more recent, if nothing else.

Of course, such techniques add many layers of complexity to what the agents have to know about instead of just "help or don't help" kinds of decisions. It might be interesting to think about what happens when the agents are free to deliberately change groups and remember other agents, in a persistent rather than evolutionary setting. To get anywhere this would require a paying your dues kind of strategy to be allowed or a defector could always try to invade other groups.

Once one goes down that path though, it's not long before one arrives at the point where the groups are irrelevant, and the agents are left choosing based on individual evaluations of their opponents in some way. Then it becomes something more like a reputation based system. So for the situation where you want co-operative agents or robots to do a good job without having to design a strategy each time, maybe we'll leave it as is.

Carpenter, Mathewws, Ong'ong'a - Why Punish? Social Reciprocity and the Enforcement of Prosocial Norms, 2003

Friedman, Resnick - The Social Cost of Cheap Pseudonyms, 1999

Monday, January 30, 2006

Thought Experiments

I was thinking about an agent society (as one in my position should) and I wondered whether one could describe a "SocietyBot" which represented in some way the wishes of society, and had a certain amount of power over the other agents.

How would such a creature operate? I guess the idea is that every other agent can remonstrate with SocietyBot, but in such a way as to not allow manipulation by any single agent. SocietyBot can penalise agents as it feels appropriate and should probably use machine learning techniques or such to improve how it does these things to increase overall "happiness" over time.

Why am I even thinking about this? Surely the point of an agent society is that society evolves from the agent, rather than enforcing this single agent representing a government in some way. I don't really have an answer for this. It's just something that seemed worth thinking about. If nothing else we can consider it to be jumping straight to the point where the agents elect the most trusted of their number to make decisions for everyone. But let us keep running with the idea.

An obvious problem is that the agents report on their happiness to SocietyBot, so a strictly dominant strategy would be to always be unhappy with the current situation and want more. So then everyone in society is always unhappy and poor SocietyBot can't do anything right. This is pretty reminiscent of focus groups in the real world whose job it is to just be relentlessly unhappy at the government to get further concessions for their pet issue.

One way around this is to have happiness be rewarded in some way. So a "happy" agent is always given better deals etc by its contemporaries, and stops hassling SocietyBot. I kind of think this is where human society has evolved to. People who don't complain constantly are much better regarded. Smiling people are considered consistently more attractive, and those who are cheerful are often given better deals than surly people. If I searched, I could probably find some scientific studies that claim these results in a slightly more formal way. But this is a thought experiment; I don't need that sort of justification.

The other way is to penalise the remonstrations. If it costs an agent to complain then they better have a genuinely good reason.

Ways around both of these: In our agent world, the human manipulator on the outside can introduce bogus extra agents whose purpose is purely to manipulate the system. So the "real" agents keep their heads down and act sweet and nice to get good deals with their good reputations, whilst the "bogus" agents (and an arbitrary number thereof) just complain constantly to benefit the "real" agent. Even if the cost to complain is in terms of real money, it may well be of benefit to the manipulator to spend a lot of money in order to make a lot more money, by pumping money through these bogus agents. Still sounds a lot like focus groups to me. Even more so in fact.

How do we disallow these agents? Firstly, we can't rely on being able to identify such agents, as potentially every complaint could come from a new agent. One way is to only allow established agents to complain - i.e. those that have made a contribution to society by trading over time. There are problems with that too though. Probably this segues into the kind of thing discussed by Friedman and Resnick in The Social Cost of Cheap Pseudonyms. I don't really think human society has very satisfactory ways around this. Maybe there hasn't been enough time to really solve this?

Either way, I think the more interesting problems may be in the more egalitarian agent society. A SocietyBot is a fine idea if you have a centralised administration for the agent society (in fact it's just the embodiment of a more powerful administration), but peer-to-peer kinds of agents societies have a lot of advantages with interesting difficulties.

Friday, January 27, 2006

Socially Intelligent Agents

Paper: Modelling Socially Intelligent Agents
Bruce Edmonds, 1997

The title of the paper under discussion for today is rather reminiscent of the heading for this blog. So it seemed worth a look. This is reasonably light stuff, and seems more like an introduction to something more impressive than a paper in itself, but anyway. Given it's a few years old I assume the research has continued since then and I intend to read some more.

The idea here is not to create something incredibly useful so much as to create agents that interact socially with each other in some kind of rational and believable (by which I mean human-like) way.

After some introductory comments, the bulk of the paper is devoted to the example of a bar which plays good Irish music on a Thursday night. The agents would like to go to this unless they think it's going to be crowded, in which case they'd rather just stay home. Each agent has its own model with which to make individual predictions. Left roughly to their own devices the number of agents attending hovers around 60%, the number at which the bar becomes too crowded.

When the model is extended with a very basic social structure such that the agents can communicate with each other then life gets more interesting. The agents have to choose whether to go and also whether they say they are going to go or not. There is also more benefit to going to the bar for an agent if its friends are going.

There is a modelling language that each of the agents uses to write its strategies in. This way they kind of design their own strategies based on the other agents and what they learn over time, and it's easyish for a human to read and see what they're doing. It seems that the agents are then left to their own devices over the course of a simulated year or two, with different strategies developing over this time.

The results would be more interesting, but there's only one case study mentioned, which only involves 5 agents. If the bar is crowded with only 3 people in it, it must be pretty darn small. My suspicion is that it's really just the closet where they keep their moonshine. In the end, some agents just learn to do their own thing rather than trusting anyone else, just choosing a mixed strategy to attend 60% of the time. Others become influenced positively or negatively by whether others say they'll attend.

To be honest, given the utility functions specified, it would seem that there would be an equilibrium of everyone going regardless of the crowdedness because their payoff would always be higher than staying at home that way. Only if someone goes while it's too crowded yet none of their friends are there is the payoff worse than staying at home. Given that it's a highly connected network this is only remotely possible for 3 of the agents, and given that it's strictly dominant for the other 2 to go regardless of who else is there (as those 2 are completely connected), I don't see why they don't just learn to always go.

I guess that's what they mean about learning to act as a society rather than act optimally. Or I've missed something in the text. I must read some more based on this soon to see if it does go anywhere.

Monday, January 23, 2006

Manipulating Elections

How many candidates are needed to make elections hard to manipulate?
Conitzer, Lang, Sandholm (2003)

Seems to be heavily based on Conitzer and Sandholm's previous research in "Complexity of manipulating elections with few candidates", continuing the work therein to several other voting systems. Not bad stuff, but I've got some concerns.

The premise is that you have a few candidates (probably not many, as in most real decisions), and you want to know at what number of candidates manipulating the election becomes NP-complete with respect to the number of voters. There are a few assumptions that should be discussed.

1. This is the really controversial one as far as I'm concerned. They have assumed that the manipulators have absolute information about the actions of the other voters, and can base their decision on that. They have useful results given this assumption, it seems, but this isn't really an appropriate assumption for most real-world situations.

2. They assume that it is a coalition of voters conspiring to manipulate the election. That's fine, and they make a good case for it. Either way, it's a general case - so at least as good.

3. They assume that the voters are weighted. This apparently makes a difference because it's always easy for the unweighted case. This increasingly makes me question the usefulness of this, as in the vast majority of cases votes are unweighted. Or at worst case at least integer weighted. It's easy in the unweighted case as with only 4 candidates or so, then with strict orderings the voters can just be partitioned into 4! voting "vectors". Easy. I'm surprised that the weighted case can't be reduced to this in some way (i.e. take the greatest common divisor of all voter weights so you have integer weightings effectively. Then just assume that they are unweighted but there are groups that always vote together or something?). Anyway, it's not controversial, as again it's a general case.

4. They assume that the manipulators are either trying to make one candidate win, or trying to make sure one candidate doesn't win. Fair enough - they examine both cases.

The proofs are largely the usual sort of NP-complete fare, showing a polynomial conversion with the PARTITION problem.

In the end, they have the appropriate tables showing how many candidates are needed to make the elections "hard" to manipulate for 9 different voting systems. It's interesting to note that apparently it's far easy under most systems to make sure someone doesn't win than vice versa.

As noted, I think there's some flaws with the line of research, as usual with everything. Seems kind of interesting stuff though.

Friday, January 20, 2006

Ideas

The heading for this blog is "Towards an agent society". Maybe I should justify this. It is, after all, a particularly pompous title.

1. I find the ways that humans maintain social norms kind of intriguing.

2. I think that online agent societies are more or less inevitable, once a few hurdles are overcome. I'm talking about societies in a very loose sense here.

3. My instincts insist that there should be some fundamentally simple mechanics underlying rules of social interaction that an agent should be able to learn or emulate. Note the "shoulds" dominating that sentence.

4. If agents can be convinced to interact in some sensible way with each other, then it opens up the way for a host of interesting applications. More realistic simulations of real societies, more fun examples of fake societies, and agents that people can trust to go online and interact with other agents and do something useful for them.

Under the heading of something useful, perhaps the most obvious thing is trading with other agents. So your agent talks to other agents, forms coalitions, trades money or whatever it is you want your agent to trade, and forms appropriate relationships with other agents in terms of trust, respect, etc.

There is a lot of big problems with this, at least some of which are obvious now, others of which are no doubt apparent with further inspection. I've written some of these down in my "working ideas document" to go through in more detail.

Thursday, January 19, 2006

Inter-community Connections

From: http://www.aip.org/pnu/2002/split/604-3.html
Number 604 #3, September 13, 2002 by Phil Schewe, James Riordon, and Ben Stein

"Eventually, the benefits of cooperating return the system to equilibrium, but the more long range connections in the network, the slower the system's recovery. Although the model is clearly a crude reflection of human interactions, it suggests that increasing numbers of long range connections between people may help destabilize communities. The result is in contrast to the general perception that connections across cultures and nations is exclusively beneficial to society."

That's not surprising at all, as per the introduction of "The Sociobiology of Conflict" by J.M.G. van der Dennen et al, "conflict serves to establish and maintain the identity and boundary lines of societies and groups" etc. So when a connection is made between different cultures, blurring the boundaries, and meaning that there isn't conflict where there ought to be between cultures, the identity of the cultural group is threatened, hence the destabilisation.

That's my rationalisation at least. The implication in the quote is interesting though, as it implies that the destabilisation of the communities is necessarily a bad thing.

Wednesday, January 18, 2006

Formal Game Theory

My recent reading has been far more along the lines of sociology papers, so reading some more formal logic has been a bit of a stretch for me. Sadly I missed the talk based on this paper - the one time there's a game theory talk in my research school would be while I'm on holiday.

A Constructive Approach To Sequential Nash Equlibria, Rene Vestergaard, 2005

Abstract:
"We present a Coq-formalised proof that all non-cooperative, sequential games have a Nash equilibrium point. Our proof methodology follows the style advocated by LCF-style theorem provers, i.e., it is based on inductive definitions and is computational in nature. The proof i) uses simple computational means, only, ii) basically is by construction, and iii) reaches a constructively stronger conclusion than informal efforts. We believe the development is a first as far as formalised game theory goes."

Coq is a formal proof management system, and most of the results and description is given in terms of Coq code, which is thankfully reasonably easy on the eyes, even if it is functional.

This paper is concerned with Nash equilibria - the situation in a game where given the strategies of the other players no player would choose to change their individual strategy. Full marks for studying something that is useful. As usual though, the method for finding these equilibria (or at least proving their existance) is through finding a good solution and working backwards, which in many ways is not ideal. In complex games an agent wants to know what their next move is, one at a time, and a good guiess at that is a lot more useful than spending the lifetime of the universe finding a complete solution. Given that it's proving their existance in a formal logic way I'll assume that it does need doing and move on.

Basically it's mainly all about defining Games, Strategies, Plays, Payoffs etc in a functional manner, and because of how he's constructed them, the proofs are mainly trivial. Said proofs culminate with a theorem saying that there exists a function which sends any game to a Backwards Induction equlibrium for it. Without putting in a bit more time to go through the last part of the paper more thoroughly I'm not sure whether this "function" can be found easily and whether you could actually use it even so, but it's still a noteworthy result, I suppose.

Accompanying the paper is a bunch of Coq code I don't really want to read unless it seems like it will really be useful.

The Idea

It seems that it would be convenient to have an online repository of my thoughts on the various papers I read. This is almost purely for my own benefit, so I entirely reserve the right to ignore major points of papers, to badmouth the perfectly valid results, and to not bother reading the details if I don't get around to doing so.