Agent Group Selection
From: Evolving Social Rationality for MAS using "Tags"
Hales & Edmonds, 2003
This paper discusses adding "tags" to agents in a multi-agent system to help co-operation break out over a number of generations. Each agent has some strategy for playing a game (let's say the Prisoner's Dilemma, seeing as they do), and they also have some tag which is a number of bits which is not related to their strategy at all, but will be reproduced via the same evolutionary principles.
Each round of this iterated Prisoner's Dilemma game, the agents pair up in some way and play each other, with the best performing agents being represented more strongly in the next round. A low level of mutation is also included - all standard evolutionary stuff. With random pairing the system will quickly devolve into every agent defecting as this is individually optimal.
The twist comes in that the agents choose a partner with the same tag as them if there is one available. In this way the system essentially becomes group selection with each tag having an associated "group". This sounds a lot like what people have evolved to do such as in Carpenter et al, 2003. They don't talk about the idea of group selection in a general sense although they do explain that the reason positive trends develop is due to groups forming. This leaves me wondering whether they just don't know about all the work done in philosophy on group selection theories that would almost certainly help further this work, or whether they just chose not to address them as they're not strictly vital to what's being done in this paper.
Hales and Edmonds show some pretty pictures which show almost total co-operation breaking out over a few thousand generations. More significantly they show an inverse scaling phenomenon in terms of how long it takes given the number of agents - with more agents there are more chances for co-operative groups to form early and influence the evolution in their direction. Of course, given the occasional mutations, every so often a co-operative group will be invaded by a defector who will quickly rip them apart by getting the highest payoff every time. But overall the trend will be towards co-operation.
More impressive is their final section which shows that a bunch of robots using these tags can outperform a group of robots using the obvious co-operative strategy. While impressive at first glance, as they admit the results look very preliminary.
This is applicable stuff for areas where you want a robust system designed to learn the most efficient way to deal with a problem over time. On the other hand, it doesn't really look useful in a society kind of sense, because a manipulator can easily insert a defecting agent to join co-operative groups and feed off each until they die out before moving onto the next. One way around that would be to allow agents to have more detailed strategies and force a newly created agent to "pay his dues" to other agents with the same tag. Again, various techniques for this are in Friedman, Resnick, 1999.
I think I'll find myself referencing that paper by Friedman and Resnick a lot, so I should probably find more papers in the same area that are more recent, if nothing else.
Of course, such techniques add many layers of complexity to what the agents have to know about instead of just "help or don't help" kinds of decisions. It might be interesting to think about what happens when the agents are free to deliberately change groups and remember other agents, in a persistent rather than evolutionary setting. To get anywhere this would require a paying your dues kind of strategy to be allowed or a defector could always try to invade other groups.
Once one goes down that path though, it's not long before one arrives at the point where the groups are irrelevant, and the agents are left choosing based on individual evaluations of their opponents in some way. Then it becomes something more like a reputation based system. So for the situation where you want co-operative agents or robots to do a good job without having to design a strategy each time, maybe we'll leave it as is.
Carpenter, Mathewws, Ong'ong'a - Why Punish? Social Reciprocity and the Enforcement of Prosocial Norms, 2003
Friedman, Resnick - The Social Cost of Cheap Pseudonyms, 1999
Hales & Edmonds, 2003
This paper discusses adding "tags" to agents in a multi-agent system to help co-operation break out over a number of generations. Each agent has some strategy for playing a game (let's say the Prisoner's Dilemma, seeing as they do), and they also have some tag which is a number of bits which is not related to their strategy at all, but will be reproduced via the same evolutionary principles.
Each round of this iterated Prisoner's Dilemma game, the agents pair up in some way and play each other, with the best performing agents being represented more strongly in the next round. A low level of mutation is also included - all standard evolutionary stuff. With random pairing the system will quickly devolve into every agent defecting as this is individually optimal.
The twist comes in that the agents choose a partner with the same tag as them if there is one available. In this way the system essentially becomes group selection with each tag having an associated "group". This sounds a lot like what people have evolved to do such as in Carpenter et al, 2003. They don't talk about the idea of group selection in a general sense although they do explain that the reason positive trends develop is due to groups forming. This leaves me wondering whether they just don't know about all the work done in philosophy on group selection theories that would almost certainly help further this work, or whether they just chose not to address them as they're not strictly vital to what's being done in this paper.
Hales and Edmonds show some pretty pictures which show almost total co-operation breaking out over a few thousand generations. More significantly they show an inverse scaling phenomenon in terms of how long it takes given the number of agents - with more agents there are more chances for co-operative groups to form early and influence the evolution in their direction. Of course, given the occasional mutations, every so often a co-operative group will be invaded by a defector who will quickly rip them apart by getting the highest payoff every time. But overall the trend will be towards co-operation.
More impressive is their final section which shows that a bunch of robots using these tags can outperform a group of robots using the obvious co-operative strategy. While impressive at first glance, as they admit the results look very preliminary.
This is applicable stuff for areas where you want a robust system designed to learn the most efficient way to deal with a problem over time. On the other hand, it doesn't really look useful in a society kind of sense, because a manipulator can easily insert a defecting agent to join co-operative groups and feed off each until they die out before moving onto the next. One way around that would be to allow agents to have more detailed strategies and force a newly created agent to "pay his dues" to other agents with the same tag. Again, various techniques for this are in Friedman, Resnick, 1999.
I think I'll find myself referencing that paper by Friedman and Resnick a lot, so I should probably find more papers in the same area that are more recent, if nothing else.
Of course, such techniques add many layers of complexity to what the agents have to know about instead of just "help or don't help" kinds of decisions. It might be interesting to think about what happens when the agents are free to deliberately change groups and remember other agents, in a persistent rather than evolutionary setting. To get anywhere this would require a paying your dues kind of strategy to be allowed or a defector could always try to invade other groups.
Once one goes down that path though, it's not long before one arrives at the point where the groups are irrelevant, and the agents are left choosing based on individual evaluations of their opponents in some way. Then it becomes something more like a reputation based system. So for the situation where you want co-operative agents or robots to do a good job without having to design a strategy each time, maybe we'll leave it as is.
Carpenter, Mathewws, Ong'ong'a - Why Punish? Social Reciprocity and the Enforcement of Prosocial Norms, 2003
Friedman, Resnick - The Social Cost of Cheap Pseudonyms, 1999

0 Comments:
Post a Comment
<< Home