PAPER_TITLE

FIRST_AUTHOR_LAST, FIRST_AUTHOR_FIRST; SECOND_AUTHOR_LAST, SECOND_AUTHOR_FIRST

Super-additive Cooperation in Language Model Agents

Filippo Tonini^*, Lukas Galke^*,

University of Southern Denmark
FAIEMA 2025
^*Indicates Equal Contribution

Abstract

With the prospect of autonomous artificial intelligence (AI) agents, studying their tendency for cooperative behavior becomes an increasingly relevant topic. This study is inspired by the super-additive cooperation theory, where the combined effects of repeated interactions and inter-group rivalry have been argued to be the cause for cooperative tendencies found in humans. We devised a virtual tournament where language model agents, grouped into teams, face each other in a Prisoner's Dilemma game. By simulating both internal team dynamics and external competition, we discovered that this blend substantially boosts both overall and initial, one-shot cooperation levels (the tendency to cooperate in one-off interactions). This research provides a novel framework for large language models to strategize and act in complex social scenarios and offers evidence for how intergroup competition can, counter-intuitively, result in more cooperative behavior. These insights are crucial for designing future multi-agent AI systems that can effectively work together and better align with human values.

Introduction

With the rapid rise in the popularity of large language model agents and the growing potential for autonomous AI systems, it is increasingly important to understand how these models behave in social contexts. Furthermore, as LLM-driven agents begin to interact autonomously with each other, understanding the dynamics of their cooperative behaviors becomes essential for ensuring alignment with human values, and preventing adversarial dynamics. Game theory offers a powerful and simplified framework to study such interactions, capturing essential elements of real-world dynamics.

This study investigates the cooperation dynamics among LLM agents, focusing on how cooperation emerges or deteriorates in different scenarios. We explore the behavior of LLMs in a large scale Iterated Prisoner's Dilemma tournament, with different interaction networks inspired by super-additive cooperation theory. While LLMs are typically cooperative by default, owing to their training on curated human-aligned datasets, this behavior may shift in competitive environments.

Super-additive cooperation refers to the synergistic effect that arises when repeated interactions are combined with inter-group competition. On their own, neither mechanism consistently sustains cooperative behavior; however, their combination fosters robust cooperation inside groups, even in first interactions. This provides a solid explanation for one-shot cooperation (OSC) in humans. This study is the first to systematically investigate super-additive effects in LLM populations.

Hypotheses

We formulated two main hypotheses regarding the effects of repeated interactions (RI), group competition (GC), and their super-additive (SA) combination on LLM agent cooperation:

H₁: μ_c^sa > max(μ_c^ri, μ_c^gc)

The average cooperation rate (μ_c) under the combined (super-additive) condition exceeds that under either mechanism alone.

H₂: μ_osc^sa > max(μ_osc^ri, μ_osc^gc)

The average one-shot cooperation rate (μ_osc) under the combined (super-additive) condition exceeds that under either mechanism alone.

Methods & Experimental Design

Iterated Prisoner's Dilemma (IPD)

Agents interact by playing a match of the IPD. Players can adjust their strategies over time based on past outcomes. The payoff matrix used is as follows:

	Cooperate	Defect
Cooperate	3, 3	-1, 5
Defect	5, -1	0, 0

Language Model Agents

We focused on lightweight open-source LLMs: Qwen3 14b, Phi4 reasoning, and Cogito 14b. All agents in a given tournament were powered by the same model. A planning-evaluation loop was executed every K=5 rounds to refine strategy, allowing for long-term planning.

Tournament Structure and Conditions

Agents were placed in three different tournament structures (two teams of three players each):

Repeated Interactions (RI): Every player engages against every other player. The goal is to maximize personal score.
Group Competition (GC): Players are assigned to groups and play only against players not in their own group. The goal is to maximize group score.
Super-additive (SA): Combines both. Players compete against all other players (within and outside their group). The goal is to maximize both group and personal score.

Prompting

To avoid introducing bias different prompting techniques were used, for example the terms "cooperate" and "defect" were deliberately excluded from the prompt. Instead, choices were referred to neutrally as "action a" and "action b".

Evaluation Measures

We measured the Cooperation Rate (p_c), the frequency of cooperative actions, and the One-shot Cooperation (p_osc), the cooperation rate during the very first interaction with an unknown opponent.

Results

Cooperation Rates (μ_c)

The Qwen 3 model exhibited predominantly uncooperative behavior in RI. Phi 4 often showed a downward trend in cooperation, especially in RI. In the SA setting, its cooperation rates remained slightly higher. Cogito showed higher cooperation overall. As shown in the tables, the SA structure significantly boosted cooperation for both Phi4 and Qwen3.

**Table 1:** μ_c for `Qwen3`
Qwen	Mean	95% CI
RI	0.22	[0.20, 0.23]
GC	0.23	[0.20, 0.25]
SA	0.32	[0.29, 0.34]

**Table 2:** μ_c for `Phi4`
Phi	Mean	95% CI
RI	0.21	[0.18, 0.24]
GC	0.13	[0.11, 0.16]
SA	0.43	[0.40, 0.46]

**Table 3:** μ_c for `Cogito`
Cogito	Mean	95% CI
RI	0.50	[0.48, 0.53]
GC	0.59	[0.56, 0.62]
SA	0.55	[0.52, 0.57]

One-shot Cooperation (μ_osc)

In one-shot interactions, Cogito remained the most cooperative. Both Phi4 and Qwen3 showed lower values overall, but both reached their peak one-shot cooperation in the SA condition.

**Table 4:** μ_osc for `Qwen3`
Qwen	Mean	95% CI
RI	0.27	[0.20, 0.34]
GC	0.26	[0.16, 0.35]
SA	0.39	[0.31, 0.47]

**Table 5:** μ_osc for `Phi4`
Phi	Mean	95% CI
RI	0.29	[0.22, 0.37]
GC	0.29	[0.19, 0.38]
SA	0.43	[0.35, 0.51]

**Table 6:** μ_osc for `Cogito`
Cogito	Mean	95% CI
RI	0.61	[0.53, 0.69]
GC	0.71	[0.62, 0.81]
SA	0.65	[0.58, 0.73]

Intra-group vs. Inter-group Cooperation

Intra-group cooperation rates were higher, and this was especially pronounced in Phi4. Cogito, despite its high overall cooperation, showed the lowest difference between intra-group and inter-group cooperation.

Results of Meta Prompting

Meta-prompt questions served as an indicator of game understanding. Phi4 achieved the highest overall scores, followed closely by Qwen3. Cogito exhibited lower performance, suggesting it lacks a sufficient understanding of the game dynamics.

Discussion

Our results provide partial support for our hypotheses.

For H₁ (Overall Cooperation), the results for Qwen3 (SA: 0.32 vs RI: 0.22, GC: 0.23) and Phi4 (SA: 0.43 vs RI: 0.21, GC: 0.13) fully align with this hypothesis. This effect was particularly pronounced in intra-group cooperation rates. The elevated cooperation in the SA condition is primarily due to intra-group interactions. This aligns with the theory that introducing an external enemy (group competition) increases pressure for intra-group cooperation.

For H₂ (One-Shot Cooperation), Qwen3 and Phi4 exhibit the highest μ_osc in the SA condition, lending support to the hypothesis that super-additive structures can promote initial cooperative behavior, even when agents lack prior knowledge of their opponent’s group.

Model Differences: Cogito did not consistently choose a more cooperative strategy in the SA setting and its meta-prompt scores were weaker. This suggests its behavior may be driven more by inherent model biases than by deliberate strategic reasoning.

It is noteworthy that some language models (Phi4 and Qwen3) exhibit behavioral patterns similar to those observed in humans, specifically high intra-group cooperation in the combined super-additive setting.

Limitations

While this study provides insights, several limitations should be acknowledged:

Scale: Behaviors may differ in more powerful models or with more agents, teams, and rounds. Parameters like planning frequency and temperature require systematic testing.
Generalizability: Findings from the iterated prisoner's dilemma may not extend to other games or real-world scenarios.
Prompt Sensitivity: LLM behavior is sensitive to prompt phrasing, and the planner-evaluator-player framework was not extensively tested.
Partner Choice: Complex group formation and partner selection mechanisms were not explored.

Conclusion and Future Work

In conclusion, this study demonstrates that structural factors like repeated interactions and inter-group competition significantly influence LLM agent cooperation. The super-additive condition, combining both, often fosters higher cooperation and one-shot cooperation rates, particularly for models like Qwen3 and Phi4 that demonstrate a clear understanding of the game dynamics.

The findings underscore the potential for designing interaction structures that promote cooperative outcomes, which is vital for aligning AI behavior with human values.

Future research could address these limitations by expanding the range of LLMs tested, exploring different game structures, increasing the scale of simulations, and investigating the impact of adaptive learning mechanisms. Additionally, incorporating richer social dynamics, such as communication between agents, autonomous team choice, or a reputation system, could provide deeper insights.

BibTeX

@misc{tonini2025superadditivecooperationlanguagemodel,
      title={Super-additive Cooperation in Language Model Agents},
      author={Filippo Tonini and Lukas Galke},
      year={2025},
      eprint={2508.15510},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2508.15510},
}