3. expressiveintelligencestudio UC Santa Cruz
Real-Time Strategy Games
Building human-level AI for RTS games
remains an open research challenge
StarCraft II, Blizzard Entertainment
4. expressiveintelligencestudio UC Santa Cruz
Task Environment Properties
Chess StarCraft Taxi Driving
Fully vs. partially
observable
Fully Partially Partially
Deterministic vs.
stochastic
Deterministic Deterministic* Stochastic
Episodic vs.
sequential
Sequential Sequential Sequential
Static vs. dynamic Static Dynamic Dynamic
Discrete vs.
continuous
Discrete Continuous Continuous
Single vs. multiagent Multi Multi Multi
[Russell & Norvig 2009]
5. expressiveintelligencestudio UC Santa Cruz
Motivation
RTS games present complex environments
and complex tasks
Professional players demonstrate a broad
range of reasoning capabilities
Human behavior can be observed, emulated,
and evaluated
[Langley 2011, Mateas 2002]
7. expressiveintelligencestudio UC Santa Cruz
Research Questions
What competencies are necessary for
expert StarCraft gameplay?
Which competencies can be learned
from demonstrations?
How can these competencies be
integrated in a real-time agent?
11. expressiveintelligencestudio UC Santa Cruz
Gameplay Scales in StarCraft
Individual
Squad
Global
Support
siege line
Worker
harassment
Aggressive mine
placement
12. expressiveintelligencestudio UC Santa Cruz
State Space
The following number of states are possible,
considering only unit type and location:
(Type * X * Y)Units
States on a 256x256 tile map:
(100*256*256)1700 > 1011,500
13. expressiveintelligencestudio UC Santa Cruz
Decision Complexity
The set of possible actions that can be executed at a
particular moment:
O(2W(A * P) + 2T(D + S) + B(R + C))
W – number of workers
A – number of the type of worker assignments
P – average number of workspaces
T – number of troops
D – number of movement directions
[Aha et al. 2005]
14. expressiveintelligencestudio UC Santa Cruz
Decision Complexity
The set of possible actions that can be executed at a
particular moment:
O(W * A * P + T * D * S + B(R + C))
Assumption
Unit actions can be selected independently
Resulting complexity:
Assuming 50 worker units on a 256x256 tile map
results in more than 1,000,000 possible actions
17. expressiveintelligencestudio UC Santa Cruz
Multi-Scale AI
Multiple scales
Actions are performed across multiple
levels of coordination
Interrelated tasks
Performance in each tasks impacts other tasks
Real-time
Actions are performed in real time
18. expressiveintelligencestudio UC Santa Cruz
Reactive Planning
Provides useful mechanisms for building
multi-scale agents
Advantages
Efficient behavior selection
Interleaved plan expansion and execution
Disadvantages
Lacks deliberative capabilities
[Loyall 1997, Mateas 2002]
19. expressiveintelligencestudio UC Santa Cruz
Agent Design
Implemented in the ABL reactive planning
language
Architecture
Extension of McCoy & Mateas integrated agent
framework
Partitions gameplay into distinct competencies
Uses a blackboard for coordination
[McCoy & Mateas 2008]
20. expressiveintelligencestudio UC Santa Cruz
EISBot Managers
Strategy
Manager
Income
Manager
Production
Manager
Tactics
Manager
Recon
Manager
Gather
Resources
Construct
Buildings
Attack
Opponent
Scout
Opponent
21. expressiveintelligencestudio UC Santa Cruz
Multi-Scale Idioms
Design patterns for authoring multi-scale AI
Idioms
Message passing
Daemon behaviors
Managers
Unit subtasks
Behavior locking
22. expressiveintelligencestudio UC Santa Cruz
Idioms in EISBot
Initial_tree
Tactics Manager Strategy Manager Income Manager
Form Squad
Squad Monitor
Squad Attack Squad Retreat
Attack Enemy Pump Probes
Legend
Subgoal
Daemon behavior
Message passingDragoon Dance
Timing Attack WME Probe Stop WME
23. expressiveintelligencestudio UC Santa Cruz
Multi-Scale AI
StarCraft gameplay is multi-scale
Reactive planning provides mechanisms for
multi-scale reasoning
Idioms are applied in EISBot to support
StarCraft gameplay
25. expressiveintelligencestudio UC Santa Cruz
Learning from Demonstration
Objective
Emulate capabilities exhibited by expert players
by harnessing gameplay demonstrations
Methods
Classification and regression model training
Case-based goal formulation
Parameter selection for model optimization
26. expressiveintelligencestudio UC Santa Cruz
Strategy Prediction
Tasks
Identify opponent build orders
Predict when buildings will be constructed
0
100
200
300
400
0 4
Game Time (minutes)
Spawning Pool Timing
[Hsieh & Sun 2008]
27. expressiveintelligencestudio UC Santa Cruz
Approach
Feature encoding
Each player’s actions are encoded in a single vector
Vectors are labeled using a build-order rule set
Features describe the game cycle when a unit or
building type is first produced by a player
t, time when x is first produced by P
0, x was not (yet) produced by P
f(x) = {
28. expressiveintelligencestudio UC Santa Cruz
Strategy Prediction Results
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5 6 7 8 9 10 11 12
RecallPrecision
Game Time (minutes)
NNge Boosting Rule Set State Lattice
29. expressiveintelligencestudio UC Santa Cruz
Strategy Learning
Task
Learn build-orders from demonstration
Trace Algorithm
Converts replays to a trace representation
Formulates goals based on most similar situation
q = argminc ϵ L distance(s, c)
g = s + (q’ - q)
[Ontañón et al. 2010]
30. expressiveintelligencestudio UC Santa Cruz
Trace Retrieval: Example
Consider a planning window of size 2
S =< 3, 0, 1, 1 >
T1 =< 2, 0, 0.5, 1 >
T2 =< 3, 0, 0.7, 1 >
T3 =< 4, 1, 0.9, 1 >
T4 =< 4, 1, 1.1, 2 >
31. expressiveintelligencestudio UC Santa Cruz
Trace Retrieval: Step 1
The system retrieves the most similar case, q
S =< 3, 0, 1, 1 >
T1 =< 2, 0, 0.5, 1 >
T2 =< 3, 0, 0.7, 1 >
T3 =< 4, 1, 0.9, 1 >
T4 =< 4, 1, 1.1, 2 >
34. expressiveintelligencestudio UC Santa Cruz
Trace Retrieval : Step 4
g is computed:
S =< 3, 0, 1, 1 >
T1 =< 2, 0, 0.5, 1 >
T2 =< 3, 0, 0.7, 1 >
T3 =< 4, 1, 0.9, 1 >
T4 =< 4, 1, 1.1, 2 >
g = s + (T4 – T2) = <4, 1, 1.4, 2>
35. expressiveintelligencestudio UC Santa Cruz
Strategy Learning Results
0
2
4
6
8
10
12
14
0 10 20 30 40 50 60 70 80 90 100
PredictionError(RMSE)
Actions performed by player
Opponent modeling with a window size of 20
Null
IB1
Trace
MultiTrace
36. expressiveintelligencestudio UC Santa Cruz
State Estimation
Task
Estimate enemy positions
given prior observations
Particle Model
Apply movement model
Remove visible particles
Reweight particles
[Thrun 2002, Bererton 2004]
37. expressiveintelligencestudio UC Santa Cruz
Parameter Selection
Free parameters
Trajectory weights
Decay rates
State estimation is represented as an
optimization problem
Input: parameter weights
Output: particle model error
Replays are used to implement a particle model
error function
38. expressiveintelligencestudio UC Santa Cruz
State Estimation Results
0
20
40
60
80
100
120
140
160
0 2 4 6 8 10 12 14 16 18
ThreatPredictionError
Game Time (Minutes)
Null Model Perfect Tracker Default Model Optimized Model
39. expressiveintelligencestudio UC Santa Cruz
Learning from Demonstration
Anticipation
Classification and regression models
Adaptation
Case-based goal formulation
Estimation
Model optimization
49. expressiveintelligencestudio UC Santa Cruz
Integrating Learning
ABL agents can be interfaced with external
learning components
Applying the GDA model enabled tighter
coordination across capabilities
EISBot incorporates ABL behaviors, a particle
model, and a GDA implementation
50. expressiveintelligencestudio UC Santa Cruz
Evaluation
Claim
Reproducing expert-level StarCraft
gameplay involves integrating
heterogeneous reasoning capabilities
Experiments
Ablation studies
User study
52. expressiveintelligencestudio UC Santa Cruz
GDA Results
Overall results from the GDA experiments
Agent
Win
Ratio
Base 0.73
Formulator 0.77
Predictor 0.81
GDA 0.92
53. expressiveintelligencestudio UC Santa Cruz
User Study
Experiment setup
Matches hosted on ICCup
3 trials
Testing script
1. Launch StarCraft
2. Connect to server
3. Host match
4. Announce experiment [Dennis Fong, Pro-gamer]
54. expressiveintelligencestudio UC Santa Cruz
Performance on Tau Cross
0
500
1000
1500
2000
0 10 20 30 40 50
ICCupScore
Number of Games Played
Base
Formulator
Predictor
GDA
56. expressiveintelligencestudio UC Santa Cruz
EISBot Ranking
Rankings achieved by the complete GDA agent
Trial
Percentile
Ranking
Longinus 32nd
Python 8th
Tau Cross 66th
Average 48th
57. expressiveintelligencestudio UC Santa Cruz
Evaluation
Ablation Studies
Optimized particle model
Complete GDA model
Integrating additional capabilities into EISBot
improved performance
EISBot performed at the level of a competitive
amateur StarCraft player
58. expressiveintelligencestudio UC Santa Cruz
Conclusion
Objective
Identify and realize capabilities necessary for
expert-level StarCraft gameplay in an agent
Approach
Decompose gameplay
Learn capabilities from demonstrations
Integrate learned gameplay models
Evaluate versus humans and agents
59. expressiveintelligencestudio UC Santa Cruz
Contributions
Idioms for authoring multi-scale agents
Methods for learning from demonstration
Integration approaches for ABL agents
60. expressiveintelligencestudio UC Santa Cruz
Integrating Learning in a Multi-Scale Agent
Ben G. Weber
Ph.D. Candidate
Expressive Intelligence Studio
UC Santa Cruz
bweber@soe.ucsc.edu
Funding
NSF Grant IIS – 1018954
61. expressiveintelligencestudio UC Santa Cruz
References
Aha, Molineaux, & Ponsen. 2005. “Learning to Win: Case-Based Plan
Selection in a Real-Time Strategy Game”, Proceedings of ICCBR.
Bererton. 2004. “State Estimation for Game AI using Particle Filters”,
Proceedings of AAI Workshop on Challenges in Game AI.
Hsieh & Sun. 2008. “Building a Player Strategy Model by Analyzing Replays
of Real-Time Strategy Games”, Proceedings of IJCNN.
Langley. 2011. “Artificial Intelligence and Cognitive Systems”, AISB
Quarterly.
Loyall. 1997. “Believable Agents: Building Interactive Personalities”, Ph.D.
thesis, CMU.
Mateas. 2002. “Believable Agents: Building Interactive Personalities”,
Ph.D. thesis, CMU.
62. expressiveintelligencestudio UC Santa Cruz
References
McCoy & Mateas. 2008. “An Integrated Agent for Playing Real-Time
Strategy Games”, Proceedings of AAAI.
Molineaux, Klenk, Aha. 2010. “Goal-Driven Autonomy in a Navy Strategy
Simulation”, Proceedings of AAAI.
Muñoz-Avila, Aha, Jaidee, Klenk, Molineaux. 2010. “Applying Goal Driven
Autonomy to a Team Shooter Game”, Proceedings of FLAIRS.
Ontañón, Mishra, Sugandh, Ram. 2010. “On-line Case-Based Planning”,
Computational Intelligence.
Russell & Norvig. 2009. Artificial Intelligence: A Modern Approach.
Shannon. 1950. “Programming a Computer for Playing Chess”,
Philosophical magazine .
Thrun. 2002. “Particle Filters in Robotics”, Proceedings of UAI.