SlideShare uma empresa Scribd logo
1 de 73
Argumentation in Artificial Intelligence: From
Theory to Practice
Part 2: Practice!
Federico Cerutti Mauro Vallati
Cardiff University
University of Huddersfield
Table of contents
1. Assessing the State of the Art
2. Analysis of the State of the Art in Abstract Argumentation
3. Learning for Argumentation
1
Assessing the State of the Art
How to Select a Solver
I understand what is argumentation about, I want to use it for solving
some of my problems. How do I pick up the best solver(s)?
... or, how to fairly compare solvers
2
How to Select a Solver
Clearly, one may not have enough time, resources, benchmarks, or
experience, to run a full experimental comparison among solvers.
This is one of the reasons why standards are introduced and usually
exploited.
3
Standards
First, we need to define some standard way for comparing
Specifically:
• standard language for input and output
• challenging, diverse, and representative instances to deal with (aka,
benchmarks)
• or, ways for creating and selecting benchmarks
The larger and more diverse the set of available benchmarks, the higher
the probability the results of the comparison are relevant for your specific
set of instances and problems.
4
Something more about benchmarks
Benchmarks can be created using generators such as AFBenchGen [4, 5]
or Probo [6]
• Purely random generated AFs.
• AFs based on structured graphs
• Watts [16]
• Erd¨os-R´enyi, [9]
• Barabasi-Albert [1]
• Focus on Stable
• Focus on SCC
Otherwise, AFs generated by considering “applications”
• Planning
• Wikipedia pages
• etc..
5
Competitions in AI: problem solved?
Standardised way for comparing solvers.
6
Can I Blindly Trust Competition Results?
NO
Ok, let me elaborate on this...
7
Sources of Performance Variation
There are various sources of performance variation that affect results.
Your settings (in a wide sense) and needs can be very different from
those used during competitions
(Sorry Ariel, not only low-level details) 8
Sources of Performance Variation (1)
Solver randomisation and other stochastic effects
• Many solvers take advantage of randomisation
• Very different solver trajectories
• Computationally expensive to draw a complete figure of the
performance of a randomised solver [11]
Other sources: operating system, cache, shared hard drives..
9
Sources of Performance Variation (1)
Solver randomisation and other stochastic effects
• Many solvers take advantage of randomisation
• Very different solver trajectories
• Computationally expensive to draw a complete figure of the
performance of a randomised solver [11]
Other sources: operating system, cache, shared hard drives..
Instances solved across 100 runs on application benchmaks for top 3 SAT 2014 solvers. (from [11])
9
Sources of Performance Variation (2)
Running time and memory limits
• Generally, more running time or memory result in higher coverage
• improved performance with increased limits tends to not be
distributed evenly across all solvers
10
Sources of Performance Variation (2)
Running time and memory limits
• Generally, more running time or memory result in higher coverage
• improved performance with increased limits tends to not be
distributed evenly across all solvers
1 2 3 4 5 6 7 8
Memory [GB]
0
20
40
60
80
100
120
140
Coverage[#instances]
Hpp-ce, Hpp
Hflow
SPMaS
Rlazya
CedalionGamer
DPMPlan
Dynamic-Gamer
SymBA-2, SymBA-1, NuCeLaR
Metis, MIPlan
RIDA
cGamer-bd
IPC 2014: planners that perform extensive precomputation benfit more
from increased memory limits [14]
10
Sources of Performance Variation (3)
Hardware and Software environment
• Solvers are affected to varying degree by different CPUs or other
hardware elements [10]
• Java, C++ compilers, libraries, python, linkers, etc.
11
Sources of Performance Variation (3)
Hardware and Software environment
• Solvers are affected to varying degree by different CPUs or other
hardware elements [10]
• Java, C++ compilers, libraries, python, linkers, etc.
gpj Gpj gPj GPj gpJ GpJ gPJ GPJ
100
110
120
130
140
150
160Coverage[#instances]
Madagascar
YAHSP3-mt
Madagascar-pc, YAHSP3
Probe
BFS-f
Mercury
Jasper
ArvandHerd, USE
IBaCoP2
Cedalion, IBaCoP
YAHSP3-mt
Madagascar
Madagascar-pc
YAHSP3
Probe
BFS-f
Mercury
Jasper
IBaCoP2
IBaCoP
USE
ArvandHerd
Cedalion
IPC 2014: coverage of top solvers wrt C++, python, and Java version
11
Sources of Performance Variation (4)
Choice of benchmark (distribution)
• Benchmarks should challenging (not trivial, not too hard)
• What does challenging mean? (dynamic or static property?)[15]
• How to create them?
• How to select them?
¼
¾¼
¼
¼
¼
½¼¼
Ú
ØÝ
Ö
Ì
ÓÙ
ØÙÐ
È
Ö
ÒÌ
ØÖ×
Ð
×Ò

ÐÓÓÖØÐ
ÌÖ
Ò×ÔÓÖØ
Ç
Ô
Ò×Ø

×
Å
ÒØ
Ò
Ò
Î
×Ø
ÐÐÖÑ
Ò
À
Ò
ÈÐÒÒÖ×´ÔÖ
Òص
Ë Ø × 
 Ò ÌÖ 
½ ¾¼
½½ ½
½¼
¼
12
Sources of Performance Variation (5)
Ranking mechanism: The techniques for aggregating results across the
set of benchmarks strongly affect competitions outcome [14]
Two main orthogonal dimensions:
• What metrics do we care about?
• Absolute vs relative ranking
• Example: IPC score, coverage, Borda ranking, PAR10..
13
Are Competitions Useful?
Don’t take me wrong, competitions in AI are awesome.
14
Are Competitions Useful?
Don’t take me wrong, competitions in AI are awesome.
• Foster the advancement of the state of the art
• Provide a large set of benchmarks
• Support the standardisation
• Provide a large number of ready-to-use solvers
• Highlight issues that need to be tackled by the community (e.g.,
areas not receiving enough attention, lack of applications, etc.)
14
A Pinch of Salt
Results from competitions in AI cannot necessarily be easily generalised.
They refer to the considered solvers, solving the selected benchmarks,
ordered according to selected metrics, run on the specific hardware and
software configuration used during the competition.
15
Analysis of the State of the Art
in Abstract Argumentation
IPC Score
IPC(s, P) =



0 if P is unsolved
1
1 + log10
TP (s)
T∗
P
otherwise
tP (s) denotes the time needed by solver s to solve P
T∗
P is the minimum amount of time required by any
considered solver to solve P
16
PAR10 score
Penalised Average Runtime 10.
PAR10(s, P) =
10 ∗ T if P is unsolved
tP (s) otherwise
T indicates the considered timeout
tP (s) denotes the time needed by solver s to solve P
17
ICCMA 2015 (1)
Four Semantics:
• complete (CO)
• preferred (PR)
• grounded (GR)
• stable (ST)
Four computational tasks:
• determine some extension (SE)
• determine all extensions (EE)
• decide whether a given argument is contained in some extension
(DC)
• decide whether a given argument is contained in all extensions (DS)
18
ICCMA 2015 (2)
18 solvers, tested on 192 AFs
10 minutes and 4 GB of RAM for solving a task.
1 point for each solved instance (used for in-track ranking).
General ranking done using Borda score.
19
Main Classes of Solvers
Solvers that took part in ICCMA 2015 can be (roughly) classified as
• reduction-based approaches: the argumentation problem is
encoded as a known problem such as SAT, ASP, MAX-SAT, etc.
• Can exploit availability of well-engineered solvers and established
techniques.
• direct approaches: the argumentation problem is tackled directly.
20
ICCMA 2015 – Results
EE-PR
1. Cegartix
2. ArgSemSAT
3. CoQuiAAS
4. ASPARTIX-V
5. LabSATSolver
6. prefMaxSAT
7. ASGL
8. ASPARTIX-D
9. ConArg
10. ArgTools
11. . . .
EE-ST
1. ASPARTIX-D
2. ArgSemSAT
3. CoQuiAAS
4. ASGL
5. ConArg
6. ArgTools
7. LabSATSolver
8. DIAMOND
9. Dungell
Carneades
ASSA
21
ICCMA 2015: Impression
First Impression:
Reduction-based systems
are the most efficient
22
Is That Always the Case?
EE-PR
All Barabasi-Albert Erd¨os-R´enyi StableM Watts-Strogatz
Solver PAR10 Cov. F.t PAR10 Cov. PAR10 Cov. PAR10 Cov. PAR10 Cov.
Cegartix 1350.4 79.1 229 1662.6 74.2 1266.6 81.0 1439.2 77.0 1028.6 84.2
ArgSemSAT 1916.2 69.1 35 3532.3 41.9 433.7 94.2 2530.9 58.7 1171.1 81.5
LabSATSolver 2050.3 66.8 9 3430.7 43.5 261.3 96.5 2869.5 53.0 1657.5 73.9
prefMaxSAT 2057.2 66.8 273 3482.1 42.9 444.0 94.2 3625.2 40.3 697.5 89.4
DIAMOND 2417.0 61.0 1 3447.8 43.2 1366.7 79.0 2831.8 53.7 2026.0 68.0
ASPARTIX-D 2728.6 56.1 4 4101.5 32.6 3067.8 51.6 2068.8 66.7 1630.3 74.3
ASPARTIX-V 2772.2 55.2 21 3646.6 40.3 3292.6 47.1 2340.7 62.0 1772.4 71.9
CoQuiAas 3026.4 50.5 78 3736.1 38.4 2873.4 53.5 2836.4 53.3 2645.1 57.1
ASGL 3477.3 43.2 1 4809.7 20.3 96.1 100.0 4475.4 26.0 4585.5 25.4
Conarg 3696.3 39.3 158 1128.7 81.6 2813.9 55.8 4934.6 18.3 6000.0 0.0
ArgTools 3906.2 35.2 322 3694.4 39.0 45.2 100.0 6000.0 0.0 6000.0 0.0
GRIS 4543.7 24.4 174 254.6 96.1 6000.0 0.0 6000.0 0.0 6000.0 0.0
23
State of the Art
• It is not always the case that that reduction-based solvers always
outperform non reduction-based systems;
• The solvers at the state of the art show a high level of
complementarity (specially those able to deal with EE-PR problems),
thus they are suitable to be combined in portfolios;
24
Parallelising the Reasoning Process
ICCMA focused on sequential solvers. Can we parallelise?
25
Parallelising the Reasoning Process
Quick and clean solution: run multiple solvers in parallel.
Strenghts
• Easy to implement
• Low overhead of communication
Weaknesses
• No information shared among the solvers
• Does not allow to solve instances that are too large for sequential
solvers
26
Parallelising the Reasoning Process
Example: P-SCC-REC [7], for enumerating preferred extensions in large
AFs.
It leverages on the notion of Strongly Connected Components, and the
extension-based semantics definition schema SCC-recursiveness [2]
27
P-SCC-REC: idea
Creation of the SCCs-tree structure: {S1, S2}, {S3} , where S1 = {c, d},
S2 = {e, f }, and S3 = {g, h}.
a b
e f
c d g h
Level 1 Level 2
28
P-SCC-REC: Results)
¼
½ ¼
¿¼¼
¼
¼¼
¼
¼¼
¼ ½ ¼ ¿¼¼ ¼ ¼¼ ¼ ¼¼
Ƚ Ú× È¾
¼
½ ¼
¿¼¼
¼
¼¼
¼
¼¼
¼ ½ ¼ ¿¼¼ ¼ ¼¼ ¼ ¼¼
Ƚ Ú× È
29
Learning for Argumentation
What does “Learning” Mean?
I have a set of AFs that want to analyse, I know the problem I am
working on, I picked up a solver that works decently.
...but, in order to deploy the system, I need it to be faster.
30
What does “Learning” Mean?
I have a set of AFs that want to analyse, I know the problem I am
working on, I picked up a solver that works decently.
...but, in order to deploy the system, I need it to be faster.
Let’s learn something then.
30
Learning: idea
Generic solver
31
Learning: idea
Generic solver
Knowledge
(about the
problem,
solver, ...)
31
Learning: idea
Generic solver
Knowledge
(about the
problem,
solver, ...)
Knowledge-boosted approach
31
However...
Extracting additional knowledge could, in principle, be easy. But...
32
However...
Extracting additional knowledge could, in principle, be easy. But...
32
Which Kind of Knowledge?
• Combination and Selection of solvers
• Configuration of solvers
• Configuration (Reformulation) of AFs
Here we focus on knowledge that can be automatically extracted.
33
Combining and Selecting Solvers
(Solver selection can be seen as a particular case of portfolio
configuration)
• Static: the same portfolio is used for analysing any AF
• Dynamic: portfolio is configured according to some characteristics of
the AF
34
Static Portfolio: Process
35
Static Portfolio
Defined by:
1. the selected solvers;
2. the order in which solvers will be run; and
3. the runtime allocated to each solver.
36
Static Portfolio: Approaches
In [8] two approaches were proposed:
Shared-k
Each component solver has been allocated maxRuntime
k seconds. Solvers
selected/ordered according to overall PAR10
FDSS
From an empty portfolio, we iteratively add either a new solver
component, or extend the allocated CPU-time of a solver already added
to the portfolio, depending on what maximises the increment of the
PAR10 score of the portfolio
37
Dynamic Portfolio: Process
38
Dynamic Portfolio
For each AF, a vector of features is computed.
Similar instances should have similar feature vectors.
Portfolios are configured using empirical performance models
39
Dynamic Portfolio: Features
Features can be extracted from different representations of an AF [3].
E.g., Directed graph representation.
• Graph size features: number of vertices, number of edges, ratios
verticesedges and inverse, and graph density
• Degree features: average, standard deviation, maximum, minimum
degree values across the nodes in the graph.
• SCC features: number of SCCs, average, standard deviation, maxi-
mum and minimum size.
• Graph structure: presence of auto-loops, number of isolated
vertices, etc
Similarly, features can be extracted by considering undirected graph, or
matrix representation.
40
Dynamic Portfolio: Approaches
Classification-based
Classify
It classifies a given AF into a single category which corresponds to the single solver
predicted to be the fastest and allocates it all the available CPU-time
Regression-based
1-Regression
Given the predicted runtime of each solver, the solver predicted to be the fastest is
selected and it has allocated all the available CPU-time
M-regression
Initially we select the solver predicted to be the fastest, but we allocate only its
predicted CPU-time +10%. If such a solver does not solve the given AF in the
allocated time, it is stopped and no longer available to be selected, and the process
iterates by selecting a different solver
41
Some interesting
results when using
representative
training instances..
EE-PR
System Cov. PAR10
VBS 91.4 562.9
Classify 89.7 665.2
1-Regression 88.6 734.7
M-Regression 82.8 1068.3
FDSS 80.0 1311.4
Cegartix 79.1 1350.4
Shared-2 73.2 1678.0
Shared-3 69.4 1892.0
ArgSemSAT 69.1 1916.2
LabSATSolver 66.8 2050.3
prefMaxSAT 66.8 2057.2
Shared-4 65.7 2105.5
Shared-5 63.3 2240.3
DIAMOND 61.0 2417.0
ASPARTIX-D 56.1 2728.6
ASPARTIX-V 55.2 2772.2
CoQuiAas 50.5 3026.4
ASGL 43.2 3477.3
Conarg 39.3 3696.3
ArgTools 35.2 3906.2
GRIS 24.4 4543.7
42
Selection of Solvers
EE-PR
System Class. M-Reg.
ArgSemSAT 0 253
ArgTools 311 305
ASGL 6 36
ASPARTIX-D 2 80
ASPARTIX-V 1 99
Cegartix 221 403
Conarg 157 122
CoQuiAas 43 44
DIAMOND 0 65
GRIS 153 278
LabSATSolver 13 208
prefMaxSAT 297 301
43
Leave-one-set-out Scenario: Can We Generalise?
EE-PR
Barabasi-Albert Erd¨os-R´enyi StableM Watts-Strogatz
System Cov. PAR10 Cov. PAR10 Cov. PAR10 Cov. PAR10
Classify 78.9 1321.4 88.6 745.0 74.4 1574.3 89.5 677.8
1-Regression 76.3 1479.0 63.0 2255.2 76.5 1453.9 83.0 1079.9
M-Regression 70.4 1828.4 67.3 2039.7 77.0 1434.7 79.6 1267.6
FDSS 69.1 1916.2 80.9 1245.5 79.1 1341.9 78.6 1380.0
Shared-2 73.2 1678.0 73.2 1678.0 74.2 1620.4 73.2 1678.0
Shared-3 69.4 1892.0 67.3 2007.9 69.5 1896.7 69.4 1892.0
Shared-4 65.7 2106.2 65.7 2101.1 65.7 2108.1 65.7 2103.9
Shared-5 63.3 2240.9 63.4 2235.8 63.3 2242.9 63.3 2242.9
44
Configuration of Algorithms
Solvers can be configured to improve performance on a class of problems
/ instances.
Image taken from [13].
45
Configuration of Algorithms
There exists several configuration approaches, based on different
underlying ideas.
For the sake of this talk, we focus on SMAC [12], used for configuring
ArgSemSAT
Image taken from [12].
46
Configuration of the Solver
Parameter Domain Default
SOLVER-ExtEnc {001111, 010101, 010111, ......, 111111} 101010
GLUCOSE-gc-frac [0.0, 500.0] 0.2
GLUCOSE-rnd-freq [0.0, 1.0] [0.0
GLUCOSE-cla-decay [0.0, 1.0] 0.999
GLUCOSE-max-var-decay [0.0, 1.0] 0.95
GLUCOSE-var-decay [0.0, 1.0] 0.8
GLUCOSE-phase-saving 0,1,2 2
GLUCOSE-ccmin-mode 0,1,2 2
GLUCOSE-K [0.0, 1.0] 0.8
GLUCOSE-R [1.0, 5.0] 1.4
GLUCOSE-szTrailQueue [10,10000] (int) 5000
GLUCOSE-szLBDQueue [10,10000] (int) 50
GLUCOSE-simp-gc-frac [0.0, 5000.0] 0.5
GLUCOSE-sub-lim [-1,10000] (int) 20
GLUCOSE-cl-lim [-1,10000] (int) 1000
GLUCOSE-grow [-10000,10000] (int) 0
GLUCOSE-incReduceDB [0,10000] (int) 300
GLUCOSE-firstReduceDB [0,10000] (int) 2000
GLUCOSE-
specialIncReduceDB
[0,10000] (int) 1000
GLUCOSE-
minLBDFrozenClause
[0,10000] (int) 30
47
Configuration of the Framework
Order arguments/attacks according to:
1. The number of attacks received;
2. The number of attacks to other arguments;
3. The presence of self-attacks;
4. The difference between the number of received attacks and the
number of attacks to other arguments;
5. Being an argument in a mutual attack.
+ arguments can be listed following a direct or inverse order
Ordering of arguments and attacks are independent
48
Configuration of the Framework (2)
a1 a3 a2
arg(a1).
arg(a2).
arg(a3).
att(a1,a3).
att(a2,a2).
att(a3,a1).
att(a3,a2).
arg(a2).
arg(a3).
arg(a1).
att(a2,a2).
att(a3,a2).
att(a3,a1).
att(a1,a3).
List of arguments ordered according to the number
of received attacks and, subsequently, the number
of outgoing attacks; and the list of attacks ordered
prioritising self-attacks and, subsequently, the
number of outgoing attacks
49
Parametrisation
Parameter Domain Default
args ingoingFirst [-1.0,1.0] 0
args outgoingFirst [-1.0,1.0] 0.2
args autoFirst [-1.0,1.0] -1
args eachOther [-1.0,1.0] -1
args differenceFirst [-1.0,1.0] -1
atts ingoingFirst [-1.0,1.0] 0
atts outgoingFirst [-1.0,1.0] 0
atts autoFirst [-1.0,1.0] 0.2
atts eachOther [-1.0,1.0] 0
atts differenceFirst [-1.0,1.0] 0
atts orders {0,1,2,3,4} 0
0 Same ordering applied to the first argument of the attack pair
1 Same ordering applied to the second argument of the attack pair
2 Inverse ordering applied to the first argument of the attack pair
3 Inverse ordering applied to the second argument of the attack pair
4 Attack-specific ordering
50
Results: Representative Training Instances
Set Configuration IPC Score PAR10 Fastest (%)
Barabasi-Albert Default 78.0 1921.0 2.5
Configured 125.2 1863.1 60.5
Erd¨os-R´enyi Default 56.8 3426.5 16.5
Configured 60.4 3329.2 18.0
Watts-Strogatz Default 116.6 1967.3 28.0
Configured 118.1 1967.9 23.5
General Default 110.0 1665.4 11.0
Configured 143.0 1376.8 62.5
51
Results: Cross-Validation
Training sets Test sets
Barabasi-Albert Erd¨os-R´enyi Watts-Strogatz General
Barabasi-Albert 119.2 6.9 34.5 42.8
Erd¨os-R´enyi 92.3 58.6 105.3 125.7
Watts-Strogatz 116.2 52.6 115.6 129.2
General 87.5 57.6 113.5 133.2
52
Configuration: Most Important Single Parameters
Set 1st 2nd 3rd
Barabasi-Albert S-ExtEnc (011111) G-firstReduceDB (1528) G-cla-decay (0.32)
Erd¨os-R´enyi F-autoFirst (-1.00) G-rnd-freq (0.00) G-K (0.26)
Watts-Strogatz S-ExtEnc (101010) G-Grow (0) G-rnd-freq (0.08)
General S-ExtEnc (101010) G-R (2.09) G-cla-decay (0.99)
53
Configuration: Interaction Between Parameters
54
Learning for Argumentation: Summarising
Exploiting additional knowledge can help argumentation reasoners to
improve their runtime performance.
3 main approaches analysed so far:
• Portfolio / Algorithm Selection
• Algorithm Configuration
• Model Reformulation
55
Let’s move to the last bit of this tutorial.
55
References I
[1] A. Barabasi and R. Albert.
Emergence of scaling in random networks.
Science, 286(5439), 1999.
[2] P. Baroni and M. Giacomin.
A General Recursive Schema for Argumentation Semantics.
In Proceedings of the 14th European Conference on Artificial
Intelligence (ECAI 2004), pages 783–787.
[3] F. Cerutti, M. Giacomin, and M. Vallati.
Algorithm selection for preferred extensions enumeration.
In Computational Models of Argument - Proceedings of COMMA,
pages 221–232, 2014.
56
References II
[4] F. Cerutti, M. Giacomin, and M. Vallati.
Generating challenging benchmark AFs.
In Proceedings of COMMA, pages 457–458, 2014.
[5] F. Cerutti, M. Giacomin, and M. Vallati.
Generating challenging benchmark AFs: Afbenchgen2.
In Proceedings of COMMA, 2016.
[6] F. Cerutti, N. Oren, H. Strass, M. Thimm, and M. Vallati.
A benchmark framework for a computational argumentation
competition.
In Computational Models of Argument - Proceedings of COMMA,
pages 459–460, 2014.
57
References III
[7] F. Cerutti, I. Tachmazidis, M. Vallati, S. Batsakis, M. Giacomin,
and G. Antoniou.
Exploiting parallelism for hard problems in abstract
argumentation.
In Proceedings of the Twenty-Ninth AAAI Conference on Artificial
Intelligence, pages 1475–1481, 2015.
[8] F. Cerutti, M. Vallati, and M. Giacomin.
Where are we now? state of the art and future trends of
solvers for hard argumentation problems.
In Computational Models of Argument - Proceedings of COMMA,
pages 207–218, 2016.
[9] P. Erd¨os and A. R´enyi.
On random graphs. I.
Publicationes Mathematicae Debrecen, 6:290–297, 1959.
58
References IV
[10] A. E. Howe and E. Dahlman.
A critical assessment of benchmark comparison in planning.
J. Artif. Intell. Res. (JAIR), 17:1–3, 2002.
[11] B. Hurley and B. O’Sullivan.
Statistical regimes and runtime prediction.
In Proceedings of the Twenty-Fourth International Joint Conference
on Artificial Intelligence, IJCAI, pages 318–324, 2015.
[12] F. Hutter, H. H. Hoos, K. Leyton-Brown, and K. P. Murphy.
Time-bounded sequential parameter optimization.
In Learning and Intelligent Optimization, 4th International
Conference, LION, pages 281–298, 2010.
[13] F. Hutter, H. H. Hoos, K. Leyton-Brown, and y. v. p.
Thomas St¨utzle, journal=J. Artif. Intell. Res. (JAIR).
Paramils: An automatic algorithm configuration framework.
59
References V
[14] C. Linares L´opez, S. J. Celorrio, and A. G. Olaya.
The deterministic part of the seventh international planning
competition.
Artif. Intell., 223:82–119, 2015.
[15] M. Vallati and T. Vaquero.
Towards a protocol for benchmark selection in IPC.
In Proceedings of the 4th Workshop on the International Planning
Competition (WIPC), 2015.
[16] D. J. Watts and S. H. Strogatz.
Collective dynamics of ’small-world’ networks.
Nature, 393(6684):440–442, 1998.
60

Mais conteúdo relacionado

Semelhante a Argumentation in Artificial Intelligence: From Theory to Practice (Practice)

Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Maarten Smeets
 
Traceability Beyond Source Code: An Elusive Target?
Traceability Beyond Source Code: An Elusive Target?Traceability Beyond Source Code: An Elusive Target?
Traceability Beyond Source Code: An Elusive Target?Lionel Briand
 
Production System Design Support - Accialini Training & Consulting
Production System Design Support - Accialini Training & ConsultingProduction System Design Support - Accialini Training & Consulting
Production System Design Support - Accialini Training & ConsultingNicola Accialini
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsLionel Briand
 
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementGoal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementEmil Lupu
 
Sfeldman performance bb_worldemea07
Sfeldman performance bb_worldemea07Sfeldman performance bb_worldemea07
Sfeldman performance bb_worldemea07Steve Feldman
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)Steve Feldman
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingSteve Feldman
 
Addressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization ProblemsAddressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization Problemsoptimizatiodirectdirect
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaDatabricks
 
Performance Testing Java Applications
Performance Testing Java ApplicationsPerformance Testing Java Applications
Performance Testing Java ApplicationsC4Media
 
Postgres in Production - Best Practices 2014
Postgres in Production - Best Practices 2014Postgres in Production - Best Practices 2014
Postgres in Production - Best Practices 2014EDB
 
Avoiding test hell
Avoiding test hellAvoiding test hell
Avoiding test hellYun Ki Lee
 
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGA GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGLubna_Alhenaki
 
Completion Decision Making with Cross Discipline Integration
Completion Decision Making with Cross Discipline IntegrationCompletion Decision Making with Cross Discipline Integration
Completion Decision Making with Cross Discipline IntegrationSociety of Petroleum Engineers
 

Semelhante a Argumentation in Artificial Intelligence: From Theory to Practice (Practice) (20)

Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!Performance Issue? Machine Learning to the rescue!
Performance Issue? Machine Learning to the rescue!
 
Training - What is Performance ?
Training  - What is Performance ?Training  - What is Performance ?
Training - What is Performance ?
 
Traceability Beyond Source Code: An Elusive Target?
Traceability Beyond Source Code: An Elusive Target?Traceability Beyond Source Code: An Elusive Target?
Traceability Beyond Source Code: An Elusive Target?
 
CFD_Lecture_1.pdf
CFD_Lecture_1.pdfCFD_Lecture_1.pdf
CFD_Lecture_1.pdf
 
Production System Design Support - Accialini Training & Consulting
Production System Design Support - Accialini Training & ConsultingProduction System Design Support - Accialini Training & Consulting
Production System Design Support - Accialini Training & Consulting
 
OR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptxOR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptx
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementGoal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
 
Sfeldman performance bb_worldemea07
Sfeldman performance bb_worldemea07Sfeldman performance bb_worldemea07
Sfeldman performance bb_worldemea07
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarking
 
OR Ndejje Univ.pptx
OR Ndejje Univ.pptxOR Ndejje Univ.pptx
OR Ndejje Univ.pptx
 
Addressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization ProblemsAddressing Uncertainty How to Model and Solve Energy Optimization Problems
Addressing Uncertainty How to Model and Solve Energy Optimization Problems
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
Performance Testing Java Applications
Performance Testing Java ApplicationsPerformance Testing Java Applications
Performance Testing Java Applications
 
Postgres in Production - Best Practices 2014
Postgres in Production - Best Practices 2014Postgres in Production - Best Practices 2014
Postgres in Production - Best Practices 2014
 
Avoiding test hell
Avoiding test hellAvoiding test hell
Avoiding test hell
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
 
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGA GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
 
Completion Decision Making with Cross Discipline Integration
Completion Decision Making with Cross Discipline IntegrationCompletion Decision Making with Cross Discipline Integration
Completion Decision Making with Cross Discipline Integration
 

Último

Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactisticshameyhk98
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 

Último (20)

Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactistics
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Argumentation in Artificial Intelligence: From Theory to Practice (Practice)

  • 1. Argumentation in Artificial Intelligence: From Theory to Practice Part 2: Practice! Federico Cerutti Mauro Vallati Cardiff University University of Huddersfield
  • 2. Table of contents 1. Assessing the State of the Art 2. Analysis of the State of the Art in Abstract Argumentation 3. Learning for Argumentation 1
  • 3. Assessing the State of the Art
  • 4. How to Select a Solver I understand what is argumentation about, I want to use it for solving some of my problems. How do I pick up the best solver(s)? ... or, how to fairly compare solvers 2
  • 5. How to Select a Solver Clearly, one may not have enough time, resources, benchmarks, or experience, to run a full experimental comparison among solvers. This is one of the reasons why standards are introduced and usually exploited. 3
  • 6. Standards First, we need to define some standard way for comparing Specifically: • standard language for input and output • challenging, diverse, and representative instances to deal with (aka, benchmarks) • or, ways for creating and selecting benchmarks The larger and more diverse the set of available benchmarks, the higher the probability the results of the comparison are relevant for your specific set of instances and problems. 4
  • 7. Something more about benchmarks Benchmarks can be created using generators such as AFBenchGen [4, 5] or Probo [6] • Purely random generated AFs. • AFs based on structured graphs • Watts [16] • Erd¨os-R´enyi, [9] • Barabasi-Albert [1] • Focus on Stable • Focus on SCC Otherwise, AFs generated by considering “applications” • Planning • Wikipedia pages • etc.. 5
  • 8. Competitions in AI: problem solved? Standardised way for comparing solvers. 6
  • 9. Can I Blindly Trust Competition Results? NO Ok, let me elaborate on this... 7
  • 10. Sources of Performance Variation There are various sources of performance variation that affect results. Your settings (in a wide sense) and needs can be very different from those used during competitions (Sorry Ariel, not only low-level details) 8
  • 11. Sources of Performance Variation (1) Solver randomisation and other stochastic effects • Many solvers take advantage of randomisation • Very different solver trajectories • Computationally expensive to draw a complete figure of the performance of a randomised solver [11] Other sources: operating system, cache, shared hard drives.. 9
  • 12. Sources of Performance Variation (1) Solver randomisation and other stochastic effects • Many solvers take advantage of randomisation • Very different solver trajectories • Computationally expensive to draw a complete figure of the performance of a randomised solver [11] Other sources: operating system, cache, shared hard drives.. Instances solved across 100 runs on application benchmaks for top 3 SAT 2014 solvers. (from [11]) 9
  • 13. Sources of Performance Variation (2) Running time and memory limits • Generally, more running time or memory result in higher coverage • improved performance with increased limits tends to not be distributed evenly across all solvers 10
  • 14. Sources of Performance Variation (2) Running time and memory limits • Generally, more running time or memory result in higher coverage • improved performance with increased limits tends to not be distributed evenly across all solvers 1 2 3 4 5 6 7 8 Memory [GB] 0 20 40 60 80 100 120 140 Coverage[#instances] Hpp-ce, Hpp Hflow SPMaS Rlazya CedalionGamer DPMPlan Dynamic-Gamer SymBA-2, SymBA-1, NuCeLaR Metis, MIPlan RIDA cGamer-bd IPC 2014: planners that perform extensive precomputation benfit more from increased memory limits [14] 10
  • 15. Sources of Performance Variation (3) Hardware and Software environment • Solvers are affected to varying degree by different CPUs or other hardware elements [10] • Java, C++ compilers, libraries, python, linkers, etc. 11
  • 16. Sources of Performance Variation (3) Hardware and Software environment • Solvers are affected to varying degree by different CPUs or other hardware elements [10] • Java, C++ compilers, libraries, python, linkers, etc. gpj Gpj gPj GPj gpJ GpJ gPJ GPJ 100 110 120 130 140 150 160Coverage[#instances] Madagascar YAHSP3-mt Madagascar-pc, YAHSP3 Probe BFS-f Mercury Jasper ArvandHerd, USE IBaCoP2 Cedalion, IBaCoP YAHSP3-mt Madagascar Madagascar-pc YAHSP3 Probe BFS-f Mercury Jasper IBaCoP2 IBaCoP USE ArvandHerd Cedalion IPC 2014: coverage of top solvers wrt C++, python, and Java version 11
  • 17. Sources of Performance Variation (4) Choice of benchmark (distribution) • Benchmarks should challenging (not trivial, not too hard) • What does challenging mean? (dynamic or static property?)[15] • How to create them? • How to select them? ¼ ¾¼ ¼ ¼ ¼ ½¼¼ Ú ØÝ Ö Ì ÓÙ ØÙÐ È Ö ÒÌ ØÖ× Ð ×Ò ÐÓÓÖØÐ ÌÖ Ò×ÔÓÖØ Ç Ô Ò×Ø × Å ÒØ Ò Ò Î ×Ø ÐÐÖÑ Ò À Ò ÈÐÒÒÖ×´ÔÖ Òص Ë Ø × Ò ÌÖ ½ ¾¼ ½½ ½ ½¼ ¼ 12
  • 18. Sources of Performance Variation (5) Ranking mechanism: The techniques for aggregating results across the set of benchmarks strongly affect competitions outcome [14] Two main orthogonal dimensions: • What metrics do we care about? • Absolute vs relative ranking • Example: IPC score, coverage, Borda ranking, PAR10.. 13
  • 19. Are Competitions Useful? Don’t take me wrong, competitions in AI are awesome. 14
  • 20. Are Competitions Useful? Don’t take me wrong, competitions in AI are awesome. • Foster the advancement of the state of the art • Provide a large set of benchmarks • Support the standardisation • Provide a large number of ready-to-use solvers • Highlight issues that need to be tackled by the community (e.g., areas not receiving enough attention, lack of applications, etc.) 14
  • 21. A Pinch of Salt Results from competitions in AI cannot necessarily be easily generalised. They refer to the considered solvers, solving the selected benchmarks, ordered according to selected metrics, run on the specific hardware and software configuration used during the competition. 15
  • 22. Analysis of the State of the Art in Abstract Argumentation
  • 23. IPC Score IPC(s, P) =    0 if P is unsolved 1 1 + log10 TP (s) T∗ P otherwise tP (s) denotes the time needed by solver s to solve P T∗ P is the minimum amount of time required by any considered solver to solve P 16
  • 24. PAR10 score Penalised Average Runtime 10. PAR10(s, P) = 10 ∗ T if P is unsolved tP (s) otherwise T indicates the considered timeout tP (s) denotes the time needed by solver s to solve P 17
  • 25. ICCMA 2015 (1) Four Semantics: • complete (CO) • preferred (PR) • grounded (GR) • stable (ST) Four computational tasks: • determine some extension (SE) • determine all extensions (EE) • decide whether a given argument is contained in some extension (DC) • decide whether a given argument is contained in all extensions (DS) 18
  • 26. ICCMA 2015 (2) 18 solvers, tested on 192 AFs 10 minutes and 4 GB of RAM for solving a task. 1 point for each solved instance (used for in-track ranking). General ranking done using Borda score. 19
  • 27. Main Classes of Solvers Solvers that took part in ICCMA 2015 can be (roughly) classified as • reduction-based approaches: the argumentation problem is encoded as a known problem such as SAT, ASP, MAX-SAT, etc. • Can exploit availability of well-engineered solvers and established techniques. • direct approaches: the argumentation problem is tackled directly. 20
  • 28. ICCMA 2015 – Results EE-PR 1. Cegartix 2. ArgSemSAT 3. CoQuiAAS 4. ASPARTIX-V 5. LabSATSolver 6. prefMaxSAT 7. ASGL 8. ASPARTIX-D 9. ConArg 10. ArgTools 11. . . . EE-ST 1. ASPARTIX-D 2. ArgSemSAT 3. CoQuiAAS 4. ASGL 5. ConArg 6. ArgTools 7. LabSATSolver 8. DIAMOND 9. Dungell Carneades ASSA 21
  • 29. ICCMA 2015: Impression First Impression: Reduction-based systems are the most efficient 22
  • 30. Is That Always the Case? EE-PR All Barabasi-Albert Erd¨os-R´enyi StableM Watts-Strogatz Solver PAR10 Cov. F.t PAR10 Cov. PAR10 Cov. PAR10 Cov. PAR10 Cov. Cegartix 1350.4 79.1 229 1662.6 74.2 1266.6 81.0 1439.2 77.0 1028.6 84.2 ArgSemSAT 1916.2 69.1 35 3532.3 41.9 433.7 94.2 2530.9 58.7 1171.1 81.5 LabSATSolver 2050.3 66.8 9 3430.7 43.5 261.3 96.5 2869.5 53.0 1657.5 73.9 prefMaxSAT 2057.2 66.8 273 3482.1 42.9 444.0 94.2 3625.2 40.3 697.5 89.4 DIAMOND 2417.0 61.0 1 3447.8 43.2 1366.7 79.0 2831.8 53.7 2026.0 68.0 ASPARTIX-D 2728.6 56.1 4 4101.5 32.6 3067.8 51.6 2068.8 66.7 1630.3 74.3 ASPARTIX-V 2772.2 55.2 21 3646.6 40.3 3292.6 47.1 2340.7 62.0 1772.4 71.9 CoQuiAas 3026.4 50.5 78 3736.1 38.4 2873.4 53.5 2836.4 53.3 2645.1 57.1 ASGL 3477.3 43.2 1 4809.7 20.3 96.1 100.0 4475.4 26.0 4585.5 25.4 Conarg 3696.3 39.3 158 1128.7 81.6 2813.9 55.8 4934.6 18.3 6000.0 0.0 ArgTools 3906.2 35.2 322 3694.4 39.0 45.2 100.0 6000.0 0.0 6000.0 0.0 GRIS 4543.7 24.4 174 254.6 96.1 6000.0 0.0 6000.0 0.0 6000.0 0.0 23
  • 31. State of the Art • It is not always the case that that reduction-based solvers always outperform non reduction-based systems; • The solvers at the state of the art show a high level of complementarity (specially those able to deal with EE-PR problems), thus they are suitable to be combined in portfolios; 24
  • 32. Parallelising the Reasoning Process ICCMA focused on sequential solvers. Can we parallelise? 25
  • 33. Parallelising the Reasoning Process Quick and clean solution: run multiple solvers in parallel. Strenghts • Easy to implement • Low overhead of communication Weaknesses • No information shared among the solvers • Does not allow to solve instances that are too large for sequential solvers 26
  • 34. Parallelising the Reasoning Process Example: P-SCC-REC [7], for enumerating preferred extensions in large AFs. It leverages on the notion of Strongly Connected Components, and the extension-based semantics definition schema SCC-recursiveness [2] 27
  • 35. P-SCC-REC: idea Creation of the SCCs-tree structure: {S1, S2}, {S3} , where S1 = {c, d}, S2 = {e, f }, and S3 = {g, h}. a b e f c d g h Level 1 Level 2 28
  • 36. P-SCC-REC: Results) ¼ ½ ¼ ¿¼¼ ¼ ¼¼ ¼ ¼¼ ¼ ½ ¼ ¿¼¼ ¼ ¼¼ ¼ ¼¼ Ƚ Ú× È¾ ¼ ½ ¼ ¿¼¼ ¼ ¼¼ ¼ ¼¼ ¼ ½ ¼ ¿¼¼ ¼ ¼¼ ¼ ¼¼ Ƚ Ú× È 29
  • 38. What does “Learning” Mean? I have a set of AFs that want to analyse, I know the problem I am working on, I picked up a solver that works decently. ...but, in order to deploy the system, I need it to be faster. 30
  • 39. What does “Learning” Mean? I have a set of AFs that want to analyse, I know the problem I am working on, I picked up a solver that works decently. ...but, in order to deploy the system, I need it to be faster. Let’s learn something then. 30
  • 41. Learning: idea Generic solver Knowledge (about the problem, solver, ...) 31
  • 42. Learning: idea Generic solver Knowledge (about the problem, solver, ...) Knowledge-boosted approach 31
  • 43. However... Extracting additional knowledge could, in principle, be easy. But... 32
  • 44. However... Extracting additional knowledge could, in principle, be easy. But... 32
  • 45. Which Kind of Knowledge? • Combination and Selection of solvers • Configuration of solvers • Configuration (Reformulation) of AFs Here we focus on knowledge that can be automatically extracted. 33
  • 46. Combining and Selecting Solvers (Solver selection can be seen as a particular case of portfolio configuration) • Static: the same portfolio is used for analysing any AF • Dynamic: portfolio is configured according to some characteristics of the AF 34
  • 48. Static Portfolio Defined by: 1. the selected solvers; 2. the order in which solvers will be run; and 3. the runtime allocated to each solver. 36
  • 49. Static Portfolio: Approaches In [8] two approaches were proposed: Shared-k Each component solver has been allocated maxRuntime k seconds. Solvers selected/ordered according to overall PAR10 FDSS From an empty portfolio, we iteratively add either a new solver component, or extend the allocated CPU-time of a solver already added to the portfolio, depending on what maximises the increment of the PAR10 score of the portfolio 37
  • 51. Dynamic Portfolio For each AF, a vector of features is computed. Similar instances should have similar feature vectors. Portfolios are configured using empirical performance models 39
  • 52. Dynamic Portfolio: Features Features can be extracted from different representations of an AF [3]. E.g., Directed graph representation. • Graph size features: number of vertices, number of edges, ratios verticesedges and inverse, and graph density • Degree features: average, standard deviation, maximum, minimum degree values across the nodes in the graph. • SCC features: number of SCCs, average, standard deviation, maxi- mum and minimum size. • Graph structure: presence of auto-loops, number of isolated vertices, etc Similarly, features can be extracted by considering undirected graph, or matrix representation. 40
  • 53. Dynamic Portfolio: Approaches Classification-based Classify It classifies a given AF into a single category which corresponds to the single solver predicted to be the fastest and allocates it all the available CPU-time Regression-based 1-Regression Given the predicted runtime of each solver, the solver predicted to be the fastest is selected and it has allocated all the available CPU-time M-regression Initially we select the solver predicted to be the fastest, but we allocate only its predicted CPU-time +10%. If such a solver does not solve the given AF in the allocated time, it is stopped and no longer available to be selected, and the process iterates by selecting a different solver 41
  • 54. Some interesting results when using representative training instances.. EE-PR System Cov. PAR10 VBS 91.4 562.9 Classify 89.7 665.2 1-Regression 88.6 734.7 M-Regression 82.8 1068.3 FDSS 80.0 1311.4 Cegartix 79.1 1350.4 Shared-2 73.2 1678.0 Shared-3 69.4 1892.0 ArgSemSAT 69.1 1916.2 LabSATSolver 66.8 2050.3 prefMaxSAT 66.8 2057.2 Shared-4 65.7 2105.5 Shared-5 63.3 2240.3 DIAMOND 61.0 2417.0 ASPARTIX-D 56.1 2728.6 ASPARTIX-V 55.2 2772.2 CoQuiAas 50.5 3026.4 ASGL 43.2 3477.3 Conarg 39.3 3696.3 ArgTools 35.2 3906.2 GRIS 24.4 4543.7 42
  • 55. Selection of Solvers EE-PR System Class. M-Reg. ArgSemSAT 0 253 ArgTools 311 305 ASGL 6 36 ASPARTIX-D 2 80 ASPARTIX-V 1 99 Cegartix 221 403 Conarg 157 122 CoQuiAas 43 44 DIAMOND 0 65 GRIS 153 278 LabSATSolver 13 208 prefMaxSAT 297 301 43
  • 56. Leave-one-set-out Scenario: Can We Generalise? EE-PR Barabasi-Albert Erd¨os-R´enyi StableM Watts-Strogatz System Cov. PAR10 Cov. PAR10 Cov. PAR10 Cov. PAR10 Classify 78.9 1321.4 88.6 745.0 74.4 1574.3 89.5 677.8 1-Regression 76.3 1479.0 63.0 2255.2 76.5 1453.9 83.0 1079.9 M-Regression 70.4 1828.4 67.3 2039.7 77.0 1434.7 79.6 1267.6 FDSS 69.1 1916.2 80.9 1245.5 79.1 1341.9 78.6 1380.0 Shared-2 73.2 1678.0 73.2 1678.0 74.2 1620.4 73.2 1678.0 Shared-3 69.4 1892.0 67.3 2007.9 69.5 1896.7 69.4 1892.0 Shared-4 65.7 2106.2 65.7 2101.1 65.7 2108.1 65.7 2103.9 Shared-5 63.3 2240.9 63.4 2235.8 63.3 2242.9 63.3 2242.9 44
  • 57. Configuration of Algorithms Solvers can be configured to improve performance on a class of problems / instances. Image taken from [13]. 45
  • 58. Configuration of Algorithms There exists several configuration approaches, based on different underlying ideas. For the sake of this talk, we focus on SMAC [12], used for configuring ArgSemSAT Image taken from [12]. 46
  • 59. Configuration of the Solver Parameter Domain Default SOLVER-ExtEnc {001111, 010101, 010111, ......, 111111} 101010 GLUCOSE-gc-frac [0.0, 500.0] 0.2 GLUCOSE-rnd-freq [0.0, 1.0] [0.0 GLUCOSE-cla-decay [0.0, 1.0] 0.999 GLUCOSE-max-var-decay [0.0, 1.0] 0.95 GLUCOSE-var-decay [0.0, 1.0] 0.8 GLUCOSE-phase-saving 0,1,2 2 GLUCOSE-ccmin-mode 0,1,2 2 GLUCOSE-K [0.0, 1.0] 0.8 GLUCOSE-R [1.0, 5.0] 1.4 GLUCOSE-szTrailQueue [10,10000] (int) 5000 GLUCOSE-szLBDQueue [10,10000] (int) 50 GLUCOSE-simp-gc-frac [0.0, 5000.0] 0.5 GLUCOSE-sub-lim [-1,10000] (int) 20 GLUCOSE-cl-lim [-1,10000] (int) 1000 GLUCOSE-grow [-10000,10000] (int) 0 GLUCOSE-incReduceDB [0,10000] (int) 300 GLUCOSE-firstReduceDB [0,10000] (int) 2000 GLUCOSE- specialIncReduceDB [0,10000] (int) 1000 GLUCOSE- minLBDFrozenClause [0,10000] (int) 30 47
  • 60. Configuration of the Framework Order arguments/attacks according to: 1. The number of attacks received; 2. The number of attacks to other arguments; 3. The presence of self-attacks; 4. The difference between the number of received attacks and the number of attacks to other arguments; 5. Being an argument in a mutual attack. + arguments can be listed following a direct or inverse order Ordering of arguments and attacks are independent 48
  • 61. Configuration of the Framework (2) a1 a3 a2 arg(a1). arg(a2). arg(a3). att(a1,a3). att(a2,a2). att(a3,a1). att(a3,a2). arg(a2). arg(a3). arg(a1). att(a2,a2). att(a3,a2). att(a3,a1). att(a1,a3). List of arguments ordered according to the number of received attacks and, subsequently, the number of outgoing attacks; and the list of attacks ordered prioritising self-attacks and, subsequently, the number of outgoing attacks 49
  • 62. Parametrisation Parameter Domain Default args ingoingFirst [-1.0,1.0] 0 args outgoingFirst [-1.0,1.0] 0.2 args autoFirst [-1.0,1.0] -1 args eachOther [-1.0,1.0] -1 args differenceFirst [-1.0,1.0] -1 atts ingoingFirst [-1.0,1.0] 0 atts outgoingFirst [-1.0,1.0] 0 atts autoFirst [-1.0,1.0] 0.2 atts eachOther [-1.0,1.0] 0 atts differenceFirst [-1.0,1.0] 0 atts orders {0,1,2,3,4} 0 0 Same ordering applied to the first argument of the attack pair 1 Same ordering applied to the second argument of the attack pair 2 Inverse ordering applied to the first argument of the attack pair 3 Inverse ordering applied to the second argument of the attack pair 4 Attack-specific ordering 50
  • 63. Results: Representative Training Instances Set Configuration IPC Score PAR10 Fastest (%) Barabasi-Albert Default 78.0 1921.0 2.5 Configured 125.2 1863.1 60.5 Erd¨os-R´enyi Default 56.8 3426.5 16.5 Configured 60.4 3329.2 18.0 Watts-Strogatz Default 116.6 1967.3 28.0 Configured 118.1 1967.9 23.5 General Default 110.0 1665.4 11.0 Configured 143.0 1376.8 62.5 51
  • 64. Results: Cross-Validation Training sets Test sets Barabasi-Albert Erd¨os-R´enyi Watts-Strogatz General Barabasi-Albert 119.2 6.9 34.5 42.8 Erd¨os-R´enyi 92.3 58.6 105.3 125.7 Watts-Strogatz 116.2 52.6 115.6 129.2 General 87.5 57.6 113.5 133.2 52
  • 65. Configuration: Most Important Single Parameters Set 1st 2nd 3rd Barabasi-Albert S-ExtEnc (011111) G-firstReduceDB (1528) G-cla-decay (0.32) Erd¨os-R´enyi F-autoFirst (-1.00) G-rnd-freq (0.00) G-K (0.26) Watts-Strogatz S-ExtEnc (101010) G-Grow (0) G-rnd-freq (0.08) General S-ExtEnc (101010) G-R (2.09) G-cla-decay (0.99) 53
  • 67. Learning for Argumentation: Summarising Exploiting additional knowledge can help argumentation reasoners to improve their runtime performance. 3 main approaches analysed so far: • Portfolio / Algorithm Selection • Algorithm Configuration • Model Reformulation 55
  • 68. Let’s move to the last bit of this tutorial. 55
  • 69. References I [1] A. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439), 1999. [2] P. Baroni and M. Giacomin. A General Recursive Schema for Argumentation Semantics. In Proceedings of the 14th European Conference on Artificial Intelligence (ECAI 2004), pages 783–787. [3] F. Cerutti, M. Giacomin, and M. Vallati. Algorithm selection for preferred extensions enumeration. In Computational Models of Argument - Proceedings of COMMA, pages 221–232, 2014. 56
  • 70. References II [4] F. Cerutti, M. Giacomin, and M. Vallati. Generating challenging benchmark AFs. In Proceedings of COMMA, pages 457–458, 2014. [5] F. Cerutti, M. Giacomin, and M. Vallati. Generating challenging benchmark AFs: Afbenchgen2. In Proceedings of COMMA, 2016. [6] F. Cerutti, N. Oren, H. Strass, M. Thimm, and M. Vallati. A benchmark framework for a computational argumentation competition. In Computational Models of Argument - Proceedings of COMMA, pages 459–460, 2014. 57
  • 71. References III [7] F. Cerutti, I. Tachmazidis, M. Vallati, S. Batsakis, M. Giacomin, and G. Antoniou. Exploiting parallelism for hard problems in abstract argumentation. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pages 1475–1481, 2015. [8] F. Cerutti, M. Vallati, and M. Giacomin. Where are we now? state of the art and future trends of solvers for hard argumentation problems. In Computational Models of Argument - Proceedings of COMMA, pages 207–218, 2016. [9] P. Erd¨os and A. R´enyi. On random graphs. I. Publicationes Mathematicae Debrecen, 6:290–297, 1959. 58
  • 72. References IV [10] A. E. Howe and E. Dahlman. A critical assessment of benchmark comparison in planning. J. Artif. Intell. Res. (JAIR), 17:1–3, 2002. [11] B. Hurley and B. O’Sullivan. Statistical regimes and runtime prediction. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI, pages 318–324, 2015. [12] F. Hutter, H. H. Hoos, K. Leyton-Brown, and K. P. Murphy. Time-bounded sequential parameter optimization. In Learning and Intelligent Optimization, 4th International Conference, LION, pages 281–298, 2010. [13] F. Hutter, H. H. Hoos, K. Leyton-Brown, and y. v. p. Thomas St¨utzle, journal=J. Artif. Intell. Res. (JAIR). Paramils: An automatic algorithm configuration framework. 59
  • 73. References V [14] C. Linares L´opez, S. J. Celorrio, and A. G. Olaya. The deterministic part of the seventh international planning competition. Artif. Intell., 223:82–119, 2015. [15] M. Vallati and T. Vaquero. Towards a protocol for benchmark selection in IPC. In Proceedings of the 4th Workshop on the International Planning Competition (WIPC), 2015. [16] D. J. Watts and S. H. Strogatz. Collective dynamics of ’small-world’ networks. Nature, 393(6684):440–442, 1998. 60