SlideShare uma empresa Scribd logo
1 de 54
Baixar para ler offline
Research Overview
Provenance Themes
Dr. Martin Chapman
March 17, 2017
King’s College London
martin.chapman@kcl.ac.uk
1
Overview
Learning the language of error
Playing Hide-And-Seek
Storing and processing MT300s
2
Learning the language of error
Problem
Problem: One of the assertions in a program fails, based upon
given input, and you want to know how the sequence of method
calls in your program might have had an impact on this.
static void gl_read () {
do {
gl_insert(nondet_int ());
}
while (nondet_int ());
}
static void gl_insert(int value) {
struct node *node = malloc(sizeof *node );
node ->value = value;
list_add (&node ->linkage , &gl_list );
INIT_LIST_HEAD (&node ->nested );
}
static inline void __list_add(struct list_head *new ,
struct list_head *prev ,
struct list_head *next) {
next ->prev = new;
new ->next = next;
new ->prev = prev;
prev ->next = new;
}
static void inspect(const struct list_head *head) {
head = head ->prev;
const struct node *node =
list_entry(head , struct node , linkage );
assert(node ->nested.prev == &node ->nested );
for (head = head ->next; &node ->linkage != head;
head = head ->next );
}
static inline void __list_del(struct list_head *prev ,
struct list_head *next) {
next ->prev = prev;
prev ->next = next;
}
static inline void list_add(struct list_head *new ,
struct list_head *head) {
__list_add(new , head , head ->next );
}
int main () {
gl_read ();
inspect (& gl_list );
}
3
Problem
Problem: One of the assertions in a program fails, based upon
given input, and you want to know how the sequence of method
calls in your program might have had an impact on this.
static void gl_read () {
do {
gl_insert(nondet_int ());
}
while (nondet_int ());
}
static void gl_insert(int value) {
struct node *node = malloc(sizeof *node );
node ->value = value;
list_add (&node ->linkage , &gl_list );
INIT_LIST_HEAD (&node ->nested );
}
static inline void __list_add(struct list_head *new ,
struct list_head *prev ,
struct list_head *next) {
next ->prev = new;
new ->next = next;
new ->prev = prev;
prev ->next = new;
}
static void inspect(const struct list_head *head) {
head = head ->prev;
const struct node *node =
list_entry(head , struct node , linkage );
assert(node ->nested.prev == &node ->nested );
for (head = head ->next; &node ->linkage != head;
head = head ->next );
}
static inline void __list_del(struct list_head *prev ,
struct list_head *next) {
next ->prev = prev;
prev ->next = next;
}
static inline void list_add(struct list_head *new ,
struct list_head *head) {
__list_add(new , head , head ->next );
}
int main () {
gl_read ();
inspect (& gl_list );
}
3
Problem
Problem: One of the assertions in a program fails, based upon
given input, and you want to know how the sequence of method
calls in your program might have had an impact on this.
static void gl_read () {
do {
gl_insert(nondet_int ());
}
while (nondet_int ());
}
static void gl_insert(int value) {
struct node *node = malloc(sizeof *node );
node ->value = value;
list_add (&node ->linkage , &gl_list );
INIT_LIST_HEAD (&node ->nested );
}
static inline void __list_add(struct list_head *new ,
struct list_head *prev ,
struct list_head *next) {
next ->prev = new;
new ->next = next;
new ->prev = prev;
prev ->next = new;
}
static void inspect(const struct list_head *head) {
head = head ->prev;
const struct node *node =
list_entry(head , struct node , linkage );
assert(node ->nested.prev == &node ->nested );
for (head = head ->next; &node ->linkage != head;
head = head ->next );
}
static inline void __list_del(struct list_head *prev ,
struct list_head *next) {
next ->prev = prev;
prev ->next = next;
}
static inline void list_add(struct list_head *new ,
struct list_head *head) {
__list_add(new , head , head ->next );
}
int main () {
gl_read ();
inspect (& gl_list );
}
Analysing large amounts of code to understand this can be difficult.
3
Solution: Learning the language of error
Proposed solution: Summarise all the paths that lead to a failing
program assertion as a DFA [Chapman et al., 2015].
4
Solution: Learning the language of error
Proposed solution: Summarise all the paths that lead to a failing
program assertion as a DFA [Chapman et al., 2015].
D
gl read gl insert listadd
list add
glinsert
inspect
Learn Assert
4
Solution: Learning the language of error
Proposed solution: Summarise all the paths that lead to a failing
program assertion as a DFA [Chapman et al., 2015].
D
gl read gl insert listadd
list add
glinsert
inspect
Learn Assert
Much easier to analyse. Provides an overview of program
behaviours, some of which may be unexpected.
4
Solution: Learning the language of error
Proposed solution: Summarise all the paths that lead to a failing
program assertion as a DFA [Chapman et al., 2015].
D
gl read gl insert listadd
list add
glinsert
inspect
Learn Assert
Did we really
want this
method call loop
in our program?
Much easier to analyse. Provides an overview of program
behaviours, some of which may be unexpected.
4
Implementation
Paths are formed from software counterexamples (method calls
that lead to a failing assertion in a program).
5
Implementation
Paths are formed from software counterexamples (method calls
that lead to a failing assertion in a program).
Our software learns these counterexamples via the L* algorithm
[Angluin, 1987] (where the oracle is a model checker).
5
Implementation
Paths are formed from software counterexamples (method calls
that lead to a failing assertion in a program).
Our software learns these counterexamples via the L* algorithm
[Angluin, 1987] (where the oracle is a model checker).
Membership queries pertain to individual counterexamples, while
conjecture queries pertain to full automata.
5
Implementation
Paths are formed from software counterexamples (method calls
that lead to a failing assertion in a program).
Our software learns these counterexamples via the L* algorithm
[Angluin, 1987] (where the oracle is a model checker).
Membership queries pertain to individual counterexamples, while
conjecture queries pertain to full automata.
Supported by a Google faculty research award.
5
Case study: Automatic merging
Unexpected behaviours are particularly prevalent when code is
automatically merged:
main {
...
functionA();
functionB();
...}
(a) Source
main {
...
functionA();
...
functionZ();}
functionZ() {
functionB();}
(b) Branch A
main {
...
functionA();
functionB();
functionC();
... }
(c) Branch B
main {
...
functionA();
...
functionZ();
}
functionZ() {
functionB();
functionC();
...}
(d) Merged
6
Capitalising on automata representation (1)
Our software uses an automaton representation to draw the
developer’s attention to the changes introduced by the merge.
7
Capitalising on automata representation (1)
Our software uses an automaton representation to draw the
developer’s attention to the changes introduced by the merge.
First we generate three automata:
Branch A B1 Merged Code P Branch B B2
Automaton A1 Automaton A2Automaton AMerged
7
Capitalising on automata representation (1)
Our software uses an automaton representation to draw the
developer’s attention to the changes introduced by the merge.
First we generate three automata:
Branch A B1 Merged Code P Branch B B2
Automaton A1 Automaton A2Automaton AMerged
We then compute the following: AMerged  A1 and AMerged  A2
in order to show the new behaviours.
7
Capitalising on automata representation (2)
D
Z
C
Learnassert
Figure 1: AMerged  A1 or behavior not in Branch A
D
Z
BC
Learnassert
Figure 2: AMerged  A2 or behavior not in Branch Ba
a
Subtracting the union of A1 and A2 (common behaviour) would also allow us
to summarise all the new behaviour introduced by the merge.
8
Capitalising on automata representation (3)
Why an automaton?
9
Capitalising on automata representation (3)
Why an automaton?
1. Processing the source code directly in order to achieve a
similar representation is likely to be inefficient (operations on
automata are well established).
9
Capitalising on automata representation (3)
Why an automaton?
1. Processing the source code directly in order to achieve a
similar representation is likely to be inefficient (operations on
automata are well established).
2. The automata representation is highly intelligible.
9
Playing Hide-And-Seek
Overview (1)
Problem: Network attacks are becoming more frequent.
10
Overview (1)
Problem: Network attacks are becoming more frequent.
Potential solution: Construct formal decision making models
(e.g. game theoretic frameworks) that capture network security
scenarios in order to aid automated response (in respect of
automatic processing and solution, etc.).
10
Overview (1)
Problem: Network attacks are becoming more frequent.
Potential solution: Construct formal decision making models
(e.g. game theoretic frameworks) that capture network security
scenarios in order to aid automated response (in respect of
automatic processing and solution, etc.).
An interesting class of network security models: network security
games (NSGs).
• Typically consider the interactions between an attacker and a
defender.
10
Overview (1)
Problem: Network attacks are becoming more frequent.
Potential solution: Construct formal decision making models
(e.g. game theoretic frameworks) that capture network security
scenarios in order to aid automated response (in respect of
automatic processing and solution, etc.).
An interesting class of network security models: network security
games (NSGs).
• Typically consider the interactions between an attacker and a
defender.
A common approach to deriving an NSG model is to apply
existing types of games to unexplored network security problems.
10
Overview (2)
Unexplored network security problem: Multiple node attacks
(e.g. botnets and attack pivots).
11
Overview (2)
Unexplored network security problem: Multiple node attacks
(e.g. botnets and attack pivots).
How do we link multiple node attacks to an existing type of game?
11
Overview (2)
Unexplored network security problem: Multiple node attacks
(e.g. botnets and attack pivots).
How do we link multiple node attacks to an existing type of game?
The link: Multiple node attacks exhibit the two-sided search
problem (looking for something that does not want to be found;
the bots in a botnet (perspective of defender), or hidden, sensitive
resources (perspective of attacker)) with multiple hidden entities.
11
Overview (3)
Search games are designed to model and investigate the two-sided
search problem, as interactions between a hider and a seeker.
12
Overview (3)
Search games are designed to model and investigate the two-sided
search problem, as interactions between a hider and a seeker.
Hide-and-seek games, a subset of search games, are designed to do
this for multiple hidden objects.
12
Overview (3)
Search games are designed to model and investigate the two-sided
search problem, as interactions between a hider and a seeker.
Hide-and-seek games, a subset of search games, are designed to do
this for multiple hidden objects.
Initial proposal: It is logical to study hide-and-seek games in
order to study multiple node attacks [Chapman et al., 2014].
• The hider is the defender, and the seeker is the attacker, or
vice-versa.
12
Hide-And-Seek Games
Different permutations on same basic model. The permutation of
interest to us:
• Two competing players; the hider and the seeker
• A search space; for our purposes, a network graph
• Hidden objects to be concealed on the network
• Some cost to seeker for undertaking a search; the hider is
rewarded in an inverse amount.
Different strategies are explored for both the hider and the seeker.
13
Hide-And-Seek Games
Different permutations on same basic model. The permutation of
interest to us:
• Two competing players; the hider and the seeker
• A search space; for our purposes, a network graph
• Hidden objects to be concealed on the network
• Some cost to seeker for undertaking a search; the hider is
rewarded in an inverse amount.
Different strategies are explored for both the hider and the seeker.
This model is simple, but already promising in what it can capture
from a multiple node attack.
13
Hide-And-Seek Games
Different permutations on same basic model. The permutation of
interest to us:
• Two competing players; the hider and the seeker
• A search space; for our purposes, a network graph
• Hidden objects to be concealed on the network
• Some cost to seeker for undertaking a search; the hider is
rewarded in an inverse amount.
Different strategies are explored for both the hider and the seeker.
This model is simple, but already promising in what it can capture
from a multiple node attack.
Richer variants to the model are natural, why aren’t they explored?
‘Complexity’. 13
Methodology
We increase the richness of the model, and thus what it can
capture of the security domain (e.g. timesteps, repeated
interactions).
We compensate for any increase in complexity by using an
Empirical Game Theoretical Analysis (EGTA) approach to
estimate the payoff values associated with different strategies by
realising computational representations of them, and evaluating
their performance in simulation.
14
Methodology
We increase the richness of the model, and thus what it can
capture of the security domain (e.g. timesteps, repeated
interactions).
We compensate for any increase in complexity by using an
Empirical Game Theoretical Analysis (EGTA) approach to
estimate the payoff values associated with different strategies by
realising computational representations of them, and evaluating
their performance in simulation.
• Also allowed us to contribute a computational platform, which
can be used as the basis for Distributed Research Games
(more at cyberhands.co.uk)
14
Methodology
We increase the richness of the model, and thus what it can
capture of the security domain (e.g. timesteps, repeated
interactions).
We compensate for any increase in complexity by using an
Empirical Game Theoretical Analysis (EGTA) approach to
estimate the payoff values associated with different strategies by
realising computational representations of them, and evaluating
their performance in simulation.
• Also allowed us to contribute a computational platform, which
can be used as the basis for Distributed Research Games
(more at cyberhands.co.uk)
The performance of different strategies provides the basis for
heuristics that can be applied to real security applications. 14
Results (1)
Multiple interaction game: the same attacker (hider) and
defender (seeker) meet each other multiple times.
15
Results (1)
Multiple interaction game: the same attacker (hider) and
defender (seeker) meet each other multiple times.
Natural for the defender to keep data on the actions of an
attacker, to help plan future strategies, by observing how the
attacker interacts with the environment (e.g. where objects are
hidden).
15
Results (1)
Multiple interaction game: the same attacker (hider) and
defender (seeker) meet each other multiple times.
Natural for the defender to keep data on the actions of an
attacker, to help plan future strategies, by observing how the
attacker interacts with the environment (e.g. where objects are
hidden).
Therefore, natural for attacker to attempt to manipulate this data
(i.e. switch the source of the data from the environment to
themselves).
15
Results (1)
Multiple interaction game: the same attacker (hider) and
defender (seeker) meet each other multiple times.
Natural for the defender to keep data on the actions of an
attacker, to help plan future strategies, by observing how the
attacker interacts with the environment (e.g. where objects are
hidden).
Therefore, natural for attacker to attempt to manipulate this data
(i.e. switch the source of the data from the environment to
themselves).
Finding: deceptive strategies are not effective if the defender is
sophisticated in respect of determining the source of data (i.e
determining when manipulation is being attempted).
15
Results (2)
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
hD
eceptive
hRandom
Set
Payoff
Strategy
sHighProbability
***
***
16
Storing and processing MT300s
Storing and processing MT300s (1)
Brief: “To design and build a distributed ledger POC system to
process and store proprietary messages for inter- subsidiary forex
transactions (MT300s) internal to a major Fortune 500 financial
institution.”
• Taking pairs of messages about forex transactions (e.g. £→
$; $ → £), and storing them on the ‘blockchain’.
Distributed ledger: A generalised term for a blockchain,
emphasising that it’s not only currency exchanges that can be
stored in this sequential, secure and replicated way.
17
Storing and processing MT300s (2)
Why? Intermediate message processors add time and money.
Cynically: become familiar with the technology that may one day
supplant them.
Output. A platform that:
• Integrates a wider range of different technologies to achieve
its aim (e.g. BigChainDB, ErisDB (Permissioned chains based
on Ethereum / EVM) + Tendermint)).
• Focuses on scalability and throughput.
Research into inter-chain interaction for processing and storing
data (e.g. using a separate chain to store filtered transactions).
18
Provenance Themes
A recurring theme of both conceptual provenance, and data
provenance, in my work:
1
19
Provenance Themes
A recurring theme of both conceptual provenance, and data
provenance, in my work:
• Learning the language of error: Understanding the functions
that have had an impact on input data, using a graph based
representation, and how this has lead to an error.
1
19
Provenance Themes
A recurring theme of both conceptual provenance, and data
provenance, in my work:
• Learning the language of error: Understanding the functions
that have had an impact on input data, using a graph based
representation, and how this has lead to an error.
• Playing Hide-And-Seek: Understanding the origin of data in
order to make strategic decisions.
1
19
Provenance Themes
A recurring theme of both conceptual provenance, and data
provenance, in my work:
• Learning the language of error: Understanding the functions
that have had an impact on input data, using a graph based
representation, and how this has lead to an error.
• Playing Hide-And-Seek: Understanding the origin of data in
order to make strategic decisions.
• MT300 Processing: Using a distributed ledger (blockchain) to
provide a secure historic record of all the actions involving an
entity1.
1
Lots of research to be done at the intersection here!
19
Summary (1)
Experience of, and achievements as a part of, projects that require
not only good development skills, but also research capabilities.
Strong programming ability.
Wide range of experience working with different systems, some
large in scale, or designed to be scalable. In particular systems
that have required me to consider how to facilitate
communication between heterogeneous entities (e.g. learn
tool, HANDS platform, distributed ledger projects).
Ph.D. with a focus on game theory, artificial intelligence, and
elements of learning.
20
Summary (2)
Themes of provenance, and graph-based representation,
throughout work.
Additional experience as a teaching academic staff member at
King’s: significant teaching responsibilities, in addition to
administrative and pastoral responsibilities. Some system
development as a part of this role.
21
References
Angluin, D. (1987).
Learning regular sets from queries and counterexamples.
Information and computation, 75(2):87–106.
Chapman, M., Chockler, H., Kesseli, P., Kroening, D.,
Strichman, O., and Tautschnig, M. (2015).
Learning the language of error.
In International Symposium on Automated Technology for
Verification and Analysis, pages 114–130. Springer.
Chapman, M., Tyson, G., McBurney, P., Luck, M., and
Parsons, S. (2014).
Playing hide-and-seek: an abstract game for cyber
security.
In Proceedings of the 1st International Workshop on Agents
and CyberSecurity, page 3. ACM. 22

Mais conteúdo relacionado

Mais procurados

C interview-questions-techpreparation
C interview-questions-techpreparationC interview-questions-techpreparation
C interview-questions-techpreparation
Kushaal Singla
 

Mais procurados (18)

Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
Symbolic Reasoning and Concrete Execution - Andrii Vozniuk
 
Automatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSEL
 
Pj01 4-operators and control flow
Pj01 4-operators and control flowPj01 4-operators and control flow
Pj01 4-operators and control flow
 
Java Questions and Answers
Java Questions and AnswersJava Questions and Answers
Java Questions and Answers
 
C interview-questions-techpreparation
C interview-questions-techpreparationC interview-questions-techpreparation
C interview-questions-techpreparation
 
Pharo tutorial at ECOOP 2013
Pharo tutorial at ECOOP 2013Pharo tutorial at ECOOP 2013
Pharo tutorial at ECOOP 2013
 
Lesson 5 link list
Lesson 5  link listLesson 5  link list
Lesson 5 link list
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.Proto
 
Java 8 features
Java 8 featuresJava 8 features
Java 8 features
 
Lazy Java
Lazy JavaLazy Java
Lazy Java
 
Boost.Dispatch
Boost.DispatchBoost.Dispatch
Boost.Dispatch
 
Chapter 2 Java Methods
Chapter 2 Java MethodsChapter 2 Java Methods
Chapter 2 Java Methods
 
(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures(Costless) Software Abstractions for Parallel Architectures
(Costless) Software Abstractions for Parallel Architectures
 
Nagios Conference 2013 - BOF Nagios Plugins New Threshold Specification Syntax
Nagios Conference 2013 - BOF Nagios Plugins New Threshold Specification SyntaxNagios Conference 2013 - BOF Nagios Plugins New Threshold Specification Syntax
Nagios Conference 2013 - BOF Nagios Plugins New Threshold Specification Syntax
 
មេរៀនៈ Data Structure and Algorithm in C/C++
មេរៀនៈ Data Structure and Algorithm in C/C++មេរៀនៈ Data Structure and Algorithm in C/C++
មេរៀនៈ Data Structure and Algorithm in C/C++
 
TDD Training
TDD TrainingTDD Training
TDD Training
 
Structure In C
Structure In CStructure In C
Structure In C
 
Fnctions part2
Fnctions part2Fnctions part2
Fnctions part2
 

Semelhante a Martin Chapman: Research Overview, 2017

Faculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docxFaculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
mydrynan
 
Chapter 2.4
Chapter 2.4Chapter 2.4
Chapter 2.4
sotlsoc
 
Questions On The Code And Core Module
Questions On The Code And Core ModuleQuestions On The Code And Core Module
Questions On The Code And Core Module
Katie Gulley
 
Lec 1 Ds
Lec 1 DsLec 1 Ds
Lec 1 Ds
Qundeel
 
Data Structure
Data StructureData Structure
Data Structure
sheraz1
 
Lec 1 Ds
Lec 1 DsLec 1 Ds
Lec 1 Ds
Qundeel
 
Software testing strategies
Software testing strategiesSoftware testing strategies
Software testing strategies
Krishna Sujeer
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1
ecomputernotes
 
201309 130917200320-phpapp01
201309 130917200320-phpapp01201309 130917200320-phpapp01
201309 130917200320-phpapp01
Simon Lin
 
S D D Program Development Tools
S D D  Program  Development  ToolsS D D  Program  Development  Tools
S D D Program Development Tools
gavhays
 
Program logic and design
Program logic and designProgram logic and design
Program logic and design
Chaffey College
 

Semelhante a Martin Chapman: Research Overview, 2017 (20)

Faculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docxFaculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
 
Chapter 2.4
Chapter 2.4Chapter 2.4
Chapter 2.4
 
Questions On The Code And Core Module
Questions On The Code And Core ModuleQuestions On The Code And Core Module
Questions On The Code And Core Module
 
Lec 1 Ds
Lec 1 DsLec 1 Ds
Lec 1 Ds
 
Data Structure
Data StructureData Structure
Data Structure
 
Lec 1 Ds
Lec 1 DsLec 1 Ds
Lec 1 Ds
 
Learn C
Learn CLearn C
Learn C
 
Refactoring for Software Design Smells - Tech Talk
Refactoring for Software Design Smells - Tech Talk Refactoring for Software Design Smells - Tech Talk
Refactoring for Software Design Smells - Tech Talk
 
Refactoring for Software Design Smells - Tech Talk
Refactoring for Software Design Smells - Tech TalkRefactoring for Software Design Smells - Tech Talk
Refactoring for Software Design Smells - Tech Talk
 
Software testing strategies
Software testing strategiesSoftware testing strategies
Software testing strategies
 
Data structures cs301 power point slides lecture 01
Data structures   cs301 power point slides lecture 01Data structures   cs301 power point slides lecture 01
Data structures cs301 power point slides lecture 01
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1
 
201309 130917200320-phpapp01
201309 130917200320-phpapp01201309 130917200320-phpapp01
201309 130917200320-phpapp01
 
AUTOCODECOVERGEN: PROTOTYPE OF DATA DRIVEN UNIT TEST GENRATION TOOL THAT GUAR...
AUTOCODECOVERGEN: PROTOTYPE OF DATA DRIVEN UNIT TEST GENRATION TOOL THAT GUAR...AUTOCODECOVERGEN: PROTOTYPE OF DATA DRIVEN UNIT TEST GENRATION TOOL THAT GUAR...
AUTOCODECOVERGEN: PROTOTYPE OF DATA DRIVEN UNIT TEST GENRATION TOOL THAT GUAR...
 
S D D Program Development Tools
S D D  Program  Development  ToolsS D D  Program  Development  Tools
S D D Program Development Tools
 
Program logic and design
Program logic and designProgram logic and design
Program logic and design
 
3.5
3.53.5
3.5
 
A brief overview of java frameworks
A brief overview of java frameworksA brief overview of java frameworks
A brief overview of java frameworks
 
The First C# Project Analyzed
The First C# Project AnalyzedThe First C# Project Analyzed
The First C# Project Analyzed
 
Clean Code
Clean CodeClean Code
Clean Code
 

Mais de Martin Chapman

Using AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patientsUsing AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patients
Martin Chapman
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
Martin Chapman
 

Mais de Martin Chapman (20)

Principles of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learningPrinciples of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learning
 
Principles of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systemsPrinciples of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systems
 
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
 
Technical Validation through Automated Testing
Technical Validation through Automated TestingTechnical Validation through Automated Testing
Technical Validation through Automated Testing
 
Scalable architectures for phenotype libraries
Scalable architectures for phenotype librariesScalable architectures for phenotype libraries
Scalable architectures for phenotype libraries
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Using AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patientsUsing AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patients
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Principles of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical softwarePrinciples of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical software
 
Principles of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical softwarePrinciples of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical software
 
Principles of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile healthPrinciples of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile health
 
Principles of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcarePrinciples of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcare
 
Principles of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systemsPrinciples of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systems
 
Principles of Health Informatics: Representing medical knowledge
Principles of Health Informatics: Representing medical knowledgePrinciples of Health Informatics: Representing medical knowledge
Principles of Health Informatics: Representing medical knowledge
 
Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...
 
Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...
 
Principles of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systemsPrinciples of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systems
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Using Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research SoftwareUsing Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research Software
 
Using CWL to support EHR-based phenotyping
Using CWL to support EHR-based phenotypingUsing CWL to support EHR-based phenotyping
Using CWL to support EHR-based phenotyping
 

Último

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Último (20)

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 

Martin Chapman: Research Overview, 2017

  • 1. Research Overview Provenance Themes Dr. Martin Chapman March 17, 2017 King’s College London martin.chapman@kcl.ac.uk 1
  • 2. Overview Learning the language of error Playing Hide-And-Seek Storing and processing MT300s 2
  • 4. Problem Problem: One of the assertions in a program fails, based upon given input, and you want to know how the sequence of method calls in your program might have had an impact on this. static void gl_read () { do { gl_insert(nondet_int ()); } while (nondet_int ()); } static void gl_insert(int value) { struct node *node = malloc(sizeof *node ); node ->value = value; list_add (&node ->linkage , &gl_list ); INIT_LIST_HEAD (&node ->nested ); } static inline void __list_add(struct list_head *new , struct list_head *prev , struct list_head *next) { next ->prev = new; new ->next = next; new ->prev = prev; prev ->next = new; } static void inspect(const struct list_head *head) { head = head ->prev; const struct node *node = list_entry(head , struct node , linkage ); assert(node ->nested.prev == &node ->nested ); for (head = head ->next; &node ->linkage != head; head = head ->next ); } static inline void __list_del(struct list_head *prev , struct list_head *next) { next ->prev = prev; prev ->next = next; } static inline void list_add(struct list_head *new , struct list_head *head) { __list_add(new , head , head ->next ); } int main () { gl_read (); inspect (& gl_list ); } 3
  • 5. Problem Problem: One of the assertions in a program fails, based upon given input, and you want to know how the sequence of method calls in your program might have had an impact on this. static void gl_read () { do { gl_insert(nondet_int ()); } while (nondet_int ()); } static void gl_insert(int value) { struct node *node = malloc(sizeof *node ); node ->value = value; list_add (&node ->linkage , &gl_list ); INIT_LIST_HEAD (&node ->nested ); } static inline void __list_add(struct list_head *new , struct list_head *prev , struct list_head *next) { next ->prev = new; new ->next = next; new ->prev = prev; prev ->next = new; } static void inspect(const struct list_head *head) { head = head ->prev; const struct node *node = list_entry(head , struct node , linkage ); assert(node ->nested.prev == &node ->nested ); for (head = head ->next; &node ->linkage != head; head = head ->next ); } static inline void __list_del(struct list_head *prev , struct list_head *next) { next ->prev = prev; prev ->next = next; } static inline void list_add(struct list_head *new , struct list_head *head) { __list_add(new , head , head ->next ); } int main () { gl_read (); inspect (& gl_list ); } 3
  • 6. Problem Problem: One of the assertions in a program fails, based upon given input, and you want to know how the sequence of method calls in your program might have had an impact on this. static void gl_read () { do { gl_insert(nondet_int ()); } while (nondet_int ()); } static void gl_insert(int value) { struct node *node = malloc(sizeof *node ); node ->value = value; list_add (&node ->linkage , &gl_list ); INIT_LIST_HEAD (&node ->nested ); } static inline void __list_add(struct list_head *new , struct list_head *prev , struct list_head *next) { next ->prev = new; new ->next = next; new ->prev = prev; prev ->next = new; } static void inspect(const struct list_head *head) { head = head ->prev; const struct node *node = list_entry(head , struct node , linkage ); assert(node ->nested.prev == &node ->nested ); for (head = head ->next; &node ->linkage != head; head = head ->next ); } static inline void __list_del(struct list_head *prev , struct list_head *next) { next ->prev = prev; prev ->next = next; } static inline void list_add(struct list_head *new , struct list_head *head) { __list_add(new , head , head ->next ); } int main () { gl_read (); inspect (& gl_list ); } Analysing large amounts of code to understand this can be difficult. 3
  • 7. Solution: Learning the language of error Proposed solution: Summarise all the paths that lead to a failing program assertion as a DFA [Chapman et al., 2015]. 4
  • 8. Solution: Learning the language of error Proposed solution: Summarise all the paths that lead to a failing program assertion as a DFA [Chapman et al., 2015]. D gl read gl insert listadd list add glinsert inspect Learn Assert 4
  • 9. Solution: Learning the language of error Proposed solution: Summarise all the paths that lead to a failing program assertion as a DFA [Chapman et al., 2015]. D gl read gl insert listadd list add glinsert inspect Learn Assert Much easier to analyse. Provides an overview of program behaviours, some of which may be unexpected. 4
  • 10. Solution: Learning the language of error Proposed solution: Summarise all the paths that lead to a failing program assertion as a DFA [Chapman et al., 2015]. D gl read gl insert listadd list add glinsert inspect Learn Assert Did we really want this method call loop in our program? Much easier to analyse. Provides an overview of program behaviours, some of which may be unexpected. 4
  • 11. Implementation Paths are formed from software counterexamples (method calls that lead to a failing assertion in a program). 5
  • 12. Implementation Paths are formed from software counterexamples (method calls that lead to a failing assertion in a program). Our software learns these counterexamples via the L* algorithm [Angluin, 1987] (where the oracle is a model checker). 5
  • 13. Implementation Paths are formed from software counterexamples (method calls that lead to a failing assertion in a program). Our software learns these counterexamples via the L* algorithm [Angluin, 1987] (where the oracle is a model checker). Membership queries pertain to individual counterexamples, while conjecture queries pertain to full automata. 5
  • 14. Implementation Paths are formed from software counterexamples (method calls that lead to a failing assertion in a program). Our software learns these counterexamples via the L* algorithm [Angluin, 1987] (where the oracle is a model checker). Membership queries pertain to individual counterexamples, while conjecture queries pertain to full automata. Supported by a Google faculty research award. 5
  • 15. Case study: Automatic merging Unexpected behaviours are particularly prevalent when code is automatically merged: main { ... functionA(); functionB(); ...} (a) Source main { ... functionA(); ... functionZ();} functionZ() { functionB();} (b) Branch A main { ... functionA(); functionB(); functionC(); ... } (c) Branch B main { ... functionA(); ... functionZ(); } functionZ() { functionB(); functionC(); ...} (d) Merged 6
  • 16. Capitalising on automata representation (1) Our software uses an automaton representation to draw the developer’s attention to the changes introduced by the merge. 7
  • 17. Capitalising on automata representation (1) Our software uses an automaton representation to draw the developer’s attention to the changes introduced by the merge. First we generate three automata: Branch A B1 Merged Code P Branch B B2 Automaton A1 Automaton A2Automaton AMerged 7
  • 18. Capitalising on automata representation (1) Our software uses an automaton representation to draw the developer’s attention to the changes introduced by the merge. First we generate three automata: Branch A B1 Merged Code P Branch B B2 Automaton A1 Automaton A2Automaton AMerged We then compute the following: AMerged A1 and AMerged A2 in order to show the new behaviours. 7
  • 19. Capitalising on automata representation (2) D Z C Learnassert Figure 1: AMerged A1 or behavior not in Branch A D Z BC Learnassert Figure 2: AMerged A2 or behavior not in Branch Ba a Subtracting the union of A1 and A2 (common behaviour) would also allow us to summarise all the new behaviour introduced by the merge. 8
  • 20. Capitalising on automata representation (3) Why an automaton? 9
  • 21. Capitalising on automata representation (3) Why an automaton? 1. Processing the source code directly in order to achieve a similar representation is likely to be inefficient (operations on automata are well established). 9
  • 22. Capitalising on automata representation (3) Why an automaton? 1. Processing the source code directly in order to achieve a similar representation is likely to be inefficient (operations on automata are well established). 2. The automata representation is highly intelligible. 9
  • 24. Overview (1) Problem: Network attacks are becoming more frequent. 10
  • 25. Overview (1) Problem: Network attacks are becoming more frequent. Potential solution: Construct formal decision making models (e.g. game theoretic frameworks) that capture network security scenarios in order to aid automated response (in respect of automatic processing and solution, etc.). 10
  • 26. Overview (1) Problem: Network attacks are becoming more frequent. Potential solution: Construct formal decision making models (e.g. game theoretic frameworks) that capture network security scenarios in order to aid automated response (in respect of automatic processing and solution, etc.). An interesting class of network security models: network security games (NSGs). • Typically consider the interactions between an attacker and a defender. 10
  • 27. Overview (1) Problem: Network attacks are becoming more frequent. Potential solution: Construct formal decision making models (e.g. game theoretic frameworks) that capture network security scenarios in order to aid automated response (in respect of automatic processing and solution, etc.). An interesting class of network security models: network security games (NSGs). • Typically consider the interactions between an attacker and a defender. A common approach to deriving an NSG model is to apply existing types of games to unexplored network security problems. 10
  • 28. Overview (2) Unexplored network security problem: Multiple node attacks (e.g. botnets and attack pivots). 11
  • 29. Overview (2) Unexplored network security problem: Multiple node attacks (e.g. botnets and attack pivots). How do we link multiple node attacks to an existing type of game? 11
  • 30. Overview (2) Unexplored network security problem: Multiple node attacks (e.g. botnets and attack pivots). How do we link multiple node attacks to an existing type of game? The link: Multiple node attacks exhibit the two-sided search problem (looking for something that does not want to be found; the bots in a botnet (perspective of defender), or hidden, sensitive resources (perspective of attacker)) with multiple hidden entities. 11
  • 31. Overview (3) Search games are designed to model and investigate the two-sided search problem, as interactions between a hider and a seeker. 12
  • 32. Overview (3) Search games are designed to model and investigate the two-sided search problem, as interactions between a hider and a seeker. Hide-and-seek games, a subset of search games, are designed to do this for multiple hidden objects. 12
  • 33. Overview (3) Search games are designed to model and investigate the two-sided search problem, as interactions between a hider and a seeker. Hide-and-seek games, a subset of search games, are designed to do this for multiple hidden objects. Initial proposal: It is logical to study hide-and-seek games in order to study multiple node attacks [Chapman et al., 2014]. • The hider is the defender, and the seeker is the attacker, or vice-versa. 12
  • 34. Hide-And-Seek Games Different permutations on same basic model. The permutation of interest to us: • Two competing players; the hider and the seeker • A search space; for our purposes, a network graph • Hidden objects to be concealed on the network • Some cost to seeker for undertaking a search; the hider is rewarded in an inverse amount. Different strategies are explored for both the hider and the seeker. 13
  • 35. Hide-And-Seek Games Different permutations on same basic model. The permutation of interest to us: • Two competing players; the hider and the seeker • A search space; for our purposes, a network graph • Hidden objects to be concealed on the network • Some cost to seeker for undertaking a search; the hider is rewarded in an inverse amount. Different strategies are explored for both the hider and the seeker. This model is simple, but already promising in what it can capture from a multiple node attack. 13
  • 36. Hide-And-Seek Games Different permutations on same basic model. The permutation of interest to us: • Two competing players; the hider and the seeker • A search space; for our purposes, a network graph • Hidden objects to be concealed on the network • Some cost to seeker for undertaking a search; the hider is rewarded in an inverse amount. Different strategies are explored for both the hider and the seeker. This model is simple, but already promising in what it can capture from a multiple node attack. Richer variants to the model are natural, why aren’t they explored? ‘Complexity’. 13
  • 37. Methodology We increase the richness of the model, and thus what it can capture of the security domain (e.g. timesteps, repeated interactions). We compensate for any increase in complexity by using an Empirical Game Theoretical Analysis (EGTA) approach to estimate the payoff values associated with different strategies by realising computational representations of them, and evaluating their performance in simulation. 14
  • 38. Methodology We increase the richness of the model, and thus what it can capture of the security domain (e.g. timesteps, repeated interactions). We compensate for any increase in complexity by using an Empirical Game Theoretical Analysis (EGTA) approach to estimate the payoff values associated with different strategies by realising computational representations of them, and evaluating their performance in simulation. • Also allowed us to contribute a computational platform, which can be used as the basis for Distributed Research Games (more at cyberhands.co.uk) 14
  • 39. Methodology We increase the richness of the model, and thus what it can capture of the security domain (e.g. timesteps, repeated interactions). We compensate for any increase in complexity by using an Empirical Game Theoretical Analysis (EGTA) approach to estimate the payoff values associated with different strategies by realising computational representations of them, and evaluating their performance in simulation. • Also allowed us to contribute a computational platform, which can be used as the basis for Distributed Research Games (more at cyberhands.co.uk) The performance of different strategies provides the basis for heuristics that can be applied to real security applications. 14
  • 40. Results (1) Multiple interaction game: the same attacker (hider) and defender (seeker) meet each other multiple times. 15
  • 41. Results (1) Multiple interaction game: the same attacker (hider) and defender (seeker) meet each other multiple times. Natural for the defender to keep data on the actions of an attacker, to help plan future strategies, by observing how the attacker interacts with the environment (e.g. where objects are hidden). 15
  • 42. Results (1) Multiple interaction game: the same attacker (hider) and defender (seeker) meet each other multiple times. Natural for the defender to keep data on the actions of an attacker, to help plan future strategies, by observing how the attacker interacts with the environment (e.g. where objects are hidden). Therefore, natural for attacker to attempt to manipulate this data (i.e. switch the source of the data from the environment to themselves). 15
  • 43. Results (1) Multiple interaction game: the same attacker (hider) and defender (seeker) meet each other multiple times. Natural for the defender to keep data on the actions of an attacker, to help plan future strategies, by observing how the attacker interacts with the environment (e.g. where objects are hidden). Therefore, natural for attacker to attempt to manipulate this data (i.e. switch the source of the data from the environment to themselves). Finding: deceptive strategies are not effective if the defender is sophisticated in respect of determining the source of data (i.e determining when manipulation is being attempted). 15
  • 46. Storing and processing MT300s (1) Brief: “To design and build a distributed ledger POC system to process and store proprietary messages for inter- subsidiary forex transactions (MT300s) internal to a major Fortune 500 financial institution.” • Taking pairs of messages about forex transactions (e.g. £→ $; $ → £), and storing them on the ‘blockchain’. Distributed ledger: A generalised term for a blockchain, emphasising that it’s not only currency exchanges that can be stored in this sequential, secure and replicated way. 17
  • 47. Storing and processing MT300s (2) Why? Intermediate message processors add time and money. Cynically: become familiar with the technology that may one day supplant them. Output. A platform that: • Integrates a wider range of different technologies to achieve its aim (e.g. BigChainDB, ErisDB (Permissioned chains based on Ethereum / EVM) + Tendermint)). • Focuses on scalability and throughput. Research into inter-chain interaction for processing and storing data (e.g. using a separate chain to store filtered transactions). 18
  • 48. Provenance Themes A recurring theme of both conceptual provenance, and data provenance, in my work: 1 19
  • 49. Provenance Themes A recurring theme of both conceptual provenance, and data provenance, in my work: • Learning the language of error: Understanding the functions that have had an impact on input data, using a graph based representation, and how this has lead to an error. 1 19
  • 50. Provenance Themes A recurring theme of both conceptual provenance, and data provenance, in my work: • Learning the language of error: Understanding the functions that have had an impact on input data, using a graph based representation, and how this has lead to an error. • Playing Hide-And-Seek: Understanding the origin of data in order to make strategic decisions. 1 19
  • 51. Provenance Themes A recurring theme of both conceptual provenance, and data provenance, in my work: • Learning the language of error: Understanding the functions that have had an impact on input data, using a graph based representation, and how this has lead to an error. • Playing Hide-And-Seek: Understanding the origin of data in order to make strategic decisions. • MT300 Processing: Using a distributed ledger (blockchain) to provide a secure historic record of all the actions involving an entity1. 1 Lots of research to be done at the intersection here! 19
  • 52. Summary (1) Experience of, and achievements as a part of, projects that require not only good development skills, but also research capabilities. Strong programming ability. Wide range of experience working with different systems, some large in scale, or designed to be scalable. In particular systems that have required me to consider how to facilitate communication between heterogeneous entities (e.g. learn tool, HANDS platform, distributed ledger projects). Ph.D. with a focus on game theory, artificial intelligence, and elements of learning. 20
  • 53. Summary (2) Themes of provenance, and graph-based representation, throughout work. Additional experience as a teaching academic staff member at King’s: significant teaching responsibilities, in addition to administrative and pastoral responsibilities. Some system development as a part of this role. 21
  • 54. References Angluin, D. (1987). Learning regular sets from queries and counterexamples. Information and computation, 75(2):87–106. Chapman, M., Chockler, H., Kesseli, P., Kroening, D., Strichman, O., and Tautschnig, M. (2015). Learning the language of error. In International Symposium on Automated Technology for Verification and Analysis, pages 114–130. Springer. Chapman, M., Tyson, G., McBurney, P., Luck, M., and Parsons, S. (2014). Playing hide-and-seek: an abstract game for cyber security. In Proceedings of the 1st International Workshop on Agents and CyberSecurity, page 3. ACM. 22