SlideShare uma empresa Scribd logo
1 de 46
Baixar para ler offline
Using CWL to support EHR-based phenotyping
Martin Chapman
King’s College London
EHR-based phenotyping i
Data mining rebranded for use with electronic health records (EHRs).
Simplest example of EHR-based phenotyping process: flag patients with a certain clinical code
as having a disease (e.g. COVID-19).
PatientID Clinical Code
001 C19
002 COPD
003 C19
004 -
Figure 1: EHRs at a given clinic
1
EHR-based phenotyping i
Data mining rebranded for use with electronic health records (EHRs).
Simplest example of EHR-based phenotyping process: flag patients with a certain clinical code
as having a disease (e.g. COVID-19).
PatientID Clinical Code Has COVID-19?
001 C19 ✓
002 COPD -
003 C19 ✓
004 - -
Figure 1: EHRs at a given clinic
1
EHR-based phenotyping ii
A slightly more complex phenotyping processes might look at multiple criteria. It might
consider a patient to have a given condition if any of these criteria are true.
For example, if a patient has a code from one of a number of different coding schemes:
PatientID Clinical Code
001 ICD-C19
002 COPD
003 SNOMED-C19
004 -
Figure 2: EHRs at a given clinic
2
EHR-based phenotyping ii
A slightly more complex phenotyping processes might look at multiple criteria. It might
consider a patient to have a given condition if any of these criteria are true.
For example, if a patient has a code from one of a number of different coding schemes:
PatientID Clinical Code ICD-10 code?
001 ICD-C19 ✓
002 COPD -
003 SNOMED-C19 -
004 - -
Figure 2: EHRs at a given clinic
2
EHR-based phenotyping ii
A slightly more complex phenotyping processes might look at multiple criteria. It might
consider a patient to have a given condition if any of these criteria are true.
For example, if a patient has a code from one of a number of different coding schemes:
PatientID Clinical Code ICD-10 code? SNOMED code?
001 ICD-C19 ✓ -
002 COPD - -
003 SNOMED-C19 - ✓
004 - - -
Figure 2: EHRs at a given clinic
2
EHR-based phenotyping ii
A slightly more complex phenotyping processes might look at multiple criteria. It might
consider a patient to have a given condition if any of these criteria are true.
For example, if a patient has a code from one of a number of different coding schemes:
PatientID Clinical Code ICD-10 code? SNOMED code? Has COVID-19?
001 ICD-C19 ✓ - ✓
002 COPD - - -
003 SNOMED-C19 - ✓ ✓
004 - - - -
Figure 2: EHRs at a given clinic
2
Phenotype definitions
The EHR-based phenotyping process is captured as a phenotype definition (abstract and
non-executable logic), and in turn implemented for use in practice as a computable
phenotype (concrete and executable implementation).
EHR
ICD-10 code
SNOMED code
CASE
Yes
Yes
SELECT UserID, Codes
FROM Patients
WHERE Codes IN (’ICD−C19’, ’SNOMED−CD19’);
...
Phenotype definition (flowchart) Computable phenotype (SQL)
3
Challenges and CWL solutions i
Wider phenotype definition and computable phenotype landscape is more complex:
portal.caliberresearch.org phekb.org
4
Challenges and CWL solutions ii
• Phenotype definitions come in lots of different forms (flowcharts, text descriptions,
weights for a classifier, etc.) and lack standardisation. This reduces intelligibility and
thus phenotypic reproducibility (the ability to accurately implement the logic intended
by the definition author).
• Computable phenotypes often don’t exist at all. This affects phenotypic portability
(the effort associated with implementing a definition).
5
Challenges and CWL solutions ii
• Phenotype definitions come in lots of different forms (flowcharts, text descriptions,
weights for a classifier, etc.) and lack standardisation. This reduces intelligibility and
thus phenotypic reproducibility (the ability to accurately implement the logic intended
by the definition author).
A new model to structure definitions based on CWL.
• Computable phenotypes often don’t exist at all. This affects phenotypic portability
(the effort associated with implementing a definition).
5
Challenges and CWL solutions ii
• Phenotype definitions come in lots of different forms (flowcharts, text descriptions,
weights for a classifier, etc.) and lack standardisation. This reduces intelligibility and
thus phenotypic reproducibility (the ability to accurately implement the logic intended
by the definition author).
A new model to structure definitions based on CWL.
• Computable phenotypes often don’t exist at all. This affects phenotypic portability
(the effort associated with implementing a definition).
An architecture—Phenoflow—to parse definitions under our new model and make them
available to researchers to download in CWL.
5
1. CWL-based model
Why a workflow?
All phenotype definitions can be considered as, or reduced to, a set of steps, which start
with a patient population, apply a number of criteria to that population, and, depending on
the relationship between those criteria, determine cases of the disease.
1. For simpler definitions, considered here, if any of the criteria are met, a patient is
considered a case, and this can be flagged by individual steps.
6
Why a workflow?
All phenotype definitions can be considered as, or reduced to, a set of steps, which start
with a patient population, apply a number of criteria to that population, and, depending on
the relationship between those criteria, determine cases of the disease.
1. For simpler definitions, considered here, if any of the criteria are met, a patient is
considered a case, and this can be flagged by individual steps.
2. For more complex definitions, where multiple criteria must all be met, or meeting a
criterion (e.g. being under a certain age) actually should exclude an individual from
having a condition, this can be determined at the end of the workflow, before a final
cohort is produced. Here, the sequential nature of a workflow is important.
6
Why a workflow?
All phenotype definitions can be considered as, or reduced to, a set of steps, which start
with a patient population, apply a number of criteria to that population, and, depending on
the relationship between those criteria, determine cases of the disease.
1. For simpler definitions, considered here, if any of the criteria are met, a patient is
considered a case, and this can be flagged by individual steps.
2. For more complex definitions, where multiple criteria must all be met, or meeting a
criterion (e.g. being under a certain age) actually should exclude an individual from
having a condition, this can be determined at the end of the workflow, before a final
cohort is produced. Here, the sequential nature of a workflow is important.
3. Nested workflows can be used to handle complex branches (more shortly).
6
CWL-based model i
A new CWL-based model for the definition of a phenotype:
number group id description type
step
Input Output
id description id description extensionA
pathA languageA paramsA
implementationUnitA
Computational
Implementation
Units
pathB languageB paramsB
implementationUnitB
Abstract
Functional
Figure 3: CWL-based definition model (step) and implementation units*.
*the bits of code actually executed by definitions structured under this model; separate from the model
itself.
7
CWL-based model ii
Model is separated into layers:
• Abstract Expresses the logic of a phenotype through a set of simple sequential,
potentially nested steps, each of which is annotated with multiple descriptions. Emphasis
on intelligibility.
8
CWL-based model ii
Model is separated into layers:
• Abstract Expresses the logic of a phenotype through a set of simple sequential,
potentially nested steps, each of which is annotated with multiple descriptions. Emphasis
on intelligibility.
• Functional Specifies the metadata of entities passed between the operations within the
abstract layer, e.g., the format of an intermediate cohort.
8
CWL-based model ii
Model is separated into layers:
• Abstract Expresses the logic of a phenotype through a set of simple sequential,
potentially nested steps, each of which is annotated with multiple descriptions. Emphasis
on intelligibility.
• Functional Specifies the metadata of entities passed between the operations within the
abstract layer, e.g., the format of an intermediate cohort.
• Computational Defines an environment for the execution of one or more
implementation units (e.g. a script, data pipeline module, etc.) for each step in the
abstract layer. Inherently supports implementation by providing a template for
development.
8
CWL-based model iii
2 - icd10 A case is identified in the presence of pa-
tients associated with the stated icd10
COVID-19 codes.
logic
step
Input Output
covid19 cohort Potential covid19
cases.
covid19 cases icd10 covid19 cases, as
identified by icd10
coding.
csv
icd10.py python -
for row in c s v r e a d e r :
newRow = row . copy ()
for c e l l in row :
i f [ value for value in
row [ c e l l ] . s p l i t ( ” , ” )
i f value in codes ] :
newRow [ ” covid19 ” ] = ”CASE”
...
Computational
Implementation
Units
icd10.js javascript -
for ( row of csvData ){
newRow = row . s l i c e ( ) ;
for ( c e l l of row ){
i f ( c e l l . s p l i t ( ” , ” )
. f i l t e r ( code=>codes .
indexOf ( code )>−1). length ){
newRow . push ( ”CASE” ) ;
...
Abstract
Functional
Figure 4: Individual step of COVID-19 phenotype definition and implementation units.
9
Relationship to CWL i
‘Informal subset’ of CWL, specified using step type metadata:
• The first step must be of a connector type (currently load or external), designed to
extract data from a datasource without performing any processing on that data, and pass
it to the second step.
• Other steps in a definition must describe the logic of the phenotype (types currently
boolean logic and generic logic (supporting, for example, case exclusion)).
• The final step must be of an output type, outputting a final condition cohort to disc,
taking into account any relationships between boolean steps (e.g all must be true).
More: https://github.com/kclhi/phenoflow/wiki/Model
10
Relationship to CWL ii
11
Other model benefits
Beyond standardising definitions (and thus improving phenotypic reproducibility), a CWL-based
model provides us with a number of other benefits:
• As we’ve already seen, we can have different implementations for the same definition.
• Often different sites will realise the same phenotype logic using different implementation
units, and we want to map these to the original logic.
• Connecting, yet keeping separate, the definition and the implementation
• Important when phenotype implementations and definitions are often conflated.
• Support for a wide range of definition types.
• From simple codelists (as seen) to trained classifiers. Implementation units can also differ
across steps, if needed. Enabled via CWL’s Docker integration (more shortly).
12
2. Definition parsing and CWL
generation
Phenoflow: Parsing and CWL generation architecture
Our architecture, Phenoflow, allows us to take non-standard phenotype definitions,
standardise them, and make them available for download in CWL.
Web Portal/API
Generator
Visualiser
Implementation
Units
VC server
Author(s)
User
customise
workflow,
visualisation,
implementation units
author,
expand
data
workflow
workflow
visualisation
Chapman, Martin, et al. “Phenoflow: A microservice architecture for portable workflow-based
phenotype definitions.” AMIA, 2021. https://kclhi.org/phenoflow
13
Phenoflow: Parsing and CWL generation architecture
Our architecture, Phenoflow, allows us to take non-standard phenotype definitions,
standardise them, and make them available for download in CWL.
Web Portal/API
Generator
Visualiser
Implementation
Units
VC server
Author(s)
User
customise
workflow,
visualisation,
implementation units
author,
expand
data
workflow
workflow
visualisation
Chapman, Martin, et al. “Phenoflow: A microservice architecture for portable workflow-based
phenotype definitions.” AMIA, 2021. https://kclhi.org/phenoflow
13
Parsing
When an author submits data relating to a definition (e.g. a codelist) via the API (or we pull
this data from existing libraries):
1. Key information that forms the logic of the definition
(e.g. a ‘conceptid’ column in a codelist CSV) is identified.
2. This information is used to automatically determine the number of
steps and their content (e.g. by grouping codes according to coding scheme)
3. Implementation units for each step are automatically created.
4. Implementation units are added to the
Phenoflow library ready to be used within a workflow.
5. Subsequent definition edits are tracked using a Data Provenance Template server.
14
Parsing: Implementation unit creation (Step 4) i
To support the creation of implementation units automatically, we developed templates with
placeholder values, that are then populated as a part of the parsing process.
• Our simplest template substitutes an array of codes, each of which can then be identified
within an EHR:
codes = [[LIST]];
...
with open(sys.argv[1], ’r’) as file in,
open(’[PHENOTYPE]−potential−cases.csv’, ’w’, newline=’’) as file out:
...
• Templates for more complex definition types are based upon existing phenotype
implementations (e.g. Python NLP phenotyping at KCL GSTT, clustering techniques).
15
Parsing: Implementation unit creation (Step 4) ii
Each populated template will eventually be executed using a CommandLineTool in CWL.
• Support for different types of definitions (from simple codelists to trained classifiers) is
provided by creating custom Docker images, which provide specific language and package
support. These are then later used by each tool to execute the implementation unit.
16
Parsing: Provenance (Step 5)
Our architecture includes a Data Provenance Template server, a piece of software that holds
structured fragments of provenance.
These fragments record the evolution
of definitions within Phenoflow, as
they are edited by users.
Designed to complement CWLProv,
which records workflow execution.
used used wasAssociatedWith
wasGeneratedBy
var:updated
prov:end vvar:time
prov:type phenoflow#Updated
zone:id update
var:author
prov:type phenoflow#Author
var:phenotypeAfter
phenoflow:description vvar:description
phenoflow:name vvar:name
prov:type phenoflow#Phenotype
var:phenotypeBefore
prov:type phenoflow#Phenotype
var:step
phenoflow:coding vvar:coding
phenoflow:doc vvar:doc
phenoflow:position vvar:position
phenoflow:stepName vvar:stepName
phenoflow:type vvar:type
prov:type phenoflow#Step
zone:id update
Fairweather, Elliot, et al. “A delayed instantiation approach to template-driven provenance for
electronic health record phenotyping”. IPAW, 2020.
17
Phenoflow: Parsing and CWL generation architecture
Our architecture, Phenoflow, allows us to take non-standard phenotype definitions,
standardise them, and make them available for download in CWL.
Web Portal/API
Generator
Visualiser
Implementation
Units
VC server
Author(s)
User
customise
workflow,
visualisation,
implementation units
author,
expand
data
workflow
workflow
visualisation
Chapman, Martin, et al. “Phenoflow: A microservice architecture for portable workflow-based
phenotype definitions.” AMIA, 2021. https://kclhi.org/phenoflow
18
Phenoflow library
At the end of the parsing process, we effectively have a set of database entries (and a set of
implementation units), containing the information required to generate a workflow. This forms
the library:
19
Phenoflow library: Additional implementation units
After an initial import, additional implementation units can be added by other users creating
the ability to customise the workflow to download.
Once a permuatation is selected, a CWL workflow of that permutation can be generated on
the fly by a user:
20
Phenoflow: Parsing and CWL generation architecture
Our architecture, Phenoflow, allows us to take non-standard phenotype definitions,
standardise them, and make them available for download in CWL.
Web Portal/API
Generator
Visualiser
Implementation
Units
VC server
Author(s)
User
customise
workflow,
visualisation,
implementation units
author,
expand
data
workflow
workflow
visualisation
Chapman, Martin, et al. “Phenoflow: A microservice architecture for portable workflow-based
phenotype definitions.” AMIA, 2021. https://kclhi.org/phenoflow 21
Generation
When a user clicks download:
1. Pass information related to the chosen workflow
permutation to the generator, receive CWL files back in response.
2. Create a version of this workflow in a local Git server
3. Pass a link to this versioned code to the visualiser and receive a graphic back
4. Combine the CWL files, implementation units and visualised work-
flow (to increase intelligibility) into a zip and push to use for download.
5. User sets configuration details within the implementation units (e.g. database
credentials) and then locally executes the workflow against their target datasource.
22
Generation: On-the-fly workflow generation (Step 1)
Created a lightweight service wrapper around CWL generator in order to allow it to be called
in realtime, and generate a workflow based on the information stored as a part of the parsing
process.
@app.route(’/generate’, methods=[’POST’])
async def generate(request):
try:
steps = await request.json();
except:
steps = None;
if(steps):
generatedWorkflow = generateWorkflow(steps);
return JSONResponse({
’workflow’: yaml.dump(generatedWorkflow[’workflow’] ... )
});
...
...
https://github.com/kclhi/
phenoflow/tree/master/
generator
23
Going forward...
Impact
The use of CWL in this way has already had some impact:
1. We are connected to the HDRUK phenotype library
(https://phenotypes.healthdatagateway.org/), and automatically provide
implementations for their 1000+ phenotype definitions.
2. We are actively working with and/or in conversation with several sites in the US to
represent their definitions
3. Phenoflow has been used to represent some recent complex phenotypes, e.g. Long
Covid (Mayor, Nikhil, et al. “Developing a Long COVID Phenotype for Postacute
COVID-19 in a National Primary Care Sentinel Cohort: Observational Retrospective
Database Analysis”. JMIR, 2022.).
More to do! We are always looking for new phenotype definitions to increase the
sophistication of our parsing process.
24
Things we could probably do better i
1. Generation overhead
It was first believed that the style of ‘on the fly’ generation used in Phenoflow was required
due to all the possible permutations of implementation units that could be selected.
In reality, we have determined that the overhead associated with generating the corresponding
CWL for these permutations in advance is less than the delay to a user.
25
Things we could probably do better ii
As such, we are now shifting our architecture to instead use Github as a store for
pre-generated workflows produced as a part of the parsing (or editing) process.
API Generator
Visualiser GitHub
Author(s)
User
query
link to workflow
+ implementation units and
visualisation
author,
expand data
workflow
index
workflows
Hope to progress a fork
of CWL Viewer that
effectively acts as the
web portal by visualis-
ing (and indexing) the
available Git reposito-
ries.
26
Things we could probably do better iii
2. Generator version
We are using the original CWL generator (python-cwlgen), but should now, instead, be using
cwl-utils.
27
Things we could probably do better iv
3. Branch handling
As a part of our parsing process, we ‘flatten’ branches into individual steps if they are simple,
and into entire nested workflows if they are more complex.
Each branch evaluates to a boolean value, rep-
resenting whether the logic it contains suggests
that a patient has the condition. Then, much
like the simpler examples we’ve seen, if any of
the steps return true, the patient is deemed to
have the condition.
May well be a more sophisticated way to do this
in CWL.
28
Things we could probably do better v
4. The CWL itself!
$namespaces:
s: http://phenomics.kcl.ac.uk/phenoflow/
baseCommand: python
class: CommandLineTool
cwlVersion: v1.0
doc: Identify COVID−19 (ICD−10)
id: icd10
inputs:
− doc: Python implementation unit
id: inputModule
inputBinding:
position: 1
type: File
− doc: Potential cases of covid−19.
id: potentialCases
inputBinding:
position: 2
type: File
outputs:
− doc: Patients with ICD−10 COVID−19 codes
id: output
outputBinding:
glob: ’∗.csv’
type: File
requirements:
DockerRequirement:
dockerPull: kclhi/python:latest
s:type: logic
29
Summary
• We standardise existing phenotype definitions under a CWL-based model.
• These standardised definitions are presented to users as a part of the Phenoflow library.
• CWL files themselves are generated in realtime when a user downloads a given definition
from the library.
Thank you! Things like CWL’s Docker integration and the generation and visualisation tools
have been invaluable.
30
Links
Links given throughout the presentation:
Live: https://kclhi.org/phenoflow
Source: https://github.com/kclhi/phenoflow
Wiki: https://github.com/kclhi/phenoflow/wiki
31

Mais conteúdo relacionado

Semelhante a Using CWL to support EHR-based phenotyping

IRJET- Detection of Chronic Kidney Disease using Machine Learning in the R-En...
IRJET- Detection of Chronic Kidney Disease using Machine Learning in the R-En...IRJET- Detection of Chronic Kidney Disease using Machine Learning in the R-En...
IRJET- Detection of Chronic Kidney Disease using Machine Learning in the R-En...IRJET Journal
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicatorsvie_dels
 
Using computable phenotypes in point of care clinical trial recruitment
Using computable phenotypes in point of care clinical trial recruitmentUsing computable phenotypes in point of care clinical trial recruitment
Using computable phenotypes in point of care clinical trial recruitmentMartin Chapman
 
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...Fault Detection in Mobile Communication Networks Using Data Mining Techniques...
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...ijcisjournal
 
Semantic Integration of Patient Data and Quality Indicators based on openEHR ...
Semantic Integration of Patient Data and Quality Indicators based on openEHR ...Semantic Integration of Patient Data and Quality Indicators based on openEHR ...
Semantic Integration of Patient Data and Quality Indicators based on openEHR ...Kathrin Dentler
 
Diagnosing Chronic Kidney Disease using Machine Learning
Diagnosing Chronic Kidney Disease using Machine LearningDiagnosing Chronic Kidney Disease using Machine Learning
Diagnosing Chronic Kidney Disease using Machine LearningIRJET Journal
 
Pressure Prediction System in Lung Circuit using Deep Learning and Machine Le...
Pressure Prediction System in Lung Circuit using Deep Learning and Machine Le...Pressure Prediction System in Lung Circuit using Deep Learning and Machine Le...
Pressure Prediction System in Lung Circuit using Deep Learning and Machine Le...IRJET Journal
 
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESBIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESIJCSEA Journal
 
New challenges monolixday2011
New challenges monolixday2011New challenges monolixday2011
New challenges monolixday2011blaudez
 
Bio-Inspired Modelling of Software Verification by Modified Moran Processes
Bio-Inspired Modelling of Software Verification by Modified Moran ProcessesBio-Inspired Modelling of Software Verification by Modified Moran Processes
Bio-Inspired Modelling of Software Verification by Modified Moran ProcessesIJCSEA Journal
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
 
Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health DataAbhishek Agrawal
 
Building_a_Readmission_Model_Using_WEKA
Building_a_Readmission_Model_Using_WEKABuilding_a_Readmission_Model_Using_WEKA
Building_a_Readmission_Model_Using_WEKASunil Kakade
 
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESBIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESIJCSEA Journal
 
Forecasting COVID-19 using Polynomial Regression and Support Vector Machine
Forecasting COVID-19 using Polynomial Regression and Support Vector MachineForecasting COVID-19 using Polynomial Regression and Support Vector Machine
Forecasting COVID-19 using Polynomial Regression and Support Vector MachineIRJET Journal
 
Eastman_MI530_FinalProjectReport
Eastman_MI530_FinalProjectReportEastman_MI530_FinalProjectReport
Eastman_MI530_FinalProjectReportNicholas Eastman
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlishAttique1
 
Universal Domain Common observation 20080808
Universal Domain Common observation 20080808Universal Domain Common observation 20080808
Universal Domain Common observation 20080808ash84
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologySnow Owl
 

Semelhante a Using CWL to support EHR-based phenotyping (20)

IRJET- Detection of Chronic Kidney Disease using Machine Learning in the R-En...
IRJET- Detection of Chronic Kidney Disease using Machine Learning in the R-En...IRJET- Detection of Chronic Kidney Disease using Machine Learning in the R-En...
IRJET- Detection of Chronic Kidney Disease using Machine Learning in the R-En...
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicators
 
Using computable phenotypes in point of care clinical trial recruitment
Using computable phenotypes in point of care clinical trial recruitmentUsing computable phenotypes in point of care clinical trial recruitment
Using computable phenotypes in point of care clinical trial recruitment
 
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...Fault Detection in Mobile Communication Networks Using Data Mining Techniques...
Fault Detection in Mobile Communication Networks Using Data Mining Techniques...
 
Semantic Integration of Patient Data and Quality Indicators based on openEHR ...
Semantic Integration of Patient Data and Quality Indicators based on openEHR ...Semantic Integration of Patient Data and Quality Indicators based on openEHR ...
Semantic Integration of Patient Data and Quality Indicators based on openEHR ...
 
Diagnosing Chronic Kidney Disease using Machine Learning
Diagnosing Chronic Kidney Disease using Machine LearningDiagnosing Chronic Kidney Disease using Machine Learning
Diagnosing Chronic Kidney Disease using Machine Learning
 
Pressure Prediction System in Lung Circuit using Deep Learning and Machine Le...
Pressure Prediction System in Lung Circuit using Deep Learning and Machine Le...Pressure Prediction System in Lung Circuit using Deep Learning and Machine Le...
Pressure Prediction System in Lung Circuit using Deep Learning and Machine Le...
 
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESBIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
 
New challenges monolixday2011
New challenges monolixday2011New challenges monolixday2011
New challenges monolixday2011
 
Bio-Inspired Modelling of Software Verification by Modified Moran Processes
Bio-Inspired Modelling of Software Verification by Modified Moran ProcessesBio-Inspired Modelling of Software Verification by Modified Moran Processes
Bio-Inspired Modelling of Software Verification by Modified Moran Processes
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - Statswork
 
PCS GEM Guide 2009
PCS GEM Guide 2009PCS GEM Guide 2009
PCS GEM Guide 2009
 
Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health Data
 
Building_a_Readmission_Model_Using_WEKA
Building_a_Readmission_Model_Using_WEKABuilding_a_Readmission_Model_Using_WEKA
Building_a_Readmission_Model_Using_WEKA
 
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESBIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
 
Forecasting COVID-19 using Polynomial Regression and Support Vector Machine
Forecasting COVID-19 using Polynomial Regression and Support Vector MachineForecasting COVID-19 using Polynomial Regression and Support Vector Machine
Forecasting COVID-19 using Polynomial Regression and Support Vector Machine
 
Eastman_MI530_FinalProjectReport
Eastman_MI530_FinalProjectReportEastman_MI530_FinalProjectReport
Eastman_MI530_FinalProjectReport
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modeller
 
Universal Domain Common observation 20080808
Universal Domain Common observation 20080808Universal Domain Common observation 20080808
Universal Domain Common observation 20080808
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
 

Mais de Martin Chapman

Principles of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learningPrinciples of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learningMartin Chapman
 
Principles of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systemsPrinciples of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systemsMartin Chapman
 
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...Martin Chapman
 
Technical Validation through Automated Testing
Technical Validation through Automated TestingTechnical Validation through Automated Testing
Technical Validation through Automated TestingMartin Chapman
 
Scalable architectures for phenotype libraries
Scalable architectures for phenotype librariesScalable architectures for phenotype libraries
Scalable architectures for phenotype librariesMartin Chapman
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Martin Chapman
 
Using AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patientsUsing AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patientsMartin Chapman
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Martin Chapman
 
Principles of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical softwarePrinciples of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical softwareMartin Chapman
 
Principles of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical softwarePrinciples of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical softwareMartin Chapman
 
Principles of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile healthPrinciples of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile healthMartin Chapman
 
Principles of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcarePrinciples of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcareMartin Chapman
 
Principles of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systemsPrinciples of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systemsMartin Chapman
 
Principles of Health Informatics: Representing medical knowledge
Principles of Health Informatics: Representing medical knowledgePrinciples of Health Informatics: Representing medical knowledge
Principles of Health Informatics: Representing medical knowledgeMartin Chapman
 
Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...Martin Chapman
 
Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...Martin Chapman
 
Principles of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systemsPrinciples of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systemsMartin Chapman
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Martin Chapman
 
Using Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research SoftwareUsing Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research SoftwareMartin Chapman
 
COVID-19 Analytics in Jupyter: Intuitive Provenance Integration using ProvIt
COVID-19 Analytics in Jupyter: Intuitive Provenance Integration using ProvItCOVID-19 Analytics in Jupyter: Intuitive Provenance Integration using ProvIt
COVID-19 Analytics in Jupyter: Intuitive Provenance Integration using ProvItMartin Chapman
 

Mais de Martin Chapman (20)

Principles of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learningPrinciples of Health Informatics: Artificial intelligence and machine learning
Principles of Health Informatics: Artificial intelligence and machine learning
 
Principles of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systemsPrinciples of Health Informatics: Clinical decision support systems
Principles of Health Informatics: Clinical decision support systems
 
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
Mechanisms for Integrating Real Data into Search Game Simulations: An Applica...
 
Technical Validation through Automated Testing
Technical Validation through Automated TestingTechnical Validation through Automated Testing
Technical Validation through Automated Testing
 
Scalable architectures for phenotype libraries
Scalable architectures for phenotype librariesScalable architectures for phenotype libraries
Scalable architectures for phenotype libraries
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Using AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patientsUsing AI to autonomously identify diseases within groups of patients
Using AI to autonomously identify diseases within groups of patients
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Principles of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical softwarePrinciples of Health Informatics: Evaluating medical software
Principles of Health Informatics: Evaluating medical software
 
Principles of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical softwarePrinciples of Health Informatics: Usability of medical software
Principles of Health Informatics: Usability of medical software
 
Principles of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile healthPrinciples of Health Informatics: Social networks, telehealth, and mobile health
Principles of Health Informatics: Social networks, telehealth, and mobile health
 
Principles of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcarePrinciples of Health Informatics: Communication systems in healthcare
Principles of Health Informatics: Communication systems in healthcare
 
Principles of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systemsPrinciples of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systems
 
Principles of Health Informatics: Representing medical knowledge
Principles of Health Informatics: Representing medical knowledgePrinciples of Health Informatics: Representing medical knowledge
Principles of Health Informatics: Representing medical knowledge
 
Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...Principles of Health Informatics: Informatics skills - searching and making d...
Principles of Health Informatics: Informatics skills - searching and making d...
 
Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...Principles of Health Informatics: Informatics skills - communicating, structu...
Principles of Health Informatics: Informatics skills - communicating, structu...
 
Principles of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systemsPrinciples of Health Informatics: Models, information, and information systems
Principles of Health Informatics: Models, information, and information systems
 
Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...Using AI to understand how preventative interventions can improve the health ...
Using AI to understand how preventative interventions can improve the health ...
 
Using Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research SoftwareUsing Microservices to Design Patient-facing Research Software
Using Microservices to Design Patient-facing Research Software
 
COVID-19 Analytics in Jupyter: Intuitive Provenance Integration using ProvIt
COVID-19 Analytics in Jupyter: Intuitive Provenance Integration using ProvItCOVID-19 Analytics in Jupyter: Intuitive Provenance Integration using ProvIt
COVID-19 Analytics in Jupyter: Intuitive Provenance Integration using ProvIt
 

Último

Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 

Último (20)

Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 

Using CWL to support EHR-based phenotyping

  • 1. Using CWL to support EHR-based phenotyping Martin Chapman King’s College London
  • 2. EHR-based phenotyping i Data mining rebranded for use with electronic health records (EHRs). Simplest example of EHR-based phenotyping process: flag patients with a certain clinical code as having a disease (e.g. COVID-19). PatientID Clinical Code 001 C19 002 COPD 003 C19 004 - Figure 1: EHRs at a given clinic 1
  • 3. EHR-based phenotyping i Data mining rebranded for use with electronic health records (EHRs). Simplest example of EHR-based phenotyping process: flag patients with a certain clinical code as having a disease (e.g. COVID-19). PatientID Clinical Code Has COVID-19? 001 C19 ✓ 002 COPD - 003 C19 ✓ 004 - - Figure 1: EHRs at a given clinic 1
  • 4. EHR-based phenotyping ii A slightly more complex phenotyping processes might look at multiple criteria. It might consider a patient to have a given condition if any of these criteria are true. For example, if a patient has a code from one of a number of different coding schemes: PatientID Clinical Code 001 ICD-C19 002 COPD 003 SNOMED-C19 004 - Figure 2: EHRs at a given clinic 2
  • 5. EHR-based phenotyping ii A slightly more complex phenotyping processes might look at multiple criteria. It might consider a patient to have a given condition if any of these criteria are true. For example, if a patient has a code from one of a number of different coding schemes: PatientID Clinical Code ICD-10 code? 001 ICD-C19 ✓ 002 COPD - 003 SNOMED-C19 - 004 - - Figure 2: EHRs at a given clinic 2
  • 6. EHR-based phenotyping ii A slightly more complex phenotyping processes might look at multiple criteria. It might consider a patient to have a given condition if any of these criteria are true. For example, if a patient has a code from one of a number of different coding schemes: PatientID Clinical Code ICD-10 code? SNOMED code? 001 ICD-C19 ✓ - 002 COPD - - 003 SNOMED-C19 - ✓ 004 - - - Figure 2: EHRs at a given clinic 2
  • 7. EHR-based phenotyping ii A slightly more complex phenotyping processes might look at multiple criteria. It might consider a patient to have a given condition if any of these criteria are true. For example, if a patient has a code from one of a number of different coding schemes: PatientID Clinical Code ICD-10 code? SNOMED code? Has COVID-19? 001 ICD-C19 ✓ - ✓ 002 COPD - - - 003 SNOMED-C19 - ✓ ✓ 004 - - - - Figure 2: EHRs at a given clinic 2
  • 8. Phenotype definitions The EHR-based phenotyping process is captured as a phenotype definition (abstract and non-executable logic), and in turn implemented for use in practice as a computable phenotype (concrete and executable implementation). EHR ICD-10 code SNOMED code CASE Yes Yes SELECT UserID, Codes FROM Patients WHERE Codes IN (’ICD−C19’, ’SNOMED−CD19’); ... Phenotype definition (flowchart) Computable phenotype (SQL) 3
  • 9. Challenges and CWL solutions i Wider phenotype definition and computable phenotype landscape is more complex: portal.caliberresearch.org phekb.org 4
  • 10. Challenges and CWL solutions ii • Phenotype definitions come in lots of different forms (flowcharts, text descriptions, weights for a classifier, etc.) and lack standardisation. This reduces intelligibility and thus phenotypic reproducibility (the ability to accurately implement the logic intended by the definition author). • Computable phenotypes often don’t exist at all. This affects phenotypic portability (the effort associated with implementing a definition). 5
  • 11. Challenges and CWL solutions ii • Phenotype definitions come in lots of different forms (flowcharts, text descriptions, weights for a classifier, etc.) and lack standardisation. This reduces intelligibility and thus phenotypic reproducibility (the ability to accurately implement the logic intended by the definition author). A new model to structure definitions based on CWL. • Computable phenotypes often don’t exist at all. This affects phenotypic portability (the effort associated with implementing a definition). 5
  • 12. Challenges and CWL solutions ii • Phenotype definitions come in lots of different forms (flowcharts, text descriptions, weights for a classifier, etc.) and lack standardisation. This reduces intelligibility and thus phenotypic reproducibility (the ability to accurately implement the logic intended by the definition author). A new model to structure definitions based on CWL. • Computable phenotypes often don’t exist at all. This affects phenotypic portability (the effort associated with implementing a definition). An architecture—Phenoflow—to parse definitions under our new model and make them available to researchers to download in CWL. 5
  • 14. Why a workflow? All phenotype definitions can be considered as, or reduced to, a set of steps, which start with a patient population, apply a number of criteria to that population, and, depending on the relationship between those criteria, determine cases of the disease. 1. For simpler definitions, considered here, if any of the criteria are met, a patient is considered a case, and this can be flagged by individual steps. 6
  • 15. Why a workflow? All phenotype definitions can be considered as, or reduced to, a set of steps, which start with a patient population, apply a number of criteria to that population, and, depending on the relationship between those criteria, determine cases of the disease. 1. For simpler definitions, considered here, if any of the criteria are met, a patient is considered a case, and this can be flagged by individual steps. 2. For more complex definitions, where multiple criteria must all be met, or meeting a criterion (e.g. being under a certain age) actually should exclude an individual from having a condition, this can be determined at the end of the workflow, before a final cohort is produced. Here, the sequential nature of a workflow is important. 6
  • 16. Why a workflow? All phenotype definitions can be considered as, or reduced to, a set of steps, which start with a patient population, apply a number of criteria to that population, and, depending on the relationship between those criteria, determine cases of the disease. 1. For simpler definitions, considered here, if any of the criteria are met, a patient is considered a case, and this can be flagged by individual steps. 2. For more complex definitions, where multiple criteria must all be met, or meeting a criterion (e.g. being under a certain age) actually should exclude an individual from having a condition, this can be determined at the end of the workflow, before a final cohort is produced. Here, the sequential nature of a workflow is important. 3. Nested workflows can be used to handle complex branches (more shortly). 6
  • 17. CWL-based model i A new CWL-based model for the definition of a phenotype: number group id description type step Input Output id description id description extensionA pathA languageA paramsA implementationUnitA Computational Implementation Units pathB languageB paramsB implementationUnitB Abstract Functional Figure 3: CWL-based definition model (step) and implementation units*. *the bits of code actually executed by definitions structured under this model; separate from the model itself. 7
  • 18. CWL-based model ii Model is separated into layers: • Abstract Expresses the logic of a phenotype through a set of simple sequential, potentially nested steps, each of which is annotated with multiple descriptions. Emphasis on intelligibility. 8
  • 19. CWL-based model ii Model is separated into layers: • Abstract Expresses the logic of a phenotype through a set of simple sequential, potentially nested steps, each of which is annotated with multiple descriptions. Emphasis on intelligibility. • Functional Specifies the metadata of entities passed between the operations within the abstract layer, e.g., the format of an intermediate cohort. 8
  • 20. CWL-based model ii Model is separated into layers: • Abstract Expresses the logic of a phenotype through a set of simple sequential, potentially nested steps, each of which is annotated with multiple descriptions. Emphasis on intelligibility. • Functional Specifies the metadata of entities passed between the operations within the abstract layer, e.g., the format of an intermediate cohort. • Computational Defines an environment for the execution of one or more implementation units (e.g. a script, data pipeline module, etc.) for each step in the abstract layer. Inherently supports implementation by providing a template for development. 8
  • 21. CWL-based model iii 2 - icd10 A case is identified in the presence of pa- tients associated with the stated icd10 COVID-19 codes. logic step Input Output covid19 cohort Potential covid19 cases. covid19 cases icd10 covid19 cases, as identified by icd10 coding. csv icd10.py python - for row in c s v r e a d e r : newRow = row . copy () for c e l l in row : i f [ value for value in row [ c e l l ] . s p l i t ( ” , ” ) i f value in codes ] : newRow [ ” covid19 ” ] = ”CASE” ... Computational Implementation Units icd10.js javascript - for ( row of csvData ){ newRow = row . s l i c e ( ) ; for ( c e l l of row ){ i f ( c e l l . s p l i t ( ” , ” ) . f i l t e r ( code=>codes . indexOf ( code )>−1). length ){ newRow . push ( ”CASE” ) ; ... Abstract Functional Figure 4: Individual step of COVID-19 phenotype definition and implementation units. 9
  • 22. Relationship to CWL i ‘Informal subset’ of CWL, specified using step type metadata: • The first step must be of a connector type (currently load or external), designed to extract data from a datasource without performing any processing on that data, and pass it to the second step. • Other steps in a definition must describe the logic of the phenotype (types currently boolean logic and generic logic (supporting, for example, case exclusion)). • The final step must be of an output type, outputting a final condition cohort to disc, taking into account any relationships between boolean steps (e.g all must be true). More: https://github.com/kclhi/phenoflow/wiki/Model 10
  • 24. Other model benefits Beyond standardising definitions (and thus improving phenotypic reproducibility), a CWL-based model provides us with a number of other benefits: • As we’ve already seen, we can have different implementations for the same definition. • Often different sites will realise the same phenotype logic using different implementation units, and we want to map these to the original logic. • Connecting, yet keeping separate, the definition and the implementation • Important when phenotype implementations and definitions are often conflated. • Support for a wide range of definition types. • From simple codelists (as seen) to trained classifiers. Implementation units can also differ across steps, if needed. Enabled via CWL’s Docker integration (more shortly). 12
  • 25. 2. Definition parsing and CWL generation
  • 26. Phenoflow: Parsing and CWL generation architecture Our architecture, Phenoflow, allows us to take non-standard phenotype definitions, standardise them, and make them available for download in CWL. Web Portal/API Generator Visualiser Implementation Units VC server Author(s) User customise workflow, visualisation, implementation units author, expand data workflow workflow visualisation Chapman, Martin, et al. “Phenoflow: A microservice architecture for portable workflow-based phenotype definitions.” AMIA, 2021. https://kclhi.org/phenoflow 13
  • 27. Phenoflow: Parsing and CWL generation architecture Our architecture, Phenoflow, allows us to take non-standard phenotype definitions, standardise them, and make them available for download in CWL. Web Portal/API Generator Visualiser Implementation Units VC server Author(s) User customise workflow, visualisation, implementation units author, expand data workflow workflow visualisation Chapman, Martin, et al. “Phenoflow: A microservice architecture for portable workflow-based phenotype definitions.” AMIA, 2021. https://kclhi.org/phenoflow 13
  • 28. Parsing When an author submits data relating to a definition (e.g. a codelist) via the API (or we pull this data from existing libraries): 1. Key information that forms the logic of the definition (e.g. a ‘conceptid’ column in a codelist CSV) is identified. 2. This information is used to automatically determine the number of steps and their content (e.g. by grouping codes according to coding scheme) 3. Implementation units for each step are automatically created. 4. Implementation units are added to the Phenoflow library ready to be used within a workflow. 5. Subsequent definition edits are tracked using a Data Provenance Template server. 14
  • 29. Parsing: Implementation unit creation (Step 4) i To support the creation of implementation units automatically, we developed templates with placeholder values, that are then populated as a part of the parsing process. • Our simplest template substitutes an array of codes, each of which can then be identified within an EHR: codes = [[LIST]]; ... with open(sys.argv[1], ’r’) as file in, open(’[PHENOTYPE]−potential−cases.csv’, ’w’, newline=’’) as file out: ... • Templates for more complex definition types are based upon existing phenotype implementations (e.g. Python NLP phenotyping at KCL GSTT, clustering techniques). 15
  • 30. Parsing: Implementation unit creation (Step 4) ii Each populated template will eventually be executed using a CommandLineTool in CWL. • Support for different types of definitions (from simple codelists to trained classifiers) is provided by creating custom Docker images, which provide specific language and package support. These are then later used by each tool to execute the implementation unit. 16
  • 31. Parsing: Provenance (Step 5) Our architecture includes a Data Provenance Template server, a piece of software that holds structured fragments of provenance. These fragments record the evolution of definitions within Phenoflow, as they are edited by users. Designed to complement CWLProv, which records workflow execution. used used wasAssociatedWith wasGeneratedBy var:updated prov:end vvar:time prov:type phenoflow#Updated zone:id update var:author prov:type phenoflow#Author var:phenotypeAfter phenoflow:description vvar:description phenoflow:name vvar:name prov:type phenoflow#Phenotype var:phenotypeBefore prov:type phenoflow#Phenotype var:step phenoflow:coding vvar:coding phenoflow:doc vvar:doc phenoflow:position vvar:position phenoflow:stepName vvar:stepName phenoflow:type vvar:type prov:type phenoflow#Step zone:id update Fairweather, Elliot, et al. “A delayed instantiation approach to template-driven provenance for electronic health record phenotyping”. IPAW, 2020. 17
  • 32. Phenoflow: Parsing and CWL generation architecture Our architecture, Phenoflow, allows us to take non-standard phenotype definitions, standardise them, and make them available for download in CWL. Web Portal/API Generator Visualiser Implementation Units VC server Author(s) User customise workflow, visualisation, implementation units author, expand data workflow workflow visualisation Chapman, Martin, et al. “Phenoflow: A microservice architecture for portable workflow-based phenotype definitions.” AMIA, 2021. https://kclhi.org/phenoflow 18
  • 33. Phenoflow library At the end of the parsing process, we effectively have a set of database entries (and a set of implementation units), containing the information required to generate a workflow. This forms the library: 19
  • 34. Phenoflow library: Additional implementation units After an initial import, additional implementation units can be added by other users creating the ability to customise the workflow to download. Once a permuatation is selected, a CWL workflow of that permutation can be generated on the fly by a user: 20
  • 35. Phenoflow: Parsing and CWL generation architecture Our architecture, Phenoflow, allows us to take non-standard phenotype definitions, standardise them, and make them available for download in CWL. Web Portal/API Generator Visualiser Implementation Units VC server Author(s) User customise workflow, visualisation, implementation units author, expand data workflow workflow visualisation Chapman, Martin, et al. “Phenoflow: A microservice architecture for portable workflow-based phenotype definitions.” AMIA, 2021. https://kclhi.org/phenoflow 21
  • 36. Generation When a user clicks download: 1. Pass information related to the chosen workflow permutation to the generator, receive CWL files back in response. 2. Create a version of this workflow in a local Git server 3. Pass a link to this versioned code to the visualiser and receive a graphic back 4. Combine the CWL files, implementation units and visualised work- flow (to increase intelligibility) into a zip and push to use for download. 5. User sets configuration details within the implementation units (e.g. database credentials) and then locally executes the workflow against their target datasource. 22
  • 37. Generation: On-the-fly workflow generation (Step 1) Created a lightweight service wrapper around CWL generator in order to allow it to be called in realtime, and generate a workflow based on the information stored as a part of the parsing process. @app.route(’/generate’, methods=[’POST’]) async def generate(request): try: steps = await request.json(); except: steps = None; if(steps): generatedWorkflow = generateWorkflow(steps); return JSONResponse({ ’workflow’: yaml.dump(generatedWorkflow[’workflow’] ... ) }); ... ... https://github.com/kclhi/ phenoflow/tree/master/ generator 23
  • 39. Impact The use of CWL in this way has already had some impact: 1. We are connected to the HDRUK phenotype library (https://phenotypes.healthdatagateway.org/), and automatically provide implementations for their 1000+ phenotype definitions. 2. We are actively working with and/or in conversation with several sites in the US to represent their definitions 3. Phenoflow has been used to represent some recent complex phenotypes, e.g. Long Covid (Mayor, Nikhil, et al. “Developing a Long COVID Phenotype for Postacute COVID-19 in a National Primary Care Sentinel Cohort: Observational Retrospective Database Analysis”. JMIR, 2022.). More to do! We are always looking for new phenotype definitions to increase the sophistication of our parsing process. 24
  • 40. Things we could probably do better i 1. Generation overhead It was first believed that the style of ‘on the fly’ generation used in Phenoflow was required due to all the possible permutations of implementation units that could be selected. In reality, we have determined that the overhead associated with generating the corresponding CWL for these permutations in advance is less than the delay to a user. 25
  • 41. Things we could probably do better ii As such, we are now shifting our architecture to instead use Github as a store for pre-generated workflows produced as a part of the parsing (or editing) process. API Generator Visualiser GitHub Author(s) User query link to workflow + implementation units and visualisation author, expand data workflow index workflows Hope to progress a fork of CWL Viewer that effectively acts as the web portal by visualis- ing (and indexing) the available Git reposito- ries. 26
  • 42. Things we could probably do better iii 2. Generator version We are using the original CWL generator (python-cwlgen), but should now, instead, be using cwl-utils. 27
  • 43. Things we could probably do better iv 3. Branch handling As a part of our parsing process, we ‘flatten’ branches into individual steps if they are simple, and into entire nested workflows if they are more complex. Each branch evaluates to a boolean value, rep- resenting whether the logic it contains suggests that a patient has the condition. Then, much like the simpler examples we’ve seen, if any of the steps return true, the patient is deemed to have the condition. May well be a more sophisticated way to do this in CWL. 28
  • 44. Things we could probably do better v 4. The CWL itself! $namespaces: s: http://phenomics.kcl.ac.uk/phenoflow/ baseCommand: python class: CommandLineTool cwlVersion: v1.0 doc: Identify COVID−19 (ICD−10) id: icd10 inputs: − doc: Python implementation unit id: inputModule inputBinding: position: 1 type: File − doc: Potential cases of covid−19. id: potentialCases inputBinding: position: 2 type: File outputs: − doc: Patients with ICD−10 COVID−19 codes id: output outputBinding: glob: ’∗.csv’ type: File requirements: DockerRequirement: dockerPull: kclhi/python:latest s:type: logic 29
  • 45. Summary • We standardise existing phenotype definitions under a CWL-based model. • These standardised definitions are presented to users as a part of the Phenoflow library. • CWL files themselves are generated in realtime when a user downloads a given definition from the library. Thank you! Things like CWL’s Docker integration and the generation and visualisation tools have been invaluable. 30
  • 46. Links Links given throughout the presentation: Live: https://kclhi.org/phenoflow Source: https://github.com/kclhi/phenoflow Wiki: https://github.com/kclhi/phenoflow/wiki 31