Using computable phenotypes in point of care clinical trial recruitment
1. Using computable phenotypes in point-of-care clinical trial
recruitment
Martin Chapman, Jesús Domı́nguez, Elliot Fairweather, Brendan C. Delaney and Vasa Curcin
King’s College London
Imperial College
2. Definition: EHR-based phenotype definition i
An electronic health record (EHR)-based phenotype definition is an abstract outline of the
functionality required to extract a cohort of patients from a set of health records (e.g. a data
flow diagram, a code list, etc.).
Each definition is realised as a computable phenotype for a given dataset (e.g. an SQL
script, Python code, etc.).
1
3. Definition: EHR-based phenotype definition ii
A Type 2 Diabetes (T2DM) phenotype:
SELECT UserID, COUNT(DISTINCT AbnormalLab) AS abnormal lab
FROM Patients
GROUP BY UserID
HAVING abnormal lab > 0;
...
Definition Computable forms
2
4. Definition: Clinical trial eligibility criteria
Eligibility criteria express the characteristics that a patient should have in order to be
considered for (or excluded from) participation in a clinical trial.
• In point-of-care clinical trial recruitment, patient eligibility is determined upon
presentation, typically during an appointment with a general practitioner.
3
5. Using computable phenotypes to determine trial eligibility
Computable phenotypes and eligibility criteria are both used to determine patient cohorts of
interest.
With large repositories of definitions emerging, it is therefore natural to consider the use of
phenotypes as the basis for determining trial eligibility.
However, for computable phenotypes to be suitable for this purpose, they must be:
• Portable, such that they can be used to identify similar condition cohorts for future trials.
• Compatible with existing trial infrastructure.
• Accurate, such that they effectively identify trial participants.
4
6. Defining an AOMd phenotype i
To explore these requirements, we focussed on a prior clinical trial,
REST1
, which was designed to compare treatments for acute otitis
media with discharge (AOMd) in children.
In the original trial, criteria for AOMd patients – such as data of birth – were expressed using a
subset of openEHR-based archetypes.
<Archetype><![CDATA[ archetype (adl version=1.4) concept [at0000]
definition EVENT[at0000] matches { attributes cardinality matches {1..1; ordered} matches {
ATTRIBUTE[at0001] matches { value matches {|2003−12−01..2018−12−01|} } } }
ontology terminologies available = <‘‘CDIM”, ...> term definitions = < [‘‘en”] = < items = <
[‘‘at0000”] = < text = <‘‘date of birth”> type = <‘‘Event”> >
[‘‘at0001”] = < text = <‘‘date of birth value”> type = <‘‘Date”> > > > >
term binding = < [‘‘CDIM”] = < items = <
[‘‘at0001”] = <[CDIM::CDIM 000007]> > > > ]]></Archetype>
1
https://www.isrctn.com/ISRCTN12873692
5
7. Defining an AOMd phenotype ii
We constructed a phenotype definition with equivalent functionality using the Phenoflow
platform.
2 - read codes Determine whether the patient is anno-
tated with a code indicating they are eli-
gible for the trial cohort.
logic
step
Input Output
otitis cohort Potential cases of
AOMd; correct age.
otitis cases Patients with
AOMd.
csv
codes.py python -
with urlopen (“https://uts-ws.nlm.nih.gov/rest/search/current?string=otitis&sabs=RCD”) as umlsCodes :
. . .
Computational
Implementation
Units
Abstract
Functional
https://kclhi.org/phenoflow
6
8. TRANSFoRm
Recruitment in REST was handled using the TRANSFoRm e-source trial platform.
In the original trial, archetype-based criteria were translated to concrete implementations
(e.g. XPath queries) by TRANSFoRm in order to determine a patient’s eligibility from their
EHR.
We developed a new service (PhEM) that instead enables the execution of a computable
phenotype against an EHR, while retaining the knowledge of record structure held previously.
EHR
integration
Archetype
execution
Data Node Connector
GP
Patient
record
Study system
Archetype
translation
Phenotype
execution
microservice
(PhEM)
7
9. Experiments
To compare the use of phenotypes with the original archetype-based criteria, we simulated the
recruitment of patients as a part of the REST trial using both configurations of
TRANSFoRm.
• Properties of the live trial were used to inform the simulation, including population size
and the types of AOMd present in the population.
• A population of 10258 patients was generated, exhibiting four forms of AOMd (plus one
form as a control) under real-world distributions (provided by Synthea1
).
1https://github.com/synthetichealth/synthea
8
10. Results
1. The use of phenotypes as eligibility criteria offers at least the same recruitment
accuracy as the use of archetypes.
2. In certain situations, improved accuracy is also shown:
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Population
Age
Total
Eligible
Recruited by archetype
Recruited by PhEM
Leveraging an understanding of age to
more accurately flag eligibility.
0
5
10
15
20
25
30
35
40
F527. F520. F510.. Y20ff F528. Other
Population
Code
Total
Eligible
Recruited by archetype
Recruited by PhEM
Leveraging knowledge of new condition
codes to more accurately flag eligibility.
9
11. Conclusion
For computable phenotypes to be used to determine trial eligibility, our requirements were that
they should be:
• Portable, such that they can be used to identify similar condition cohorts for future trials.
• Current phenotype standards, such as Phenoflow, prioritise portability by design.
• Phenoflow also enables technical portability, which is important in point-of-care
recruitment.
10
12. Conclusion
For computable phenotypes to be used to determine trial eligibility, our requirements were that
they should be:
• Portable, such that they can be used to identify similar condition cohorts for future trials.
• Current phenotype standards, such as Phenoflow, prioritise portability by design.
• Phenoflow also enables technical portability, which is important in point-of-care
recruitment.
• Compatible with existing trial infrastructure.
• Our contribution of PhEM enables phenotype definitions to be plugged into TRANSFoRm,
also supporting future trials.
• This service, or the microservice structure adopted, can be used to integrate phenotypes
with other platforms.
10
13. Conclusion
For computable phenotypes to be used to determine trial eligibility, our requirements were that
they should be:
• Portable, such that they can be used to identify similar condition cohorts for future trials.
• Current phenotype standards, such as Phenoflow, prioritise portability by design.
• Phenoflow also enables technical portability, which is important in point-of-care
recruitment.
• Compatible with existing trial infrastructure.
• Our contribution of PhEM enables phenotype definitions to be plugged into TRANSFoRm,
also supporting future trials.
• This service, or the microservice structure adopted, can be used to integrate phenotypes
with other platforms.
• Accurate, such that they effectively identify trial participants.
• The use of phenotypes is shown to recruit at the same level of accuracy as established
criteria formalisms, and more accurately in certain situations.
• This is promising for other trial environments.
10
14. Discussion
• There is an execution time cost associated with executing a phenotype instead of using
a static archetype definition (22.22 (20.15; 24.30) seconds vs. 12.37 (11.74; 12.49)
seconds).
• Archetype definitions can offer comparable accuracy, but this requires expert, human
intervention.
• Utility is predicated on the continued development of high-quality phenotype definitions.
11