A complex ADaM dataset - three different ways to create one

A Complex ADaM dataset?
Three different ways to create one.

Disclaimer
Any views or opinions presented in this
presentation are solely those of the author
and do not necessarily represent those of the
company.

11/27/2013

Cytel Inc.

2

Agenda
• Introduction of ADaM dataset
• Three methods for a complex ADaM dataset
• Example
• Benefits of each method
• Limitation of each method
• Consideration
• Conclusion
• Questions & Answers
11/27/2013

Cytel Inc.

3

Introduction of ADaM
• ADaM(Analysis Data Model) is the analysis dataset in
CDISC.
• Purpose
• Analysis Ready (statistical analysis to be performed with
minimal programming)
• Traceability

• Type
• ADSL(Subject Level Analysis Dataset)
• BDS(Basic Data Structure)
− Special BDS(upcoming)
• ADTTE(Time to Event Analysis Dataset)

• ADAE(Adverse Event Analysis Dataset ‐ upcoming)
4

A complex ADaM dataset
• Can require several algorithms
• Can require several data manipulation steps
• Can be derived from more than one SDTM
• Can be difficult to trace back
• Can be difficult to validate

11/27/2013

Cytel Inc.

5

Three Methods to create a complex
ADaM dataset

1. SDTM datasets to ADaM datasets
2. SDTM datasets through the intermediate
permanent datasets to final ADaM datasets
ADaM datasets to final ADaM datasets

11/27/2013

Cytel Inc.

6

Three Methods Diagram
Intermediate permanent datasets

SDTM+

ADaM+

SDTM

ADaM

ADaM

11/27/2013

Cytel Inc.

7

Example 1
• A comparison of average daily drinking rate in
treatment period between placebo and study
drug.
• At the baseline period ‐ the average daily drinking
rate during 21 days from hospitalization date
• At the treatment period – the average daily
drinking rate during during 42 days from the first
study dose.
Baseline rate imputation applied to the followings
− The subject who discontinued early
− Any missing assessment

11/27/2013

Cytel Inc.

8

Key components in the example
• SDTM – SU (Substance Use)
• Final ADaM – ADDR (Drinking Rate Analysis
Dataset)
• Parameter – ADDRATE (Average Daily Drinking
Rate)

11/27/2013

Cytel Inc.

9

Algorithm of parameter of ADDRATE
• Rb (Baseline rate) = sum of all doses / number
of days drinking data available at baseline
period
• Ra (Actual treatment rate) = sum of all doses /
number of days drinking data available at
treatment period
• Rt (Imputed treatment rate)
( Ra * DAYS + Rb * (42 – DAYS) ) / 42
at DAYS is the number of days drinking data
available
11/27/2013

Cytel Inc.

10

Three Methods for example

SDTM+(_SU)

ADaM+(_ADDR)

SDTM(SU)

ADaM(ADDR)

ADaM(ADSU)

11/27/2013

Cytel Inc.

11

SDTM SU dataset
USUBJID

SUSEQ SUTRT

001‐01‐001

1

ALCOHOL

001‐01‐001

2

ALCOHOL

001‐01‐001

3

ALCOHOL

001‐01‐001

21

001‐01‐001

SUSTAT

SUDOSE

SUDOSU SUSTDTC

SUSTDY

VISIT

0

DRINKS

2011‐02‐08

‐21

Screening

DRINKS

2011‐02‐09

‐20

Screening

5

DRINKS

2011‐02‐10

‐19

Screening

ALCOHOL

0

DRINKS

2011‐02‐28

‐1

Screening

22

ALCOHOL

0

DRINKS

2011‐03‐01

1

Visit 1

001‐01‐001

23

ALCOHOL

DRINKS

2011‐03‐02

2

Visit 1

001‐01‐001

24

ALCOHOL

0

DRINKS

2011‐03‐03

3

Visit 1

001‐01‐001

25

ALCOHOL

2

DRINKS

2011‐03‐04

4

Visit 1

001‐01‐001

26

ALCOHOL

NOT DONE

DRINKS

2011‐03‐05

5

Visit 1

001‐01‐001

58

ALCOHOL

NOT DONE

DRINKS

2011‐04‐06

37

Visit 3

001‐01‐001

59

ALCOHOL

4

DRINKS

2011‐04‐07

38

Visit 3

001‐01‐001

60

ALCOHOL

0

DRINKS

2011‐04‐08

39

Visit 3

001‐01‐001

61

ALCOHOL

2

DRINKS

2011‐04‐09

40

Visit 3

001‐01‐001

62

ALCOHOL

1

DRINKS

2011‐04‐10

41

Visit 3

001‐01‐001

63

ALCOHOL

4

DRINKS

2011‐04‐11

42

Visit 3

NOT DONE

….

NOT DONE

….

11/27/2013

Cytel Inc.

12

Analysis Dataset Metadata for ADDR
Dataset
Name

Dataset
Description

Dataset
Location

Dataset
Structure

ADDR

Drinking
Rate
Analysis
Data

addr.xpt

one record per USUBJID,
PARAMCD,
subject per
parameter per AVISITN
analysis visit

11/27/2013

Cytel Inc.

Key
Variables
of Dataset

Class of
Dataset

Documentation

BDS

c‐addr.txt

13

Analysis Variable Metadata including Analysis
Parameter value level Metadata for ADDR (1)
Variable Label

Variable
Type

Display
Format

ADDR

*ALL*

USUBJID

Unique Subject
Identifier

text

$20

ADSL.USUBJID

ADDR

*ALL*

SITEID

Site ID

text

$20

ADSL.SITEID

ADDR

*ALL*

SEX

Sex

text

$20

M, F

ADSL.SEX

ADDR

*ALL*

FASFL

Full Analysis Set
Population Flag

text

$1

Y, N

ADSL.FASFL

ADDR

*ALL*

TRTPN

Planned
Treatment (N)

integer

1.0

1 = Placebo, 2
= Study Drug

ADSL.TRTPN

ADDR

*ALL*

TRTP

Planned
Treatment

text

$20

Placebo,
Study Drug

ADSL.TRTP

ADDR

PARAMCD

PARAMCD

Parameter Code

text

$8

ADDRATE

ADDR

*ALL*

PARAM

Parameter

text

$50

Average Daily
Drinking Rate

11/27/2013

Cytel Inc.

Codelist /
Controlled
Terms

Source /
Derivation

Dataset Parameter Variable
Name
Identifier Name

14

Name
Identifier Name

Variable
Label

Variable
Type

Display
Format

Codelist /
Controlled
Terms

ADDR

*ALL*

PARAMTYP

Parameter
Type

text

$20

DERIVED

ADDR

*ALL*

AVISITN

Analysis Visit integer
(N)

3.0

1=Baseline,
2=Treatment
Period

ADDR

*ALL*

AVISIT

Analysis Visit

text

$20

Baseline,
Treatment
Period

ADDR

*ALL*

AVAL

Analysis
Value

float

8.2

Source / Derivation

11/27/2013

Cytel Inc.

‘Baseline’ when
SU.VISIT=‘Screening’
‘Treatment Period’
when SU.VISIT in (‘VISIT
1’, ‘VISIT 2’, ‘VISIT 3’)
Average Daily Drinking
Rate within analysis
visit. At Treatment
Period, if a patient
discontinues early or
have missing records,
impute with baseline
rate
15

Name
Identifier Name

Variable
Label

Variable
Type

Display
Format

Codelist /
Controlled
Terms

Source / Derivation

ADDR

*ALL*

ABLFL

Baseline
Record Flag

text

$1

Y

‘Y’ at AVISIT = “Baseline”

ADDR

*ALL*

BASE

Baseline
Value

float

8.2

AVAL of
AVISIT=“Baseline”

ADDR

*ALL*

CHG

Change from float
Baseline

8.2

AVAL ‐ BASE

11/27/2013

Cytel Inc.

16

1st method : SDTM to ADaM

SDTM(SU)

11/27/2013

ADaM(ADDR)

Cytel Inc.

17

Final ADaM dataset of ADDR
USUBJID

FASFL

TRTP

PARAMCD PARAM

AVISIT

ABLFL

AVAL

001‐01‐001

Y

Study
Drug

ADDRATE

Average Daily
Drinking Rate

Baseline

Y

4.40

001‐01‐001

Y

Study
Drug

ADDRATE

Average Daily
Drinking Rate

Treatment
Period

001‐01‐002

Y

Placebo

ADDRATE

Average Daily
Drinking Rate

Baseline

001‐01‐002

Y

Placebo

ADDRATE

Average Daily
Drinking Rate

Treatment
Period

2.72
Y

BASE

CHG

4.40

‐1.68

4.26

‐1.16

4.26
3.10

Key points to note:
• Row 2: There are 3 missing assessments during the
treatment period for the subject of 01‐001, so the baseline rate
imputation method was applied as follow
2.60*39 + 4.40*(42‐39) = 2.72
42
• Row 4: There are no missing assessments during the
treatment period for the subject of 01‐002
11/27/2013

Cytel Inc.

18

2nd method : SDTM to intermediate
permanent datasets to ADaM

SDTM+(_SU)

ADaM+(_ADSU)

SDTM(SU)

11/27/2013

ADaM(ADDR)

Cytel Inc.

19

Intermediate permanent datasets of
SDTM plus _SU (1)
USUBJID

SUS
EQ

SUTRT

001‐01‐001

1

ALCOHOL

001‐01‐001

2

ALCOHOL

001‐01‐001

3

ALCOHOL

001‐01‐001

21

001‐01‐001

SUSTAT

SUD
OSE

SUDOSU SUSTDTC

SUST VISIT
DY

_HO
SEQ

0

DRINKS

2011‐02‐08

‐21

Screening

1

DRINKS

2011‐02‐09

‐20

Screening

5

DRINKS

2011‐02‐10

‐19

Screening

2

ALCOHOL

0

DRINKS

2011‐02‐28

‐1

Screening

19

22

ALCOHOL

0

DRINKS

2011‐03‐01

1

Visit 1

001‐01‐001

23

ALCOHOL

DRINKS

2011‐03‐02

2

Visit 1

001‐01‐001

24

ALCOHOL

0

DRINKS

2011‐03‐03

3

Visit 1

2

001‐01‐001

25

ALCOHOL

2

DRINKS

2011‐03‐04

4

Visit 1

3

001‐01‐001

26

ALCOHOL

NOT DONE

DRINKS

2011‐03‐05

5

Visit 1

001‐01‐001

58

ALCOHOL

NOT DONE

DRINKS

2011‐04‐06

37

Visit 3

001‐01‐001

59

ALCOHOL

4

DRINKS

2011‐04‐07

38

Visit 3

35

001‐01‐001

60

ALCOHOL

0

DRINKS

2011‐04‐08

39

Visit 3

36

001‐01‐001

61

ALCOHOL

2

DRINKS

2011‐04‐09

40

Visit 3

37

001‐01‐001

62

ALCOHOL

1

DRINKS

2011‐04‐10

41

Visit 3

38

001‐01‐001
11/27/2013

63

ALCOHOL

4

DRINKS

2011‐04‐11

42

Visit 3

39
20

NOT DONE

_SDS
EQ

….

NOT DONE

1

….

Cytel Inc.

Intermediate permanent datasets of
SDTM plus _SU (2)

• _HOSEQ is the sequence number of non‐
missing drinking assessment from the
hospitalization date (2011‐02‐08)
• _SDSEQ is the sequence number of non‐
missing drinking assessment from the first
dose date (2011‐03‐01)
• When SUSTAT = ‘NOT DONE’, _HOSEQ and
_SDSEQ are not increased by 1.

11/27/2013

Cytel Inc.

21

Intermediate permanent dataset – ADaM
plus _ADDR (1)
USUBJID TRTP

PARAM

AVISIT

ABLFL

AVAL

001‐01‐
001

Study
Drug

Average Daily
Drinking Rate

Baseline

Y

4.40

001‐01‐
001

Study
Drug

Average Daily
Drinking Rate

Treatment
Period

001‐01‐
002

Placebo

Average Daily
Drinking Rate

Baseline

001‐01‐
002

Placebo

Average Daily
Drinking Rate

Treatment
Period

2.72
Y

BASE

4.26
3.10

4.26

‐1.16

_DAYS

_AVAL

19

4.40

101.2

39

2.60

89.4

‐1.68

_TOT
AL
83.6

4.40

CHG

21

4.26

130.2

42

3.10

Plus variables
• _TOTAL(Sum of doses per visit) = sum(SUDOSE)
• _DAYS (Number of non‐missing drinking days per visit)=
count(missing SUSTAT) or last._HOSEQ or last._SDSEQ within
AVISIT
• _AVAL (Actual treatment rate)= _TOTAL / _DAYS
11/27/2013

Cytel Inc.

22

Intermediate permanent dataset – ADaM
plus _ADDR (3)
USUBJID TRTP

PARAM

AVISIT

ABLFL

AVAL

001‐01‐
001

Study
Drug

Average Daily
Drinking Rate

Baseline

Y

4.40

001‐01‐
001

Study
Drug

Average Daily
Drinking Rate

Treatment
Period

001‐01‐
002

Placebo

Average Daily
Drinking Rate

Baseline

001‐01‐
002

Placebo

Average Daily
Drinking Rate

Treatment
Period

2.72
Y

BASE

4.26
3.10

4.26

‐1.16

_DAYS

_AVAL

19

4.40

101.2

39

2.60

89.4

‐1.68

_TOTAL
83.6

4.40

CHG

21

4.26

130.2

42

3.10

Key points to note:
• Row 2 and 4: at the treatment period, AVAL algorithm is
(_AVAL * _DAYS + BASE * (42 ‐ _DAYS) ) / 42
• Row 2:
2.60*39 + 4.40*(42‐39) = 2.72
42
• Row 4:
3.10*42 + 4.26*(42‐42) = 3.10
11/27/2013
Cytel Inc.
42

23

3rd method: SDTM to intermediate ADaM
to ADaM

SDTM(SU)

ADaM(ADDR)

ADaM(ADSU)

11/27/2013

Cytel Inc.

24

Intermediate ADaM dataset of ADSU (1)
USUBJID

PARAMCD AVAL

ADT

AVISIT

VISIT

001‐01‐001

DDRATE

0

2011‐02‐08

Baseline

001‐01‐001

DDRATE

5

2011‐02‐10

001‐01‐001

DDRATE

0

2011‐02‐28

001‐01‐001

DDRATE

4.4

001‐01‐001

DDRATE

0

2011‐03‐01

Treatment Period

Visit 1

001‐01‐001

DDRATE

4.4

2011‐03‐02

Treatment Period

Visit 1

001‐01‐001

DDRATE

0

2011‐03‐03

Treatment Period

001‐01‐001

DDRATE

2

2011‐03‐04

001‐01‐001

DDRATE

4.4

001‐01‐001

DDRATE

001‐01‐001

DTYPE

ASEQ

SUSEQ

Screening

1

1

Baseline

Screening

2

3

Baseline

Screening

19

21

….
Baseline

AVERAGE

20
21

22

22

23

Visit 1

23

24

Treatment Period

Visit 1

24

25

2011‐03‐05

Treatment Period

Visit 1

BLCF

25

26

4.4

2011‐04‐06

Treatment Period

Visit 3

BLCF

57

58

DDRATE

4

2011‐04‐07

Treatment Period

Visit 3

58

59

001‐01‐001

DDRATE

0

2011‐04‐08

Treatment Period

Visit 3

59

60

001‐01‐001

DDRATE

2

2011‐04‐09

Treatment Period

Visit 3

60

61

001‐01‐001

DDRATE

1

2011‐04‐10

Treatment Period

Visit 3

61

62

001‐01‐001

DDRATE

4

2011‐04‐11

Treatment Period

Visit 3

62

63

001‐01‐001
11/27/2013

DDRATE

2.72

BLCF

….

Treatment Period

Cytel Inc.

AVERAGE

63

25

Intermediate ADaM dataset of ADSU (2)

• ‘NOT DONE’ data from SU were not included in
ADSU
• At baseline visit, we only include 19 records for 01‐
001. We used DYPTE=’AVERAGE’ to achieve the
average of assessed doses at ASEQ = 20.
• At treatment period visit, we only include 39 records.
We used DYPTE=’AVERAGE’ to achieve the average of
assessed doses at ASEQ = 63.

11/27/2013

Cytel Inc.

26

Final ADaM dataset of ADDR
USUBJID TRTP

PARAM

AVISIT

ABLFL

AVAL

001‐01‐
001

Study
Drug

Average Daily
Drinking Rate

Baseline

Y

4.40

001‐01‐
001

Study
Drug

Average Daily
Drinking Rate

Treatment
Period

001‐01‐
002

Placebo

Average Daily
Drinking Rate

Baseline

001‐01‐
002

Placebo

Average Daily
Drinking Rate

Treatment
Period

2.72
Y

BASE

4.26
3.10

4.26

‐1.16

SRCSEQ
20

ADSU

63

ADSU

‐1.68

SRCDOM
ADSU

4.40

CHG

22

ADSU

65

Key points to note:
• All the records are coming from ADSU.
• Great data point traceability.

11/27/2013

Cytel Inc.

27

Example 2 : Intermediate Time to Event
permanent ADaM plus dataset
USUB
JID

TRTP

PARA AVA
M
L

STAR
TDT

ADT

CN
SR

EVNTDESC

_DSDECOD

_DS
DTC

_SVXS
TDTC

_AEX
DT

001‐
01‐001

Study
Drug 1

Death

157

2011‐
01‐04

2011‐
06‐10

1

COMPLETED
THE STUDY

COMPLETED
THE STUDY

2011‐
06‐10

2011‐
06‐10

2011‐
05‐04

001‐
01‐002

Study
Drug 2

Death

116

2011‐
02‐01

2011‐
05‐28

1

LOST TO
FOLLOW‐UP

LOST TO
FOLLOW‐UP

2011‐
05‐28

2011‐
05‐28

2011‐
05‐01

001‐
01‐003

Study
Drug 2

Death

88

2011‐
02‐05

2011‐
05‐04

0

DEATH

DEATH

2011‐
05‐04

2011‐
05‐04

2011‐
05‐04

001‐
01‐004

Study
Drug 1

Death

102

2011‐
03‐20

2011‐
06‐30

1

ONGOING

2011‐
06‐30

2011‐
06‐04

001‐
01‐005

Study
Drug 1

Death

101

2011‐
03‐26

2011‐
07‐05

1

ONGOING

2011‐
07‐01

2011‐
07‐05

AVAL = ADT – STARTDT
Plus variables
• _DSDECOD = DS.DSDECOD when DS.DSCAT = “DISPOSITION EVENT”
• _DSDTC = DS.DSDTC when DS.DSCAT = “DISPOSITION EVENT”
• _SVXSTDTC = Last Study Visit date
• _AEXDT = Last AE date
11/27/2013

Cytel Inc.

28

1st Method : SDTM to ADaM
The benefits are
• Simple process
The limitations are
• A lack of data point traceability (Traceability
will be provided with Define.xml)
• Difficult to troubleshoot issues if development
SAS programmer and validation SAS
programmer do not agree on issues in the
final ADaM dataset.

11/27/2013

Cytel Inc.

29

2nd Method : SDTM thru intermediate
permanent datasets to final ADaM

The benefits are
• Easy to follow each step and to validate
• Flexibility of the data structure of
intermediate datasets (A programmer does
not need to follow CDISC standards in the
intermediate permanent datasets)
The limitations are
• A lack of data point traceability, especially for
the reviewers.
11/27/2013

Cytel Inc.

30

Business rules for plus datasets
• Plus datasets
• The same SAS program as the final ADaM dataset
development program. We do not have separate dataset
programs for the intermediate permanent datasets.
• Same number of the records – we keep the same number
of records between SDTM datasets and SDTM plus datasets
and also ADaM datasets and ADaM plus datasets.
• Naming convention : the prefix of ‘_’ and original SDTM or
final ADaM

• Plus variables
• The temporary variables by adding the prefix ‘_’.
• No Standard for plus variables – we assign the labels, but
do not follow any CDISC standards.
11/27/2013

Cytel Inc.

31

3rd method : SDTM thru ADaM to
final ADaM

The benefits are
• Easy to follow each step
• Great data point traceability
The limitations are
• Need to create and validate all ADaM datasets
including the intermediate ADaM datasets
• Not much flexibility of ADaM datasets as the
intermediate datasets
11/27/2013

Cytel Inc.

32

Consideration
Datasets which will be submitted
• SDTM to ADaM method
1. SDTM
2. final ADaM

• SDTM thru the intermediate permanent datasets to
ADaM method
1. SDTM
2. final ADaM

• SDTM thru ADaM to ADaM method
1. SDTM
2. intermediate ADaM
3. final ADaM
11/27/2013

Cytel Inc.

33

Conclusion
• Three methods for a complex ADaM datasets
1. SDTM datasets to ADaM datasets
permanent datasets to final ADaM datasets
3. SDTM datasets through the intermediate ADaM
datasets to final ADaM datasets

• More options for a complex ADaM dataset
creation
• Analysis will dictate the type of methods
11/27/2013

Cytel Inc.

34

A complex ADaM dataset - three different ways to create one

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a A complex ADaM dataset - three different ways to create one

Semelhante a A complex ADaM dataset - three different ways to create one (20)

Mais de Kevin Lee

Mais de Kevin Lee (20)

Último

Último (20)

A complex ADaM dataset - three different ways to create one