SlideShare uma empresa Scribd logo
1 de 73
「診断精度の分析」の書き方

           山形伸二
大学入試センター 入学者選抜研究機構
   shinji.yamagata[at]gmail.com
構成
• 診断精度の分析:おさらい
• 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011)
   –A. 標本選択 (patient selection)
   –B. 指標検査 (index test)
   –C. 参照基準 (reference standard)
   –D. 時間進行 (flow and timing)
• QUADAS (Quality Assessment of Diagnostic Accuracy Studies)
• 書き方
   – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ)
• エピローグ
構成
• 診断精度の分析:おさらい
• 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011)
   –A. 標本選択 (patient selection)
   –B. 指標検査 (index test)
   –C. 参照基準 (reference standard)
   –D. 時間進行 (flow and timing)
• QUADAS (Quality Assessment of Diagnostic Accuracy Studies)
• 書き方
   – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ)
• エピローグ
診断精度の指標:まとめ
                                真の状態
                   病気                   病気でない

                                                                    陽性的中率
                A. 真陽性                   B. 偽陽性
      陽性                                                 陽性数 (positive predictive value)
              (true positive)          (false postive)
                                                               = 真陽性数/陽性数                      陽性率
 検査                                                                                         (positive rate)
                                                                    陰性的中率                  = 陽性数/総数
                 C. 偽陰性                   D. 真陰性
      陰性                                                 陰性数 (negative predictive value)
              (false negative)         (true negative)
                                                               = 真陰性数/陰性数

                 病者数                     非病者数            N. 総数
                                                                    正診率(accuracy) =
            感度 (sensitivity) =    特異度 (specificity) =
                                                                  (真陽性数+真陰性数)
            真陽性数/病者数              真陰性数/非病者数
                                                                       総数
               有病率(prevalence) = 病者数/総数

陽性尤度比 = 感度/(1-特異度): 特異度100%(偽陽性0%)の時∞…確定診断
陰性尤度比 = (1-感度)/特異度: 感度100%(偽陰性0%)の時0…除外診断
オッズ比 = AD/BC = 陽性尤度比/陰性尤度比
リスク比 = 陽性的中率/(1-陰性的中率)
κ = {(A+D)- Ne(A+D)}/{N – Ne(A+D)}; where Ne(A+D) = {(A + B)(A + C) + (C + D)(B + D)}/N
φ = (AD - BC)/√{(A+B)(C+D)(A+C)(B+D)}
χ2 = Nφ2
検定…χ2 検定,McNemar検定
有病率(事前確率)の影響
•有病率 = .50の場合
                         真の状態
                病気                 病気でない

                                                                 陽性的中率
              A. 真陽性               B. 偽陽性            陽性数
     陽性                                                   (positive predictive value)
                 81                   40              121                                  陽性率
                                                              =81/121 = .67
                                                                                        (positive rate)
検査
                                                                                         = 121/200
                                                                陰性的中率
              C. 偽陰性               D. 真陰性            陰性数                                    = .61
     陰性                                                  (negative predictive value)
                 19                   60              79
                                                              = 19/60 = .32

            病者数 = 100            非病者数 = 100          N= 200

          感度 (sensitivity) =   特異度 (specificity) =              正診率(accuracy) =
           81/100 = .81         60/100 = .60                   (81+60)/200 = .71

            有病率(prevalence) = 100/200 = .50




                                                                                based on Streiner (2003)
有病率(事前確率)の影響
 •有病率 = .05の場合
                         真の状態
                病気                 病気でない

                                                                      陽性的中率
              A. 真陽性               B. 偽陽性            陽性数
     陽性                                                        (positive predictive value)
                 405                  3800            4205                                       陽性率
                                                                  =405/4205 = .10
                                                                                              (positive rate)
検査
                                                                                             = 4205/10000
                                                                  陰性的中率
              C. 偽陰性               D. 真陰性            陰性数                                          = .42
     陰性                                                    (negative predictive value)
                 95                   5700            5795
                                                              = 5700/5795 = .98

            病者数 = 500           非病者数 = 9500          N=10000

          感度 (sensitivity) =   特異度 (specificity) =                正診率(accuracy) =
           405/500 = .81        5700/9500 = .60                (405+5700)/10000 =.61

           有病率(prevalence) = 500/10000 = .05

•ベイズの定理:
P(真陽性|検査陽性) =                     P(検査陽性|真陽性)P(真陽性)
                        {P(検査陽性|真陽性)P(真陽性)+P(検査陽性|真陰性)P(真陰性)}
•テストの指標としては,感度または特異度>陽性(陰性)的中率
                                                                                    based on Streiner (2003)
感度と特異度のトレードオフ
          検査陽性(感度高,特異度低)

           検査陽性(感度低,特異度高)




病気でない       病気
 スクリーニング尺度の得点分布
ROC曲線(Receiver Operating Characteristics curve)
        とAUC (Area Under Curve)
         1

        0.9

        0.8

        0.7

        0.6

        0.5                                                                 検査1
   感度                                                                       検査2
        0.4

        0.3

        0.2

        0.1

         0
              0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1



                         1-特異度 (偽陽性率)
•AUC
 完璧なテスト:感度=特異度100%→1.0
 ランダムなテスト:対角線→0.5
2つの検査の精度を比較する (標本が独立の場合)




                    鈴木 (2006)
2つの検査の精度を比較する (標本が同一の場合)




                    鈴木 (2006)
(検査Yの検査Xに対する真陽性)   (検査Yの検査Xに対する偽陽性)




                                  鈴木 (2006)
構成
• 診断精度の分析:おさらい
• 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011)
   –A. 標本選択 (patient selection)
   –B. 指標検査 (index test)
   –C. 参照基準 (reference standard)
   –D. 時間進行 (flow and timing)
• QUADAS (Quality Assessment of Diagnostic Accuracy Studies)
• 書き方
   – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ)
• エピローグ
標本選択に関わる内的・外的妥当性
       (bias, variation/applicability)
• A1. 選択バイアス (selection bias)
・本来,標本はランダム/連続的に抽出されるべき
⇔判断の難しいケース,非典型的なケースを除外
…×内的妥当性(精度の過大推定)

• A2. スペクトラム・バイアス (spectrum bias)
・本来,標本は実際の臨床で扱う母集団と同じであるべき
⇔より重症,または明確に病気である標本の使用
⇔より健康な標本の使用
⇔症例対照(case-control)デザインの使用
…×外的妥当性 (精度の過大推定)
感度と特異度のトレードオフ

            検査陽性




病気でない       病気

 スクリーニング尺度の得点分布
感度と特異度のトレードオフ

            検査陽性

             感度の過大推定




病気でない       病気(重症)

 スクリーニング尺度の得点分布
感度と特異度のトレードオフ

                検査陽性

特異度の過大推定         感度の過大推定




病気でない(健康)       病気(重症)

     スクリーニング尺度の得点分布
指標検査に関わる内的・外的妥当性

B1. 情報バイアス (information bias)
・本来,指標検査は参照基準と独立に行われるべき
⇔指標検査者が参照基準の結果を知っている
…×内的妥当性(精度の過大推定)
B2. 既有知識によるバイアス(clinical review bias;context bias)
・本来,指標検査の解釈は実践と同知識量で行われるべき
⇔有病率,患者についての知識が解釈に影響
…×内的・外的妥当性(精度の過大推定)
指標検査に関わる内的・外的妥当性

B3. 指標検査結果の再現性(reproducibility, observer variability)
・本来,指標検査は高い信頼性(再検査,評定者間)を備えるべき
⇔信頼性が低い…×内的妥当性(精度の過小・過大推定)
B4. 事後的な閾値の設定
・本来,量的な指標検査の閾値は事前に設定されているべき
⇔事後的に手元の標本で精度が最大となるよう閾値を設定
…×外的妥当性(精度の過大推定)
B5. 指標検査の実施方法の記述の不備
• 本来,実施方法は完全に再現可能な方法で記述されるべき
⇔実施方法の詳細が不明,特殊な機器を使用
…×外的妥当性
参照基準に関わる内的・外的妥当性

C1. 誤分類バイアス(misclassification bias)
・本来,参照基準は完全(gold standard)であるべき
⇔参照基準自体の感度・特異度に問題がある
…×内的妥当性(精度の過大・過小推定)

C2. 情報バイアス (information bias)
C3. 包含バイアス(incorporation bias)
・本来,参照基準は指標検査と独立に行われるべき
⇔参照基準の実施者が指標検査の結果を知っている
⇔指標検査が(複合的)参照基準の一部を形成している
…×内的妥当性(精度の過大推定)
参照基準に関わる内的・外的妥当性

C4. 参照基準の実施方法の記述の不備
• 本来,実施方法は完全に再現可能な方法で記述されるべき
⇔実施方法の詳細が不明,特殊な機器を使用
…×外的妥当性
時間進行に関わる内的・外的妥当性

D1. 疾患進行バイアス (disease progression bias)
・本来,指標検査と参照基準は間を置かず実施されるべき
⇔指標検査と参照基準実施の間に病気が進行
…×内的妥当性

D2. 部分的確証バイアス(partial verification bias)
D3. 差別的確証バイアス(differential verification bias)
・本来,標本の全てが同一の参照基準で判断されるべき
⇔標本の一部(e.g.指標検査で高リスク)しか参照基準を受けない
⇔標本の一部が異なる(e.g.より精度の高い)参照基準を受ける
…×内的妥当性(感度の過大推定)
時間進行に関わる内的・外的妥当性

D4. データの除外
D5. 脱落
・本来,外れ値,分析失敗等のデータの扱い,標本の脱落は
 全て報告されるべき
⇔報告しない…×内的妥当性
実際の研究ではどうか?(Lijmer et al., 1999)




RDOR ≒ オッズ比[条件満たさない研究]/オッズ比[条件満たした研究]
構成
• 診断精度の分析:おさらい
• 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011)
   –A. 標本選択 (patient selection)
   –B. 指標検査 (index test)
   –C. 参照基準 (reference standard)
   –D. 時間進行 (flow and timing)
• QUADAS (Quality Assessment of Diagnostic Accuracy Studies)
• 書き方
   – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ)
• エピローグ
QUADASとは

• STARD (Statement for Reporting Studies of Diagnostic Accuracy)
  • 個々の研究報告の質向上のためのガイドライン (Bossuyt et al., 2003)
• QUADAS (Quality Assessment of Diagnostic Accuracy Studies)
  • 系統的レビューのための研究の質評定ツール
  • 「まず報告せよ」(STARD)→「報告された研究の質は?」(QUADAS)
  • 9人の専門家が28→14項目に絞り込んだもの(Whiting et al., 2003)
  • Yes/No/Unclear
• QUADAS-2 (Whiting et al. 2011)
  • patient selection, index test, reference standard, flow and timing
  • sources of biasとapplicabilityに分類
• QAREL (Quality Appraisal of Reliability Studies; Lucas et al., 2010)
  • 診断の信頼性(⇔accuracy)についての研究の質評定ツール
QUADASの内容

A2
A1
C1
D1
D2
D3
C3
B5
C4
B1
C2

B2
D4
D5
QUADAS⇒STARDの対応

                                     STARD3, 4, 15, 18
                      STARD3,5
                                          STARD7
                                       STARD17
            STARD16
                                                   STARD16

             STARD9                            STARD8

    STARD8
  STARD11
STARD11
                  STARD11
                                 STARD16, 20, 22
                       STARD16
STARD⇒QUADASの対応
Section and Topic       Item                                                                                          On page #
                          #
TITLE/ABSTRACT/           1    Identify the article as a study of diagnostic accuracy (recommend MeSH
KEYWORDS                       heading 'sensitivity and specificity').
INTRODUCTION             2     State the research questions or study aims, such as estimating diagnostic
                               accuracy or comparing accuracy between tests or across participant groups.
METHODS
   Participants          3     The study population: The inclusion and exclusion criteria, setting and                Q1, Q2
                               locations where data were collected.
                         4     Participant recruitment: Was recruitment based on presenting symptoms, results
                               from previous tests, or the fact that the participants had received the index tests     Q1
                               or the reference standard?
                         5     Participant sampling: Was the study population a consecutive series of
                               participants defined by the selection criteria in item 3 and 4? If not, specify         Q2
                               how participants were further selected.
                         6     Data collection: Was data collection planned before the index test and reference
                               standard were performed (prospective study) or after (retrospective study)?
     Test methods        7     The reference standard and its rationale.                                               Q3
                         8     Technical specifications of material and methods involved including how and
                               when measurements were taken, and/or cite references for index tests and               Q8, 9
                               reference standard.
                         9     Definition of and rationale for the units, cut-offs and/or categories of the results
                               of the index tests and the reference standard.                                          Q7
                        10     The number, training and expertise of the persons executing and reading the
                               index tests and the reference standard.
                         11    Whether or not the readers of the index tests and reference standard were blind
                               (masked) to the results of the other test and describe any other clinical
                                                                                                                      Q10, 11,
                               information available to the readers.                                                   12
  Statistical methods   12     Methods for calculating or comparing measures of diagnostic accuracy, and the
                               statistical methods used to quantify uncertainty (e.g. 95% confidence intervals).
                        13     Methods for calculating test reproducibility, if done.
STARD⇒QUADASの対応

RESULTS
    Participants   14   When study was performed, including beginning and end dates of recruitment.
                   15   Clinical and demographic characteristics of the study population (at least
                        information on age, gender, spectrum of presenting symptoms).                                Q1
                   16   The number of participants satisfying the criteria for inclusion who did or did not
                        undergo the index tests and/or the reference standard; describe why participants
                                                                                                                    Q13,14
                        failed to undergo either test (a flow diagram is strongly recommended).
                                                                                                                    Q5, Q6
    Test results   17   Time-interval between the index tests and the reference standard, and any treatment
                        administered in between.                                                                     Q4
                   18   Distribution of severity of disease (define criteria) in those with the target condition;
                        other diagnoses in participants without the target condition.                                Q1
                   19   A cross tabulation of the results of the index tests (including indeterminate and
                        missing results) by the results of the reference standard; for continuous results, the
                        distribution of the test results by the results of the reference standard.

                   20   Any adverse events from performing the index tests or the reference standard.                Q13
     Estimates     21   Estimates of diagnostic accuracy and measures of statistical uncertainty (e.g. 95%
                        confidence intervals).
                   22   How indeterminate results, missing data and outliers of the index tests were handled.
                                                                                                                     Q13
                   23   Estimates of variability of diagnostic accuracy between subgroups of participants,
                        readers or centers, if done.
                   24   Estimates of test reproducibility, if done.
DISCUSSION         25   Discuss the clinical applicability of the study findings.
QUADASの使用例
(Verweij et al., 2012)
QUADASの使用例
(Verweij et al., 2012)
構成
• 診断精度の分析:おさらい
• 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011)
   –A. 標本選択 (patient selection)
   –B. 指標検査 (index test)
   –C. 参照基準 (reference standard)
   –D. 時間進行 (flow and timing)
• QUADAS (Quality Assessment of Diagnostic Accuracy Studies)
• 書き方
   – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ)
• エピローグ
・侵襲的検査 (参照基準/gold standard)
         ・Amniocentesis:羊水穿刺
         ・Chorionic villus sampling: 絨毛採取
        …流産のリスク1%
…       ・既存の非侵襲的検査
         ・母体血清マーカー検査
         ・超音波診断
        …不完全。多くの人に侵襲的検査の必要
    …
        ・指標検査
         ・細胞外DNA検査
         ・母親の血液中に含まれる胎児のDNA
          を調べ,21番染色体の量を判断

        ・目的
         通常の非侵襲的検査を受け,侵襲的検
         査を指示された妊婦を対象に,細胞外
         DNA検査の診断精度を調べる
調査参加者について記述(3~6, 14)



               ・いつ
               ・どこで
               ・誰を
               ・どのように

              …集めたか?
調査参加者について記述(STARD 3~6,14)
検査の実施方法についての記述(STARD 7~11,17)
(STARD 8)
不完全なデータの扱いについての記述(16,20,22)
Flow diagram (16)
残ったデータについての記述(15,18,19)




              ・結果として誰が残ったか?
              ・わかりやすく表にして
残ったデータの
 特徴 (15,18)
指標検査*参照基準 (19)
診断精度に関する統計的方法・結果(12,13,21,23,24)
構成
• 診断精度の分析:おさらい
• 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011)
   –A. 標本選択 (patient selection)
   –B. 指標検査 (index test)
   –C. 参照基準 (reference standard)
   –D. 時間進行 (flow and timing)
• QUADAS (Quality Assessment of Diagnostic Accuracy Studies)
• 書き方
   – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ)
• エピローグ
http://link.springer.com/article/10.1007/s10897-012-9564-0
回収率…17%
回収率…29%
(二分脊椎症) (無脳症)   (X)   (XXY)
http://j.people.com.cn/94475/8154067.html
ありがとうございました!

「診断精度の分析」の書き方
              山形伸二
 大学入試センター 入学者選抜研究機構
    shinji.yamagata[at]gmail.com
引用文献
Benyamin et al. (in press). Childhood intelligence is heritable, highly polygenic and associated with FNBP1L. Mol Psyhiatry.
      http://www.nature.com/mp/journal/vaop/ncurrent/full/mp2012184a.html
Bianchi et al. (2012). Genome-wide fetal aneuploidy detection by maternal plasma DNA sequencing. Obstet Gynecol, 119, 890-901.
Bossuyt et al. (2003) The STARD statement for reporting studies of diagnostic accuracy: Explanation and elaboration. Ann Intern Med, 138, W1-
      W12.
Devers et al. (in press). Noninvasive prenatal testing/noninvasive prenatal diagnosis: the position of the National Society
     of Genetic Counselors. J Genet Couns. http://link.springer.com/article/10.1007/s10897-012-9564-0
Greely (2011). Get ready for the flood of fetal gene screening. Nature, 469, 289–291.
Hayden (2011). Fetal gene screening comes to market. Nature, 478, 440. Nature, 469, 289–291.
Fan et al. (2012). Non-invasive prenatal measurement of the fetal genome. Nature, 487, 320–326.
Lijmer et al. (1999) Empirical evidence of design-related bias in studies of diagnostic tests. JAMA, 282, 1061-1066.
Lucas et al. (2010) The development of quality appraisal tool for studies of diagnostic accuracy. J Clin Epidemiol, 63, 854-861.
Mansfield et al. (1999). Termination rates after prenatal diagnosis of Down syndrome, spina bifida, anencephaly, and Turner and Klinefelter
      syndromes: a systematic literature review. Prenat Diagn, 19, 808-12.
Palomaki et al. (2011). DNA sequencing of maternal plasma to detect Down syndrome: An international clinical validation study. Genet Med, 13,
      913-20.
Palomaki et al. (2012). DN A sequencing of maternal plasma reliably identifies trisomy 18 and trisomy 13 as well as Down syndrome:
      an international collaborative study. Genet Med, 14, 296-305.
Skotko (2009). With new prenatal testing, will babies with Down syndrome slowly disappear? Arch Dis Child, 94, 823-826.
Skotko (2011). Self-perceptions from people with Down syndrome. Am J Med Genet Part A, 155, 2360-2369.
Skotko et al. (2011). Having a son or daughter with Down syndrome: perspectives from mothers and fathers. Am J Med Genet Part A, 155, 2335-
      2347.
Streiner (2003). Diagnosing tests: using and misusing diagnostic and screening tests. J Pers Assess, 81, 209-219.
鈴木貞夫 (2006). 診断検査の正確さの評価と比較 現代医学,53,513-518.
Verweij et al. (2012). Diagnostic accuracy of noninvasive detection of fetal trisomy 21 in maternal blood: A systematic review.
       Fetal Diagn Ther, 31, 81-86.
Whiting et al. (2003) The development of QUADAS: A tool for the quality assessment of studies of diagnostic accuracy included in systematic
      reviews. BMC Med Res Methodol, Nov 10, 3-25.
Whiting et al. (2004) Sources of variation and bias in studies of diagnostic accuracy. Ann Intern Med, 140, 189-202.
Whiting et al. (2011) QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies.
     Ann Intern Med, 155, 529-536.
参考資料 (QUADAS-2)
http://www.bris.ac.uk/quadas/resources/quadas2.pdf
参考資料 (QUADAS-2)
参考資料 (QAREL)
http://qarel.s3.amazonaws.com/qarel-checklist.pdf
参考資料 (QAREL data extraction form)
      http://qarel.org/
参考資料 (QAREL data extraction form)
      http://qarel.org/




                               …合計21項目

Mais conteúdo relacionado

Destaque

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 

Destaque (20)

Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 

書き方V1.4

  • 1. 「診断精度の分析」の書き方 山形伸二 大学入試センター 入学者選抜研究機構 shinji.yamagata[at]gmail.com
  • 2. 構成 • 診断精度の分析:おさらい • 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011) –A. 標本選択 (patient selection) –B. 指標検査 (index test) –C. 参照基準 (reference standard) –D. 時間進行 (flow and timing) • QUADAS (Quality Assessment of Diagnostic Accuracy Studies) • 書き方 – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ) • エピローグ
  • 3. 構成 • 診断精度の分析:おさらい • 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011) –A. 標本選択 (patient selection) –B. 指標検査 (index test) –C. 参照基準 (reference standard) –D. 時間進行 (flow and timing) • QUADAS (Quality Assessment of Diagnostic Accuracy Studies) • 書き方 – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ) • エピローグ
  • 4. 診断精度の指標:まとめ 真の状態 病気 病気でない 陽性的中率 A. 真陽性 B. 偽陽性 陽性 陽性数 (positive predictive value) (true positive) (false postive) = 真陽性数/陽性数 陽性率 検査 (positive rate) 陰性的中率 = 陽性数/総数 C. 偽陰性 D. 真陰性 陰性 陰性数 (negative predictive value) (false negative) (true negative) = 真陰性数/陰性数 病者数 非病者数 N. 総数 正診率(accuracy) = 感度 (sensitivity) = 特異度 (specificity) = (真陽性数+真陰性数) 真陽性数/病者数 真陰性数/非病者数 総数 有病率(prevalence) = 病者数/総数 陽性尤度比 = 感度/(1-特異度): 特異度100%(偽陽性0%)の時∞…確定診断 陰性尤度比 = (1-感度)/特異度: 感度100%(偽陰性0%)の時0…除外診断 オッズ比 = AD/BC = 陽性尤度比/陰性尤度比 リスク比 = 陽性的中率/(1-陰性的中率) κ = {(A+D)- Ne(A+D)}/{N – Ne(A+D)}; where Ne(A+D) = {(A + B)(A + C) + (C + D)(B + D)}/N φ = (AD - BC)/√{(A+B)(C+D)(A+C)(B+D)} χ2 = Nφ2 検定…χ2 検定,McNemar検定
  • 5. 有病率(事前確率)の影響 •有病率 = .50の場合 真の状態 病気 病気でない 陽性的中率 A. 真陽性 B. 偽陽性 陽性数 陽性 (positive predictive value) 81 40 121 陽性率 =81/121 = .67 (positive rate) 検査 = 121/200 陰性的中率 C. 偽陰性 D. 真陰性 陰性数 = .61 陰性 (negative predictive value) 19 60 79 = 19/60 = .32 病者数 = 100 非病者数 = 100 N= 200 感度 (sensitivity) = 特異度 (specificity) = 正診率(accuracy) = 81/100 = .81 60/100 = .60 (81+60)/200 = .71 有病率(prevalence) = 100/200 = .50 based on Streiner (2003)
  • 6. 有病率(事前確率)の影響 •有病率 = .05の場合 真の状態 病気 病気でない 陽性的中率 A. 真陽性 B. 偽陽性 陽性数 陽性 (positive predictive value) 405 3800 4205 陽性率 =405/4205 = .10 (positive rate) 検査 = 4205/10000 陰性的中率 C. 偽陰性 D. 真陰性 陰性数 = .42 陰性 (negative predictive value) 95 5700 5795 = 5700/5795 = .98 病者数 = 500 非病者数 = 9500 N=10000 感度 (sensitivity) = 特異度 (specificity) = 正診率(accuracy) = 405/500 = .81 5700/9500 = .60 (405+5700)/10000 =.61 有病率(prevalence) = 500/10000 = .05 •ベイズの定理: P(真陽性|検査陽性) = P(検査陽性|真陽性)P(真陽性) {P(検査陽性|真陽性)P(真陽性)+P(検査陽性|真陰性)P(真陰性)} •テストの指標としては,感度または特異度>陽性(陰性)的中率 based on Streiner (2003)
  • 7. 感度と特異度のトレードオフ 検査陽性(感度高,特異度低) 検査陽性(感度低,特異度高) 病気でない 病気 スクリーニング尺度の得点分布
  • 8. ROC曲線(Receiver Operating Characteristics curve) とAUC (Area Under Curve) 1 0.9 0.8 0.7 0.6 0.5 検査1 感度 検査2 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-特異度 (偽陽性率) •AUC 完璧なテスト:感度=特異度100%→1.0 ランダムなテスト:対角線→0.5
  • 11. (検査Yの検査Xに対する真陽性) (検査Yの検査Xに対する偽陽性) 鈴木 (2006)
  • 12. 構成 • 診断精度の分析:おさらい • 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011) –A. 標本選択 (patient selection) –B. 指標検査 (index test) –C. 参照基準 (reference standard) –D. 時間進行 (flow and timing) • QUADAS (Quality Assessment of Diagnostic Accuracy Studies) • 書き方 – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ) • エピローグ
  • 13. 標本選択に関わる内的・外的妥当性 (bias, variation/applicability) • A1. 選択バイアス (selection bias) ・本来,標本はランダム/連続的に抽出されるべき ⇔判断の難しいケース,非典型的なケースを除外 …×内的妥当性(精度の過大推定) • A2. スペクトラム・バイアス (spectrum bias) ・本来,標本は実際の臨床で扱う母集団と同じであるべき ⇔より重症,または明確に病気である標本の使用 ⇔より健康な標本の使用 ⇔症例対照(case-control)デザインの使用 …×外的妥当性 (精度の過大推定)
  • 14. 感度と特異度のトレードオフ 検査陽性 病気でない 病気 スクリーニング尺度の得点分布
  • 15. 感度と特異度のトレードオフ 検査陽性 感度の過大推定 病気でない 病気(重症) スクリーニング尺度の得点分布
  • 16. 感度と特異度のトレードオフ 検査陽性 特異度の過大推定 感度の過大推定 病気でない(健康) 病気(重症) スクリーニング尺度の得点分布
  • 17. 指標検査に関わる内的・外的妥当性 B1. 情報バイアス (information bias) ・本来,指標検査は参照基準と独立に行われるべき ⇔指標検査者が参照基準の結果を知っている …×内的妥当性(精度の過大推定) B2. 既有知識によるバイアス(clinical review bias;context bias) ・本来,指標検査の解釈は実践と同知識量で行われるべき ⇔有病率,患者についての知識が解釈に影響 …×内的・外的妥当性(精度の過大推定)
  • 18. 指標検査に関わる内的・外的妥当性 B3. 指標検査結果の再現性(reproducibility, observer variability) ・本来,指標検査は高い信頼性(再検査,評定者間)を備えるべき ⇔信頼性が低い…×内的妥当性(精度の過小・過大推定) B4. 事後的な閾値の設定 ・本来,量的な指標検査の閾値は事前に設定されているべき ⇔事後的に手元の標本で精度が最大となるよう閾値を設定 …×外的妥当性(精度の過大推定) B5. 指標検査の実施方法の記述の不備 • 本来,実施方法は完全に再現可能な方法で記述されるべき ⇔実施方法の詳細が不明,特殊な機器を使用 …×外的妥当性
  • 19. 参照基準に関わる内的・外的妥当性 C1. 誤分類バイアス(misclassification bias) ・本来,参照基準は完全(gold standard)であるべき ⇔参照基準自体の感度・特異度に問題がある …×内的妥当性(精度の過大・過小推定) C2. 情報バイアス (information bias) C3. 包含バイアス(incorporation bias) ・本来,参照基準は指標検査と独立に行われるべき ⇔参照基準の実施者が指標検査の結果を知っている ⇔指標検査が(複合的)参照基準の一部を形成している …×内的妥当性(精度の過大推定)
  • 21. 時間進行に関わる内的・外的妥当性 D1. 疾患進行バイアス (disease progression bias) ・本来,指標検査と参照基準は間を置かず実施されるべき ⇔指標検査と参照基準実施の間に病気が進行 …×内的妥当性 D2. 部分的確証バイアス(partial verification bias) D3. 差別的確証バイアス(differential verification bias) ・本来,標本の全てが同一の参照基準で判断されるべき ⇔標本の一部(e.g.指標検査で高リスク)しか参照基準を受けない ⇔標本の一部が異なる(e.g.より精度の高い)参照基準を受ける …×内的妥当性(感度の過大推定)
  • 23. 実際の研究ではどうか?(Lijmer et al., 1999) RDOR ≒ オッズ比[条件満たさない研究]/オッズ比[条件満たした研究]
  • 24. 構成 • 診断精度の分析:おさらい • 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011) –A. 標本選択 (patient selection) –B. 指標検査 (index test) –C. 参照基準 (reference standard) –D. 時間進行 (flow and timing) • QUADAS (Quality Assessment of Diagnostic Accuracy Studies) • 書き方 – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ) • エピローグ
  • 25. QUADASとは • STARD (Statement for Reporting Studies of Diagnostic Accuracy) • 個々の研究報告の質向上のためのガイドライン (Bossuyt et al., 2003) • QUADAS (Quality Assessment of Diagnostic Accuracy Studies) • 系統的レビューのための研究の質評定ツール • 「まず報告せよ」(STARD)→「報告された研究の質は?」(QUADAS) • 9人の専門家が28→14項目に絞り込んだもの(Whiting et al., 2003) • Yes/No/Unclear • QUADAS-2 (Whiting et al. 2011) • patient selection, index test, reference standard, flow and timing • sources of biasとapplicabilityに分類 • QAREL (Quality Appraisal of Reliability Studies; Lucas et al., 2010) • 診断の信頼性(⇔accuracy)についての研究の質評定ツール
  • 27. QUADAS⇒STARDの対応 STARD3, 4, 15, 18 STARD3,5 STARD7 STARD17 STARD16 STARD16 STARD9 STARD8 STARD8 STARD11 STARD11 STARD11 STARD16, 20, 22 STARD16
  • 28. STARD⇒QUADASの対応 Section and Topic Item On page # # TITLE/ABSTRACT/ 1 Identify the article as a study of diagnostic accuracy (recommend MeSH KEYWORDS heading 'sensitivity and specificity'). INTRODUCTION 2 State the research questions or study aims, such as estimating diagnostic accuracy or comparing accuracy between tests or across participant groups. METHODS Participants 3 The study population: The inclusion and exclusion criteria, setting and Q1, Q2 locations where data were collected. 4 Participant recruitment: Was recruitment based on presenting symptoms, results from previous tests, or the fact that the participants had received the index tests Q1 or the reference standard? 5 Participant sampling: Was the study population a consecutive series of participants defined by the selection criteria in item 3 and 4? If not, specify Q2 how participants were further selected. 6 Data collection: Was data collection planned before the index test and reference standard were performed (prospective study) or after (retrospective study)? Test methods 7 The reference standard and its rationale. Q3 8 Technical specifications of material and methods involved including how and when measurements were taken, and/or cite references for index tests and Q8, 9 reference standard. 9 Definition of and rationale for the units, cut-offs and/or categories of the results of the index tests and the reference standard. Q7 10 The number, training and expertise of the persons executing and reading the index tests and the reference standard. 11 Whether or not the readers of the index tests and reference standard were blind (masked) to the results of the other test and describe any other clinical Q10, 11, information available to the readers. 12 Statistical methods 12 Methods for calculating or comparing measures of diagnostic accuracy, and the statistical methods used to quantify uncertainty (e.g. 95% confidence intervals). 13 Methods for calculating test reproducibility, if done.
  • 29. STARD⇒QUADASの対応 RESULTS Participants 14 When study was performed, including beginning and end dates of recruitment. 15 Clinical and demographic characteristics of the study population (at least information on age, gender, spectrum of presenting symptoms). Q1 16 The number of participants satisfying the criteria for inclusion who did or did not undergo the index tests and/or the reference standard; describe why participants Q13,14 failed to undergo either test (a flow diagram is strongly recommended). Q5, Q6 Test results 17 Time-interval between the index tests and the reference standard, and any treatment administered in between. Q4 18 Distribution of severity of disease (define criteria) in those with the target condition; other diagnoses in participants without the target condition. Q1 19 A cross tabulation of the results of the index tests (including indeterminate and missing results) by the results of the reference standard; for continuous results, the distribution of the test results by the results of the reference standard. 20 Any adverse events from performing the index tests or the reference standard. Q13 Estimates 21 Estimates of diagnostic accuracy and measures of statistical uncertainty (e.g. 95% confidence intervals). 22 How indeterminate results, missing data and outliers of the index tests were handled. Q13 23 Estimates of variability of diagnostic accuracy between subgroups of participants, readers or centers, if done. 24 Estimates of test reproducibility, if done. DISCUSSION 25 Discuss the clinical applicability of the study findings.
  • 32.
  • 33.
  • 34. 構成 • 診断精度の分析:おさらい • 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011) –A. 標本選択 (patient selection) –B. 指標検査 (index test) –C. 参照基準 (reference standard) –D. 時間進行 (flow and timing) • QUADAS (Quality Assessment of Diagnostic Accuracy Studies) • 書き方 – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ) • エピローグ
  • 35.
  • 36.
  • 37.
  • 38. ・侵襲的検査 (参照基準/gold standard) ・Amniocentesis:羊水穿刺 ・Chorionic villus sampling: 絨毛採取 …流産のリスク1% … ・既存の非侵襲的検査 ・母体血清マーカー検査 ・超音波診断 …不完全。多くの人に侵襲的検査の必要 … ・指標検査 ・細胞外DNA検査 ・母親の血液中に含まれる胎児のDNA を調べ,21番染色体の量を判断 ・目的 通常の非侵襲的検査を受け,侵襲的検 査を指示された妊婦を対象に,細胞外 DNA検査の診断精度を調べる
  • 39. 調査参加者について記述(3~6, 14) ・いつ ・どこで ・誰を ・どのように …集めたか?
  • 43.
  • 45.
  • 47. 残ったデータについての記述(15,18,19) ・結果として誰が残ったか? ・わかりやすく表にして
  • 51.
  • 52.
  • 53.
  • 54. 構成 • 診断精度の分析:おさらい • 診断精度の研究と内的・外的妥当性 (Whiting et al., 2004; 2011) –A. 標本選択 (patient selection) –B. 指標検査 (index test) –C. 参照基準 (reference standard) –D. 時間進行 (flow and timing) • QUADAS (Quality Assessment of Diagnostic Accuracy Studies) • 書き方 – 新型出生前診断の診断精度 (Chiu et al. 2011, BMJ) • エピローグ
  • 55.
  • 58.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 67. ありがとうございました! 「診断精度の分析」の書き方 山形伸二 大学入試センター 入学者選抜研究機構 shinji.yamagata[at]gmail.com
  • 68. 引用文献 Benyamin et al. (in press). Childhood intelligence is heritable, highly polygenic and associated with FNBP1L. Mol Psyhiatry. http://www.nature.com/mp/journal/vaop/ncurrent/full/mp2012184a.html Bianchi et al. (2012). Genome-wide fetal aneuploidy detection by maternal plasma DNA sequencing. Obstet Gynecol, 119, 890-901. Bossuyt et al. (2003) The STARD statement for reporting studies of diagnostic accuracy: Explanation and elaboration. Ann Intern Med, 138, W1- W12. Devers et al. (in press). Noninvasive prenatal testing/noninvasive prenatal diagnosis: the position of the National Society of Genetic Counselors. J Genet Couns. http://link.springer.com/article/10.1007/s10897-012-9564-0 Greely (2011). Get ready for the flood of fetal gene screening. Nature, 469, 289–291. Hayden (2011). Fetal gene screening comes to market. Nature, 478, 440. Nature, 469, 289–291. Fan et al. (2012). Non-invasive prenatal measurement of the fetal genome. Nature, 487, 320–326. Lijmer et al. (1999) Empirical evidence of design-related bias in studies of diagnostic tests. JAMA, 282, 1061-1066. Lucas et al. (2010) The development of quality appraisal tool for studies of diagnostic accuracy. J Clin Epidemiol, 63, 854-861. Mansfield et al. (1999). Termination rates after prenatal diagnosis of Down syndrome, spina bifida, anencephaly, and Turner and Klinefelter syndromes: a systematic literature review. Prenat Diagn, 19, 808-12. Palomaki et al. (2011). DNA sequencing of maternal plasma to detect Down syndrome: An international clinical validation study. Genet Med, 13, 913-20. Palomaki et al. (2012). DN A sequencing of maternal plasma reliably identifies trisomy 18 and trisomy 13 as well as Down syndrome: an international collaborative study. Genet Med, 14, 296-305. Skotko (2009). With new prenatal testing, will babies with Down syndrome slowly disappear? Arch Dis Child, 94, 823-826. Skotko (2011). Self-perceptions from people with Down syndrome. Am J Med Genet Part A, 155, 2360-2369. Skotko et al. (2011). Having a son or daughter with Down syndrome: perspectives from mothers and fathers. Am J Med Genet Part A, 155, 2335- 2347. Streiner (2003). Diagnosing tests: using and misusing diagnostic and screening tests. J Pers Assess, 81, 209-219. 鈴木貞夫 (2006). 診断検査の正確さの評価と比較 現代医学,53,513-518. Verweij et al. (2012). Diagnostic accuracy of noninvasive detection of fetal trisomy 21 in maternal blood: A systematic review. Fetal Diagn Ther, 31, 81-86. Whiting et al. (2003) The development of QUADAS: A tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol, Nov 10, 3-25. Whiting et al. (2004) Sources of variation and bias in studies of diagnostic accuracy. Ann Intern Med, 140, 189-202. Whiting et al. (2011) QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med, 155, 529-536.
  • 72. 参考資料 (QAREL data extraction form) http://qarel.org/
  • 73. 参考資料 (QAREL data extraction form) http://qarel.org/ …合計21項目