21

International Journal for Quality in Health Care 1999; Volume 11, Number 1: pp. 21–28

Development and application of a generic
methodology to assess the quality of
clinical guidelines
FRANCOISE A. CLUZEAU1, PETER LITTLEJOHNS1, JEREMY M. GRIMSHAW2, GENE FEDER3 AND
¸
SARAH E. MORAN1
1
Health Care Evaluation Unit, St George’s Hospital Medical School, London, 2Health Services Research Unit, University of Aberdeen
and 3Department of General Practice and Primary Care, St Bartholomew’s and the Royal London School of Medicine and Dentistry,
London, UK

Abstract
Background. Despite clinical guidelines penetrating every aspect of clinical practice and health policy, doubts persist over
their ability to improve patient care. We have designed and tested a generic critical appraisal instrument, that assesses whether
developers have minimized the biases inherent in creating guidelines, and addressed the requirements for effective
implementation.
Design. Thirty-seven items describing suggested predictors of guideline quality were grouped into three dimensions covering
the rigour of development, clarity of presentation (including the context and content) and implementation issues. The ease
of use, reliability and validity of the instrument was tested on a national sample of guidelines for the management of asthma,
breast cancer, depression and coronary heart disease, with 120 appraisers. A numerical score was derived to allow comparison
of guidelines within and between diseases.
Results. The instrument has acceptable reliability (Cronbach’s coefficient, 0.68–0.84; intra-class correlation coefficient,
0.82–0.90). The results provided some evidence of validity (Pearson’s correlation coefficient between appraisers’ dimension
scores and their global assessment was 0.49 for dimension one, 0.63 for dimension two and 0.40 for dimension three). The
instrument could differentiate between national and local guidelines and was easy to apply. There was variation in the
performance of guidelines with most not achieving a majority of criteria in each dimension.
Conclusions. Use of this instrument should encourage developers to create guidelines that reflect relevant research evidence
more accurately. Potential users or groups adapting guidelines for local use could apply the instrument to help decide which
one to follow. The National Health Service Executive is using the instrument to assist in deciding which guidelines to
recommend to the UK National Health Service. This methodology forms the basis of a common approach to assessing
guideline quality in Europe.
Keywords: appraisal, clinical guidelines, instrument, quality, reliability, validity

Clinical guidelines are now ubiquitous in every aspect of An increasing concern is the number of disease-specific
clinical practice and health policy. They are expected to fulfil guidelines that offer inconsistent advice [6,7]. Many reasons
a myriad of roles from increasing the uptake of research have been put forward to explain this variability ranging
findings [1] to facilitating the rationing of health care [2]. from lack (or differing interpretation) of underlying research
Whilst there is evidence that guidelines can improve clinical findings, different values given to anticipated outcome (for
practice, their successful introduction is dependent on many example clinical versus economic), dubious achievement of
factors, including the clinical context, methods of de- consensus and possible bias introduced through conflicts of
velopment, dissemination and implementation [3]. Suc- interest. Faced with this diversity, potential users will want
cessfully addressing all of these issues in routine practice can to make an informed choice. However the information on
prove difficult [4], but is necessary if guidelines are to improve which to base this judgement is often lacking [8,9]. Ideally,
the quality of health care [5]. data from a formal evaluation of the ability of the guidelines

Address correspondence to Francoise Cluzeau, Health Care Evaluation Unit, St George’s Hospital Medical School, Cranmer
¸
Terrace, London, SW17 0RE, UK. Tel: +44 181 725 2771. Fax: + 44 181 725 3584. E-mail: F.Cluzeau@SGHMS.ac.uk

© 1999 International Society for Quality in Health Care and Oxford University Press 21

F. A. Cluzeau et al.

to bring about the anticipated health outcomes when adhered contains 20 items and assesses responsibility and endorsement
to (defined as validity [10]) would be available; in reality there for the guidelines, the composition of the development group,
is a virtual absence of this type of outcome data for most identification and interpretation of evidence, the link between
guidelines. Moreover when results from carefully controlled evidence and main recommendation, peer review and up-
randomized trials of guidelines implementation strategies are dating. The second dimension, context and content, contains
available they may not necessarily be generalizable to a routine 12 items addressing the attributes of guideline reliability,
clinical setting [11]. In the absence of appropriate outcome applicability, flexibility and clarity. It assesses the aims of the
indicators on which to judge effectiveness, most assessments guidelines, the target population, circumstances for applying
of clinical quality substitute process and structural criteria the recommendations, presentation and format of the guide-
[12]. Indeed this is often the most practical way to assess lines and estimated benefits–harms and costs. The third
quality of care on a routine basis [13]. Using this approach dimension, application, contains five items addressing the
to the assessment of guidelines requires the determination of implementation, dissemination and monitoring strategies. All
whether guideline developers have been rigorous in min- three dimensions assess the adequacy of documentation.
imizing the potential biases in creating the guideline [14], in Each item inquires whether information is present and
essence, critically appraising guidelines. then requires a judgement about the quality of the information.
There is increasing published work on how to critically The specific questions demand ‘yes’, ‘no’, ‘not sure’ answers.
appraise primary research and reviews [15–17]. This work An option for ‘not applicable’ answers is available for some
has been stimulated by the Cochrane Collaboration [18]. items. To ensure that the questions were interpreted con-
However, the application of this approach to guidelines is in sistently and to minimize the need for judgement a user
its infancy. In 1992 the Institute of Medicine (IOM) started manual was designed; this contained a detailed explanation
the process by developing a provisional, if unwieldy, appraisal of the meaning of each question [23], and suggested cir-
instrument based on ‘desirable attributes’ of good guidelines cumstances where a ‘yes’ answer may be appropriate. In the
[19]. Subsequently, shorter checklists were produced in study a global assessment of the guidelines was asked for, as
Canada [20] and Australia [21] but their usefulness has never a measure of overall quality: ‘strongly recommended’ (for use
been formally assessed. in practice without modifications); ‘recommended’ (for use
In June 1993, The UK National Health Service Man- in practice on condition of some alterations or with provisos);
agement Executive organized a workshop to explore the or ‘not recommended’ (not suitable for use in practice).
issues around assessing the quality of guidelines. A research
programme was initiated to produce a generic instrument to
Selection of guidelines for appraisal
appraise guidelines. The instrument should be capable of
being applied by anyone (general or specialist clinicians, Sixty guidelines were selected from a national survey of
health care managers, and researchers) interested in assessing UK guidelines between January 1991 and January 1996 on
guidelines and should allow comparison between guidelines. coronary heart disease, asthma, breast cancer and depression
This paper describes the creation of the instrument, an (15 guidelines per disease group) [24]. The size of the
assessment of its validity and reliability, and a description of sample was based on Nunnally’s recommendation that at
the quantity and quality of UK guidelines for the management least 300 observations are needed for inter-rater test of
of coronary artery disease, depression, breast cancer and reliability [25]. We hypothesized that national guidelines
asthma. would be more systematically developed than local ones.
All 12 guidelines produced by nationally recognized or-
ganizations or commissioned by the NHS Executive were
Methods selected. Forty-eight local guidelines were drawn through
a random sample. Guideline authors were asked to provide
Appraisal instrument copies of their guidelines and information on how their
guidelines had been developed.
The purpose of the appraisal instrument is to assess the extent
to which clinical guidelines are ‘systematically developed’ Appraisers
[22], and take into account known determinants of effective
strategies for dissemination and implementation. Initially the Each guideline was assessed independently by six appraisers
reliability and face validity of the IOM instrument was tested (120 in total). Each assessed three guidelines. Each block of
on five UK guidelines with seven appraisers in a pilot study three guidelines (20 blocks altogether) was assessed by the
[7]. Based on these results potential questions for a simplified same six appraisers (Figure 1). These included a national
appraisal tool were circulated to individuals interested in expert in the disease area, a general practitioner, a public
guideline development for comments. The revised list con- health physician, a hospital consultant physician, a nurse
tained 37 items (see Appendix). These address different specializing in the disease area, and a researcher on guideline
aspects and are categorized into three conceptual dimensions methodology. They were recruited through UK cardiac units,
which could be mapped to the IOM attributes. The first asthma centres, the Royal College of General Practitioners,
dimension, rigour of development, reflects the attributes respondents to the survey, the Royal College of Nursing and
necessary to enhance guideline validity and reproducibility. It research institutions and were randomly allocated guidelines.

22

Guidelines appraisal methodology

Guidelines calculating Pearson’s correlation coefficients between ap-
praisers’ dimension scores and their global assessment of a
Appraisers 1 2 3
1
guideline. We predicted that dimension scores for national
2
3
guidelines would be higher than those for local guidelines.
4
5
In an attempt to investigate validity further, analysis of
6 4 5 6
7
variance (ANOVA) was used to test this hypothesis. ANOVA
8
9 was also used to examine the effect of year of publication,
10
11 disease area and level of background information on guideline
12 7 8 9
13 dimension scores. Year of publication was classified into three
14
15 categories: pre 1994, 1994–1996 and unknown. These were
16
17 chosen because a number of influential papers and re-
18 10 11 12
19 commendations had been published about the development
20
21 of guidelines in 1993 [29,30]. A zero skewness log trans-
22
23
13 14 15
formation was used in the ANOVA for dimensions one and
24
25 three because the scores were not normally distributed.
26
27 Mann–Whitney tests were used on individual appraisers’
28
29 scores to examine differences between professional groups.
30
Appraisers who omitted at least one question in a di-
mension were excluded from calculations of the ICCs and
Figure 1 Design for the assessment of coronary heart disease Pearson’s correlation coefficients for that dimension.
guidelines (design repeated for other three disease areas:
asthma, breast cancer and depression).
Results
Analysis
Background information was received for 53 guidelines. Five
In order to allow comparison of guideline performance, guidelines (three national and two local) had a background
dimension scores for each guideline were calculated. A ‘yes’ document with details of their development process. Com-
response was given a value of 1 and other responses ( ‘no’, pleted structured questionnaires were available for 46 guide-
‘not sure’ and ‘not applicable’) a value of zero. Individual lines and two authors provided information in a letter. No
appraisers’ dimension scores were calculated by summing additional information was available for seven local guidelines.
their scores for each item within a dimension. A guideline Thirty-eight guidelines had been published between 1994 and
dimension score was obtained by calculating the mean of the 1996, 14 between 1992 and 1993 and eight documents were
appraisers’ scores. This was then expressed as a percentage undated. One appraiser had been closely associated with the
of the maximum possible score for that dimension in order development of one of the guidelines and therefore assessed
to compare scores across the three dimensions. only two guidelines.
Figure 2 shows the distribution of guideline scores for
Item dimension
each dimension. Over two-thirds of guidelines scored less
We calculated Pearson’s correlation coefficients between each than 50 on dimension one, which means that less than 50%
item and dimension scores, omitting the index item, to check of criteria for rigorous development were met. The median
that each item was in the appropriate dimension [26]. for dimension one was 30.4 with a wide range of 0.8–85.
The median score was higher for dimension two (47.9).
Reliability Performance was poorest on dimension three (median 24.2).
Reliability of the instrument was assessed in two ways: The distribution for this dimension was very skewed with
first, internal consistency was measured by calculating the scores ranging from 0 to 95.
correlation between all items within a dimension to test to
what extent they measured the same underlying concept, Item dimension
using Cronbach’s coefficient [27]. Second, inter-rater agree-
Items were in the appropriate dimension as all but two
ment was measured by calculating the intra-class correlation
correlated more highly with their dimension scores than with
coefficient (ICC) for the dimension scores according to the
the other two dimensions’ scores (table of results available
criteria of Shrout and Fleiss [28]. Calculations were based on
from the authors).
the assumption that each guideline was assessed by a different
set of appraisers.
Reliability
Validity
All three dimensions had good internal consistency (Cron-
In the absence of a gold standard or a validated measure of bach’s , 0.68–0.84) and excellent inter-rater agreement (ICCs,
guideline quality, evidence of criterion validity was sought by 0.82–0.90) and narrow confidence intervals [31] (Table 1).

23


Validity
The Pearson’s correlation coefficients between appraisers’
dimension scores and their global assessment were 0.49 (n=
311) for dimension one, 0.63 (n=319) for dimension two
and 0.40 (n=315) for dimension three. All coefficients were
highly significant (P<0.0001), providing evidence of criterion
validity. However it should be noted that the appraisers made
their global assessment after completing the instrument, so
a significant correlation would be expected.
Mean standardized guideline scores are presented in Table
2. National guidelines had a significantly higher score than
local guidelines for the three dimensions (dimension one
P<0.001, dimension two P=0.0008, dimension three P=
0.04), confirming our a priori hypothesis, and hence providing
some further evidence of validity. Guidelines with a back-
ground document or a form performed significantly better
than others on dimension one (P<0.001), although numbers
were small and confidence intervals were wide.
Median scores for researchers was significantly lower than
that for the nurses in all dimensions. The median scores for
consultant physicians, general practitioners and public health
physicians were significantly higher than those of the re-
searchers for dimension two (table of results available from
the authors).

Discussion
This study has shown that it is feasible to develop an
instrument that can be used for appraising the methodological
quality of clinical guidelines. The instrument has good re-
liability and there is suggestion of validity. Assessing the
quality of any health care intervention is complex because of
the multidimensional nature of the concept [32], and guide-
lines are no exception [33]. Although the linear separation
of the creation process into development, dissemination, and
implementation provides a useful framework [3], interactions
between these stages and differing perceptions by the various
Figure 2 Frequency distribution of guidelines’ standardized participants (developer, user, patient and payer) of what are
scores by dimension. satisfactory outcomes creates a more complicated picture. It
is possible to use randomized controlled trials to assess
whether guidelines can change practice in the required dir-
ection [34,35], but observational studies can also provide
useful information on the performance of a guideline in
practice [36]. However, even this level of evaluative data is
rarely available for newly developed guidelines. Furthermore,
Table 1 Cronbach’s correlation coefficient and intraclass
researchers require reassurance that they have a ‘good enough’
correlation coefficient
guideline before embarking on a long and expensive evalu-
ation. The development of a useable, valid and reliable generic
Dimension Cronbach’s ICC (95% CI∗)
............................................................................................................ instrument to assess the rigour with which guidelines are
1. Rigour of 0.84 0.90 (0.85–0.93) created provides an essential first step in the evaluative
development process. Aside from the use of guidelines in research, potential
2. Context and content 0.78 0.82 (0.74–0.89) users of the guidelines (either for direct patient care or
3. Application 0.68 0.84 (0.77–0.90) commissioning of health care) and groups adapting guidelines
for local use, need to make systematic and reliable judgements
∗ Confidence intervals (CI) were calculated using Wald’s Method of their quality. This instrument provides a basis for these
[28]. judgements.

24


Table 2 Mean standardized guidelines scores and their confidence intervals (CI) for each dimension according to type of
guideline, level of information, disease area and year of production

Dimension 1 Dimension 2 Dimension 3
.................................................... .................................................... ......................................................
Mean 95% CI Mean 95% CI Mean 95% CI
.............................................................................................................................................................................................................................
All guidelines (n=60) 34.0 29.6–38.3 46.2 41.8–50.7 29.0 22.9–35.1
Type of guideline
Local (n=48)1 29.2 25.3–33.1 42.3 37.5–47.0 26.5 19.9–33.2
National (n=12) 52.93 42.7–63.1 62.03 55.0–68.9 38.82 23.0–54.7
Level of information
Nothing (n= 7)1 12.4 3.0–21.9 40.4 19.9–60.9 14.3 −4.6–33.1
Letter (n=2) 28.8 2.3–55.2 58.3 40.7–76.0 20.6 −88.5–129.6
Form (n=46) 35.83 31.3–40.3 45.9 41.0–50.8 31.1 24.1–38.1
Document (n=5) 49.13 27.7–70.5 52.5 29.5–75.5 34.0 −2.0–70.0
Disease area
Cancer (n=15)1 33.6 25.3–42.0 48.1 40.0–56.2 29.6 16.5–42.7
Depression (n=15) 33.1 26.8–39.3 46.2 35.2–57.2 29.5 20.4–38.6
Asthma (n=15) 34.1 21.4–46.8 46.0 35.6–56.3 27.3 13.7–40.9
Coronary heart 35.1 25.3–44.9 44.5 35.2–53.8 29.6 12.8–46.4
disease (n=15)
Year of publication
Unknown (n=8)1 23.0 8.6–37.4 33.9 22.9–44.9 12.8 1.9–23.7
1994–1996 (n=38) 36.5 30.9–42.2 47.5 42.1–53.0 34.32 26.0–42.5
Pre-1994 (n=14) 33.3 25.4–41.1 49.6 38.8–60.5 24.0 12.9–35.1
1
Reference category.
2
0.001< P<0.05.
3
PΖ0.001.

To our knowledge this is the first time a study of its kind Executive in the UK to help to decide which guidelines are
has been undertaken and there are a number of method- to be recommended to the NHS (a dichotomous outcome
ological issued that need to be addressed. First, the de- for each guideline) [39]. It has formed the basis of a BIOMED-
velopment of appraisal criteria is conceptually similar to the 2 research project involving 10 European countries and
development of instruments or checklists for assessing the Canada to identify the reasons for differences in guideline
quality of randomized controlled trials [15,37]. This means recommendations across countries [40]; this will provide a
that the instrument will require regular testing and revision. further test of validity by assessing the usefulness of the
Further work on the instrument will need to be undertaken scoring system in explaining inconsistencies of guidelines
to examine issues of validity, item refinement and weighting recommendations.
of items. Second, an apparently rigorous development process In practice, a key indicator of its value will be whether
can still hide aberrant clinical recommendations. Therefore it is perceived to be useful in helping developers to address
detailed analysis and comparison of the clinical content of the issues necessary to produce good quality guidelines
guidelines is also necessary to ensure that the re- and potential users (individuals and organizations) to decide
commendations are clinically sound before they can be ad- on which guidelines to use. The progression from a
opted. For example, we will need to test the instrument qualitative to quantitative assessment of guidelines should
against guidelines that have been shown to have predictive facilitate this process. Rather than splitting guidelines into
validity, such as the Ottawa ankle rules [38]. Third, it remains good or bad, the instrument provides a numerical description
an assumption that the structural and process factors that of a guideline in three key dimensions. This allows potential
are assessed by the instrument are true determinants of valid users to relate a guideline to the whole population of
and effective guidelines. It would be reassuring if the validity guidelines and then to decide on the basis of their
of this approach itself (as opposed to the validity of the requirements. For example, a hospital wishing to introduce
instrument) could be subjected to a formal evaluation. How- a guideline for the management of asthma may want to
ever this can only happen if the instrument is widely used look at guideline X because of its rigour of development
and the performance of the guidelines as described by the but also to review guideline Y because it scores high on
instrument is compared to an external standard. This is clarity. This approach could provide a quality dimension
beginning to happen as the instrument has been translated to the databases of guidelines that are now emerging in
into Italian and French and is being used by the NHS Canada [41], America [42] and Germany [43].

25


Acknowledgements 17. Moher D, Jadad AR, Nichol G et al. Assessing the quality of
randomised controlled trials: an annotated bibliography of scales
and checklists. Controlled Clin Trials 1995; 16: 62–73.
The authors thank the guidelines authors for making their
documents available, the Royal College of General Prac- 18. Chalmers I, Haynes B. Reporting, updating, and correcting
titioners for recruiting the general practitioners, Professor systematic reviews of the effects of health care. Br Med J 1994;
Martin Bland for his helpful comments on earlier drafts, and 309: 862–865.
the appraisers. 19. Lohr KN, Field MJ. A Provisional Instrument for Assessing Clinical
Practice Guidelines. In Institute of Medicine, Field MJ, Lohr K,
eds, Guidelines: from Practice to Use. Washington DC: National
References Academy Press, 1992. (Appendix B).
20. Hayward RS, Wilson MC, Tunis SR et al. More informative
1. Haines A, Jones R. Implementing findings of research. Br Med abstracts of articles describing clinical practice guidelines. Ann
J 1994; 308:1488–1492. Intern Med 1993; 118: 731–737.
2. Durand-Zaleski I, Colin C, Blum-Boisgard C. An attempt to
21. Liddle J, Williamson M, Irwig L. Method for Evaluating Research
save money by using mandatory practice guidelines in France.
and Guideline Evidence. Sydney: NSW Health Department, 1996.
Br Med J 1997; 315: 943–946.
22. Institute of Medicine. Field MJ, Lohr K. (eds) Clinical Practice
3. Grimshaw JM, Russell IT. Effect of clinical guidelines on
Guidelines: Directions for a New Program: 38. Washington DC:
medical practice: a systematic review of rigorous evaluations.
National Academy Press, 1990.
Lancet 1993; 342: 1317–1322.
23. Ware Jr JE, Snow K, Kosinski M, Gandek B. SF-36 Health
4. Paccaud F. Variation in guidelines. J Health Serv Res Policy 1997;
Survey: Manual and Interpretation Guide. Boston, MA: New England
2: 53–55.
Medical Center, 1993.
5. Brook RH. Implementing medical guidelines. Lancet 1995; 346:
132. 24. Cluzeau F, Littlejohns P, Grimshaw J, Feder G. National survey
of UK guidelines for the management of coronary heart disease,
6. Swales JD. Guidelines on guidelines. J Hypertension 1993; 11: lung and breast cancer, asthma and depression. J Clin Effect
899–903. 1997; 2: 120–123.
7. Thomson R, McElroy H, Sudlow M. Guidelines on anticoagulant 25. Nunnally JC. Psychometric Theory. New York, NY: McGraw-Hill,
treatment in atrial fibrillation in Great Britain: variation in 1981.
content and implications for treatment. Br Med J 316: 509–513.
26. Streiner DL, Norman GR. Health Measurement Scales. A
8. Cluzeau F, Littlejohns P, Grimshaw J, Hopkins A. Appraising Practical Guide to their Development and Use, 2nd edn. New York,
clinical guidelines and the development of criteria: a pilot study. NY: Oxford University Press, 1995.
J Interprofes Care 1995; 9: 227–235.
27. Bland JM, Altman DG. Cronbach’s Alpha: Statistics Notes. Br
9. Ward JE, Grieco V. Why we need guidelines for guidelines: a Med J 1997; 314: 572.
study of the quality of clinical practice guidelines in Australia.
Med J Aust 1996; 165: 574–576. 28. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing
rater reliability. Psycholog Bull 1979; 86: 420–428.
10. Institute of Medicine. Field MJ, Lohr K. (eds) Guidelines for
Clinical Practice. From Development to Use. Washington DC: National 29. Grimshaw JM and Russell IT. Achieving health gain through
Academy Press, 1992. clinical guidelines. I: Developing scientifically valid guidelines.
Qual Health Care 1993; 2: 243–248.
11. Black N. Why we need observational studies to evaluate the
effectiveness of health care. Br Med J 1996; 312: 1215–1218. 30. Woolf SH. Practice guidelines, a new reality in medicine II.
Methods of developing guidelines. Arch Intern Med 1992; 152:
12. Donabedian A. The Criteria and Standards of Quality. Explorations 946–952.
in Quality Assessment and Monitoring. Ann Arbor, MI: Health
Administration Press, 1982. 31. Burdick RK, Maqsood F, Graybill FA. Confidence intervals on
the intraclass correlation in the unbalanced one-way clas-
13. Ierodiakonou K, Vandenbroucke JP. Medicine as a stochastic sification. Commun Stats – Theory Methods 1986; 15: 3353–3378.
art. Lancet 1993; 341: 542–543.
32. Maxwell RJ. Quality assessment in health. Br Med J 1984; 288:
14. Baker R, Feder G. Clinical guidelines: where next? Int J Qual 1470–1472.
Health Care 1997; 9: 399–404.
33. Margolis CZ. Methodology Matters – VII. Clinical Practice
15. Downs SH, Black N. The feasibility of creating a checklist for Guidelines: Methodological Considerations. Int J Qual Health
the assessment of the methodological quality both of randomised Care 1997; 9: 303–306.
and non-randomised studies of health care interventions. J
Epidemiol Commun Health 1998; 52: 377–384. 34. Effective Health Care: Implementing Clinical Practice Guidelines: can
Guidelines be used to Improve Clinical Practice? Leeds: University of
16. Moher D, Pham B, Jones A et al. Does quality of reports Leeds, 1994.
of randomised trials effect estimates of intervention efficacy
reported in meta-analyses? Lancet 1998; 352: 609–613. 35. Worrall G, Chaulk P, Freake D. The effects of clinical practice

26


guidelines on patients outcomes in primary care: a systematic
review. Can Med Assoc J 1997; 156: 1705–1712. 9. If so, is (are) the method(s) for rating the evidence adequate?

36. Steinhoff MC, Abd El Khalek MK, Khallaf N et al. Effectiveness 10. Is there a description of the methods used to formulate the
of clinical guidelines for the presumptive treatment of strep- recommendations?
tococcal pharyngitis in Egyptian children. Lancet 1997; 350: 11. If so, are the methods satisfactory?
918–921.
12. Is there an indication of how the views of interested parties
37. Jadad AR, Moore RA, Carroll D et al. Assessing the quality of not on the panel were taken into account?
reports of randomized clinical trials: is blinding necessary?
Control Clin Trials 1996; 17: 1–12. 13. Is there an explicit link between the major recommendations
and the level of supporting evidence?
38. Stiell IG, Greenberg GH, McKnight RD et al. Decision rules
for the use of radiography in acute ankle injuries. Refinement and 14. Were the guidelines independently reviewed prior to pub-
prospective validation. J Am Med Assoc 1993; 269: 1127–1132. lication/release?

39. CMO Update 16. 1997: Number 8. 15. If so, is explicit information given about the methods and how
comments were addressed?
40. Littlejohns P, Cluzeau F. Promoting the rigorous development
16. Were the guidelines piloted?
of clinical guidelines in Europe through the creation of a
common appraisal instrument. Scientific Basis for Health Services, 17. If so, is explicit information given about the methods used and
Amsterdam, 1997 (abstract). the results adopted?
41. Graham I, Beardall S, Carter A, Laupacis A. The state of 18. Is there a mention of a date for reviewing or updating the
the art of practice guidelines development, dissemination, and guidelines?
evaluation in Canada. Scientific Basis for Health Services, Amsterdam,
1997. 19. Is the body responsible for the reviewing and updating clearly
identified?
42. Stephenson J. Revitalized AHCPR pursues Research on Quality.
J Am Med Assoc 1997; 278: 1557. 20. Overall, have the potential biases of guideline development
been adequately dealt with?
43. Lauterbach KW, Lubecki P, Oesingmann U et al. A concept for
a clearing procedure for guidelines in Germany (in German). Dimension two: context and content
Zeitschrift Fur Arztliche Fortbildung Und Qualitatssicherung 1997; 91:
21. Are the reasons for developing the guidelines clearly stated?
283–288.
22. Are the objectives of the guidelines clearly defined?

23. Is there a satisfactory description of the patients to which the
Appendix. Appraisal instrument guidelines are meant to apply?

Dimension one: rigour of development process 24. Is there a description of the circumstances (clinical or non-
clinical) in which exceptions might be made in using the
1. Is the agency responsible for the development of the guidelines guidelines?
clearly identified?
25. Is there an explicit statement of how the patient’s preferences
2. Was external funding or other support received for developing should be taken into account in applying the guidelines?
the guidelines?
26. Do the guidelines describe the condition to be detected, treated,
3. If external funding or support was received, is there evidence or prevented in unambiguous terms?
that the potential biases of the funding body(ies) were taken
into account? 27. Are the different possible options for the management of the
condition clearly stated in the guidelines?
4. Is there a description of the individuals (e.g. professionals,
interest groups – including patients) who were involved in the 28. Are the recommendations clearly presented?
guidelines development group?
29. Is there an adequate description of the health benefits that are
likely to be gained from the recommended management?
5. If so, did the group contain representatives of all key disciplines?
30. Is there an adequate description of the potential harms or risks
6. Is there a description of the sources of information used to that may occur as a result of the recommended management?
select the evidence on which the recommendations are based?
31. Is there an estimate of the costs or expenditures likely to incur
7. If so, are the sources of information adequate? from the recommended management?

8. Is there a description of the method(s) used to interpret and 32. Are the recommendations supported by the estimated benefits,
assess the strength of evidence? harms and costs of the intervention?

27


Dimension three: application of guidelines 36. Does the guideline document identify clear standards or targets?
33. Does the guideline document suggest possible methods for 37. Does the guideline document deﬁne measurable outcomes that
dissemination and implementation? can be monitored?
34. (National guidelines only) Does the guideline document identify
key elements which need to be considered by local guideline
groups?
35. Does the guideline document specify criteria for monitoring
compliance? Accepted for publication 28 September 1998

28

21

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (9)

Semelhante a 21

Semelhante a 21 (20)

Mais de rsd kol abundjani

Mais de rsd kol abundjani (20)

Último

Último (20)

21