More Related Content
Similar to Acs0002 Performance Measures In Surgical Practice
Similar to Acs0002 Performance Measures In Surgical Practice (20)
More from medbookonline (20)
Acs0002 Performance Measures In Surgical Practice
- 1. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice
ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 1
2 PERFORMANCE MEASURES IN
SURGICAL PRACTICE
John D. Birkmeyer, M.D., F.A.C.S., and Justin B. Dimick, M.D., M.P.H.
With the growing recognition that the quality of surgical care reviews, providing an overview of the measures commonly used to
varies widely, there is a rising demand for good measures of surgi- assess surgical quality, considering their main strengths and limita-
cal performance. Patients and their families need to be able to tions, and offering recommendations for selecting the optimal
make better-informed decisions about where to get their surgical quality measure.
care—and from whom.1 Employers and payers need data on
which to base their contracting decisions and pay-for-performance
initiatives.2 Finally, clinical leaders need tools that can help them Overview of Current Performance Measures
identify “best practices” and guide their quality-improvement The number of performance measures that have been devel-
efforts. To meet these different needs, an ever-broadening array of oped for the assessment of surgical quality is already large and con-
performance measures is being developed. tinues to grow. For present purposes, it should be sufficient to con-
The consensus about the general desirability of surgical perfor- sider a representative list of commonly used quality indicators that
mance measurement notwithstanding, there remains considerable have been endorsed by leading quality-measurement organizations
uncertainty about which specific measures are most effective in or have already been applied in hospital accreditation, pay-for-per-
measuring surgical quality. The measures currently in use are formance, or public reporting efforts [see Tables 2 and 3]. A more
remarkably heterogeneous, encompassing a range of different ele- exhaustive list of performance measures is available on the
ments. In broad terms, they can be grouped into three main cate- National Quality Measures Clearinghouse (NQMC) Web site
gories: measures of health care structure, process-of-care mea- (http://www.qualitymeasures.ahrq.gov), sponsored by the Agency
sures, and measures reflecting patient outcomes. Although each of for Healthcare Research and Quality (AHRQ).
these three types of performance measure has its unique strengths, Overt the past few years, the National Quality Forum (NQF)
each is also associated with conceptual, methodological, or practi- has emerged as the leading organization endorsing quality mea-
cal problems [see Table 1]. Obviously, the baseline risk and fre- sures. Many other organizations, including the Joint Commission
quency of the procedure are important considerations in weighing on Accreditation of Healthcare Organizations (JCAHO) and the
the strengths and weaknesses of different measures.3 So too is the Center for Medicare and Medicaid Services (CMS), rely on the
underlying purpose of performance measurement; for example, endorsement of the NQF before applying a measure to practice.
measures that work well when the primary intent is to steer The number of measures relevant to surgery that have been
patients to the best hospitals or surgeons (selective referral) may endorsed by the NQF has grown rapidly [see Table 2]. Many of
not be optimal for quality-improvement purposes. these new measures were vetted as part of CMS’s Surgical Care
Several reviews of performance measurement have been pub- Improvement Program (SCIP), which includes process measures
lished in the past few years.3-5 In what follows, we expand on these related to prevention of surgical site infections (SSIs), postopera-
Table 1 Primary Strengths and Limitations of Structural, Process, and Outcome Measures
Type of
Examples Strengths Limitations
Measure
Measures are expedient and inexpensive
Number of measures is limited
Measures are efficient—a single one may relate to
Procedure volume several outcomes Measures are generally not actionable
Structural
Intensivist-managed ICU For some procedures, measures predict subse- Measures do not reflect individual performance and are consid-
quent performance better than process or out- ered unfair by providers
come measures do
Measures reflect care that patients actually
Many measures are hard to define with existing databases
receive—hence, greater buy-in from providers
Process of Appropriate use of Extent of linkage between measures and important patient
Measures are directly actionable for quality-improve-
care prophylactic antibiotics outcomes is variable
ment activities
High-leverage, procedure-specific measures are lacking
For many measures, risk adjustment is unnecessary
Risk-adjusted mortalities Face validity Sample sizes are limited
Direct
outcome for CABG from state or Measurement may improve outcomes in and of Clinical data collection is expensive
national registries itself (Hawthorne effect) Concerns exist about risk adjustment with administrative data
CABG—coronary artery bypass grafting
- 2. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice
ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 2
Table 2 Clinical Performance Measures Structural Measures
Relevant to Surgery That Have Been Endorsed by The term health care structure refers to the setting or system in
National Quality Forum* which care is delivered. Many structural performance measures
reflect hospital-level attributes, such as the physical plant and
resources or the coordination and organization of the staff (e.g.,
Diagnosis or Procedure Performance Measure the registered nurse–bed ratio and the designation of a hospital as
a level I trauma center). Other structural measures reflect physi-
Use of internal mammary artery cian-level attributes (e.g., board certification, subspecialty training,
Preoperative beta blocker and procedure volume).
Deep sternal wound infection rate
Coronary artery bypass
grafting Prolonged intubation STRENGTHS
Renal insufficiency
Surgical reexploration
Structural performance measures have several attractive fea-
Hospital volume tures. One strength of such measures is that many of them are
strongly related to outcomes. For example, with esophagectomy
Risk-adjusted mortality and pancreatic resection for cancer, operative mortality is as much
Aortic valve replacement
Hospital volume as 10% lower, in absolute terms, at very high volume hospitals
Risk-adjusted mortality than at lower-volume centers.6,7 In some instances, structural mea-
Mitral valve replacement
Hospital volume sures (e.g., procedure volume) are better predictors of subsequent
hospital performance than any known process or outcome mea-
Use of antiplatelet agents, antilipid drugs, and
beta blockers on discharge sures are [see Figure 1].8
Participation in a cardiac surgery registry A second strength is efficiency. A single structural measure may
Any cardiac surgery Preoperative beta blocker be associated with numerous outcomes. For example, with some
Renal insufficiency types of cancer surgery, higher hospital or surgeon procedure vol-
Prolonged intubation ume is associated not only with lower operative mortality but
Stroke also with lower perioperative morbidity and improved late sur-
Radiation therapy after breast conservation surgery vival.9-11 Intensivist-staffed intensive care units are linked to short-
Surgery for breast cancer er lengths of stay and reduced use of resources, as well as to lower
Adjuvant chemotherapy for appropriate candidates
mortality.12,13
Adjuvant chemotherapy for appropriate candidates
The third, and perhaps most important, strength of structural
Surgery for colon cancer At least 12 lymph nodes identified in surgical
specimen measures is expediency. Many such measures can easily be
assessed with readily available administrative data. Although some
Adjuvant radiation therapy for patients with rectal structural measures require surveying of hospitals or providers,
Surgery for rectal cancer cancer
such data are much less expensive to collect than data obtained
VTE prophylaxis through review of individual patients’ medical records.
Any surgical procedure Appropriate timing, selection, and discontinuance
of prophylactic antibiotics
Any hospitalized patient, Central venous catheter infection rate
including all postopera- Urinary catheter–associated infection rate
tive patients Table 3 Other Performance Measures Currently
Ventilator-associated pneumonia rate
Used in Surgical Practice
*As of August 2007.
Performance Measure
Diagnosis or Procedure
(Developer/Endorser)
tive cardiac events, venous thromboembolism (VTE), and respira- Critical illness Staffing with board-certified intensivists (LF)
tory complications.
Although the NQF is the central organization for evaluating can- Hospital volume (AHRQ, LF)
didate quality measures, many other organizations continue to cre- Abdominal aneurysm repair Risk-adjusted mortality (AHRQ)
ate their own quality indicators [see Table 3].The AHRQ has focused Prophylactic beta blockers (LF)
primarily on quality measures that take advantage of readily avail- Carotid endarterectomy Hospital volume (AHRQ)
able administrative data. Because little information on process of
care is available in these datasets, these measures are mainly struc- Esophageal resection Hospital volume (AHRQ)
for cancer
tural (e.g., hospital procedure volume) or outcome-based (e.g., risk-
adjusted mortality). The Leapfrog Group (http://www.leapfrog- Hospital volume (AHRQ, LF)
Pancreatic resection
group.org), a coalition of large employers and purchasers, has devel- Risk-adjusted mortality (AHRQ)
oped perhaps the most visible set of surgical quality indicators for Hospital volume (AHRQ)
its value-based purchasing initiative. The organization’s original Pediatric cardiac surgery
Risk-adjusted mortality (AHRQ)
(2000) standards focused exclusively on procedure volume, but
their current (2006) standards include selected process variables Hip replacement Risk-adjusted mortality (AHRQ)
(e.g., the use of beta blockers in patients undergoing abdominal
Craniotomy Risk-adjusted mortality (AHRQ)
aortic aneurysm repair) and outcome measures. In the near future,
the Leapfrog Group may begin using a composite of operative mor- Cholecystectomy Laparoscopic approach (AHRQ)
tality and hospital volume as the primary measure for their evi-
Appendectomy Avoidance of incidental appendectomy (AHRQ)
dence-based hospital referral initiative. Such composite measures
are discussed further elsewhere (see below). AHRQ—Agency for Healthcare Research and Quality LF—Leapfrog Group
- 3. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice
ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 3
a b
25.0 20.0
Observed Mortality (%), 2002–2003
Observed Mortality (%), 2002–2003
20.0 16.0
15.0 12.0
10.0 8.0
5.0 4.0
0 0
Hospital Volume Historical Mortality Hospital Volume Historical Mortality
Quartiles of Performance Measures for Quartiles of Performance Measures for
Esophagectomy (1998–2001) Pancreatic Resection (1998–2001)
Figure 1 Illustrated is the relative ability of historical (1998–2001) measures of hospital volume and risk-
adjusted mortality to predict subsequent (2002–2003) risk-adjusted mortality in U.S. Medicare patients.
LIMITATIONS
outcome measures, is not required for many process measures. For
One limitation of structural performance measures is that rela- example, appropriate prophylaxis against postoperative VTE is one
tively few of them are strongly linked to patients and thus poten- performance measure in CMS’s expanding pay-for-performance
tially useful as quality indicators. A second limitation is that most initiative and is part of SCIP. Because it is widely agreed that vir-
structural measures, unlike most process measures, are not readily tually all patients undergoing open abdominal procedures should
actionable. For example, a small hospital can increase the percent- be offered some form of prophylaxis, there is little need to collect
age of its surgical patients who receive antibiotic prophylaxis, but it detailed clinical data about illness severity for the purposes of risk
cannot easily make itself a high-volume center. Thus, al- adjustment.
though some structural measures may be useful for selective refer- Another strength is that process measures are generally less con-
ral initiatives, they are of limited value for quality improvement. strained by sample-size problems than outcome measures are.
A third limitation is that whereas some structural measures can Important outcome measures (e.g., perioperative death) are rela-
identify groups of hospitals or providers that perform better on tively rare, but most targeted process measures are relevant to a
average, they are not adequate discriminators of performance much larger proportion of patients. Moreover, because process
among individuals. For example, in the aggregate, high-volume measures generally target aspects of general perioperative care,
hospitals have a much lower operative mortality for pancreatic they can often be applied to patients who are undergoing numer-
resection than lower-volume centers do. Nevertheless, some indi- ous different procedures, thereby increasing sample sizes and, ulti-
vidual high-volume hospitals may have a high mortality, and some mately, improving the precision of the measurements.
individual low-volume centers may have a low mortality (though
LIMITATIONS
the latter possibility may be difficult to confirm because of the
smaller sample sizes involved).14 For this reason, many providers One practical limitation of process measures is the lack of a reli-
view structural performance measures as unfair. able infrastructure for collecting the necessary data. Administra-
tive datasets do not have the clinical detail and specificity required
for this task. Measurement systems based on clinical data, includ-
Process Measures ing that of the National Surgical Quality Improvement Program
Processes of care are the clinical interventions and services pro- (NSQIP) of the Department of Veterans Affairs (VA),15 focus on
vided to patients. Process measures have long been the predomi- patient characteristics and outcomes and do not collect informa-
nant quality indicators for both inpatient and outpatient medical tion on processes of care. Currently, most pay-for-performance
care, and their popularity as quality measures for surgical care is programs rely on self-reported information from hospitals, but the
growing rapidly. Perhaps the best example of the trend toward reliability of such data is uncertain (particularly when reimburse-
using process measures is SCIP, which, as noted (see above), ment is at stake).
focuses exclusively on processes related to prevention of SSIs, post- Even if this first limitation were overcome, there remains a sec-
operative cardiac events,VTE, and respiratory complications. ond limitation to be considered—namely, that process variables are
limited in their ability to explain observed variations in mortality.
STRENGTHS
There is a growing body of empirical data supporting this state-
A strength of process measures is their direct connection to ment. Most of the data come from the literature on medical diag-
patient management. Because they reflect the care that physicians noses (e.g., acute myocardial infarction), where the link between
actually deliver, they have substantial face validity and hence process and outcome is much stronger than it is in surgery.16,17 For
greater “buy-in” from providers. Such measures are usually direct- example, the JCAHO/CMS process measures for acute myocardial
ly actionable and thus are a good substrate for quality-improve- infarction explained only 6% of the observed variation in risk-
ment activities. adjusted mortality for this condition.17
A second strength is that risk adjustment, though important for Although to date, no analogous study has been done in surgery,
- 4. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice
ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 4
LIMITATIONS
there is some reason to believe that existing process measures
explain very little of the variation in important surgical outcomes. One limitation of hospital- or surgeon-specific outcome mea-
First, most process measures currently used in surgery relate to sures is that they are severely constrained by small sample sizes.
secondary rather than primary outcomes. For example, although For the large majority of surgical procedures, very few hospitals (or
the value of antibiotic prophylaxis in reducing the risk of superfi- surgeons) have sufficient adverse events (numerators) and cases
cial SSI should not be underestimated, this process is not among (denominators) to be able to generate meaningful, procedure-spe-
the most important adverse events of major surgery (including cific measures of morbidity or mortality. For example, a 2004
death). Second, process measures in surgery often relate to com- study used data from the Nationwide Inpatient Sample to study
plications that are very rare. For example, there is a consensus that seven procedures for which mortality was advocated as a quality
prophylaxis for VTE is necessary and important. Accordingly, the indicator by the AHRQ.19 For six of the seven procedures, only a
SCIP measures, endorsed by the NQF, include the use of appro- very small proportion of hospitals in the United States had large
priate prophylaxis. However, pulmonary embolism is very uncom- enough caseloads to rule out a mortality that was twice the nation-
mon, and therefore, improving adherence to these measures will al average. Although identifying poor-quality outliers is an impor-
not avert many deaths. Until a better understanding is achieved tant function of outcome measurement, to focus on this goal alone
regarding which details account for variations in the most impor- is to underestimate the problems associated with small sample
tant complications, especially those adverse events leading to sizes. Distinguishing among individual hospitals with intermediate
death, process measures will continue to be of limited usefulness in levels of performance is even more difficult.
surgical quality improvement. Other limitations of direct outcome assessment depend on the
measurement platform being used. The two most prevalent mea-
surement platforms are the use of existing data, usually generated
Outcome Measures for administrative purposes, and the creation of a clinical registry
Direct outcome measures reflect the end result of care, either specifically for quality improvement. For outcome measures based
from a clinical perspective or from the patient’s viewpoint. Mor- on clinical data, the major problem is expense. For example, it
tality is by far the most commonly used surgical outcome measure, costs more than $100,000 annually for a private-sector hospital to
but there are other outcomes that could also be used as quality participate in NSQIP. Because of the expense of data collection,
indicators, including complications, hospital readmission, and var- the ACS-NSQIP currently collects data on only a sample of
ious patient-centered measures of satisfaction or health status. patients undergoing surgery at each hospital. Although this sam-
Several large-scale initiatives involving direct outcome assess- pling strategy reduces the cost of data collection, it exacerbates the
problem of small sample size with individual procedures.
ment in surgery are currently under way. For example, proprietary
With measurement systems that use administrative data, a
health care rating firms (e.g., Healthgrades) and state agencies are
major concern is the adequacy of risk adjustment. For outcome
assessing risk-adjusted mortalities by using Medicare or state-level
measures to have face validity with providers, high-quality risk
administrative datasets. Most of the current outcome-measurement
adjustment may be essential. It may also be useful for discouraging
initiatives, however, involve the use of large clinical registries, of
gaming of the system (e.g., hospitals or providers avoiding high-
which the cardiac surgery registries in New York, Pennsylvania, and
risk patients to optimize their performance measures). It is unclear,
a growing number of other states are perhaps the most visible exam-
however, to what extent the scientific validity of outcome measures
ples. At the national level, the Society for Thoracic Surgeons and the
is threatened by imperfect risk adjustment with administrative
American College of Cardiology have implemented systems for data. Although administrative data lack clinical detail on many
tracking the morbidity and mortality associated with cardiac variables related to baseline risk,20-23 the degree to which case mix
surgery and percutaneous coronary interventions, respectively. varies systematically across hospitals or surgeons has not been
Although the majority of the outcome-measurement efforts to determined. Among patients who are undergoing the same surgi-
date have been procedure-specific (and largely limited to cardiac cal procedure, there is often surprisingly little variation. For exam-
procedures), NSQIP has assessed hospital-specific morbidities and ple, among patients undergoing CABG in New York State, unad-
mortalities aggregated across surgical specialties and procedures. justed hospital mortality and adjusted hospital mortality (as
NSQIP is now working in conjunction with the American College derived from clinical registries) were nearly identical in most years
of Surgeons (ACS) in an effort to apply the same measurement (with correlations exceeding 0.90) [see Figure 2]. Moreover, hospi-
approach outside the VA.18 Currently, the ACS-NSQIP is being tal rankings based on unadjusted mortality and those based on
used in more than 170 private hospitals in the United States, of adjusted mortality were equally useful in predicting subsequent
many different types and from all geographic regions. hospital performance.
STRENGTHS
Direct outcome measures have at least two major strengths. Matching Performance Measures to Underlying Goals
First, they have obvious face validity and thus are likely to garner a Performance measures will never be perfect. Certainly, over
high degree of support from hospitals and surgeons. Second, out- time, better analytic methods will be developed, and better access
come measurement, in and of itself, may improve performance— to higher-quality data may be gained with the addition of clinical
the so-called Hawthorne effect. For example, surgical morbidity elements to administrative datasets or the broader adoption of
and mortality in VA hospitals have fallen dramatically since the electronic medical records. There are, however, some problems
implementation of NSQIP in 1991.15 Undoubtedly, many surgical with performance measurement (e.g., sample-size limitations) that
leaders at individual hospitals made specific organizational or are inherent and thus not fully correctable. Consequently, clinical
process improvements after they began receiving feedback on their leaders, patient advocates, payers, and policy makers will all have
hospitals’ performance. However, it is very unlikely that even a full to make decisions about when imperfect measures are nonetheless
inventory of these specific changes would explain such broad- good enough to act on.
based and substantial improvements in morbidity and mortality. A measure should be implemented only with the expectation
- 5. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice
ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 5
that acting on it will yield a net improvement in health quality. In improve quality at all hospitals, not to direct patients to centers
other words, the direct benefits of implementing a particular mea- with high compliance rates. Conversely, the Leapfrog Group’s
sure cannot be outweighed by the indirect harm. Unfortunately, efforts in surgery are primarily aimed at selective referral, though
benefits and harm are often difficult to measure. Moreover, mea- they may indirectly provide incentives for quality improvement.
surement is heavily influenced by the specific context and by For the purposes of quality improvement, a good performance
who—patients, payers, or providers—is doing the accounting. For measure—most often, a process-of-care variable—must be action-
this reason, the question of where to set the bar, so to speak, has able. Measurable improvements in the given process should trans-
no simple answer. late into clinically meaningful improvements in patient outcomes.
Although quality-improvement activities are rarely actually harm-
It is important to ensure a good match between the perfor-
ful, they do have potential downsides, mainly related to their
mance measure and the primary goal of measurement. It is partic-
opportunity cost. Initiatives that hinge on bad performance mea-
ularly important to be clear about whether the underlying goal is sures siphon away resources (e.g., time and focus) from more pro-
(1) quality improvement or (2) selective referral (i.e., directing ductive activities.
patients to higher-quality hospitals or providers). Although some For the purposes of selective referral, a good performance mea-
pay-for-performance initiatives may have both goals, one usually sure is one that steers patients toward better hospitals or physicians
predominates. For example, the ultimate objective of CMS’s pay- (or away from worse ones). For example, a measure based on pre-
for-performance initiative with prophylactic antibiotics is to vious performance should reliably identify providers who are likely
to have superior performance now and in the future. At the same
time, a good performance measure should not provide incentives
a for perverse behaviors (e.g., carrying out unnecessary procedures to
4.0
meet a specific volume standard) or negatively affect other domains
Correlation = 0.95 of quality (e.g., patient autonomy, access, and satisfaction).
3.5
Measures that work well for quality improvement may not be
Risk-Adjusted Mortality (%)
3.0
particularly useful for selective referral; the converse is also true.
For example, appropriate use of perioperative antibiotics in surgi-
2.5
cal patients is a good quality-improvement measure: it is clinically
meaningful, linked to lower SSI rates, and directly actionable.This
2.0 process of care would not, however, be particularly useful for selec-
tive referral purposes. In the first place, patients are unlikely to base
1.5 their decision about where to undergo surgery on patterns of peri-
operative antibiotic use. Moreover, surgeons with high rates of
1.0 appropriate antibiotic use do not necessarily do better with respect
to more important outcomes (e.g., mortality). A physician’s per-
0.5 formance on one quality indicator often correlates poorly with his
or her performance on other indicators for the same or other clin-
ical conditions.24
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 As a counterexample, the two main performance measures for
Observed Mortality (%) pancreatic cancer surgery—hospital volume and operative mortal-
b ity—are very informative in the context of selective referral:
4.0 patients can markedly improve their chances of surviving surgery
by selecting hospitals highly ranked on either measure [see Figure
1]. Neither of these measures, however, is particularly useful for
3.0 quality-improvement purposes. Volume is not readily actionable,
Mortality (%), 2002
and mortality is too unstable at the level of individual hospitals
(again, because of the small sample sizes) to serve as a means of
2.0 identifying top performers, determining best practices, or evaluat-
ing the effects of improvement activities.
Many believe that a good performance measure must be capa-
1.0 ble of distinguishing levels of performance on an individual basis.
From the perspective of providers in particular, a measure cannot
be considered fair unless it reliably reflects the performance of
0 individual hospitals or physicians. Unfortunately, as noted (see
Best Middle Worst Best Middle Worst above), small caseloads (and, sometimes, variations in the case
Unadjusted Mortality Ratings, Risk-Adjusted Mortality Ratings, mix) make this degree of discrimination difficult or impossible to
New York State Hospitals New York State Hospitals, 2001 achieve with most procedures. Even so, information that at least
improves the chances of a good outcome on average is still of real
value to patients. Many performance measures can achieve this less
Figure 2 Shown are mortality figures from coronary artery
bypass surgery in New York State hospitals, based on data from
demanding objective even if they do not reliably reflect individual
the state’s clinical outcomes registry. (a) Depicted is the correla- performance.
tion between adjusted and unadjusted mortality rates for all state For example, a 2002 study used clinical data from the
hospitals in 2001. (b) Illustrated is the relative ability of adjusted Cooperative Cardiovascular Project to assess the usefulness of the
mortality and unadjusted mortality to predict performance in the Healthgrades hospital ratings for acute myocardial infarction
subsequent year. (based primarily on risk-adjusted mortality from Medicare data).25
- 6. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice
ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 6
Compared with the one-star (worst) hospitals, the five-star (best) ber of data elements could be reduced by creating more parsimo-
hospitals had a significantly lower mortality (16% versus 22%) nious risk-adjustment models. Second, the sampling strategy could
after risk adjustment with clinical data; they also discharged signif- be changed to sample 100% of the most important operations; this
icantly more patients on appropriate aspirin, beta-blocker, and change would allow assessment of procedure-specific outcomes.
angiotensin-converting enzyme inhibitor regimens. However, the Ultimately, participating hospitals would need procedure-specific
Healthgrades ratings proved not to be useful for discriminating outcome data to target specific operations for improvement.Third,
between any two individual hospitals. In only 3% of the head-to- clinical processes of care could be added to the data collection
head comparisons did five-star hospitals have a statistically lower process; this would allow hospitals to respond to national pay-
mortality than one-star hospitals. for-performance mandates, as well as to provide more actionable
Thus, some performance measures that clearly identify groups quality measures.This last change would require the ACS-NSQIP
of hospitals or providers exhibiting superior performance may be to manifest a level of flexibility that it has not exhibited to date.
limited in their ability to differentiate individual hospitals from one With the flexibility to change data measurement periodically, the
another.There may be no simple way of resolving the basic tension ACS-NSQIP would be able not only to add other measures that
implied by performance measures that are unfair to providers yet are used in national mandates (e.g., SCIP) but also to evaluate
informative for patients. This tension does, however, underscore their importance.
the importance of being clear about (1) what the primary purpose Another barrier to improving surgical performance is the lack of
of performance measurement is (quality improvement or selective good global measures of performance. With the proliferation of
referral) and (2) whose interests are receiving top priority (the pay-for-performance pilot programs, various stakeholders have
provider or the patient). been confronted with the problem of how to make sense of multi-
ple competing measures of quality. Most have responded by com-
bining multiple domains to create a composite measure of perfor-
Future of Performance Measurement mance. The Premier/Center for Medicare and Medicaid Services
Although great progress has been made, the science of surgical Hospital Quality Incentive Demonstration uses a composite of
quality improvement is still in its infancy.There are several barriers process and outcome as a quality measure for coronary artery
to improving the quality of surgical care. Perhaps the biggest bar- bypass surgery. The Society of Thoracic Surgeons’ Task Force on
rier is the lack of an accurate and affordable measurement infra- Quality Measurement advocates a composite score based on a set
structure. One practical solution that may reduce the expense of of outcome and process measures endorsed by the NQF. In these
detailed data collection with clinical registries is to create hybrid composite approaches, the different measures are essentially
systems that join data elements from administrative and clinical weighted equally, with no empiric determination of which ones are
datasets. Although administrative data are criticized for their lack the most important.There are, however, emerging techniques that
of accuracy in identifying coexisting diseases, they can reliably use empirically derived weighting to create a composite score that
identify the type of procedure performed, certain demographic optimally predicts future mortality for high-risk surgery. As such
variables (e.g., age, gender, and race), and some outcome variables methods become more fully developed, composite measures will
(e.g., vital status, discharge to a skilled nursing facility, and length no doubt continue to gain popularity.
of stay). This set of variables could then be linked to a limited set Given that most existing quality improvement efforts focus on
of clinical risk factors that would allow robust risk adjustment.This optimizing measurement of technical quality, it is important not to
solution will be even more attractive as administrative data come lose sight of the fact that many quality concerns arise upstream
to contain more accurate information (e.g., present-on-admission from the operation itself—that is, with the decision to operate in
codes to distinguish complications from coexisting problems).26 the first place.Wide variations in the use of surgery have long been
In addition to improving the efficiency of data collection, it recognized. Some of these variations are attributable to differences
would be worthwhile to rethink how existing registries are in disease prevalence and physician practice style. Some, however,
designed so as to make them less expensive and more useful. For arise from either overuse, underuse, or misuse of surgical manage-
example, although the ACS-NSQIP is in a key position to become ment. For a full accounting of surgical quality, it will be necessary
the leading measurement platform for surgical quality improve- to develop reliable means of measuring the appropriateness of sur-
ment, there are several changes that could be made to ensure its gical treatment and the extent to which patient preferences are
success. First, the burden of data collection could be reduced; this incorporated into clinical decisions, in addition to measures assess-
would substantially decrease the costs of participating. The num- ing how well patients do after surgery.
References
1. Lee TH, Meyer GS, Brennan TA: A middle 5. Bird SM, Cox D, Farewell VT, et al: Performance subsequent hospital performance. Ann Surg
ground on public accountability. N Engl J Med indicators: good, bad, and ugly. J R Statist Soc 243:411, 2006
350:2409, 2004 168:1, 2005 9. Bach PB, Cramer LD, Schrag D, et al: The influ-
2. Galvin R, Milstein A: Large employers’ new strate- 6. Halm EA, Lee C, Chassin MR: Is volume related ence of hospital volume on survival after resection
gies in health care. N Engl J Med 347:939, 2002 to outcome in health care? A systematic review and for lung cancer. N Engl J Med 345:181, 2001
3. Birkmeyer JD, Birkmeyer NJ, Dimick JB: methodologic critique of the literature. Ann Intern 10. Begg CB, Reidel ER, Bach PB, et al: Variations in
Measuring the quality of surgical care: structure, Med 137:511, 2002 morbidity after radical prostatectomy. N Engl J
process, or outcomes? J Am Coll Surg 198:626, 7. Dudley RA, Johansen KL, Brand R, et al: Selective Med 346:1138, 2002
2004 referral to high volume hospitals: estimating poten- 11. Finlayson EVA, Birkmeyer JD: Effects of hospital
4. Landon BE, Normand SL, Blumenthal D, et al: tially avoidable deaths. JAMA 283:1159, 2000 volume on life expectancy after selected cancer
Physician clinical performance assessment: 8. Birkmeyer JD, Dimick JB, Staiger DO: Operative operations in older adults: a decision analysis. J
prospects and barriers. JAMA 290:1183, 2003 mortality and hospital volume as predictors of Am Coll Surg 196:410, 2002
- 7. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice
ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 7
12. Pronovost PJ, Angus DC, Dorman T, et al: 17. Bradley EH, Herrin J, Elbel B, et al: Hospital qual- progress, but problems remain. Am J Public Health
Physician staffing patterns and clinical outcomes ity for acute myocardial infarction: correlation 82:243, 1992
in critically ill patients: a systematic review. JAMA among process measures and relationship with
22. Iezzoni LI, Foley SM, Daley J, et al: Comorbidities,
288:2151, 2002 short-term mortality. JAMA 296:72, 2006
complications, and coding bias. Does the number
13. Pronovost PJ, Needham DM, Waters H, et al: 18. Fink A, Campbell DJ, Mentzer RJ, et al: The Na- of diagnosis codes matter in predicting in-hospital
Intensive care unit physician staffing: financial tional Surgical Quality Improvement Program in mortality? JAMA 267:2197, 1992
modeling of the Leapfrog standard. Crit Care Med non–Veterans Administration hospitals: initial
32:1247, 2004 demonstration of feasibility. Ann Surg 236:344, 23. Iezzoni LI: The risks of risk adjustment. JAMA
2002 278:1600, 1997
14. Shahian DM, Normand SL:The volume-outcome
relationship: from Luft to Leapfrog. Ann Thorac 19. Dimick JB, Welch HG, Birkmeyer JD: Surgical 24. Palmer RH, Wright EA, Orav EJ, et al:
Surg 75:1048, 2003 mortality as an indicator of hospital quality: the Consistency in performance among primary care
problem with small sample size. JAMA 292:847, practitioners. Med Care 34(9 suppl):SS52, 1996
15. Khuri SF, Daley J, Henderson WG:The compara-
tive assessment and improvement of quality of sur- 2004 25. Krumholz HM, Rathore SS, Chen J, et al:
gical care in the Department of Veterans Affairs. 20. Finlayson EV, Birkmeyer JD, Stukel TA, et al: Evaluation of a consumer-oriented internet health
Arch Surg 137:20, 2002 Adjusting surgical mortality rates for patient care report card: the risk of quality ratings based
comorbidities: more harm than good? Surg on mortality data. JAMA 287:1277, 2002
16. Fonarow GC, Abraham WT, Albert NM, et al:
Association between performance measures and 132:787, 2002 26. Fry DE, Pine M, Jordan HS, et al: Combining
clinical outcomes for patients hospitalized with 21. Fisher ES, Whaley FS, Krushat WM, et al: The administrative and clinical data to stratify surgical
heart failure. JAMA 297:61, 2007 accuracy of Medicare’s hospital claims data: risk. Ann Surg 246:875, 2007