1. School of Health and Related Research (ScHARR) The University of Sheffield Regent Court 30 Regent Street Sheffield, S1 4DA Website: www.sheffield.ac.uk/scharr
5. Information resources Bayesian Statistics in Health Economics Evidence review and synthesis Health Economics and Outcomes Research Cost-effectiveness modelling to support Healthcare Decision Making Main Focus of Research Areas
13. Evaluating the cost-effectiveness of diagnostic tests Dr Matt Stevenson Reader in Health Technology Assessment; NICE Appraisal Committee Member; Contributor to the NICE Diagnostic Assessment Programme manual
14.
15. Cost-effectiveness analyses In the last 10 years there has a been a considerable increase in the importance of cost-effectiveness analyses. This was due to the relatively fixed budget and a combination of ageing populations and emerging expensive interventions. This has led to the formation of funding agencies in England and Wales, Scotland, Australia and Canada.
16. Does a diagnostic test represent value for money? Diagnostic tests with high prices may be cost-effective (i.e. a worthwhile use of a limited budget). Conversely, diagnostic tests with low prices may not be cost-effective. The ‘gold standard’ approach for determining whether the price of a diagnostic test is justified is through an economic evaluation, or cost-effectiveness analysis.
17. Cost-effectiveness analyses The goal of funding agencies is to provide the greatest amount of health for society within the budget, and thus opportunity cost is a key principle. That is, what health would be lost if money was diverted from one intervention in order to fund another. The process is typically to estimate the cost-effectiveness of an intervention through modelling, and comparing this result with a value assumed to represent opportunity cost.
18.
19. Methods for evaluating diagnostic tests There have been, for some time, clear methods guide for undertaking evaluation of pharmaceutical interventions. Recently NICE has set up a Diagnostic Assessment Programme which has issues an interim statement of the methods it expects to be followed in evaluating diagnostics. http://www.nice.org.uk/media/164/3C/DAPInterimMethodsStatementProgramme.pdf
20. Simplified Overview of the modelling required The following slides discuss the steps that would be required to generate an estimate of the cost-effectiveness of a diagnostic test (or series of diagnostic tests). The overview is a simplification. More detailed discussion is provided in the previously listed HTA reports (all free to download) and the Diagnostic Methods statement
21. Estimating Test Accuracy The sensitivity and specificity of a diagnostic test must be estimated. These values would be combined with the estimated prevalence of the condition being tested for, to form an expectation of the number of true positives, true negatives, false positives and false negatives generated by the diagnostic test.
22. Modelling the patient experience For each of the four groups defined, an estimation of the events that would occur to the patient must be modelled. These may differ due to underlying risks and the chosen medical management. The modelling would include factors such as the risks of mortality, risk of morbidity, length of stay within hospital, costs for initial and subsequent care, treatment-related adverse-events and the quality of life for patients in each potential health state.
23. Modelling the patient experience Ultimately, an estimation of the life years, quality adjusted life years (QALYs*) and costs can be attributed to each of the four groups. These can be weighted by the proportions in each group to form a total cost and total QALY for patients post diagnosis. The costs of the diagnostic tests performed are then added. * The QALY is a combination of life years and patient utility. A person living for 10 years at a utility of 0.5 would gain 5 QALYs; a person living for 4 years at a utility of 0.75 would gain 3 QALYs
24. Calculating an ICER* Assume that post diagnosis, an average patient was expected to gain 10 QALYs at a cost of £20,000 under current best practice. These values became 11 QALYs at a cost of £18,000 following a new diagnostic test, which costs £4,000 per patient. In this instance the increase in cost is £2,000 (£18,000 - £20,000 + £4,000) The increase in QALYs is 1. (11 – 10) * An Incremental Cost Effectiveness Ratio.
25. Calculating an ICER In this example, the ICER would be £2,000 per QALY gained (£2,000 / 1) This would be compared with an estimation of the cost of gaining a QALY in interventions that are likely to be replaced.* Thus if this were the result from a real technology appraisal the diagnostic test would be likely to be recommended for use. * NICE has estimated this to be in the region of £20,000-£30,000
26. Implications for diagnostic pricing Where a new diagnostic test has a large impact on mortality or on the utility of a patient, then the QALY gained over the current diagnostic will be greater. ICER = Δ Cost / Δ QALY Thus, for a constant ICER, such a test would be able to command a higher price than a test with a smaller QALY gain.
27. Sequences and subgroups Note that sequences of tests and only incorporating tests on a subgroup of the population are possible. The following slide shows the predicted optimal strategy for diagnosing whether a patient has deep vein thrombosis. The costs of diagnostic tests, the risks of death, morbidity, recurrence, treatment-related adverse-events and the costs of treating future events were all considered in the model.
32. Additional complications with evaluating diagnostic tests There are reasons why evaluating diagnostic tests are more difficult than evaluating pharmaceuticals. Due to time restrictions these will be mentioned very briefly under broad headings.
78. Costs State in the model Perspective NHS & PSS Societal Employer At work £0 £0 £0 1 wk to 6 months sick leave Cost of usual care and intervention incurred by NHS NHS & PSS costs + Employer costs - Transfer costs Cost of intervention incurred by employer + Cost of replacing employee + Production loss over friction period + Salary of replacement employee after friction period + Occupational sick pay + Employer’s NI contribution 6-12 months sick leave Cost of usual care Cost of usual care Occupational sick pay + Employer’s NI contribution 12 months+ sick leave Cost of usual care Cost of usual care
88. Public health modelling: lessons learned from a contraception case study Hazel Squires, Jim Chilcott, Nick Payne, Lindsay Blank, Monica Hernandez, Louise Guillaume ScHARR, University of Sheffield
111. The decision problem To evaluate the cost-effectiveness of high dose statins (atorvastatin 80mg/d, rosuvastatin 40mg/d & simvastatin 80mg/d) versus simvastatin 40mg/d in individuals with acute coronary syndrome.
201. Systematic reviews of relevant data Myfanwy Lloyd Jones Senior Research Fellow ScHARR, University of Sheffield Email: m.lloydjones@sheffield.ac.uk
228. Systematic reviews of relevant data Myfanwy Lloyd Jones Senior Research Fellow ScHARR, University of Sheffield Email: m.lloydjones@sheffield.ac.uk
229.
230.
231.
232.
233.
234.
235.
236.
237.
238.
239.
240. AUROCs from key studies: test vs liver biopsy Test Degree of fibrosis Study AUROC (95% CI) ELF ‘ Moderate/severe’ Rosenberg 0.94 (0.84-1.00) FibroTest F2-F4 Naveau 0.83 (0.81-0.87) Nguyen-Khac 0.79 (0.69-0.90) Cirrhosis (F4) Naveau 0.95 (0.94-0.96) Nguyen-Khac 0.84 (0.72-0.97) FibroScan Severe fibrosis (F3-F4) Kim 0.98 (0.94-1.02) Mueller 0.91 + 0.03 Nahon 0.94 (90-0.97) Nguyen-Khac 0.90 (0.82-0.97) Cirrhosis Kim 0.97 (0.93-1.01) Mueller 0.92 (0.87-0.97) Nahon 0.87 (0.81-0.93) Nguyen-Khac 0.94 (0.87-0.98)
241. Sensitivity and specificity: ELF Test (subgroup with ALD) Degree of fibrosis Study Threshold score Sensitivity Specificity ‘ Moderate/severe’ Rosenberg 0.087 100% 16.7% 0.431 93.3% 100%
256. Developing and testing condition-specific preference-based measures: Lessons learnt and policy implications John Brazier, Donna Rowen, Ifigeneia Mavranezouli, Aki Tsuchiya , Tracey Young, Yaling Yang, Michael Barkham
257.
258.
Notas do Editor
PDG & literature to derive
Switching from atorvastatin (10/20mg) to generic simvastatin (20/40mg) saves approx £1,000/patient over 5 years (Moon 2006) SEARCH
Scenario A: Adherence in clinical trials reported 80-90% - results represent cost-effectiveness for individuals who tolerate potent doses AND adhere to treatment Scenario B: Scenario C: assumed individuals who didn’t tolerate potent statins received simvastatin 40mg/d
For Scenario B, For Scenario A & C: All potent doses would be considered cost-effective For Scenario B: Rosuvastatin would be considered cost-effective, Simvastatin 80mg/d would not
Alcohol consumption (? within the last 2 months) affects the results of several of the tests used in ELF and FibroTest, and causes inflammation which influences FibroScan results. FibroMAX may be less affected as it combines FibroTest with additional tests for steatosis (SteatoTest), non-alcoholic steatohepatitis (NashTest), and severe alcoholic steatohepatitis (AshTest). In patients with risk factors for ALD, it simultaneously presents the FibroTest, SteatoTest and AshTest results. 42
Lack of clarity about numbers of patients with ALD; lack of independence of test manufacturer
Naveau: patients hospitalised for alcoholism or complications of cirrhosis; test accuracy (biopsy) (2005); survival at 5 and 10 years (2009) Nguyen-Khac 2008: patients requiring alcohol detoxification/rehabilitation; test accuracy (biopsy) – this is the only FT study which seems independent of the manufacturer Excluded Foucher 2006a: alcoholic patients (not further defined) because test accuracy (biopsy) in subset only, and criteria for biopsy not clear
Excluded Thabut 2003 HVPG comparison in patients with patients with chronic liver disease as data relating specifically to patients with ALD neither published nor available from study authors
Kim 2009: patients with ALD; test accuracy (biopsy) Mueller 2010: patients with ALD; test accuracy (biopsy) Nahon 2008: patients with suspected ALD; test accuracy (biopsy) Nguyen-Khac 2008: patients requiring alcohol detoxification/ rehabilitation; test accuracy (biopsy) Janssens 2010: patients requiring alcohol detoxification/ rehabilitation; test accuracy (biopsy, HPVG) assessed only in those with scores indicating severe fibrosis Melin 2005: patients treated for alcohol withdrawal; test accuracy (biopsy) only in subset with FS score >13 kPa Foucher 2006a: alcoholic patients (not further defined); test accuracy (biopsy) in subset only, criteria for biopsy not clear, therefore study excluded
Foucher 2006b excluded because population overlapped with 2006a
Key studies – ie the only study of ELF, even though it is only a subgroup analysis of a mixed population, and the studies of FibroTest and FibroScan which focused on patients with known or suspected ALD, and biopsied all patients.
Low threshold: high sensitivity (in this case, all patients with moderate/severe fibrosis have a positive test result – no false negatives) but low specificity (ie lot of patients without moderate/severe fibrosis have a positive result – false positives) High threshold: lower sensitivity (ie some patients with moderate/severe fibrosis have false negative test results) but high specificity (ie no patients without moderate/severe fibrosis have a positive result – no false positives)
Janssens found that the threshold scores recommended for Hepatitis C (9.6 kPa for F3-F4, and 12.5 kPa for F4) had a poor PPV in patients with ALD (65% for F3-F4) (presumably because of the effect of alcohol-related liver inflammation), so looked at the effect of different thresholds.
Alcoholic steatohepatitis: Mueller et al found that diagnostic accuracy improved when patients with laboratory signs of ASH (GOT >100 U/L) were excluded; exclusion of patients with mildly elevated GOT (>50 U/L) improved accuracy re F3-F4 fibrosis, but not cirrhosis alone) – the specificity is affected more than the sensitivity (ie fewer false positives)
Either the whole study population is unrepresentative because of the disease severity (eg recruited from patients already scheduled for biopsy) Or the whole population is more representative, but biopsy is only performed in the subset with a NILT result suggesting severe disease
Naveau et al follow-up study of FibroTest found that only 21% were abstinent during follow-up period; 50% not abstinent, 29% unknown. No indication what the impact of the test result was. Some evidence that the ELF test and FibroTest may have some prognostic value, but without information about post-test drinking habits this is open to confounding
There are two major problems relating to the use of liver biopsy as the reference standard to assess the diagnostic accuracy of non-invasive liver tests. The first is the fact that it is an imperfect reference standard, and the second relates to ethical issues.
Because of the level of AEs, it would not be ethical to biopsy the full range of people with suspected ALD, including those who are unlikely to have fibrosis. So, the evidence clusters towards one end of the range, and we have much more evidence of the accuracy of the non-invasive tests in correctly identifying people who have severe fibrosis than in correctly identifying people who don’t have severe fibrosis
Care and management of dementia increasing concern
This is the development process that the team in Sheffield have developed and applied to a range of condition specific measures to create a health state classification system that is subsequently valued using a preference elicitation technique such as TTO
Why EFA
This is the development process that the team in Sheffield have developed and applied to a range of condition specific measures to create a health state classification system that is subsequently valued using a preference elicitation technique such as TTO
Sample size for DEMQOL-U was larger as the classification system describes more states and had a larger selected study design. using the AFD Names and Numbers version 3.1.25 database ( AFD Software Limited, Ramsey, UK) . The sample was balanced to the UK population according to geo-demographic profiles.
in which respondents valued states from one of the classification systems determined using a card block system. Interviewers worked systematically through blocks; odd and even blocks contained DEMQOL-U and DEMQOL-Proxy-U states respectively. This approach was used to try and ensure that there were no systematic differences across the geo-demographic profiles of the samples for each classification system. and this was done to help familiarise them with the classification system as there are concerns that naming the condition can affect elicited utility values (8). Face to face interviews This rank task further familiarised respondents with the classification system and the health states to be later valued using time trade-off (TTO).
18 to 78 observations with 306 and worst states. The range of mean values was 0.954 to 0.184 One or more states with mean value lower than worst state. There were a large proportion of TTO values at 1 for both measures (26.9% for DEMQOL-U the distribution of the data was negatively skewed)
287 valued worst states. The range of mean values was 0.961 to 0.331 (smaller than DEMQOL) one state with mean value lower than the worst state two states with a mean value higher than the best state. These apparent contradictions are most likely observed due to the much smaller number of observations for some states in comparison to worst state and best state 28.8% of values at 1 and distribution negatively skewed
Predicted range similar to observed range
Predicted range similar to observed range
Condition-specific preference-based measures have been derived from existing measures for Asthma quality of life questionnaire, overactive bladder questionnaire, EORTC-QOQ-C30, Sexual quality of life questionnaire etc. We are also nearing completion of measures for common mental health problems, epilepsy, dementia (using both self-report and proxy-report measures), and diabetes. This involves 3 stages. Firstly dimensions and items are selected using a combination of psychometric, factor and Rasch analysis. Secondly a sample of health states are valued by members of the general population, using, say, the time trade-off valuation techniques (primary data collection). Thirdly values are modelled using regression analysis to produce utility values for every health state defined by the measure. We’ve recently completed an MRC/NIHR funded study examining the methodology of the process and the HTA report is forthcoming. We have developed a measure suitable for children aged 7-11 years, the CHU-9D. (Current utility weights are from adults but work is ongoing to obtain preference weightings from children). This measure is being used in many studies in UK and Australia. AMD project involved primary data collection of HUI, SF-6D, EQ-5D utility values from approx 1000 patients with AMD in order to estimate the relationship between visual acuity and the generic measures to populate a cost-effectiveness model. We have recently undertaken many reviews examining whether EQ-5D is appropriate (assessing validity, responsiveness) for a range of conditions including...
Condition-specific preference-based measures have been derived from existing measures for Asthma quality of life questionnaire, overactive bladder questionnaire, EORTC-QOQ-C30, Sexual quality of life questionnaire etc. We are also nearing completion of measures for common mental health problems, epilepsy, dementia (using both self-report and proxy-report measures), and diabetes. This involves 3 stages. Firstly dimensions and items are selected using a combination of psychometric, factor and Rasch analysis. Secondly a sample of health states are valued by members of the general population, using, say, the time trade-off valuation techniques (primary data collection). Thirdly values are modelled using regression analysis to produce utility values for every health state defined by the measure. We’ve recently completed an MRC/NIHR funded study examining the methodology of the process and the HTA report is forthcoming. We have developed a measure suitable for children aged 7-11 years, the CHU-9D. (Current utility weights are from adults but work is ongoing to obtain preference weightings from children). This measure is being used in many studies in UK and Australia. AMD project involved primary data collection of HUI, SF-6D, EQ-5D utility values from approx 1000 patients with AMD in order to estimate the relationship between visual acuity and the generic measures to populate a cost-effectiveness model. We have recently undertaken many reviews examining whether EQ-5D is appropriate (assessing validity, responsiveness) for a range of conditions including...
ADHD project involves (ongoing) primary data collection of patients with ADHD and is an observation study examining the wider effects of ADHD on the patient and the family. AMD project involved primary data collection of HUI, SF-6D, EQ-5D utility values from approx 1000 patients with AMD in order to estimate the relationship between visual acuity and the generic measures to populate a cost-effectiveness model. Can move beyond the NHS perspective to societal perspective and take into account wider effects such as carer and productivity effects. Are currently involved in research for DH on value based pricing, as this is going to be used in the UK from 2014 onwards... Literature reviews can also be used to obtain values from the literature for use in the cost-effectiveness model and we have also undertaken research looking at synthesising utility values from multiple sources.
Jigsaw method to find out about key types evidence Into 4 groups. Each of you is going to become an expert (in 10 mins) on a particular level of evidence. Going to give you some information relating to one type of evidence. Read through it and digest the key points. Try to establish a clear definition and the pros and cons of using this type of evidence. You are then going to get together with fellow experts . Your task is to teach your fellow group members about the evidence you have been resesearching. 3 mins to get them up to speed. Your fellow group members will help you to prepare the key facts for your 3 min teaching session. You will each get a turn in your groups to be the expert, teaching the other group members about your evidence. Aim at the end – all group members informed about the key sources of evidence in very short space of time. So, 10 mins to digest your information. Then 15 mins with your fellow experts to plan your feedback session Then return to your groups and take turns to teach each other. Group members need to listen attentively and ask questions if they are unclear. At end, we are going to ask each group to feedback on what they have learned.
Jigsaw method to find out about key types evidence Into 4 groups. Each of you is going to become an expert (in 10 mins) on a particular level of evidence. Going to give you some information relating to one type of evidence. Read through it and digest the key points. Try to establish a clear definition and the pros and cons of using this type of evidence. You are then going to get together with fellow experts . Your task is to teach your fellow group members about the evidence you have been resesearching. 3 mins to get them up to speed. Your fellow group members will help you to prepare the key facts for your 3 min teaching session. You will each get a turn in your groups to be the expert, teaching the other group members about your evidence. Aim at the end – all group members informed about the key sources of evidence in very short space of time. 10 mins to digest your information. Then 10 mins with your fellow experts to plan your feedback session Then return to your groups and take turns to teach each other. 3 minute teaching session each person Group members need to listen attentively and ask questions if they are unclear. At end, we are going to ask each group to feedback on what they have learned
Supporting the Health Researcher of the Future In addition to supporting the key roles of basic education and continuing professional development health libraries are increasingly occupying an essential position in providing support to those involved in health research. Whereas previously such a role involved stocking a few key journals in a discipline and providing access to a much wider selection of peer reviewed articles through well-utilised interlibrary loan networks the emphasis has now shifted to a service “beyond the library walls”. Indeed the challenge faced by many libraries is that of warding off increasing invisibility as researchers become accustomed to accessing resources from their own desktops. Faced with such a challenge what can a health library that aims to meet the needs of its research community seek to do? One possibility is to reengineer the library's presence through a range of tailored services and virtual resources. This presentation will describe how a health library can utilise free or low cost technologies to deliver a suite of services that are based around the needs of particular programmes, projects or even individual researchers. It will describe the activities of the School of Health and Related Research at the University of Sheffield in moving forward its research support services through the use of wikis, RSS feeds, blogs and portals. The team will share lessons learnt and pointers for any other libraries seeking to extend its outreach to health researchers beyond the four walls of the library.
Supporting the Health Researcher of the Future (30 mins) The Context of Research Support (5 mins) The Potential of Web 2.0 (5 mins) Some Examples of Good Practice (4 mins) What we are doing in ScHARR/Y&H (9 mins) The Way Forward? (2 mins) Questions (5 mins)
Are you all familiar with CCs? Basically bibliographies identifying the key books in Health Subject areas to help librarians in Collection Development - either when setting up a collection from scratch or to help prioritise budgets which in new era is particularly pertinent. The process in updating these had however become very timeconsuming: - Difficult for people to meet - Diffcult to get people to contribute. - It was felt that an online version was now needed. So in this presentation I'll take you through some of the and the issues we needed to address, Helen will explain her scoping of various Web 2.0 tools and how we piloted the use of Library Thing with the Mental Heath CC and then we'll do a live demonstration to try and entice you all to go away and contribute to the new Medical CC! Have to begin with a disclaimer - we are not experts in Web 2.0 technologies. I for one work in an NHS Trust where frustratingly anything innovative is often firewalled, So we will be deferring all technical queries at the end to Frank
Core Collections have a long and fruitful history - over 17 years MIWP developed and when the group was disbanded, HLG took on the role of updating these in partnership with Tomlinsons. But the process had become very timeconsuming. When the issue of revising the CCs at one HLG committee meeting with a call out for volunteers, there was a definite tumbleweed moment.. We therefore set up a working group to look at alternative ways of revising these
HELEN Idea of using Web 2.0 application Not actually one but 3 challenges Team across the country Resource would be richer with more contributors Felt a need for the final lists to be available online These challenges need different solutions - I'm going to look at the last two and how Library Thing provided a full or part solution.
Consider 2nd challenge Increasing collaboration - we needed to make it easier and quicker for people to suggest titles for the collection. Considered 3 tools
HELENE We decided to pilot the use of Library Thing with the Mental Health CC which was the smallest of the collections and also the one in most dire need of updating - the first edition had been published in 1999. It had been produced by one library service, so had had limited ... A Library was set up on LT with the data from the last edition for people to comment on. A briefing paper was sent out to librarians working in mental health via as many channels as possible and they were encourgaged toget clinicians and university lecturers in their areas involved. People were encouraged to comment on the books - w
We created a Mental Health collection on LT and loaded all the books from the last edition onto LT, Tomlinsons having checked for new edtions.
Each book was assigned a tag, or subject term, making it easier for people to select the specific subjects they were interested in focusing on. As I said, we sent out some publicity and guidance to mental heatlh librarians via as many communication channels as possibe (e-mail discussion lists, the PLCS scheme etc) and asked them to contribute by commenting on the books already on the system, either to endorse them or to suggest that they should be removed and to suggest new books. They were also encouraged to try to get health professionals and lecturers in their area to contribute.
Excellent feedback received, LT facilitated discussion - people in some cases arguing as to why a particular book should still be included despite its date for example. People commented on whether books were on reading lists, how popular they were with their users, etc.
The Mental Health CC was published in December 2009. The Nursing CC, which Lori Harvard has co-ordinated is literally hot off the press - collect your copy from Tomlinsons stand today if you haven't already. Lori was involved in the complilation of the 3rd and 4th edition - doing it through LibraryThing made it all so much easier.
And Online version is available - either via the HLG website where we have a read-only version of the Collection on LT or via a new website which Tomlinsons have produced.
If you do contribute, your name will be included in the printed version of the 6th ed.
Sign in
Please don't delete anything!
Please add a brief comment to explain why you think the book is important and your name.