Reassurance bypass

May 28, 2006 | Created May 26, 2006 |

There is not one surgical centre in England and Wales dealing with coronary artery bypass grafts where the case mortality rate is worse than expected. That is the falsely reassuring claim implied by the Healthcare Commission’s website. The site, based on data from the Royal Society of Cardiothoracic Surgeons, publishes centre-by-centre case mortality rates, as noted in the BMJ a few weeks ago. It systematically asserts every single centre to be performing at or above ‘expectation’, but this is based on an ‘expectation’ which assumes an average mortality rate more than twice as high as the actual mortality rate. This is despite an explicit warning against 'false reassurance' based on this method of calculating expectation recently published by one of the major groups involved in collecting the data (Bhatti et al 2006); they are not wrong.

Disclaimer: when I say below that almost every clinician has a horror story, I have carefully not asked any of those clinicians to whom I am related. Actually I haven't asked any at all. Though I work for a big multinational that likes to sell things to doctors, none of this is their doing, although I would miss the Full Text access if I didn't work there. Both of my children are above average. It gets rather flippant from now on, and mightn't if it was me that needed a bypass.

The release of this data is greatly to be welcomed and, were I to be facing surgery, I would I think find the website mostly and correctly reassuring. Although it claims to be intended to help patients discuss ‘the rates of survival for particular surgeons or hospitals with their GP, surgeon or cardiologist before making a decision’, actually the reassurance the site provides is not the possibility that one might pick the ‘best’ doctor in any meaningful way: the between-centre variability in the published data is small compared with other uncertainties. No, it’s reassuring because it should let you as a patient convince yourself you are not going to be the victim of some ongoing medical scandal whereby an incompetent doctor, one who really stands out from the crowd, is allowed to continue practicing because of a lack of oversight – or worse a lack of confrontation – from their peers. Though of course it’s no coincidence that it was cardiac surgeons who were involved in the Bristol Royal Infirmary case, I imagine such cases are rather rare, even if nearly every clinician can furnish an anecdotal horror story. It’s also reassuring in a more mundane but much more relevant way: it is evidence that a group of surgeons care enough about improving their own processes and learning from the best to collect this data for their own group audit purposes, despite the practical difficulties of collection and the risks of publication.

But there are risks of publication, and the biggest is I Want Surgery In Lake Wobegon: even if the differences between adjacent centres in a ranking are utterly inconsequential, half of all centres will be below the midpoint, and one can see why hardworking professionals are nervous about any data set that demonstrates that. How does the website avoid this statistical inevitability and move to Garrison Keillor’s Lake Wobegon, 'where all the children are above average'?.

Let’s look at the raw data. Not so easy, because they don’t supply it in a single table. To even get to the pages of case mortality rates for 25 centres you have to click 50 times, with another brace of cutting and pasting movements on each one making plenty of scope for error (errors probably present in the data I present below and which for this reason I’m unprepared to apologise for). If that was a device to prevent Daily Mail scare stories (“Health Expert Garrison Keillor Proves UK Surgeons Killing Us All”), it was possibly a bit short-sighted as the one thing journalists are good at is cutting and pasting from the internet.

The website presents the data as survival rates. While that’s a sensible and reassuring approach - I would I think rather hear of a 97% or 96% survival rate than a 3% and 4% mortality one – this choice to present ‘not much difference’ (97/96) instead of ‘a bit of difference’ (4/3) is also consistent with a general approach of minimising the impact of the important comparisons. That there is difficulty with this approach is signalled by the graphical presentation, where the data appears as a bar in a box. On the front page the bar goes from 0 to 100%, so that the visual signal sent is a large grey box for survival and a tiny one for mortality: reassuring again and indeed a legitimate reflection of the extraordinary ability of these people to rescue lives. But when we come to see the data for individual centres, the box left hand side of the box magically shifts to 50%. Why not 0%? Well because all the action is up on the right hand side at at the 95-97% range and the further the left hand side side is away the less detail you can see. (But then why stop at 50%? Why not 80%? 90%?)Let’s see what happens to you try and present all the data so laboriously clicked on, on this scale where it is all scrunched up on the right hand side:

JSCABG1
Figure 1 Mortality rates for cardiac artery bypass grafts as reported by the Healthcare Commission, presented in the style of the website (but all on one graph) so that all the data is squashed up at one end. Performance is at or exceeds ‘expectation’ for every centre. Solid boxes: actual case survival rates; dotted lines range of uncertainty in 'expected' performance with midpoints. Not all centres provided three-year averages (red). Transcribed by hand from website on 7th May 2006 and may contain errors. Vertical axis is centre, ordered by size of reported 2006 caseload, but not labelled as it's really not my point to have a go at anyone in particular.

But that’s just presentation. Now we come to the point where the website, consciously or not, flirts with false reassurance. Every single centre is shown as doing ‘as expected’ or ‘better than expected’: the solid boxes are always to the right of the open ones. How can this be? Is the failure to remark on this cause or consequence of the failure to show all the data together?

The answer of course lies in the definition of what is ‘expected’, and to its credit the website gives a link to the technical details underlying this definition. Different centres see a different mix of cases and it’s not possible to even contemplate this exercise unless you take some account of this. Here is, in full, the description the site provides of how this is done:

The expected rates of survival are calculated using logistic EuroSCORE. The model allows calculation of the likely rates of survival for heart operations, taking into account the age and health of the patients and the type of operation. It is a popular model because it is relatively simple. However, recent evidence from the UK and parts of the USA and Australia, show that, since the year 2002, hospitals, have tended to do better than predicted. This reflects improvements in technology and surgical and anaesthetic techniques.
Using the EuroSCORE provides a good starting point. However, all measures should be reviewed and revised as new techniques in surgery or better anaesthetics, for example, become available. The Society for Cardiothoracic Surgery (SCTS) and the Healthcare Commission will regularly review the techniques used for risk adjustment and modify them to take account of changes where appropriate.

So it’s clear that the creators of this site know that this ‘expected’ score is unduly pessimistic (good news, no doubt, if reflective of improvements in survival in the last few years). Although not cited on the website, a few Pubmed clicks later I discover that one of the key players in the whole quality improvement process (& one I thoroughly applaud) is my local surgeon. And what’s his opinion? Along with his coauthors, he wrote in March this year that:

Our observations that the logistic EuroSCORE over-estimates observed mortality for isolated coronary artery surgery by a factor of 2 means that it would be easy to gain false reassurance by comparing observed mortality to that predicted by the algorithm. ((Bhatti et al, Heart, Mar 17th 2006,)
This ‘false reassurance’ is exactly what the Healthcare Commission website provides.

The issue is not that problems with the EuroSCORE model will produce unrealistic estimates that will set unfair expectations for some centres rather than others, it is that it systematically underestimates expectations for every centre. In the absence of any other data, what happens if we modify the EuroSCORE prediction by the scaling necessary to make the right prediction for the group as a whole, in other words roughly halve predicted case mortality? This assumes that whatever is driving the difference, such as technology improvement, is the same for all the risk factors in the model and is untested (& untestable by me since they don’t publish the risk factor covariates, but making this assumption isn’t going to produce any falser reassurance). On that assumption, and with the reasonable assumption that the model prediction errors will not be systematically changed by this scaling, we get the following estimate of excess mortality between ‘expectation’ and observation.

JSCAB2
Figure 2. Mortality reports by centre (vertical scale) expressed as a difference between unscaled actual cases and expected cases from EuroSCORE scaled by 0.45. Mortality in this data set was actually between 0.42 (2005) and 0.45 (2003-2005) of expected by EuroSCORE. In the absence of a properly rescaled mortality predictor this can only be indicative, but it suggests that there are no very substantial outliers, and the variability between centres is comparable to the variability in EuroSCORE prediction. I've done my own bit of visual manipulation in here: that conclusion is much clearer for the red line, the 2003 data. For 2003-2005 where the expectation ranges are smaller (large sample sizes presumably) there are a few more outliers but the green is both less bright and green outliers are obscured by the red lines anyway. Since I wanted this all to be about methodology and not picking arguments with good people, and since the green excess is never more than a couple of cases a year out of over 18000 operations, let's leave it that way.

Now we have left Lake Wobegon, and centres line up both below and above average performance. But note how few of them are more different in performance than the noise of our prediction process; and note that the few outliers are visible here in ways completely obscure in Figure 1.

Just to finish, I applaud what these people have done. They have had the courage to publish data that no other group of clinicians have, some of them even break down the centre data into individual surgeons: I haven’t looked at that data (too much clicking) but the same principle applies. Via audit, and the process improvement that follows, it will do good for them and for patients; and I don’t pretend my comments are primarily designed to help either of those causes very much.

(Here are the data and grungy R code used to generate the figures)

Posted by Jonathan at May 26, 2006 09:35 AM
Comments
Post a comment









Remember personal info?