Can We Measure a “Good Doctor?”


Danielle Ofri
New England Journal of Medicine

The quarterly “report card” sits on my desk. Only 33% of my patients with diabetes have glycated hemoglobin levels that are at goal. Only 44% have cholesterol levels at goal. A measly 26% have blood pressure at goal. All my grades are well below my institution’s targets.

It’s hard not to feel like a failure when the numbers are so abysmal. We’ve been getting these reports for more than 2 years now, and my numbers never budge. It’s wholly dispiriting.

When I voice concern about the reports, I’m told that these are simply data, not criticisms, and that any feedback of data to doctors is helpful. On the face of it, this seems logical. How can additional information be anything but helpful?

It’s easy, of course, to find scientific reasons why the data are less clinically meaningful than they seem. Success and failure in these measures tend to be presented as a binary function, although clinical risk is almost always a variable function. My patients whose blood pressure is 140/85 (quite near the 130/80 goal) are counted as failures equivalent to patients with a blood pressure of 210/110, even though their risks for adverse cardiovascular outcomes are vastly different.

And although these quality measures focus on diabetes in pristine isolation, my patients inconveniently carry at least five other diagnoses and routinely have medication lists in the double digits. Practicing clinicians know from experience that micromanagement of one condition frequently leads to fallout in another.1,2

What happens when my patients read these data? I wouldn’t blame them if they concluded that I’m a lousy doctor and switched to another physician who manages to get glycated hemoglobin levels at goal for 38% of her patients with diabetes.

The quarterly report card stokes a perennial fear: maybe I really am a substandard doctor, and these statistics simply shed light on what I’ve refused to accept. If I’m doing my patients a disservice, then I’m morally obliged to vacate my office to make room for a more competent practitioner.

I appreciate the efforts and good intentions behind the report cards, and I’m certainly not saying that we shouldn’t have any data at all. But I think we need good evidence that the data measure true quality and that providing data is actually helpful. For individual doctors — as opposed to institutions or countries or populations — the evidence is not convincing.3,4 The possible mandatory use of these quality measures for reimbursement raises a host of other concerns.

If the goal of providing reports to individual physicians is to help them improve their care, it’s critical to understand the baseline assumption about doctors’ performance. Are most doctors doing a reasonable job? If so, then our analytics should aim to weed out the few who are inept. Or are most doctors mediocre, with shoddy clinical skills that put patients at risk? If so, then our data-driven system must prod doctors as a group to up their game.

There isn’t a simple formula for distinguishing good doctors from second-rate ones, nor will there ever be. At least some evidence suggests that when doctors deviate from quality measures, they nearly always have medically valid reasons for doing so.5 I think we should be willing to consider the larger gestalt of medicine, rather than just the minutiae that fit more expediently into a spreadsheet.

Who are the people who choose to enter medicine, and what are their motivations and character? I have yet to meet a medical student, intern, nurse, or doctor who doesn’t feel a powerful sense of professional responsibility. Not every single one is lining up for a Nobel Prize, but overall it is a smart and dedicated group. If Winnicott were selecting a “good enough” cohort for the medical profession, this would be it. I think society accepts that the overwhelming majority of health care workers are in the profession to help patients and are doing a decent job.

Quantitative analysts will chafe at this line of reasoning. They will say that doctors are afraid of being judged on the basis of hard data. They will see it as a sign of medical arrogance that physicians insist that everyone simply trust us to do the right thing because we are such smart and noble people.

I’ve always wanted to ask these analysts how they choose a physician for their sick child or ailing parent. Do they go online and look up doctors’ glycated hemoglobin stats? Do they consult a magazine’s Best Doctor listing? Or do they ask friends and family to recommend a doctor they trust? That trust relies on a host of variables — experience, judgment, thoughtfulness, ethics, intelligence, diligence, compassion, perspective — that are entirely lost in current quality measures. These difficult-to-measure traits generally turn out to be the critical components in patient care.

I certainly want to know how my hospital is doing with an overwhelming disease like diabetes. The data do offer a snapshot of the clinical complexities of the disease, the challenges posed by our patients’ cases, and the limits of how much we can alter a disease that is affected by so many variables. And they could highlight fixable systemic impediments to good care.

But pinning the data on individual doctors is different. It purports to make a statement about comparative quality whose objectivity is a fallacy. When it weeds out the rare incompetent, it’s fine. But by and large, it serves only to demoralize doctors.

It offers patients a seductively scientific metric of doctors’ performance — but can easily lead them astray. Relying on these data is like trying to choose which car to purchase, armed with a metallurgic analysis of one square inch of the left rear fender of each car. The numbers are accurate, but they don’t tell you which car will run the best.

We all want our patients to achieve the best health possible, but most doctors don’t actually have control over the challenges of a complicated disease like diabetes — which is probably why my numbers haven’t budged in 2 years.

Sure, I can imagine a few changes that would no doubt improve my patients’ medical care: an hour-long visit instead of 15 minutes, weekly individual nutrition counseling, personal exercise trainers, glucose test strips that are covered by insurance, and medications that don’t cause diarrhea, heart failure, weight gain, or hypoglycemia. But report cards with my stats? So far they haven’t made me a better doctor. They just make me feel like a nihilist, bitterly watching primary care medicine grind down so many of its practitioners.

Doctors who actually practice medicine — as opposed to those who develop many of these benchmarks — know that these statistics cannot possibly capture the totality of what it means to take good care of your patients. They merely measure what is easy to measure.

We teach students and residents that tests that don’t alter clinical management can be harmful and should not be ordered. Regrettably, that is essentially what I’ve concluded about report cards for individual doctors. I don’t even bother checking the results anymore. I just quietly push the reports under my pile of unread journals, phone messages, insurance forms, and prior authorizations. It’s too disheartening, and it chips away at whatever is left of my morale. Besides, there are already five charts in my box — real patients waiting to be seen — and I need my energy for them. (from the )


    1. Tinetti ME, Bogardus ST Jr, Agostini JV. Potential pitfalls of disease-specific guidelines for patients with multiple conditions. N Engl J Med 2004;351:2870-2874
    2. Boyd CM, Darer J, Boult C, Fried LP, Boult L, Wu AW. Clinical practice guidelines and quality of care for older patients with multiple comorbid diseases: implications for pay for performance. JAMA 2005;294:716-724
    3. Fung CH, Lim YW, Mattke S, Damberg C, Shekelle PG. Systematic review: the evidence that publishing patient care performance data improves quality of care. Ann Intern Med2008;148:111-123
    4. Werner RM, Asch DA. The unintended consequences of publicly reporting quality information. JAMA 2005;293:1239-1244
    5. Persell SD, Dolan NC, Friesema EM, Thompson JA, Kaiser D, Baker DW. Frequency of inappropriate medical exceptions to quality measures. Ann Intern Med 2010;152:225-231