Monday, September 30, 2019

Screening for autism spectrum disorder: the jury is still out

In 2007, the American Academy of Pediatrics (AAP) first recommended using a standardized autism-specific tool to screen all children for autism spectrum disorder at the 18- and 24-month well-child visits. In a recent national survey, most pediatricians reported following this guidance, but I suspect that screening rates are considerably lower among family physicians. In my practice, I don't use an autism-specific screening instrument unless either I or the child's parent or guardian have behavioral concerns, in which case it's no longer screening, but evaluation.

Why not? In 2016, the U.S. Preventive Services Task Force concluded that "current evidence is insufficient to assess the balance of benefits and harms of of screening for autism spectrum disorder (ASD) in young children for whom no concerns of ASD have been raised by their parents or a clinician." The Task Force observed that most ASD treatment studies included children who were considerably older than those identified through screening, and that no controlled studies have looked at the comparative clinical outcomes of screening-identified children with ASD, which is what a guideline writer would definitely want to know before recommending universal screening, even if the AAP didn't think so.

Dr. Doug Campos-Outcalt, a longtime colleague who has served as the American Academy of Family Physicians' liaison to the USPSTF, wrote in American Family Physician that four critical questions needed to be answered before screening for ASD would be "ready for prime time":

1. What are the sensitivity and false-positive rate of the best screening test for ASDs available in an average clinical setting?

2. How much earlier can screening tests detect ASDs compared with an astute clinician who asks a few key questions about, and acts on, parental concerns regarding a child's communication and interactions?

3. What are the potential harms of testing?

4. Does earlier detection by screening result in meaningful and long-lasting improvements compared with detection through routine care?

Although the answers to the second and fourth questions are arguably the most important, until last week there was little evidence to answer the first and third, either. If the recommended screening test, the Modified Checklist for Autism in Toddlers, Revised, with Follow-Up (M-CHAT-R/F), can't reliably detect most children who will eventually develop symptoms of ASD in later life, or there are so many false positives that the harms of parent anxiety and unnecessary diagnostic evaluations would outweigh the benefits, then universal screening is unlikely to work. Unfortunately, the first large study (n=26,000) of near-universal screening for ASD in 31 primary care clinics affiliated with Children's Hospital of Philadelphia just provided disappointing results on both of these fronts. Using an older version of the M-CHAT, the sensitivity of screening was only 38.8%, and only 14.6% of children who screened positive ultimately received an ASD diagnosis, with even lower positive predictive value in children residing in lower-income households.

The authors pointed out that nearly 60% of children with initial positive screens did not return for a follow-up interview that might have reduced false positives and improved predictive value, and that children with positive screens who were diagnosed with ASD were more likely to receive interventions at a younger age, potentially improving outcomes. But the former simply shows how a two-stage screening test performs in real life, rather than in a controlled research setting. As for the latter, outside of anecdotes from screening advocates, we still have no conclusive evidence that long-term outcomes turn out better in these children. The bottom line? For universal screening for ASD in toddlers, the jury is still out.

Monday, September 23, 2019

Using life expectancy and prognosis to support shared decision-making

Due to competing causes of death (e.g., heart disease, stroke, dementia), the benefits of most screening tests decline with increasing age; for example, screening for breast and colorectal cancers is not recommended in persons with a life expectancy of less than 10 years. However, estimating how much time an individual has left to live and incorporating that estimate into shared decision-making with patients is challenging. As a result, a 2014 U.S. population-based survey found that 31% to 55% of participants with a greater than 75% risk of death in the next 9 years were still receiving breast, colorectal, or prostate cancer screenings.

There are many reasons why physicians provide so many unnecessary and potentially harmful screening tests to older persons with limited life expectancies. In an editorial in the September 1 issue of American Family Physician, Dr. Emma Wallace and Norah Murphy observed that "barriers to discussing life expectancy include uncertainty in prognostic estimates, limited time to broach this sensitive topic, and concerns about upsetting the patient or getting negative reactions."

A systematic review of the prognostic value of the "Surprise Question" approach (which asks clinicians, "would you be surprised if this patient died in the next 12 months?") found that the answer has varying degrees of accuracy at identifying patients in their last year of life. The QMortality tool, in contrast, generates a more precise estimate of one-year mortality in persons age 65 to 99 years utilizing multiple clinical and demographic variables, and was found to have good predictive accuracy in 500,000 family practice patients in England.

Some patients may feel uncomfortable about stopping nonbeneficial screening tests even if they are objectively unlikely to benefit from them. In a mailed survey of patients age 50 years or older in the Veterans Affairs health system, nearly 30 percent reported being "not at all comfortable" with discontinuing screening colonoscopy in a hypothetical patient scenario where a colorectal cancer-specific risk calculator predicted a low likelihood of benefit. To help physicians sensitively incorporate prognostic information into discussions about continuing or discontinuing screening, the University of California San Francisco's ePrognosis website provides risk calculators and video examples demonstrating key communication skills.


This post first appeared on the AFP Community Blog.

Monday, September 16, 2019

Too much medicine: research spotlights tests and interventions to consider avoiding in practice

In the fourth installment of an annual series, Drs. Roland Grad and Mark Ebell presented the "Top POEMs of 2018 Consistent with the Principles of the Choosing Wisely Campaign" in the September 1 issue of American Family Physician. Unlike the official list of Choosing Wisely campaign recommendations produced by the American Academy of Family Physicians and many other medical organizations, these suggested clinical actions were generated from recent research studies whose findings were judged by members of the Canadian Medical Association to help reduce overdiagnosis and overtreatment in practice. Drs. Grad and Ebell reviewed 13 of these POEMs (patient-oriented evidence that matters) in a previous article on the top 20 research studies of 2018 for primary care physicians.

This year's Choosing Wisely review article covered musculoskeletal conditions, respiratory disease, infections, cardiovascular disease, and miscellaneous topics. Here is a handy "cheat sheet":

1. Subacromial decompression surgery does not work.

2. Amitriptyline has no long-term benefits for chronic lower back pain.

3. In adults with mild asthma, as-needed budesonide/formoterol is as effective as a daily inhaled steroid.

4. In children with acute respiratory infections, broad-spectrum antibiotics are not more effective, but cause more adverse events, than narrow-spectrum antibiotics.

5. For chronic sinusitis, saline irrigation helps, and irrigation plus an intranasal steroid may help a little more.

6. A lower threshold for defining high blood pressure may harm patients at low risk for cardiovascular disease.

7. Don't order a high-sensitivity troponin level for a patient with a low pretest likelihood of myocardial infarction.

8. For women with symptomatic postmenopausal atrophic vaginitis, a nonprescription nonhormonal lubricant may be as effective as a vaginal estrogen tablet.

9. In adults with type 2 diabetes, NPH insulin is a cost-effective alternative to insulin analogues.

10. Ibuprofen is as effective as oral morphine for pain relief in children after minor outpatient orthopedic surgery, and has fewer side effects.

11. Skip the bath oil in children with atopic dermatitis.

Many of these overused tests and interventions are based on faulty pathophysiologic reasoning (e.g., if lowering blood pressure somewhat is good, then lowering blood pressure more should be even better).

Another valuable review of other research studies published in 2018 that highlighted medical overuse and health care services of questionable benefit appeared in JAMA Internal Medicine last week.

In a recent commentary on overuse in BMJ Evidence-Based Medicine, Drs. David Slawson and Allen Shaughnessy argued that "reducing overuse begins with the recognition and acceptance of the potential for unintended harm of our best intentions." They provided five examples of unintended harms of making medical decisions based on "what ought to work" rather than "what does work": activism gone awry (believing that no one is harmed by screening); innocent bystanders (traumatized loved ones of newborns with false positive screening results); the worried well we create (prediabetes); the butterfly effect (higher motor vehicle accident rates in patients with diabetes due to medication-induced hypoglycemia); and out of Oz and back to Kansas (over-extrapolating from research studies performed in ideal circumstances to real-world practice).


This post first appeared on the AFP Community Blog.

Tuesday, September 10, 2019

Draft USPSTF statement on screening for illicit drug use requires major revisions

It may surprise some observers that for its first quarter century, the U.S. Preventive Services Task Force did not post draft research plans, recommendation statements or systematic reviews online for public comments. Instead, these documents were developed and discussed on private conference calls and voted on at invitation-only Agency for Healthcare Research and Quality meetings, which I attended as a medical officer from 2006 through 2010. This policy changed after the media uproar over the USPSTF's 2009 mammography recommendations, which included criticism for the Task Force's lack of transparency in guideline development. Reluctant to open their meetings to the public out of fear that it would stifle candid debates about politically sensitive subjects, the USPSTF chose, instead, to institute a one-month public comment period on draft documents before finalizing their recommendations.

For the first few years, public comments resulted in few significant changes to draft statements. However, there are now examples of the public comment period leading to substantial changes in recommended testing options and letter grades in high-profile topics such as screening for colorectal cancer and cervical cancer. That's a good thing, since the USPSTF draft statement on screening for illicit drug use, which recently closed to public comments*, requires major revisions.

In 2008, the USPSTF concluded that "the current evidence is insufficient to assess the balance of benefits and harms of screening adolescents, adults, and pregnant women for illicit drug use." What specific evidence gap prevented them from making a recommendation?

The most significant research gap identified by the USPSTF is the lack of studies to determine if interventions found effective for treatment-seeking individuals with symptoms of drug misuse are equally effective when applied to asymptomatic individuals identified through screening.

Consequently, the research plan finalized by the USPSTF in October 2016 to update their 2008 statement focused on summarizing evidence of the benefits and harms of counseling interventions to reduce drug use in "screen-detected persons." Focusing the systematic review on this population recognized that their willingness and motivation to change their drug use behavior in response to an intervention likely differs from those who actively seek medical treatment.

The draft review produced by the team that carried out this research plan determined that a great deal more applicable evidence was published in the past decade: 27 randomized, controlled trials with a total of 8,705 participants. The studies' findings, however, were disappointing for advocates of screening:

Across all 27 trials, in general, there was no consistent effect of the interventions on rates of self-reported or biologically confirmed drug use at 3- to 12-month followup. Likewise, across 13 trials reporting the effects of the interventions on health, social, or legal outcomes, none of the trials found a statistically significant difference between intervention and control groups on any of these measures at 3- to 12-month followup.

In other words, interventions for persons who had illicit drug use detected by screening didn't reduce drug use, improve physical health, or lead to fewer brushes with the law. No benefit + no harm (though only 4 studies reported on potential screening harms) = no net benefit. So the appropriate response to the evidence would be to either recommend against primary care screening for illicit drug use (since it adds burden to practices without benefiting patients), or, if the studies were considered too heterogenous to make that definitive a conclusion, to declare the evidence insufficient to determine the balance of benefits and harms.

Here, though, is where the Task Force appears to have gone off the rails. Rather than draw one of these two evidence-based conclusions, they instead commissioned a second systematic review from a completely different team (without posting a new research plan for public comment) seeking evidence on interventions in treatment-seeking populations. This draft review concluded that psychosocial interventions increase the likelihood of abstinence from drug use for up to 12 months, and that there are effective medications for opioid use disorder in persons who desire treatment (nice to confirm, but hardly a novel finding). The USPSTF relied on this second review (and apparently ignored the first one) to support their draft "B" recommendation to screen for illicit drug use in adults age 18 years or older.

Don't primary care clinicians already ask their patients about illicit drug use? We certainly do, as part of taking the social history of a new patient, but not in the methodical, intensive way that the USPSTF is now recommending. Perhaps the Task Force felt compelled by the pressure of the opioid epidemic to offer something more in terms of clinical prevention than an "I" statement or a politically unpalatable "D" recommendation against screening. Regardless of their rationale, by bypassing their published methods and processes to produce a statement that the evidence clearly doesn't yet support, the USPSTF has ventured onto dangerous ground, raising questions about their scientific credibility at a time when evidence-based institutions need to be defended more than ever.


* - A summary of my assertions in this post was submitted to the USPSTF during the public comment process.

Thursday, September 5, 2019

What we choose to name a disease matters

A few years ago around this time, I was dealing with a series of minor health problems. I developed a sinus infection that took several weeks to resolve. I twisted one of my knees ice skating, and for a while I feared that I had torn a meniscus. Occasionally after eating a heavy meal, I had the sensation that food was getting stuck on the way to my stomach - so along with an x-ray and MRI for my knee, my doctor also sent me for an upper GI series. Finally, my blood tests for a new life insurance policy came back with a slightly high hemoglobin A1c level. The A1c test was once used only to monitor glucose control in patients with established diabetes, but in 2010 the American Diabetes Association changed their diagnostic criteria to classify an A1c level of 6.5% or greater as consistent with diabetes, 5.7% to 6.4% as prediabetes, and 5.6% or lower as normal. So on top of knee tendinitis and gastroesophageal reflux disease (GERD), I also found out that I had prediabetes.

Intellectually, I knew that there was no evidence that screening for prediabetes is beneficial (the life insurance company, not my doctor, had ordered the test), and that a screen-and-treat approach to diabetes prevention leads to lots of overdiagnosis. Emotionally, it was a different story. I had recently turned 40 and was feeling old. It had been years since I had gotten the recommended amount of physical activity for adults, and now I was doing even less because my knee hurt. It didn't help that the afternoon I found out about my A1c level, my wife called and asked me to pick up some Burger King sandwiches and fries to bring home for dinner. Not exactly what a pre-diabetic adult with GERD should be eating.

Would I have felt less sick if I had instead been told that I had "slightly high blood sugar"? In recent years, oncologists have recommended re-naming slow-growing lesions that we currently call cancer, such as "ductal carcinoma in situ" of the breast, indolent lesions of epithelial origin (IDLE), hoping that a less scary term will discourage patients from pursuing unnecessarily aggressive (and potentially harmful) treatment. Similarly, a study showed that telling patients that they have a "chest cold" rather than "acute bronchitis" will help them feel more satisfied when they don't receive an antibiotic prescription.

systematic review published in BMJ Open supported the notion that what clinicians choose to name a disease influences patients' management preferences. Some study examples: women who were told they had "polycystic ovary syndrome" were more likely to want a pelvic ultrasound than those who were told they had a "hormone imbalance." Women were more likely to want surgery if they had "pre-invasive breast cancer cells" versus "abnormal cells" or a "breast lesion." Patients were more likely to expect surgery or casting of a "broken bone" or "greenstick fracture" than a "hairline fracture" or "crack in the bone." In each of these cases, the use of a more medicalized or precise term led patients to prefer invasive management options that were no better than more conservative choices.

How will I apply this knowledge to my daily practice? Although I already use the term "prediabetes" sparingly (preferring "increased risk for diabetes"), I'm going to start telling more patients with A1c levels similar to mine that they have high blood sugar instead. That they have heartburn rather than GERD. That they have overuse knee strains instead of tendinitis. And certain medical terms, such as "advanced maternal age" (i.e., pregnancy after the age of 35, or my wife's age when she gave birth to 3 of our 4 children), I will strive to eliminate from my vocabulary entirely.


A slightly different version of this post first appeared on Common Sense Family Doctor on October 5, 2017.