Wednesday, April 24, 2024

How artificial intelligence will make my work easier

A recent article in the Pittsburgh Post-Gazette outlined the various ways that artificial intelligence (AI) is improving health care in Pennsylvania. For example, AI software can serve as a "virtual scribe," listening to the doctor-patient conversation during an office visit and drafting a note, freeing the doctor to focus on the patient for 100% of the time. AI can "draft letters to health insurers on behalf of patients who need specialty medications, medical equipment or other care that's not standard in their insurance benefits," saving time for doctors and office staff. In the future, AI could respond to patient portal messages, triage phone calls, or even suggest diagnoses.

That all sounds great, but a lot of people thought that electronic health records would make clinicians' work easier when they were implemented, too, and we know how that worked out (or didn't). So what's the evidence that AI will actually deliver on its promise in health care?

A case study published in NEJM Catalyst described Kaiser Permanente (KP) Medical Group's implementation of AI scribes using smartphone microphones to document more than 300,000 patient encounters across all medical specialties:

The response from physicians who have used the ambient AI scribe service has been favorable; they cite the technology’s capability to facilitate more personal, meaningful, and effective patient interactions and to reduce the burden of after-hours clerical work. In addition, early assessments of patient feedback have been positive, with some describing improved interaction with their physicians. Early evaluation metrics, based on an existing tool that evaluates the quality of human-generated scribe notes, find that ambient AI use produces high-quality clinical documentation for physicians’ editing. Further statistical analyses after AI scribe implementation also find that usage is linked with reduced time spent in documentation and in the EHR.

How about the electronic inbox and the increasing burden of responding to patient portal messages? One approach to streamlining this workload is making sure that requests are routed to the right person in the practice, often front office staff or nurses rather than physicians. A research letter in JAMA Network Open illustrated the content of nearly 5 million electronic messages from patients received by KP Northern California between April and August 2023 and classified using real-time natural language processing. In a pilot quality improvement study in primary care and gastroenterology practices at Stanford Health Care, responses to messages were drafted by a large language model (LLM), and clinicians (physicians, advanced practice providers, nurses, and clinical pharmacists) were surveyed pre- and post-program implementation. Although the LLM drafted responses to 75% of messages, the average clinician used the draft only 20% of the time, with primary care clinical pharmacists using them the most (44%). There was no change in the amount of time clinicians spent managing their inboxes. However, task load and work exhaustion scores declined in the post-survey, and many clinicians appreciated that editing a draft required less effort than writing a response from scratch.

As a medical editor and author of hundreds of published papers, that last point makes sense - leading me finally to the use of AI outside of the clinic to draft scientific review articles. Currently, most journals either prohibit AI use or require authors to describe exactly how AI was used to develop a manuscript. A recent study compared papers on 3 topics related to bone health (Alzheimer's disease, fracture healing regulation, and effects if COVID-19) that were written by 1) a human only; 2) ChatGPT only; and 3) a human and ChatGPT working together ("AI-assisted"). Unsurprisingly, the most accurate papers that required the least amount of time to write were AI-assisted (human-supervised?) where the AI was given not only a prompt but an outline and references. I'm still waiting to see the first American Family Physician submission where the authors were assisted by AI. It's only a matter of time - unless, of course, it's already happened and I just didn't realize it.

Saturday, April 20, 2024

Reducing harms associated with PSA screening

In the U.K. Cluster Randomized Trial for PSA Testing for Prostate Cancer (CAP), more than 400,000 men in primary care practices between 2001 and 2009 were either invited to receive a single PSA screening test or usual care. After a median follow-up of 10 years, there were more prostate cancer diagnoses in the screening group, but no effect on prostate cancer mortality. (Men diagnosed with localized prostate cancer were invited to participate in a separate trial comparing active monitoring, surgery, and radiotherapy, which Dr. Middleton discussed previously on the AFP Community Blog.) In a secondary analysis of the CAP trial after 5 more years of follow-up, researchers found a small difference in prostate cancer mortality favoring the screening group (absolute reduction = 0.09%, number needed to screen = 1,111 to prevent one prostate cancer death). However, the screening group was at greater risk of detection of low-grade (Gleason score <=6) cancers that are likely to be clinically unimportant and represent overdiagnosis.

Magnetic resonance imaging (MRI) is increasingly being used as a triage strategy for men with suspected prostate cancer to avoid unnecessary biopsies while still detecting clinically significant cancers at curable stages. A 2024 systematic review and meta-analysis of 72 studies (n=36,366) examined associations between MRI Prostate Imaging Reporting & Data System (PI-RADS) findings, clinical data, and clinically significant prostate cancer. Compared to performing prostate biopsies on all patients, avoiding biopsies in patients with PI-RADS category 3 or lower lesions and PSA density of 0.10 or less reduced unnecessary biopsies by 30% and missed 1 in 17 significant tumors. Increasing the PSA density threshold to 0.15 reduced unnecessary biopsies by 48% and missed 1 in 15 significant tumors.

Several randomized trials are evaluating the effectiveness of a screening strategy combining MRI and a PSA-based biomarker risk score (e.g., 4-Kallikrein Panel) to determine which patients with abnormal PSA levels should be biopsied. The ProScreen trial, involving more than 60,000 Finnish men aged 50 through 63 years, recently reported preliminary results from its baseline screening round. Researchers found that compared to the usual care group, men invited for screening were more likely to have high-grade prostate cancer detected (1 per 196 men) at the cost of also being more likely to have low-grade prostate cancer detected (1 per 909 men). Whether these small differences will lead to meaningful improvements in prostate cancer mortality will not be known for at least several years.

A systematic review published on April 7 in JAMA reiterated the importance of continuing to use cancer-specific mortality as the primary outcome in randomized trials of cancer screening. The authors evaluated the strength of correlations between reductions in stages 3 and 4 cancer (a proposed surrogate outcome for trials of multicancer screening tests) and reductions in cancer-specific mortality in 41 published randomized trials of screening for breast, colorectal, lung, ovarian, prostate, and other cancers. They found high correlations for ovarian and lung cancers, but only a moderate correlation for breast cancer, and weak correlations for colorectal and prostate cancers.

**

This post first appeared on the AFP Community Blog.

Saturday, April 6, 2024

Should race be incorporated into weight management decisions?

I have a personal stake in the answer to this question. For most of my adult life, my body mass index (BMI) has ranged between 22 and 25 kg/m2, which is considered to be in the normal range (the threshold for overweight is a BMI of 25, and obesity a BMI of 30). But it turns out that I've been overweight for most of that time if one applies a race-specific definition of overweight (BMI greater than 23) for individuals of Asian descent. Where did this race-based cutpoint come from, and is it still relevant in an era when we generally frown on using race as a surrogate for social determinants of health in making clinical decisions?

The story starts more than two decades years ago, when an expert committee convened by the World Health Organization (WHO) examined associations between BMI, body fat percentage, and risk factors for type 2 diabetes and cardiovascular disease in studies of Asian populations. They found that at similar BMI levels, Asian adults have higher body fat percentages and more metabolic risk factors than White adults. Although the WHO declined to formally establish different BMI thresholds for overweight and obesity in Asian populations, it suggested "additional trigger points for public health action": BMI greater than 23 represents "increased risk" and BMI greater than 27.5 represents "high risk."

In 2015, the American Diabetes Association (ADA) examined evidence from 4 cohort studies in Asian American populations and concluded that Asian American adults should be considered for diabetes screening if they have a BMI greater than 23, based on the prevalence of type 2 diabetes in this population being roughly equivalent to that in White Americans with a BMI greater than 25. (The U.S. Preventive Services Task Force recommends screening for prediabetes and diabetes in nonpregnant adults aged 35 to 70 with a BMI of 25 or greater, but it alludes to the ADA's lower threshold for Asian Americans in its practice considerations.) Notably, the studies cited by the ADA included virtually no persons of Chinese descent, despite Chinese being the largest Asian American subgroup. So this guideline does not necessarily apply to me.

However, a 2009 study of a large cohort (n=36,386) of Taiwanese civil servants and schoolteachers over age 40 found that all-cause mortality increased significantly at BMIs greater than 25, analogous to the increase in mortality seen in White populations with obesity (BMI > 30). Taiwanese adults with BMIs from 23 to 24.9 had no difference in all-cause mortality compared to persons with lower BMIs but showed a nonsignificant trend toward increased cardiovascular mortality that was not modified by smoking status. That this study suggested a nearly identical risk threshold as studies in other Asian American populations would argue that I am not exempt.

A more recent comparative study of minority populations living in England found that South Asians with lower BMIs had the highest risk of developing diabetes, followed by Arab, Chinese, Black, and finally White populations. Presumably, race and ethnicity were self-identified. Similarly, a 2023 scientific statement from the American Heart Association found that the risk of coronary artery disease appears to be highest among South Asian and Filipino Americans and lowest among Chinese, Japanese, and Korean Americans, but cautioned that limited disaggregated data precluded making clinical recommendations based on race or ethnicity. As a JAMA news article recently noted, the common practice of national surveys lumping diverse ethnic groups into a single "Asian" obscures disparities within those groups and frustrates efforts to achieve health equity.

My admittedly selective review of the data leads me to believe there is probably some value to considering more intensive lifestyle counseling and metabolic screening in Asian patients with BMIs between 23 and 25, like me. But what do we do about the rising numbers of American adults of mixed race? Perhaps "precision medicine" will eventually find a way to integrate genetic and environmental risks and let clinicians dispense entirely with numeric thresholds and race categories, but I would be surprised if this occurs before the end of my career in medicine.