Amy K. Saenger, PhD, DABCC, is a Director of Cardiovascular Laboratory Medicine and Assistant Professor of Laboratory Medicine and Pathology, College of Medicine, Mayo Clinic, Rochester, Minnesota. Dr. Saenger presents an update on test utilization strategies in the clinical chemistry laboratory.
Contact us: MMLHotTopics@mayo.edu.
Thank you for the introduction. I will be presenting Part 2 in a series dedicated to Laboratory Test Utilization Strategies, specifically focusing on examples from the Clinical Chemistry Laboratory. My colleagues, Dr. Curt Hanson and Dr. Bobbi Pritt, are presenting Part 1 and Part 3 in the series.
I have no conflicts or financial disclosures to make.
Laboratory test utilization can be defined as a strategy for performing appropriate laboratory and pathology testing with the goal of providing high-quality, cost-effective patient care. However, first I believe it is essential to succinctly define what an inappropriate or unnecessary laboratory test really is. In fact, the definition may be quite simple: it is any test where the results are not likely to be “medically necessary” for appropriate clinical management of the patient. It should not be defined as any laboratory test which is unlikely to be reimbursed. Strategies which have motives focused solely on financial incentives have been shown to be ineffective and less sustainable than those which center on value and quality of patient care. The frequency or commonality of inappropriate testing is difficult to quantitate, although the prevalence is estimated somewhere between 25% to 40% of all laboratory orders. Potentially even more difficult to grasp is quantitation of under-testing. What we do know is that there exists a wide variability in test request patterns between institutions, within 1 institution, and even within specialty practices, thereby suggesting the prevalence of inappropriate testing is likely much greater than we even are even aware of.
There are some things that we as laboratorians, physicians, and health care professionals can be certain of when a brand new test is introduced to a laboratory’s test menu. In general, once we introduce that assay onto our test menu, the clinical specificity of the test will decrease over time. For example, Mayo Medical Laboratories offers 2 new serum biomarkers called galectin-3 and soluble ST2. These assays are targeted towards a specific population and have demonstrated utility for prognosis and risk assessment only in patients already diagnosed with heart failure. Therefore, the test has a specific utility in a specific clinical condition and we would not want this assay ordered in everybody; this would decrease the specificity of the assay although the clinical sensitivity would likely improve. However, with a brand new test often comes a level of excitement and propensity towards ordering the test. Without proper education and availability of relevant real-time information, ST2 and galectin-3 assays could become as commonplace as vitamin D testing.
This slide was shared with me by Dr. Mike Astion from the University of Washington and Seattle Children’s Hospital and shows data that he acquired in working with an insurance company. Although the information is from a few years back, the data are still fairly representative with chemistry testing comprising approximately 20% to 25% of the total dollar amount spent on laboratory tests. Laboratory costs are increasing at an annual rate of 20% to 25% with much of that increase due to significant growth in the arena of molecular testing and genetic studies. Therefore, contemporary data might show a similar mix of laboratory spending by discipline except for a likely escalation in the molecular and genetic testing areas.
When dealing with laboratory utilization issues and traditional chemistry/immunology/endocrinology assays, I believe chemistry assay utilization can be almost more difficult to control than, for example, molecular and genetic tests. Why is that? Well, these tests are often deeply embedded into almost every clinical practice. Therefore, you have physicians/nurses/physicians assistants who were taught historical things (or “best practices”) in medical school which may now be outdated or inappropriate. You may encounter the statement that they’ve always had standing orders for blood gases, magnesium, liver enzymes, or just about anything. There is also the inherent attitude that it’s “just” a glucose, iron, BNP, or whatever test they want to order, and the individuals may not realize that the sedimentation rate they are ordering twice a day requires some bit of manual intervention and testing. Chemistry assays are also just a piece of the whole clinical puzzle, providing a hint at a clinical condition, prognosis, or outcome but rarely yielding an answer which is black or white. Therefore, everyone becomes or is the expert (self-defined or not) when it comes to ordering and interpreting chemistry tests.
There are several ways to look at examples of waste and nonstandard chemistry tests. First, you have legitimate tests which are ordered and/or utilized in the inappropriate clinical setting. These could include ordering a prostate specific antigen (PSA) tumor marker assay to screen for prostate cancer in a 90-year-old man. It may include daily orders for almost analyte, which is rarely needed or justified or your laboratory might perform widespread 25-hydroxyvitamin D testing for everyone, because physicians and patients have heard that everyone seems to be vitamin D deficient in some way, shape, or form. The guidelines, evidence and actual ordering practices are not lining up in this scenario. Second, you may have tests available or sent out which have weak evidence to support their clinical utility and/or the utility is not well defined. These include IgG allergy testing, the vertical autoprofiler (VAP) assay, and massive cardiovascular risk prediction panels or order sets. Odds are not in the patient’s favor that all of those results are going to come back “normal”, which further feeds the loop of inappropriate testing and potential additional unnecessary procedures/tests for the patient. Finally, there are the laboratories which offer tests based on “junk science”. These are the tests with odd or scary sounding names and have no data to support their use, things like an adrenal stress panel, salivary hormone tests, or assays which can somehow measure how well your liver is capable at detoxifying your body. Physicians or patients themselves may order these tests regardless, but in the end create downstream waste in the medical and health care system.
So, how do we go about changing these things or how tests are ordered? Trying to change fundamental behavior and behavioral patterns is difficult. If we look at the whole life cycle of a test, the actual critical point begins at the time the test is ordered—it essentially dictates what happens during the rest of the cycle. If you order the wrong test, you may not even realize it until you receive an unanticipated result and/or follow up results with the patient. If you order the wrong test but don’t have all the required information (such as in a timed creatinine clearance testing), you create issues downstream. Studies have demonstrated that in order to effectively change physician behavior, both education and direct feedback are crucial for the success of any test utilization initiative. Furthermore, feedback has the most impact if given to the individual as close to the decision-making time as possible.
One of the many successful strategies which have been utilized by Massachusetts General Hospital includes making and controlling the visibility of the test within the orders system. They actually displayed the price of the assay within their electronic orders system and postimplementation demonstrated a significant difference in the overall number of duplicate and inappropriate test orders. In that case, giving a “price or cost” to a laboratory test somehow made it more tangible, knowing that someone is going to have to pay for the cost of the test. This strategy might be particularly effective within institutions where residents or fellows are the primary individuals playing the order. Other institutions have implemented similar strategies, although instead of displaying direct costs the tests may be color coded like a stoplight (where green would equal good and red would equal bad); the colors could indicate either cost or appropriateness of the order. There are many successful stories out there, each unique, based on the diversity of the practice and/or availability of resources.
One strategy which is quite simple to implement if you utilize computerized provider order entry (or CPOE) is to perform a thorough audit of your test menu in the system. As an example, we found some inaccuracies with how our hCG test was built in our electronic orders system. The hCG assay is commonly utilized to diagnose and monitor pregnancy. However hCG may be secreted by abnormal germ cell, placental, or embryonal tissues, and therefore also has utility as a tumor marker (in particular ovarian germ cell tumors and/or testicular tumors). Because of these distinct unique uses for hCG we have 2 separate test codes, reference ranges, and interpretive information available for hCG. The laboratory was astutely aware that specimens were being received with orders for the quantitative pregnancy hCG assay for patients who were geriatric and/or male. In addition, women being seen in our OB practice had orders placed for the hCG tumor marker test. By simply evaluating how these tests were defined in CPOE we noted the test naming convention and keywords were defined incorrectly. We modified the test name and provided “useful for” information which was embedded right in the order set, at the time the physician was ordering the hCG assay, we have not had a similar instance since that time.
Often it can be as simple as providing the useful information embedded as a comment within the electronic medical record. One such analyte that I commonly received questions about was NT-Pro BNP, due to the various cutoffs reported and clinical scenarios which may affect interpretation of NT-Pro BNP (including renal failure, obesity, age and gender). The first sentence of the following comment is appended to all NT-Pro BNP results to yield relevant information on the negative predictive value of the analyte; the rest of the information is automatically added based on the age of the patient. This strategy has prevented many phone calls because the providers have all of the relevant information right next to the result and there is often no need to seek additional interpretation in a separate location.
A similar scenario was implemented for cardiac troponin when we altered our cardiac biomarker panel to remove CK-MB and require serial sampling, consistent with the international guidelines for diagnosis of acute myocardial infarction (AMI). In this case it is crucial to define a change or a delta and so we report a baseline, a 3-hour and a 6-hour result, along with a delta, which is interpreted as significant or not significant depending on how much it is changing between time points.
Now I will discuss a couple of examples of test utilization intiatives that we’ve under taken to reduce inappropriate or duplicate testing. The first example I’m going to discuss is looking at red cell folate versus serum folate testing.
So, folate is a water-soluble B vitamin that is essential for adequate health. Folate, along with vitamin B12, is essential for DNA synthesis, repair, and methylation. Folate deficiency manifests clinically in a variety of ways; notably it is strongly linked to an increased risk of neural tube defects and several observational and controlled trials have demonstrated that neural tube defects are significantly reduced with periconceptual folic acid supplementation. Individuals with folate deficiency may present clinically with unexplained, nonspecific neurological symptoms including dementia, weakness, and headaches. Folate and vitamin B12 deficiencies are both associated with a reduction in hemoglobin and megaloblastic changes in the bone marrow or other tissues. Megaloblastic anemia is the primary manifestation of folate deficiencies, where erythrocytes become abnormally large and nucleated due to the lack of folate necessary for DNA synthesis and cell division. Recognition of the significant relationship between folate and neural tube defects, cancer, and cardiovascular disease led to FDA mandated fortification of breads, cereals, flours, pasta, and other grain products. Complete fortification was fully implemented in the United States in 1998, with the primary goal of reduction of neural tube defects. Overall, fortification has been successful and the prevalence of low serum folate among women of childbearing age declined from 20.6% between 1988 and 1994 to just 0.8% in 2005-2006.
Suspicion of folate deficiency may originate if patient history reveals any clinical conditions and/or from results from a routine complete blood count, where a low hemoglobin and high MCV are observed. Use of the MCV alone is a nonspecific indicator when used by itself, as patients with concurrent iron and folate deficiency will not have the characteristic macrocytosis seen in folate deficiency alone. Laboratory diagnosis of folate deficiency further includes measurement of serum folate, and less often red blood cell (RBC) folate. There are a number of methods which can be utilized to quantitate serum or red cell folate, including microbiologic assays, competitive protein-binding assays, or chromatography. A majority of laboratories use a protein-binding assay with chemiluminescent detection for both serum and red blood cell folate, where the red cell hemolysate is prepared by manually lysing the specimen with ascorbic acid, which releases the intracellular folate, primarily in the 5-tetrahydrofolate form. Folate is taken up only by the developing erythrocyte; therefore RBC folate has historically been regarded as the better indicator of long-term folate storage. Serum folate concentrations reflect recent dietary intake of folate, but measurements need to be conducted after the patient fasts. Thus, theoretically while RBC folate is less susceptible to rapid changes in dietary intake, analytically the assays are plagued with imprecision issues.
We evaluated utilization and ordering patterns of serum and red cell folate to determine the frequency with which both assays were ordered and to determine necessity. The following table shows the annual volumes, both internal and external, for serum and red cell folate. The high external but not internal volume of RBC folate testing is likely due to its manual nature which is not amenable to high-throughput automated analyzers and workflows. You can also note the Medicare reimbursement for both assays is fairly similar while the direct costs to the laboratory are quite different.
Results from our laboratory and other studies suggest that American populations have indeed attained adequate concentrations of folate and that modern folate deficiency has essentially been eliminated. This graph shows results from Mayo Clinic patients (in Rochester, MN) who had serum folate ordered over a 2-year time period. Of the almost 25,000 serum folates performed, only half a percent were less than 3.0 ng/mL, which is the deficient concentration as defined by NHANES and the CDC. Thus, a huge majority of samples are being tested that are normal; there are not many other routine laboratory tests that have this type of distribution, normally a more Gaussian distribution of results can be observed.
We further undertook a 10-year retrospective analysis of red blood cell and serum folate results to examine ordering patterns and evaluate the clinical utility of RBC folate in our own patient population. Results were retrieved from all serum and red cell folate tests from the laboratory information system at Mayo ordered on inpatients and outpatients between 1999-2009. Data for patients who had simultaneous orders for serum and red cell folate were analyzed and chart reviews were conducted on those patients with normal serum folate but low red cell folate; these are the individuals who have the potential for misdiagnosis if screened with a serum folate alone. Abnormal values were defined by the NHANES/CDC criteria for folate deficiency (serum folate <3.0 ng/mL and RBC folate <140 ng/mL). A total of 152,166 serum and 15,708 RBC folate assays were performed over the decade. The prevalence of folate deficiency using only serum folate was 0.39% and 0.27% when using only RBC folate. There were 1082 patients in which serum and red cell folate were ordered concurrently as seen in the table. Only 1 individual or 0.09% had both an abnormal serum and red cell folate using traditional definitions for deficiency. Only 4 individuals had a normal serum folate but abnormal RBC folate; one of which was nonfasting and 3 which were considered “difficult” interpretations (one of which had a MTHFR mutation, and the other 2 results were not reproducible).
From a recent CAP survey, RBC folate measured by the most prevalent methods had percent CVs that were greater than serum folate by the same methods that show percent CVs that are lower by almost half at low concentrations. This suggests that if folate deficiency is truly rare, and the 2 assays available give clinically equivalent results, the one with a manual processing step and greater imprecision has limited value and may be discontinued—again, if clinically warranted.
To discontinue RBC folate or any other test, it is important to seek and obtain buy-in from your clinicians. To do this, it is helpful to gather your own institutional data from your own patients to make a case for why the test has limited diagnostic utility. Present this data to the relevant clinical practice committees; for RBC folate the most frequent consumers of the test were physicians in hematology, gastroenterology, and endocrinology. It was anticipated that the decision to discontinue the test was going to be difficult but, in reality the clinical practice groups were all champions for removing the assay from the test menu, hot buttons within CPOE and producing educational materials for the rest of the practice. It became equally important to communicate this change practice-wide which can be accomplished by a variety of mechanisms, depending on your institution, including newsletters, memos, grand rounds, or other modalities.
To conclude with this example, for folate testing routine ordering of serum and RBC folate together is unnecessary. Folic acid supplementation is a reasonable approach without actually testing for deficiency and finally red cell folate provides equivalent diagnostic information to serum folate in almost every clinical situation and laboratories in general, with very little exception should not be ordering RBC folate.
A second initiative regarding test utilization focuses on the inflammatory markers erythrocyte sedimentation rate or ESR and C-reactive protein or CRP. This is a project spearheaded by my colleague Dr. Darci Block. The project involved analysis of the ordering patterns for the 2 tests, identifying their discordance, indication for ordering, as well as the clinicians reaction to the discordant results. Ideally, our goal is to reduce the volume of ESR tests performed in our laboratory.
Inflammation results from the body’s immune response, involves both the vascular and immune systems, and causes several nonspecific symptoms including pain, redness, heat, and swelling. During the inflammatory response (both acute and chronic), several proteins and molecules are produced known as acute phase reactants. CRP is an acute phase reactant made by the liver and is involved in complement activation. Its primary clinical utility lies in detection and monitoring of a wide variety of inflammatory diseases. Assays for CRP are automated and the turnaround time is typically short. Production and the presence of CRP in serum are nonspecific it but does offer the advantage of being a sensitive marker of early inflammation. Erythrocyte sedimentation rate, ESR, is an indirect measure of inflammation that is dependent on the formation of rouleaux. These are RBCs that aggregate to an extent that depends on plasma protein concentrations of acute phase reactants. ESR in our laboratory is measured by the manual Westergren method which involves measuring the distance the RBCs settle in a tube within 60 minutes (or watching the blood fall in the tube). There are automated platforms available to perform the measurement, although we were hesitant to implement an automated test for a poor assay without further attempting to first reduce inappropriate utilization. ESR has more specific indications for diagnosing and monitoring a subset of rheumatologic diseases. However, it too is nonspecific and has the additional disadvantages of being influenced by age, gender, anemia, and protein abnormalities. Overall, in a side by side comparison, if given the choice for most patients and most diseases, CRP is the preferred method for detecting and monitoring inflammation.
We evaluated the ordering patterns at Mayo Clinic, Rochester over a 1-year time period. The results demonstrated over 50,000 requests for ESR, which outnumbered orders for CRP. The data also showed that three-quarters of CRP requests were made in combination with ESR, totaling about 30,000 simultaneous orders.
We then evaluated the reasons or indications for sed rate and CRP orders. In this data set there were over 900 unique ICD-9 codes associated with ESR or CRP orders. Each of the orders had up to 3 indications for testing and 95% of the orders were made for the same indication. This information caused us to further question the redundancy and necessity of these duplicate and unnecessary orders. We looked at the discordance of the results by defining abnormal according to our normal reference intervals. Overall, concordance between the 2 assays was 81%, which was surprisingly very good! Amongst the discordant results, CRP was elevated a majority of the time owing to its increased sensitivity in detecting inflammation.
The most discordant panels were identified, and significance in discordance was defined using altered abnormal cutoffs chosen to account for both biological and analytical variability. The clinician’s reaction to the lab result was documented following chart review. Overall, rheumatoid arthritis had mixed results where ESR and CRP correlated with disease activity and other cases where they did not, and based on Dr. Block’s review it was unclear which test should be recommended. In the polymyalgia rheumatica and giant cell arteritis cases, CRP was by far the marker that best reflected disease activity while elevations of ESR in asymptomatic patients was explained by anemia, hypergammaglobulinemia, as well as other causes, suggesting CRP is the marker of choice in these groups. Finally, the inflammation group showed a similar result suggesting CRP could be performed alone in these patients. In conclusion, this data shows that with an 81% concordance and acceptable performance of CRP in many different disease groups, an isolated CRP result is adequate for many indications. Overall, this suggests we have the potential to reduce ESR orders by about 50% or more. Our next steps involve meeting with clinicians to discuss these results and develop strategies to curtail ESR requests, similar to what was done with RBC folate testing.
One final note about dealing with pseudoscience and waste in test utilization. These are often send out tests which are sent externally to other reference laboratories. These can be easier to deal with than standard tests because you just “No, they’re not on our send out list”. They could be CLIA laboratories that actually claim their test rule in syndromes that are not as well accepted such as, emotional problems, vaccine injuries, or dysbiosis. These maybe laboratories that advertise or claim their tests rule in syndromes where there’s no specific lab test that can diagnose the disorder, such as chronic fatigue syndrome, fibromyalgia, autism, irritable bowel syndrome, or chemical sensitivity. These tests often involve huge panels of testing which cost $1000 upward. Again, many of the results comeback positive leading to other unnecessary tests being ordered and if you look in peer-reviewed literature these tests are often not referenced in publications.
In dealing with these pseudoscience tests, structural interventions are the best. Having formularies or laboratories/tests on a preapproved list are effective modalities, as well as periodic education or reminders to the practice, report cards, and requiring “real time” authorization of tests before sending them out.
In conclusion, chemistry tests are highly embedded within a majority of clinical practices and can often be more difficult to get a hold on when attempting to control test utilization. It may be optimal to first focus on things like immunoassays which have expensive reagents and tests which have manual handling or special processing steps like both RBC folate and ESR. Finally, formularies are a very effective modality for eliminating or reducing assays and laboratories which are questionable, often expensive, and those which have limited to no clinical utility. Finally, it is critical for us to always remember that evidence-based medicine should be complemented by evidence-based implementation.
Finally, I would like to acknowledge my colleagues within the Mayo Clinic Department of Laboratory Medicine and Pathology, many who were involved in the specific test utilization projects I mentioned.
Thank you for the opportunity to speak to you today and hopefully I have been able to discuss some specific examples related to laboratory test utilization in the clinical chemistry laboratory. In part 3 of this series, Dr. Bobbi Pritt will continue this discussion and will focus on the role of test algorithms in laboratory test utilization.