Use Of A Binomial Model To Predict A Lower Confidence Limit For Copper Deficiency Prevalence In Feeder Calves

Ronald K. Tessman, DVM^*

Jeff W. Tyler, DVM, PhD^*

Stan W. Casteel, DVM, PhD^†

Robert L. Larson, DVM, PhD^*

Jeff Lakritz, DVM, PhD^*

Gary F. Krause, PhD‡

James E. Williams, PhD§

Departments of

^*Veterinary Medicine and Surgery, ^†Veterinary Pathobiology, ‡Statistics, §Animal Science
University of Missouri, Columbia, Missouri

The described research was supported in part by grants from The Committee On Research, College of Veterinary Medicine, University of Missouri-Columbia; the University of Missouri Agricultural Experiment Station; and USDA Formula Funds. Additional support was provided by the Minority Biomedical Researchers Training Initiative and the University of Missouri Chancellor’s Gus T. Ridgel Fellowship for Underrepresented Minority Americans.

KEY WORDS: prevalence, binomial,
copper deficiency, cattle, confidence limis,
epidemiology

Abstract

The purpose of this study was to investigate a novel approach to estimate a lower confidence limit for copper deficiency prevalence using serum copper concentrations, which were based on the binomial distribution. Paired liver and serum samples were collected from 33 calves. The binomial distribution was used to calculate the probability of k positive test results in n trials at varying prevalences. This process provided a de facto hypothesis test for the lower 95% confidence limit of prevalence. For model validation, random samples of either 10 or 15 were drawn from the 33 serum samples. These samples were used to compare the lower limit of prevalence using the developed model and traditional methods. Results of the model verification trials are as follows. Mean and standard deviation of the 95% lower confidence limit of prevalence of the 10 and 15 sample trials were 0.49 ± 0.24 and 0.64 ± 0.19, respectively. Using the Z-distribution for population proportions, the mean and standard deviation of the 95% lower confidence limit of the 10- and 15-sample trials were 0.95 ± 0.14 and 0.99 ± 0.06, respectively. The binomial model provided a more satisfactory method to interpret imperfect test results than the Z-distribution for population proportions. The described technique has merit beyond the topic of copper deficiency. Rather than discount imperfect tests, we envision using this procedure to develop confidence limits for population prevalence in those instances in which test performance (either sensitivity or specificity) of diagnostic tests is sub-optimal.

INTRODUCTION

Ideally, diagnostic tests have high sensitivity and specificity. When tests are used to direct the medical management of individuals, the accuracy of individual test results is paramount. Additionally researchers, clinicians, and public health professionals are often asked to provide estimates of disease prevalence based on results of imperfect tests. Reasonable conclusions regarding the population disease behavior may be made using tests with less-than-optimal sensitivity and specificity. Traditionally, real prevalence is calculated using the following formula:

Equation (Eq) 1:
RP = (AP + S_p – 1) / (S_e + S_p – 1)1

where RP is real prevalence, AP is the apparent prevalence or proportion of positive test results, S_p is the specificity or likelihood of a negative test in a disease negative individual, and S_e is the sensitivity or likelihood of a positive test in a disease positive individual. It is possible to use the Z-distribution for population proportions to construct lower confidence intervals for calculated real prevalence using the following formula:

95% lower confidence limit of prevalence = RP – 1.96 ÷(RP(1-RP)/n)2

where RP = calculated real prevalence, and n = sample size. It should be noted that small sample sizes produce broad confidence intervals that have little value in describing population prevalence.

The purpose of this study was to develop an alternate approach based on the binomial probability distribution. For illustrative purposes, we examined detection of copper deficiency in cattle using serum copper concentrations. This procedure has previously been reported to have imperfect sensitivity (0.53) and specificity (0.89).3 True copper status was determined by liver copper concentrations.

MATERIALS AND METHODS

Theoretical Reasoning

The hypothesis statement developed to guide model development was as follows:

H₀:
Real prevalence < an a priori
hypothesized prevalence

H_a:
Real prevalence ³ an a priori
hypothesized prevalence

It is intuitive that apparent prevalence equals the probability of a positive test result: AP = Pr{T+}.4 Eq 1 can be solved for apparent prevalence to obtain equation 2:

Equation (Eq) 2:
AP = RP*S_e + RP*S_p – RP – S_p +1.

The binomial distribution requires an outcome which is either positive or negative and mutually exclusive; Pr{T+} = 1 – Pr{T-}, where Pr{T+} is the probability of a positive test and Pr{T-} is the probability of a negative test. The reported sensitivity and specificity were used to calculate the Pr{T+} at all possible prevalences. These probabilities were used with various combinations of trials and successes in the binomial distribution equation to calculate the probability of k positive test results (serum copper £ 0.45 mg/mL) when n subjects are sampled at varying prevalences. The binomial distribution may be defined as follows:

Equation (Eq) 3:
P(X=k) = ( nk)pk (1 – p)n – k 5

where P(X=k) is the probability of an event, n is the number of trials, k is the number of positive outcomes and p is the probability of a success in each trial. In our example, n was the number of calves sampled, k was the number of animals with a serum copper concentration less than or equal to 0.45 mg/g, and p was the hypothesized apparent prevalence as calculated using Eq 2. The event predicted was the likelihood of the number of positive tests being greater than or equal to k when n calves were sampled in a population of defined prevalence. If the calculated probability was very low, we assumed that the prevalence is higher than that which was previously hypothesized. The probability statement may be expressed as, Pr{RP ³ x} £ P value. Because this model was based on apparent prevalence, an additional probability statement was required. This statement may be expressed in the form of Pr{AP ³ y} £ P value. Therefore, the final probability statement required to reject hypotheses regarding prevalence based on randomly sampled data sets was as follows:

Equation (Eq) 4:
Pr{RP ³ x} = Pr{AP ³ y} = S ( nk)pk (1 – p)n – k £ p-value.

When the calculated Pr{AP ³ y}£ 0.025 we rejected the null hypothesis that real prevalence is less than an a priori hypothesized prevalence. We chose a p value £ 0.025 because it is equivalent to the lower limit of a 95% confidence interval.

With the aid of a computer program (Microsoft Excel 2000, Seattle, WA), a large table representing the probability of all possible outcomes for trials consisting of greater than or equal to 5 and less than or equal to 20 samples at all possible real prevalences was produced.

Model Validation

Serum copper determinations were performed on samples taken from calves ranging in age from 6 to 9 months from a single herd. Paired liver and serum samples were collected from 33 calves. Blood was collected into evacuated tubes specifically manufactured for trace mineral determinations (Becton Dickinson and Company, Franklin Lakes, NJ). Liver biopsies were collected by trans-thoracic technique using a 16 gauge biopsy needle (Jorgensen Laboratories, Loveland, CO).6 All copper determinations were made through the use of atomic absorption spectrophotometry (Perkin-Elmer 2380, Norwal, CT) (wavelength, 324.7 nm) with a previously described method.3

The apparent prevalence (AP) was calculated by dividing the number of positive tests, those with a serum copper concentration less than or equal to 0.45 mg/g, by the total number of calves sampled at each time period.

Samples of sets of 10 and 15 serum copper results were randomly drawn without replacement from the 33 serum copper determinations. For samples of each size (n = 10 or 15), 1000 random sampling iterations were performed using a computer software program (S-Plus 2000, Mathsoft Inc., Seattle, WA).The number of positive test results (serum copper £ 0.45 mg/g) was determined for each sample set. Given the number of positive tests (k) and the sample size (n = 10 or 15), the 95% lower confidence limit for prevalence was calculated for each sample. The mean and standard deviation was then calculated for the 1000 sample sets of 10 and 15 observations (PROC MEANS, SAS Institute, Cary, NC). These results were then compared to real prevalence of copper deficiency of this population of 33 calves. Real prevalence was calculated by dividing the number of calves with a liver copper concentrations less than 25 mg/g by the total number of calves. Liver copper concentrations less than 25 mg/g is the accepted test endpoint for determining copper deficiency.7

For comparison purposes, the estimates of the lower confidence limit calculated using the binomial model were compared with the lower 95% confidence limit calculated using the Z-distribution. Using the previously described 2 sets of 1000 randomly selected samples, the real prevalence of copper deficiency was calculated for each sample of 10 or 15 serum copper determinations using Eq 1. Thereafter, for each sample the lower 95% confidence limit of calculated real prevalence was calculated using the Z-distribution:

95% lower confidence limit =
RP – 1.96 ÷(RP(1-RP)/n)

where RP = calculated real prevalence, and n = sample size (either 10 or 15).

Thereafter, the mean and standard deviation of the 95% lower confidence limit was calculated for the 1000 randomly selected sample sets containing either 10 or 15 observations.f

RESULTS

Apparent prevalence based on serum copper concentration was 0.67. Real prevalence based on liver copper concentration was 0.67. The equality of these numbers was coincidental. Results of the model verification trials are as follows. Mean and standard deviation of the 95% lower confidence limit of prevalence calculated using the binomial model of the 10 and 15 sample trials were 0.49 ± 0.24 and 0.64 ± 0.19, respectively. Using the population proportion estimation of the 95% lower confidence limit, the mean and standard deviation of the 10 and 15 sample trials were 0.95 ± 0.14 and 0.99 ± 0.06, respectively.

DISCUSSION

Application of this model is straightforward. If we sample 5 calves and 4 calves have serum copper concentrations less than 0.45 mg/g, we can confidently assume that herd prevalence of copper deficiency exceeds 40% (Fig. 1). Three positive tests assure us that prevalence is greater than 5%. Two positive tests are consistent with a population with a 0% prevalence of copper deficiency. Cursory appraisal of these results suggests that sample sizes this small should be avoided. As sample size increases, the proportion of positive tests required to reject the hypothesis that prevalence is less than or equal to the hypothesized prevalence becomes smaller. For example, 4 of 5 tests must be positive to reject the hypothesis that prevalence is less than 40%; however, only 5 of 7 or 6 of 9 tests must be positive to reject the same hypothesis (Fig. 1).

The estimates of lower confidence limits generated using the binomial model and the Z-distribution differed substantially (Fig. 2). The Z-distribution produced narrow confidence limits; however, these confidence limits are clearly flawed. Lower limits calculated using the Z-distribution exceed real prevalence for both sets analyzed, whether they contain 10 or 15 observations. In contrast, lower confidence limits calculated using the binomial model were less than the real prevalence, and hence, are more plausible and accurate. Comparison of the results of the two methods indicate that use of the traditional method to calculate real prevalence, when the test for evaluation is substantially imperfect, is ill advised. These results reinforce the need for novel approaches to interpret imperfect test results. The binomial model suggested here provides a satisfactory method to interpret imperfect test results.

These results highlight some interesting relationships between test results and individual test sensitivities and specificities. In particular, they illustrate the potential for erroneous conclusions when results of imperfect tests are taken at face value. A large proportion of the calves are copper-deficient based upon apparent prevalence. If we were to assume a herd had a real prevalence of copper deficiency of 100%, we could use Eq 2 to calculate the extreme value of apparent prevalence one could expect. In this instance, AP = 1*0.53 + 1*0.89 – 1 – 0.89 +1= 0.53. This illustrates that apparent prevalence results based on serum copper concentrations over 53% predict a real prevalence over 100%.

In summary, the described procedure provides a de facto hypothesis test for prevalence. When the calculated probability is less than 0.025, we have established that the probability of the observed pattern of test results at the hypothesized population prevalence is less than 2.5%. In essence, we reject the null hypothesis that population prevalence is less than the hypothesized prevalence. In this manner, we are constructing a lower limit confidence interval for herd prevalence.

The described technique has merit beyond the topic of copper deficiency. Rather than discount imperfect tests as invalid, we envision using this procedure to develop confidence limits for population prevalence in those instances in which test performance, either sensitivity or specificity, of diagnostic tests is sub-optimal. These confidence limits will be appropriate for use in the development of disease control strategies.

REFERENCES

1. Martin SW: Estimating disease prevalence and the interpretation of screening test results. Prev Vet Med 2: 463–472, 1984.

2. Daniel WW: Biostatistics: A Foundation for Analysis in the Health Sciences 7^th ed. New York: John Wiley & sons, Inc, pp. 176–177, 1999.

3. Tessman RK, Lakritz J, Tyler JW, et al: Sensitivity and specificity of serum copper determination for detection of copper deficiency in feeder calves. J Am Vet Med Asso 218:756–760, 2001.

4. Martin SW: The evaluation of tests. Can J Comp Med 41:19–25, 1977.

5. Moore DS, McCabe GP: Introduction to the Practice of Statistics 2^nd ed. New York: W.H. Freeman, pp. 372–378, 1993.

6. Pearce SG, Firth EC, Grace ND, et al: Liver biopsy techniques for adult horses and neonatal foals to assess copper status. Aust Vet J 75:194–198, 1997.

7. Puls R. Copper. In: Puls R ed. Mineral Levels in Animal Health 2^nd ed. Clearbrook, BC, Canada: Sherpa International, pp. 82–109, 1994.

Positive
Tests Test 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95
performed Results

5 5 *

5 4 *

5 3 *

6 6 *

6 5 *

6 4 *

7 7 *

7 6 *

7 5 *

7 4 *

8 8 *

8 7 *

8 6 *

8 5 *

8 4 *

9 9 *

9 8 *

9 7 *

9 6 *

9 5 *

9 4 *

10 10 *

10 9 *

10 8 *

10 7 *

10 6 *

10 5 *

Figure 1. Interpretation of test outcomes for various combinations of number of tests and number of positive test results. Cells designated by an asterisk (*) denote the calculated lower limit of copper deficiency prevalence.

Figure 2. Results of 2 trials of randomly generated serum copper concentration sample selection outcomes for estimation of the lower 95% confidence interval of prevalence of copper deficiency. The · represents the mean of each trial and the solid horizontal line represents the standard deviation of the proposed binomial model. The Ñ represents the mean of each trial and the solid horizontal line represents the standard deviation of the traditional method of determination. The vertical dashed line represents the real prevalence of the population from which the samples were taken as determined by liver copper concentrations.