Modeling the potential risk factors of bovine viral diarrhea prevalence in Egypt using univariable and multivariable logistic regression analyses

Aim: The present cross-sectional study was conducted to determine the seroprevalence and potential risk factors associated with Bovine viral diarrhea virus (BVDV) disease in cattle and buffaloes in Egypt, to model the potential risk factors associated with the disease using logistic regression (LR) models, and to fit the best predictive model for the current data. Materials and Methods: A total of 740 blood samples were collected within November 2012-March 2013 from animals aged between 6 months and 3 years. The potential risk factors studied were species, age, sex, and herd location. All serum samples were examined with indirect ELIZA test for antibody detection. Data were analyzed with different statistical approaches such as Chi-square test, odds ratios (OR), univariable, and multivariable LR models. Results: Results revealed a non-significant association between being seropositive with BVDV and all risk factors, except for species of animal. Seroprevalence percentages were 40% and 23% for cattle and buffaloes, respectively. OR for all categories were close to one with the highest OR for cattle relative to buffaloes, which was 2.237. Likelihood ratio tests showed a significant drop of the −2LL from univariable LR to multivariable LR models. Conclusion: There was an evidence of high seroprevalence of BVDV among cattle as compared with buffaloes with the possibility of infection in different age groups of animals. In addition, multivariable LR model was proved to provide more information for association and prediction purposes relative to univariable LR models and Chi-square tests if we have more than one predictor.


Introduction
Bovine viral diarrhea virus (BVDV), the causal agent of BVD and mucosal disease complex, is classified in the genus Pestivirus in the family Flaviviridae. Although cattle are the primary host for BVDV, several reports suggest most even-toed ungulates are also susceptible. It causes important economic losses in cattle breeding. Infection is characterized by depression, temperature, mild diarrhea, and temporary leukopenia [1].
Serologic surveys indicate that BVDV is distributed worldwide. The prevalence of antiviral antibody in cattle varies among countries and may vary between geographic regions within a country. Prevalence of antiviral antibody may be >90% if vaccination is practiced commonly in a geographic region. Although cattle of all ages are susceptible, most cases of the overt clinical disease are seen in cattle between 6 months and 2 years old [2].
Cattle that are persistently infected (PI) with noncytopathic BVDV serve as a natural reservoir for virus. Persistent infection develops when noncytopathic BVDV is transmitted transplacentally during the first 4 months of fetal development. The calf is born infected with virus, remains infected for life, and usually is immunotolerant to the resident noncytopathic virus [3]. Transplacental infection that occurs later in gestation results in abortion, congenital malformations, or birth of normal calves that have antibody against BVDV. The prevalence of persistent infection varies among countries and between regions within a country [4].
PI animals result from the infection of the bovine fetus with an NCP-BVDV biotype early in gestation. These animals show specific immunological tolerance to the carrier virus and maybe born apparently healthy. PI animals are the main source of virus transmission as they continuously shed large amounts of virus in the environment. Virus is excreted in smaller amounts from acutely infected animals and for only a few days during the acute infection [5].
Available at www.veterinaryworld.org/Vol.11/March-2018/1.pdf Early detection of antibodies using enzyme-linked immunosorbent assay (ELISA) is unreliable and difficult attached with appropriate antigen [6]. However, this has been overcome, resulting in Ab ELISAs with high specificity and sensitivity of up to 99% and 98%, respectively, when compared with the serum neutralization test (SNT) [7,8]. ELISA can detect various types of samples and are an efficient and economical alternative to SNT [9]. SNT is more sensitive than ELISA and can detect more antibodies following vaccination [10]. Furthermore, low SNT titer appeared in prolonged storage or repeated freeze-thawing samples or sample was negative with ELISA [11]. The objectives of this study were to determine the prevalence of BVDV in cattle and buffaloes in some localities in Egypt, to model the potential risk factors associated with BVDV prevalence using logistic regression (LR), and to fit the best predictive model for the current data.

Ethical approval
This study was conducted according to ethical guidelines approved by ethics of scientific research committee, Faculty of Veterinary Medicine, Suez Canal University, Ismailia, Egypt.

Animal and sampling
A total of 480 and 260 blood samples were collected from cattle and buffaloes of from four governorates (Kalubia, Giza, Menofia, and Gharbia) in Egypt. Samples for examination of BVDV antibodies were collected from animals aged between 6 months and 3 years. The samples were collected from apparently healthy and diseased animals without a history of vaccination. All blood samples were collected within the period November 2012-March 2013. The age, sex, species, and location of the animal were studied for being potential risk factors for BVD seropositivity.

Indirect ELISA
The all collected serum samples were examined with indirect ELISA test kit (Svanova BVDV antibody ELISA, Svanova Biotech AB). The tests were performed following the manufacturer's instructions. The antibody titer was interpreted on the basis of the percentage positivity by dividing the sample OD values by positive reference sample OD values. The cutoff value was set at 14%.

Nonparametric analysis
Data were analyzed statistically to test the potential association between the BVDV occurrence and each of the predictors (gender, age, species, and herd location) using nonparametric Chi-square tests. Although Chi-square test allows testing these relationships, the nonparametric test has some limitations; first, Chi-square test did not permit for the potential effect of other independent variables on that relationship. Second, Chi-square test was not able to provide a predictive model for future prediction of the outcome. Third, Chi-square test did not assess the relationship between a dependent categorical variable and several predictors at the same time. Fourth, the magnitude and contribution of each predictor in explaining the outcome cannot be calculated by Chi-square test. Therefore, it is necessary to search for another approach, the binary LR analysis being used as a potential alternative statistical test for analysis of categorical outcomes.

Fitting the LR models
Data were also analyzed using univariable and multivariable LR techniques for modeling the potential risk factors related to BVD disease and, to explain more practical and statistical facts that could not be denoted by the ordinary Chi-square test. The univariable LR models were fitted using only one explanatory variable along with examining its relationship with the outcome of BVD disease. The multivariable LR model was applied to assess the relationship between a dichotomous outcome and many explanatory variables. The contribution of each predictor variable in explaining the outcome was measured by the LR coefficients and odds ratio (OR). In univariable LR, the natural log odds of BVDV was fitted as a linear function of the predictors [12] as follows: Where p is the probability of outcome variable; X is the predictor variable; α and β are the LR coefficients. An equation for predicting the probability of BVD occurrence was obtained by shifting Equation (1) using the antilog on both sides as follows: P=Probability (y=BVD outcome/X=x, a speci- Where p is the probability of the BVD outcome, α is the Y-intercept, β is the regression coefficient, and e is the base of the natural logarithm (e=2.71828). The relationship between logit (Y) and X in Equation (1) is linear, while the relationship between the probability of Y and X in Equation (2) is nonlinear. Hence, the natural log transformation of the odds of BVD was imperative to exhibit the linear relationship between the dependent and independent variables. By the way, the multivariable LR model was fitted as follows: Logit(y)=Ln(odds)=Ln(p/1−P)=α+β 1 X 1 +…+β k X k (3) Therefore, p=p (y=interested outcome/X i =x 1 The deviation with the null model (−2 log likelihood [−2LL] of null model) was compared to the corresponding term for a given model with explanatory variables (−2 log L of the given model). The differences between the two −2 log L produce the Chi-square statistic (χ 2 ), with k degrees of freedom [13]. The probability level < 0.05 suggests that the given model work well and better than the null model. Hence, at least one of the predictor variables participates in explaining and predicting the outcome.
Hosmer-Lemeshow test was applied for assessing the goodness of fit of LR models. The test depends mainly on the same principle of Chi-square test of testing the differences between observed and predicted frequencies. The value of test statistics [14] was calculated as: Where O g and E g are the observed and expected values for the gth deciles. Hosmer-Lemeshow test follows a χ 2 distribution with 8 degrees of freedom (df) (number of groups-2). p>0.05 suggests a good fit of data by the LR model.

Testing the significance of LR coefficients
Wald statistic was used to test the significant contribution of each predictor in the given LR model. The Wald statistic also distributed with Chi-square and calculated as the ratio of the square of regression slope to the square of the standard error of that coefficient [13]: For the present data, the Wald statistic was tested at degrees of freedom equal to the number of categories of the X variable −1. The predictor variable was considered significantly affect the outcome of BVD if p≤0.05. All statistical analyses were performed by Statistical Package for the Social Sciences (SPSS version 20) and Statistical Analysis System (SAS Institute) software. Results are considered significant at a probability level of 0.05 for each p≤0.05.

Results and Discussion
Association between BVD prevalence and potential risk factors using Chi-square test Table-1 showed the association between the condition of BVDV and gender. Chi-square test (0.480) revealed insignificant (p>0.05) association between BVDV infection and gender. Odds of infection were found to be 0.553 for males and 0.495 for females. That is, the OR for male relative to female was 1.117, indicating the absence of association (OR was close to one). Table-2 gave the relationship between age and the seroprevalence of BVDV. The age group <6 months was incorporated as a baseline category because it had a code of zero in the dataset. The infection with BVDV was non-significantly (p>0.05) associated with the age (Chi-square=0.239). The results showed similar odds of BVDV infection with increasing age of the animal. The OR of the two age groups other than the baseline were nearly equal (close to one), suggesting that all age groups would be a potential risk factor for BVDV disease. A similar study was conducted by Talafha et al. [15] who reported non-significant differences in BVDV seroprevalence between different age groups.
When the sex was determined or provided, approximately two-thirds of the specimens were from females. This likely reflects that bovine females outnumber males, and especially in the case of dairy cattle, are often of greater financial value. The non-significant difference in prevalence of BVD in relation to sex of examined animals in this study come in accordance with other previous studies as Wilson et al. [16], where the % of infection was 2.3% in male and 2.4% in female. In contrast, some studies reported a significant difference in prevalence of BVD between male and female as Bello et al. [17], the prevalence rate was 75% in female and 59.3% in male.
The occurrence of BVD in the present study according to age group was more prevalent in animals <6 and those over than 12 months. This finding is in agreement  [20] found the prevalence rate more in juvenile animals than adults' animals. The prevalence rate increases in young animals at risk [21,22]. In contrast, the prevalence rate of BVD in animals of age above 3 years (79%) is higher than young animals with age <1 year (70%) as previously reported by Vásquez et al. [23]. The lower prevalence rate in calve may be due to some calves might be persistent infected animal and immunotolerant to BVD virus which not produce antibodies and cannot be detected by serological tests [24][25][26].
The prevalence of BVD in the two studied species was observed to be higher in cattle (40%) than buffaloes (23.0%). The odd of infection for buffaloes was 0.298 compared to 0.667 for cattle. This implies that the OR for cattle versus buffaloes was 2.237 (1/0.447) and subsequently, it can be concluded that the odds of BVDV for cattle were 2.237 times the odds of BVDV for buffaloes. This result was confirmed by the value of Chi-square (21.648) which indicates a highly significant (p≤0.01) association between prevalence of BVD and species (Table-3). Because cattle are the most reared breed in Egypt in comparison with buffaloes, due to its high milk yield, it showed a higher prevalence of BVD. This result agrees with other previous studies [17,19,27].
The herd location had a non-significant (p>0.05) effect of on the BVD prevalence in Egypt, according to the Chi-square test result (Table-4). This result is in accordance with the finding of Talafha et al. [15]. The highest % of BVD prevalence was recorded for Kalubia province (38.9%). The odds of infection showed small differences among all localities. For this reason, Kalubia was selected as a baseline category and coded as zero in the dataset. The ORs were all less than one, and estimated to be; 0.675, 0.786, and 0.792, for Giza, Menofia, and Gharbia provinces, respectively, compared to Kalubia. In other words, the odds of BVD prevalence for Giza, Menofia, and Gharbia were 1.48 (1/0.675), 1.27 (1/0.786), and 1.26 (1/0.792), respectively, times less than the odds of BVDV occurrence for Kalubia.
The present findings agree with Ghazi et al. [27] and El-Bagoury et al. [28]. In areas that had very high BVD seroprevalence, like those observed in this study, and where cattle density was high and herds clearing the infection were obviously at higher risk of reinfection from losing immune protection and becoming naive to the virus. Therefore, the vaccination of susceptible herds in combination with removal of PI animals, would overcome the problem of reinfection by preventing intrauterine infection in pregnant dams at risk of exposure to contact with undetected PI animals [29,30].

LR analysis
In the previous results, Chi-square test did not study the influence of other independent variables on that relationship. Hence, univariable and multivariable LR models were fitted to model and predict the potential risk factors with BVDV disease. The univariable LR models were created for a single predictor, followed by a multivariable LR model, controlling for the other independent variables. Moreover, LR models were imperative to quantify the coefficient estimates (the change in log Y for a one unit change in X), and to determine the direction of relationship between the predictor and the logit of outcome. The summary of univariable (Model 1, 2, 3, and 4) and multivariable (Model 5) LR models were presented in Table-

Testing the significance of predictors
According to Model 1, the Wald Chi-square statistic revealed that gender was non-significant (p>0.05) predictor for the BVD prevalence. The intercept was significant (p<0.01), suggesting its important inclusion in the model. The log odds that an animal of a given gender would show the infection with BVD can be predicted from Model 1. For a male animal (gen-der=1), the log odds of being infected with BVD were −0.592. In practice, the value of log odds (−0.592)    (1), the dummy variable for the second category of age (6-12 months), and age (2), the dummy variable for the third category of age (> 12 months), that would be compared with the age group <6 months. The Wald statistic revealed that age was not significant (p>0.05) predictor for the BVD occurrence (Table-5). Specifically, the Wald statistics for the age group 6-12 months and age group >12 months were 0.187 and 0.019, suggesting the absence of significant difference (p>0.05) between each of these two age groups relative to the baseline group (<6 months). The Wald statistic for the intercept testing was 10.935, which was also significant (p<0.01), reflecting the important contribution of intercept term in Model 2. By the same way of calculations in Model 1, the predicted probabilities for the three age groups of being infected with BVD were 35.2%, 33%, and 34.5%, respectively. It was obvious that the percentages estimated by the univariable logit Model 2 were similar to those in association table of Chi-square analysis (Table-2). Moreover, the OR for age group 6-12 months relative to the baseline group was 0.906 (=0.492/0.542), while the OR for age group >12 months relative to the reference group was 0.970 (=0.526/0.542). The two estimated ORs were the same as calculated in Table-2. Although all Wald statistics concerned with testing the significant effect of age were non-significant (p>0.05), the estimated ORs were <1, indicating a negative relationship between age categories and odds of BVD prevalence. This conclusion has been confirmed by the negative signs of coefficients in LR Model 2.
The univariable LR model (Model 3) associated with species showed a highly significant (p<0.01) relationship between species and outcome of BVDV disease. The model predicted that 23% (=0.298/1.298) of buffaloes showed the seropositivity of BVD (as in Table-3). Furthermore, Model 3 predicted that 40% (=0.667/1.667) of dairy cattle were positively infected with BVD. The OR of infection for buffaloes relative to cattle was 0.447 (0.298/0.667). The reciprocal of that value (1/0.447) resulted in the OR of cattle relative to buffaloes.
In Model 4, all the model coefficients were negative, suggesting negative associations with the studied outcome (Table-5). Regarding the estimated model, area (1) was the dummy variable for Giza, area (2) was the dummy variable for Menofia, and area (3) was the dummy variable for Gharbia. All these localities were compared with the reference area, Kalubia. The Wald statistic (3.035) for herd location was also non-significant (p>0.05), concluding no association between the seroprevalence of BVDV and localities. In addition, the Wald statistics for all dummy variables were non-significant (p>0.05). Results showed the significance (p<0.01) of intercept in Model 4. Using the formula (odds/odds+1), the probabilities predicted by Model 4 of seropositive infection with BVD were 38.9%, 30%, 33.3%, and 33.5%, respectively. The ORs for Giza, Menofia, and Gharbia localities relative to Kalubia were 0.673, 0.786, and 0.792, respectively.
The previous results were aimed to study the association and prediction of BVD outcome from a single predictor only. Before constructing the multivariable LR model (Model 5), multicollinearity among the independent variables was checked using multicollinearity diagnostic tests. The values of variance inflation factor (VIF=1/tolerance) were all <10, indicating the absence of collinearity between explanatory variables. It is imperative to mention that if VIF >5 or 10, the estimated coefficients will be invalid because of multicollinearity, which in turn lead to inflation of variances [32], and consequently inaccurate estimates and unreal inferences about the relationship between the explanatory variables and outcome [33]. Moreover, the absolute correlation coefficients were low (0.006-0.083) and the condition indices were <15.
Unlike univariable logit models, the intercept in the full model (Model 5) was non-significant (p>0.05), may be due to the inclusion of all predictors with dummies in the model. Furthermore, the negative values of regression coefficients indicate that the odds and the probability of BVD disease may decrease because both values of regression coefficient and OR are correlated and dependent. The estimated confidence intervals (95% CIs) for all ORs included one in its range, for all studied factors, except for species, suggesting that the association between BVD positivity and species was statistically significant at 0.05, because, the 95% CIs for species were 0.317-0.630 and 0.312-0.624 for Model 3 and Model 5, respectively. A 95% CI including the value one indicates the absence of significant association as reported by Szumilas [34]. According to the estimates of Available at www.veterinaryworld.org/Vol.11/March-2018/1.pdf Model 5, it was observed that multivariable logit model resulted in very little effects on ORs and their interpretations. Furthermore, the direction of relationship between BVD outcome and predictors was the same. In a conclusion, the multivariable LR model yielded stable estimates, together with the absence of multicollinearity.

Overall evaluation and goodness of fit of models
Predicted model relative to null model LR models depend on the maximum likelihood (ML) estimators to assess the relationship between the outcome and predictor(s). The ML estimator relies on its ability to provide a model with a high degree of precision, for predicting an outcome [35]. A better LR model is that one which proves an improvement versus the null model [36]. The null model is considered a good reference because it includes no independent variables. In this study, different tests were used to evaluate LR models, LRT, and score test. These tests compared the difference between the −2LL estimate for the given model and the −2LL estimate for the null model. This difference is the Chi-square statistic with the same df for the two tests. The model with the lowest −2LL is considered the best for the fitted dataset. Fortunately, the two tests usually have the same conclusions and statistically approved by many authors [14,[37][38][39]. The results of LRT, score test and the values of −2LL were shown in Table-6. The Chi-square test statistics represented the differences between two −2LL for LRT. The LRT and score test results were highly significant (p<0.01) for Model 3 (univariable logit model of species) and Model 5 (multivariable LR model). This implies that the addition of species into Model 3 resulted in an increase of Chi-square values (χ 2 =22.429, df=1) for LRT and χ 2 =121.648, df=1 for score test to be highly significant (p<0.01), indicating that a model with species was more effective than the null model. Similarly, the multivariable LR model was highly effective than null model, with the greatest increase in χ 2 values (χ 2 =26.330, df=7) for LRT and χ 2 =25.449, df=7 for score test. The LRT and score test results for other models were non-significant (p>0.05), suggesting that these models (Model 1, 2, and 4) were not improved over the intercept-only model, and their predictors had no influence on the BVDV seroprevalence.
Another approach to assess the studied models was the value of −2LL estimated by the model. It was concluded that multivariable logit model (Model 5), which had, not only the smallest −2LL value (919.113), but also the highest drop of −2LL from 945.442 to 919.113, was the best for predicting BVDV  Table-7), because of data restriction, small number of groups (2) in contingency tables, and subsequently zero degrees of freedom.

Multivariable versus univariable LR models
To assess the causative effects and relationships between more than one predictor and the outcome, a construction of multiple LR models should be carried out, controlling for other independent variables [40][41][42]. The multivariable LR model was statistically evaluated, not only in relation to the null model but also with the other models. Table-8 shows a comparison of the multivariable LR model with other logit models. The results revealed highly significant differences (p<0.01) between the multivariable LR model with the three single-predictor models, including sex, age, and localities, suggesting that the addition of more predictors lead to a significant improvement. The only non-significant difference in −2LL was observed between the multiple predictor's models and a single-predictor model with species, indicating that addition of independent variables other than species insignificantly (p>0.05) improved the predictive ability of the model. In conclusion, the current multivariable LR model would be suitable for predicting the BVD prevalence than did other models. Table-7 presents the output of classification table for the multivariable LR, which would be recommended as the best for the present dataset of BVD. At cutoff point 0.5, the overall percentage of correct classification was recorded to be 61.7%, which imply an improvement made by the model. The sensitivity %, specificity %, false positive %, and false negative % were 43.8%, 71%, 29%, and 56.18%, respectively.

Conclusions
The results of this study show that BVD is prevalent in the cattle and buffaloes population of Egypt. Animal species appears to be a significant risk factor for BVD infections, while the other risk factors, i.e., age, sex, and, herd location had no significant impact on BVD seroprevalence. In addition, we present different statistical methods that were highly compatible to use, therefore, this study indicates that multivariable LR is recommended as an alternative to Chi-square test for both association and predictive statistics for BVD seroprevalence, and may aid in studying the epidemiology of BVD virus particularly if there is no previous history of infections in closely located farms or regions.