The present study uses data from the Jordan Healthcare Utilization and Expenditure Survey (JHUES) collected in 2000. The survey contained a sample of 8,800 households and obtained information on each individual within the household, the head of the household, and detailed health care utilization and expenditure information on a randomly selected household member. Essentially, the JHUES consists of four samples: (i) demographic and background data on 49,543 individuals, (ii) insurance coverage on 49,456 individuals, (iii) household data on 8,306 heads of households, and (iv) health care utilization and spending data, and self-assessed health (SAH) on 8,306 randomly selected individuals of all ages. Individuals from this latter sub-sample are used in the subsequent analysis.

The survey instrument was divided into seven sections: the household schedule, health insurance data, outpatient care utilization (two-week reference period), health status, inpatient episodes, mortality, and household conditions, including income and assets. Moreover, the survey used a complex sample design with seven strata. In the first stage of the sampling method, the primary sampling units (psu) were selected within each stratum with a probability proportional to size (pps). In the second stage, 20 households were randomly selected within each sampling unit. Finally, as noted, one individual was chosen at random within each household. Sampling weights were then calculated in two steps. First, each observation was weighted by the inverse of its probability of selection, and second, the weights were normalized by dividing the first step probability weights by their average. The sample design and sampling weights are considered in all estimates in the ensuing analysis; this ensures robust standard errors of all estimated coefficients.

Descriptive statistics for the main variables are presented in table 1. During the recall period of two weeks prior to the survey, a little less than 20 percent of the randomly interviewed individuals reported having had at least one spell of illness. Out of those individuals, around 63 percent visited a health care provider for outpatient care, the vast majority of which made only one visit and only a handful made more than three visits.

As previously noted, around 64 percent of the Jordanian population has access to some type of health insurance. The most common insurance program is the Royal Medical Services (RMS), which provides health insurance for 43 percent (28 percent of all randomly interviewed individuals; not shown) of the insured. The program organized by the Ministry of Health (CIP) provides insurance for another 28 percent (18 percent) of the insured while the Jordanian University Hospital (JUH) program only reaches a little more than one percent (less than 1 percent) of the insured individuals. Some 14 percent (9 percent) of those with insurance have some form of private health insurance, and finally, the UN special program for the Palestinian refugees in Jordan (UNRWA) provide health insurance for around 13 percent (8 percent) of all those insured.

Information on health insurance status is provided in two places in the questionnaire: first, for all individuals in the survey (n = 49,456) regardless of possible health care utilization during the recall period, and second, for all of the randomly interviewed persons given that they visited a health care service provider at least once during the recall period. Below the first source of information is used so as to be able to gauge the effect of insurance on utilization regardless of whether a provider was visited or not during illness. Moreover, the Jordan survey collected information on health care utilization for specific illnesses, as opposed to any illness in the recall period. This means that the potential problem of independence between illness and utilization is taken care of, which, in turn, speaks in favor of applying the two-part approach to the analysis of health care demand [7].

### Econometric considerations

In the subsequent analysis, there are at least three specific econometric considerations that require special attention. First, as mentioned earlier, using interaction terms in logit and probit models calls for the application of extended estimation commands in some standard econometric software programs, including Stata [15, 16].

Second, the coefficients of the various explanatory variables may be biased and inconsistent if some of them correlate with the error term in the econometric models below, suggesting that they are endogenous. In particular, the health insurance status indicator variable may be suspected of being endogenous due to unobserved heterogeneity if individuals self-select into their insurance status on account of some factor not controlled for and thus contained in the error term in the estimation models [See, for example, [20]].

There are several factors that are not immediately observed why individuals may opt to buy or obtain (a specific type of) health insurance. For example, to the extent that obtaining insurance is costly, richer individuals may choose to buy insurance expecting to utilize health care in the future. Also, health insurance may be associated with a certain type of employment either in the private or the public (including military) sectors. Individuals may then opt to seek certain types of jobs partly with a view to obtaining health insurance in anticipation of future health care needs. Finally, people with poorer health may choose to obtain health insurance due to higher expectations of utilizing health care.

By including measures of these and other factors in the models, attempts are made to control for the potential endogeneity of the insurance variable. In addition, a test for endogeneity of the insurance variable is performed using a version of the Durbin-Wu-Hausman class of tests for endogeneity [21]. Failure to reject the null hypothesis of no correlation between the error term and any of the regressors in such a test would suggest that some of the regressors are endogenous, which, in turn, would suggest a failure to properly identify the regression equation. One way of handling endogeneity would be to use instrumental variable (IV) estimation techniques. While obtaining valid instruments for the endogenous variables is no easy task, and using poor instruments may be inferior to using the possibly endogenous variable and accounting for bias, some instruments that have been suggested in the literature on the demand for health and insurance include the relationship of the individual to the head of household and mean rate of affiliation of the insurance type in the community [13]. This type of information is available from the JHES 2000 survey, although whether they provide valid instruments in this case needs to be tested formally.

The final issue concerns the applicability of the count data model. While a Poisson distribution may be assumed in count data on the number of visits to a provider, the actual estimation of the count model by means of Poisson estimation methods may be inappropriate due to the rather restrictive assumptions of the Poisson model. In particular, the model assumes that the mean of the expected number of counts (or, events) is equal to the variance. Frequently in empirical situations, this assumption is known to be violated due to overdispersion. A formal test of overdispersion may be applied, the outcome of which may suggest an alternative estimation method, such as a negative binomial (negbin) approach [20, 21]. Also, the Poisson model assumes that visits are independent, which clearly is a strong assumption given that referrals most likely require the active decision of a GP or similar provider. Finally, the count data on the number of visits to an outpatient provider is truncated, which will have additional implications for the choice of estimation method [20].

### Econometric models

The current set of analyses attempt to assess the effect of insurance on health care utilization and on subsequent payments for care by means of econometric estimation of three separately specified models as outlined below. The key variable of interest is thus 'insurance status', which is a dummy variable taking the value 1 if the individual has health insurance and 0 otherwise. To further gauge the insurance effect, the study allows for the interaction between insurance status and an indicator of individual (or household) socioeconomic status. Here, total household monthly consumption expenditure divided by the square root of household size to get a per capita measure and control for household size, is taken as that indicator.

Based on theory and empirical experience, insurance status is expected to improve access to care and to reduce payments for services. While the data generating process (DGP) underlying these decisions is admittedly complex, the econometric approach essentially depends on the nature of the data and the information they contain about the dependent variables. This issue has been discussed in the literature [22], and a taxonomy to help choose the correct model has been suggested. Based on the discussion, this study adopts the approach of the 'two-part model' (TPM) to assess the effect of insurance on utilization and expenditures.

Formally, the econometric models are the following, where the individual subscript *i* = *(1,...,N)* is suppressed for notational simplicity. The probability of a health care visit conditional on being ill is estimated by the probability model

Prob(utilization>0|ill) = *βX* + *ε* (1)

where *X* is a set of covariates, including health insurance status. This model is linear in the log of the odds (logit) of the event, in this case visiting a provider when ill.

The intensity of health care utilization is analyzed by means of a 0-truncated negative binomial (0-negbin) model since the focus of analysis is the number of visits conditional on at least one visit to a service provider. The negative binomial takes the following general form:

where y is a health care visit ranging from one to, in this case, six visits, and Γ(·) is the gamma function.

Finally, out-of-pocket expenditures conditional on positive utilization of outpatient care is estimated by OLS of the log-linear model

Log(OOP|utilization > 0) = *γX* + *μ* (3)

Again, X is a set of explanatory variables, including insurance status. In models (1) and (3), *β* and *γ* are the coefficients to be estimated and *ε* and *μ* are error terms; in model (2), *μ* is the expected number of counts or, here, outpatient visits.

In order to be able to draw a ceteris paribus conclusion as to the insurance effect, the study controls for a number of individual, household, and community factors. The individual factors include age and gender, educational, civil and occupational status, and nationality. There are few convincing a priori reasons to expect these variables to affect utilization and payments in any special direction as these effects are largely empirical. The study does, however, also control for the health status of the individual and it is expected that, all other things being equal, poorer health status leads to larger health care needs and possibly larger health care expenditures.

Household factors include income (as measured by consumption expenditure) and other living conditions of the household. Although these factors are suggested to influence the dependent variables in models (1) – (3), the expected sign of the estimated coefficients of these variables is indecisive, again due to the empirical nature of these effects. Community factors include the presence of a health care provider nearby and a regional dummy variable. Living close to a health service provider is expected to increase the probability of utilization.