Fits a linear model with empirical likelihood.
el_lm(
formula,
data,
weights = NULL,
na.action,
offset,
control = el_control(),
...
)
An object of class formula
(or one that can be coerced to
that class) for a symbolic description of the model to be fitted.
An optional data frame, list or environment (or object coercible
by as.data.frame()
to a data frame) containing the variables in
formula
. If not found in data, the variables are taken from
environment(formula)
.
An optional numeric vector of weights to be used in the
fitting process. Defaults to NULL
, corresponding to identical weights. If
non-NULL
, weighted empirical likelihood is computed.
A function which indicates what should happen when the data
contain NA
s. The default is set by the na.action
setting of
options
, and is na.fail
if that is unset.
An optional expression for specifying an a priori known
component to be included in the linear predictor during fitting. This
should be NULL
or a numeric vector or matrix of extents matching those of
the response. One or more offset
terms can be included in the formula
instead or as well, and if more than one are specified their sum is used.
An object of class ControlEL constructed by
el_control()
.
Additional arguments to be passed to the low level regression fitting functions. See ‘Details’.
An object of class of LM.
Suppose that we observe \(n\) independent random variables
\({Z_i} \equiv {(X_i, Y_i)}\) from a common distribution, where \(X_i\)
is the \(p\)-dimensional covariate (including the intercept if any) and
\(Y_i\) is the response. We consider the following linear model:
$$Y_i = X_i^\top \theta + \epsilon_i,$$
where \(\theta = (\theta_0, \dots, \theta_{p-1})\) is an unknown
\(p\)-dimensional parameter and the errors \(\epsilon_i\) are
independent random variables that satisfy
\(\textrm{E}(\epsilon_i | X_i)\) = 0. We assume that the errors have
finite conditional variances. Then the least square estimator of
\(\theta\) solves the following estimating equations:
$$\sum_{i = 1}^n(Y_i - X_i^\top \theta)X_i = 0.$$
Given a value of \(\theta\), let
\({g(Z_i, \theta)} = {(Y_i - X_i^\top \theta)X_i}\) and the (profile)
empirical likelihood ratio is defined by
$$R(\theta) =
\max_{p_i}\left\{\prod_{i = 1}^n np_i :
\sum_{i = 1}^n p_i g(Z_i, \theta) = \theta,\
p_i \geq 0,\
\sum_{i = 1}^n p_i = 1
\right\}.$$
el_lm()
first computes the parameter estimates by calling lm.fit()
(with ...
if any) with the model.frame
and model.matrix
obtained from
the formula
. Note that the maximum empirical likelihood estimator is the
same as the the quasi-maximum likelihood estimator in our model. Next, it
tests hypotheses based on asymptotic chi-square distributions of the
empirical likelihood ratio statistics. Included in the tests are overall
test with
$$H_0: \theta_1 = \theta_2 = \cdots = \theta_{p-1} = 0,$$
and significance tests for each parameter with
$$H_{0j}: \theta_j = 0,\ j = 0, \dots, p-1.$$
Owen A (1991). ``Empirical Likelihood for Linear Models.'' The Annals of Statistics, 19(4), 1725--1747. doi:10.1214/aos/1176348368 .
EL, LM, el_glm()
, elt()
,
el_control()
## Linear model
data("thiamethoxam")
fit <- el_lm(fruit ~ trt, data = thiamethoxam)
summary(fit)
#>
#> Empirical Likelihood
#>
#> Model: lm
#>
#> Call:
#> el_lm(formula = fruit ~ trt, data = thiamethoxam)
#>
#> Number of observations: 165
#> Number of parameters: 4
#>
#> Parameter values under the null hypothesis:
#> (Intercept) trtSpray trtFurrow trtSeed
#> 6.016 0.000 0.000 0.000
#>
#> Lagrange multipliers:
#> [1] -0.03994 -0.29622 0.59777 -0.15872
#>
#> Maximum EL estimates:
#> (Intercept) trtSpray trtFurrow trtSeed
#> 5.6667 -0.7167 2.7798 -0.3452
#>
#> logL: -875.9 , logLR: -33.38
#> Chisq: 66.76, df: 3, Pr(>Chisq): 2.113e-14
#> Constrained EL: converged
#>
#> Coefficients:
#> Estimate Chisq Pr(>Chisq)
#> (Intercept) 5.6667 413.766 < 2e-16 ***
#> trtSpray -0.7167 1.978 0.160
#> trtFurrow 2.7798 19.259 1.14e-05 ***
#> trtSeed -0.3452 0.416 0.519
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
## Weighted data
wfit <- el_lm(fruit ~ trt, data = thiamethoxam, weights = visit)
summary(wfit)
#>
#> Weighted Empirical Likelihood
#>
#> Model: lm
#>
#> Call:
#> el_lm(formula = fruit ~ trt, data = thiamethoxam, weights = visit)
#>
#> Number of observations: 165
#> Number of parameters: 4
#>
#> Parameter values under the null hypothesis:
#> (Intercept) trtSpray trtFurrow trtSeed
#> 6.358 0.000 0.000 0.000
#>
#> Lagrange multipliers:
#> [1] -0.06482 -0.21994 0.51894 -0.19381
#>
#> Maximum EL estimates:
#> (Intercept) trtSpray trtFurrow trtSeed
#> 5.71940 -0.10153 2.94806 -0.07815
#>
#> logL: -847.5 , logLR: -31.59
#> Chisq: 63.18, df: 3, Pr(>Chisq): 1.229e-13
#> Constrained EL: converged
#>
#> Coefficients:
#> Estimate Chisq Pr(>Chisq)
#> (Intercept) 5.71940 415.486 < 2e-16 ***
#> trtSpray -0.10153 0.028 0.867
#> trtFurrow 2.94806 19.830 8.46e-06 ***
#> trtSeed -0.07815 0.020 0.887
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
## Missing data
fit2 <- el_lm(fruit ~ trt + scb, data = thiamethoxam,
na.action = na.omit, offset = NULL
)
summary(fit2)
#>
#> Empirical Likelihood
#>
#> Model: lm
#>
#> Call:
#> el_lm(formula = fruit ~ trt + scb, data = thiamethoxam, na.action = na.omit,
#> offset = NULL)
#>
#> Number of observations: 162
#> Number of parameters: 5
#>
#> Parameter values under the null hypothesis:
#> (Intercept) trtSpray trtFurrow trtSeed scb
#> 6.043 0.000 0.000 0.000 0.000
#>
#> Lagrange multipliers:
#> [1] -0.017410 -0.301618 0.595467 -0.153307 -0.008558
#>
#> Maximum EL estimates:
#> (Intercept) trtSpray trtFurrow trtSeed scb
#> 5.62981 -0.74024 2.82886 -0.24460 0.01551
#>
#> logL: -857 , logLR: -32.79
#> Chisq: 65.58, df: 4, Pr(>Chisq): 1.939e-13
#> Constrained EL: converged
#>
#> Coefficients:
#> Estimate Chisq Pr(>Chisq)
#> (Intercept) 5.62981 405.317 < 2e-16 ***
#> trtSpray -0.74024 2.160 0.142
#> trtFurrow 2.82886 23.410 1.31e-06 ***
#> trtSeed -0.24460 0.235 0.628
#> scb 0.01551 0.141 0.707
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> (3 observations deleted due to missingness)
#>