This function is a wrapper that calls `specify()`

, `hypothesize()`

, and
`calculate()`

consecutively that can be used to calculate observed
statistics from data. `hypothesize()`

will only be called if a point
null hypothesis parameter is supplied.

Learn more in `vignette("infer")`

.

## Usage

```
observe(
x,
formula,
response = NULL,
explanatory = NULL,
success = NULL,
null = NULL,
p = NULL,
mu = NULL,
med = NULL,
sigma = NULL,
stat = c("mean", "median", "sum", "sd", "prop", "count", "diff in means",
"diff in medians", "diff in props", "Chisq", "F", "slope", "correlation", "t", "z",
"ratio of props", "odds ratio"),
order = NULL,
...
)
```

## Arguments

- x
A data frame that can be coerced into a tibble.

- formula
A formula with the response variable on the left and the explanatory on the right. Alternatively, a

`response`

and`explanatory`

argument can be supplied.- response
The variable name in

`x`

that will serve as the response. This is an alternative to using the`formula`

argument.- explanatory
The variable name in

`x`

that will serve as the explanatory variable. This is an alternative to using the formula argument.- success
The level of

`response`

that will be considered a success, as a string. Needed for inference on one proportion, a difference in proportions, and corresponding z stats.- null
The null hypothesis. Options include

`"independence"`

,`"point"`

, and`"paired independence"`

.`independence`

: Should be used with both a`response`

and`explanatory`

variable. Indicates that the values of the specified`response`

variable are independent of the associated values in`explanatory`

.`point`

: Should be used with only a`response`

variable. Indicates that a point estimate based on the values in`response`

is associated with a parameter. Sometimes requires supplying one of`p`

,`mu`

,`med`

, or`sigma`

.`paired independence`

: Should be used with only a`response`

variable giving the pre-computed difference between paired observations. Indicates that the order of subtraction between paired values does not affect the resulting distribution.

- p
The true proportion of successes (a number between 0 and 1). To be used with point null hypotheses when the specified response variable is categorical.

- mu
The true mean (any numerical value). To be used with point null hypotheses when the specified response variable is continuous.

- med
The true median (any numerical value). To be used with point null hypotheses when the specified response variable is continuous.

- sigma
The true standard deviation (any numerical value). To be used with point null hypotheses.

- stat
A string giving the type of the statistic to calculate. Current options include

`"mean"`

,`"median"`

,`"sum"`

,`"sd"`

,`"prop"`

,`"count"`

,`"diff in means"`

,`"diff in medians"`

,`"diff in props"`

,`"Chisq"`

(or`"chisq"`

),`"F"`

(or`"f"`

),`"t"`

,`"z"`

,`"ratio of props"`

,`"slope"`

,`"odds ratio"`

,`"ratio of means"`

, and`"correlation"`

.`infer`

only supports theoretical tests on one or two means via the`"t"`

distribution and one or two proportions via the`"z"`

.- order
A string vector of specifying the order in which the levels of the explanatory variable should be ordered for subtraction (or division for ratio-based statistics), where

`order = c("first", "second")`

means`("first" - "second")`

, or the analogue for ratios. Needed for inference on difference in means, medians, proportions, ratios, t, and z statistics.- ...
To pass options like

`na.rm = TRUE`

into functions like mean(), sd(), etc. Can also be used to supply hypothesized null values for the`"t"`

statistic or additional arguments to`stats::chisq.test()`

.

## See also

Other wrapper functions:
`chisq_stat()`

,
`chisq_test()`

,
`prop_test()`

,
`t_stat()`

,
`t_test()`

Other functions for calculating observed statistics:
`chisq_stat()`

,
`t_stat()`

## Examples

```
# calculating the observed mean number of hours worked per week
gss %>%
observe(hours ~ NULL, stat = "mean")
#> Response: hours (numeric)
#> # A tibble: 1 × 1
#> stat
#> <dbl>
#> 1 41.4
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(response = hours) %>%
calculate(stat = "mean")
#> Response: hours (numeric)
#> # A tibble: 1 × 1
#> stat
#> <dbl>
#> 1 41.4
# calculating a t statistic for hypothesized mu = 40 hours worked/week
gss %>%
observe(hours ~ NULL, stat = "t", null = "point", mu = 40)
#> Response: hours (numeric)
#> Null Hypothesis: point
#> # A tibble: 1 × 1
#> stat
#> <dbl>
#> 1 2.09
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
#> Response: hours (numeric)
#> Null Hypothesis: point
#> # A tibble: 1 × 1
#> stat
#> <dbl>
#> 1 2.09
# similarly for a difference in means in age based on whether
# the respondent has a college degree
observe(
gss,
age ~ college,
stat = "diff in means",
order = c("degree", "no degree")
)
#> Response: age (numeric)
#> Explanatory: college (factor)
#> # A tibble: 1 × 1
#> stat
#> <dbl>
#> 1 0.941
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(age ~ college) %>%
calculate("diff in means", order = c("degree", "no degree"))
#> Response: age (numeric)
#> Explanatory: college (factor)
#> # A tibble: 1 × 1
#> stat
#> <dbl>
#> 1 0.941
# for a more in-depth explanation of how to use the infer package
if (FALSE) {
vignette("infer")
}
```