Declare a null hypothesis about variables selected in specify()
.
Learn more in vignette("infer")
.
hypothesize(x, null, p = NULL, mu = NULL, med = NULL, sigma = NULL)
hypothesise(x, null, p = NULL, mu = NULL, med = NULL, sigma = NULL)
x | A data frame that can be coerced into a tibble. |
---|---|
null | The null hypothesis. Options include |
p | The true proportion of successes (a number between 0 and 1). To be used with point null hypotheses when the specified response variable is categorical. |
mu | The true mean (any numerical value). To be used with point null hypotheses when the specified response variable is continuous. |
med | The true median (any numerical value). To be used with point null hypotheses when the specified response variable is continuous. |
sigma | The true standard deviation (any numerical value). To be used with point null hypotheses. |
A tibble containing the response (and explanatory, if specified) variable data with parameter information stored as well.
# hypothesize independence of two variables
gss %>%
specify(college ~ partyid, success = "degree") %>%
hypothesize(null = "independence")
#> Dropping unused factor levels DK from the supplied explanatory variable 'partyid'.
#> Response: college (factor)
#> Explanatory: partyid (factor)
#> Null Hypothesis: independence
#> # A tibble: 500 × 2
#> college partyid
#> <fct> <fct>
#> 1 degree ind
#> 2 no degree rep
#> 3 degree ind
#> 4 no degree ind
#> 5 degree rep
#> 6 no degree rep
#> 7 no degree dem
#> 8 degree ind
#> 9 degree rep
#> 10 no degree dem
#> # … with 490 more rows
# hypothesize a mean number of hours worked per week of 40
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40)
#> Response: hours (numeric)
#> Null Hypothesis: point
#> # A tibble: 500 × 1
#> hours
#> <dbl>
#> 1 50
#> 2 31
#> 3 40
#> 4 40
#> 5 40
#> 6 53
#> 7 32
#> 8 20
#> 9 40
#> 10 40
#> # … with 490 more rows
# more in-depth explanation of how to use the infer package
if (FALSE) {
vignette("infer")
}