Compute a p-value from a null distribution and observed statistic. Simulation-based methods are (currently only) supported.

Learn more in vignette("infer").

get_p_value(x, obs_stat, direction)

get_pvalue(x, obs_stat, direction)

Arguments

x

Data frame of calculated statistics as returned by generate()

obs_stat

A numeric value or a 1x1 data frame (as extreme or more extreme than this).

direction

A character string. Options are "less", "greater", or "two-sided". Can also use "left", "right", "both", "two_sided", or "two sided".

Value

A 1x1 tibble with value between 0 and 1.

Aliases

get_pvalue() is an alias of get_p_value(). p_value is a deprecated alias of get_p_value().

Zero p-value

Though a true p-value of 0 is impossible, get_p_value() may return 0 in some cases. This is due to the simulation-based nature of the {infer} package; the output of this function is an approximation based on the number of reps chosen in the generate() step. When the observed statistic is very unlikely given the null hypothesis, and only a small number of reps have been generated to form a null distribution, it is possible that the observed statistic will be more extreme than every test statistic generated to form the null distribution, resulting in an approximate p-value of 0. In this case, the true p-value is a small value likely less than 3/reps (based on a poisson approximation).

In the case that a p-value of zero is reported, a warning message will be raised to caution the user against reporting a p-value exactly equal to 0.

Examples

# find the point estimate---mean number of hours worked per week point_estimate <- gss %>% specify(response = hours) %>% calculate(stat = "mean") %>% dplyr::pull() # starting with the gss dataset gss %>% # ...we're interested in the number of hours worked per week specify(response = hours) %>% # hypothesizing that the mean is 40 hypothesize(null = "point", mu = 40) %>% # generating data points for a null distribution generate(reps = 1000, type = "bootstrap") %>% # finding the null distribution calculate(stat = "mean") %>% get_p_value(obs_stat = point_estimate, direction = "two-sided")
#> # A tibble: 1 x 1 #> p_value #> <dbl> #> 1 0.042
# More in-depth explanation of how to use the infer package if (FALSE) { vignette("infer") }