samplingbook package

Build Status (Linux) Build status (Windows) Coverage Status GitHub repo size GitHub issues

CRAN Status Badge CRAN Downloads CRAN License CRAN dependencies status Website

Sampling procedures from the book ‘Stichproben - Methoden und praktische Umsetzung mit R’ by Goeran Kauermann and Helmut Kuechenhoff (2010).

The book introduces the basic principles for sampling and corresponding statistical analysis. The following topics are covered:

  1. Introduction of Basic Sampling Principles
  2. Simple Sample Methods
  3. Model-based Sampling
  4. Design-based Sampling: Horvitz-Thomson Estimate
  5. Grouping of Populations: Stratified and Cluster Sampling
  6. Methods with Multiple Phases
  7. Problems in Real-World Applications

Each chapter concludes with exemplifying real-world applications in R, which utilize the functions from this R package.

Installation

You can install the latest production version from CRAN

install.packages("samplingbook", dependencies = TRUE)

or the current development version from GitHub

library("devtools")
install_github("jmanitz/samplingbook")

Then, load the package

library("samplingbook")

Example: Simple Sample and Estimation of Proportions

In a company with N=300 employees a survey was conducted regarding to improve working conditions. A random sample of n=100 employees was drawn and two questions were asked:

The questions were answered with “yes” by 45 and 2 employees, respectively. Using samplingbook::Sprop(), we can estimate the proportions of support in the entire company:

Sprop(m=45, n=100, N=300)
## 
## Sprop object: Sample proportion estimate
## With finite population correction: N = 300 
## 
## Proportion estimate:  0.45 
## Standard error:  0.0408 
## 
## 95% approximate confidence interval: 
##  proportion: [0.37,0.53]
##  number in population: [111,159]
## 95% exact hypergeometric confidence interval: 
##  proportion: [0.3667,0.5367]
##  number in population: [110,161]

Thus, between 37% and 53% of all employees, thus between 111 and 159 persons, are estimated to support more flexible working hours.

Sprop(m=2, n=100, N=300)
## 
## Sprop object: Sample proportion estimate
## With finite population correction: N = 300 
## 
## Proportion estimate:  0.02 
## Standard error:  0.0115 
## 
## 95% approximate confidence interval: 
##  proportion: [-0.0025,0.0425]
##  number in population: [0,12]
## 95% exact hypergeometric confidence interval: 
##  proportion: [0.0067,0.0633]
##  number in population: [2,19]

On the other hand, the survey results that only between 1 and 19 employees would support child care within the company.

The example shows, that in particular the for small proportion estimates, the calculations of exact confidence intervals using the hypergeometric distributions is more appropriate.

Some remarks on exact hypergeometric confidence intervals for proportion estimates can be found in the vignette.

Contributions

Citation

Juliane Manitz, Mark Hempelmann, Goeran Kauermann, Helmut Kuechenhoff, Shuai Shao, Cornelia Oberhauser, Nina Westerheide and Manuel Wiesenfarth (2020). samplingbook: Survey Sampling Procedures. R package version 1.2.4. https://CRAN.R-project.org/package=samplingbook

Use toBibtex(citation("samplingbook")) in R to extract BibTeX references.