The sample method of a CmdStanModel object runs the default MCMC algorithm in CmdStan (algorithm=hmc engine=nuts), to produce a set of draws from the posterior distribution of a model conditioned on some data.

Usage

$sample(
  num_chains = 1,
# num_cores = NULL, # not yet available
  data = NULL,
  num_warmup = NULL,
  num_samples = NULL,
  save_warmup = FALSE,
  thin = NULL,
  refresh = NULL,
  init = NULL,
  seed = NULL,
  max_depth = NULL,
  metric = NULL,
  stepsize = NULL,
  adapt_engaged = NULL,
  adapt_delta = NULL
)

Arguments shared by all fitting methods

The following arguments can be specified for any of the fitting methods (sample, optimize, variational). Arguments left at NULL default to the default used by the installed version of CmdStan.

  • data (multiple options): The data to use:

    • A named list of R objects like for RStan;

    • A path to a data file compatible with CmdStan (R dump or JSON). See the appendices in the CmdStan manual for details on using these formats.

  • seed: (positive integer) A seed for the (P)RNG to pass to CmdStan.

  • refresh: (non-negative integer) The number of iterations between screen updates.

  • init: (multiple options) The initialization method:

    • A real number x>0 initializes randomly between [-x,x] (on the unconstrained parameter space);

    • 0 initializes to 0;

    • A character vector of data file paths (one per chain) to initialization files.

Arguments unique to the sample method

In addition to the arguments above, the sample method also has its own set of arguments. These arguments are described briefly here and in greater detail in the CmdStan manual. Arguments left at NULL default to the default used by the installed version of CmdStan.

  • num_samples: (positive integer) The number of sampling iterations.

  • num_warmup: (positive integer) The number of warmup iterations.

  • save_warmup: (logical) Should warmup iterations also be streamed to the output?

  • thin: (positive integer) The period between saved samples. This should typically be left at its default (no thinning).

  • adapt_engaged: (logical) Do warmup adaptation?

  • adapt_delta: (real in (0,1)) The adaptation target acceptance statistic.

  • stepsize: (positive real) The initial step size for the discrete approximation to continuous Hamiltonian dynamics. This is further tuned during warmup.

  • metric: (character) The geometry of the base manifold. One of the following:

    • A single string from among "diag_e", "dense_e", "unit_e";

    • A character vector containing paths to files (one per chain) compatible with CmdStan that contain precomputed metrics. Each path must be to a JSON or Rdump file that contains an entry inv_metric whose value is either the diagonal vector or the full covariance matrix.

    If you want to turn off adaptation when using a precomuted metric set adapt_engaged=FALSE, otherwise it will use the precomputed metric just as an initial guess during adaptation. See the Euclidean Metric section of the CmdStan manual for more details on these options.

  • max_depth: (positive integer) The maximum allowed tree depth. See the Tree Depth section of the CmdStan manual for more details.

Value

The sample method returns a CmdStanMCMC object.

See also

The CmdStanR website (mc-stan.org/cmdstanr) for online documentation and tutorials.

The Stan and CmdStan documentation:

Other CmdStanModel methods: CmdStanModel-method-compile, CmdStanModel-method-optimize, CmdStanModel-method-variational

Examples

# \dontrun{ # Set path to cmdstan # Note: if you installed CmdStan via install_cmdstan() with default settings # then default below should work. Otherwise use the `path` argument to # specify the location of your CmdStan installation. set_cmdstan_path(path = NULL)
#> CmdStan path set to: /Users/jgabry/.cmdstanr/cmdstan
# Create a CmdStan model object from a Stan program, # here using the example model that comes with CmdStan stan_program <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan") mod <- cmdstan_model(stan_program) mod$print()
#> data { #> int<lower=0> N; #> int<lower=0,upper=1> y[N]; #> } #> parameters { #> real<lower=0,upper=1> theta; #> } #> model { #> theta ~ beta(1,1); #> for (n in 1:N) #> y[n] ~ bernoulli(theta); #> }
# Compile to create executable mod$compile()
#> Running make /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli #> make: `/Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli' is up to date.
# Run sample method (MCMC via Stan's dynamic HMC/NUTS), # specifying data as a named list (like RStan) standata <- list(N = 10, y =c(0,1,0,0,0,0,0,0,0,1)) fit_mcmc <- mod$sample(data = standata, seed = 123, num_chains = 2)
#> method = sample (Default) #> sample #> num_samples = 1000 (Default) #> num_warmup = 1000 (Default) #> save_warmup = 0 (Default) #> thin = 1 (Default) #> adapt #> engaged = 1 (Default) #> gamma = 0.050000000000000003 (Default) #> delta = 0.80000000000000004 (Default) #> kappa = 0.75 (Default) #> t0 = 10 (Default) #> init_buffer = 75 (Default) #> term_buffer = 50 (Default) #> window = 25 (Default) #> algorithm = hmc (Default) #> hmc #> engine = nuts (Default) #> nuts #> max_depth = 10 (Default) #> metric = diag_e (Default) #> metric_file = (Default) #> stepsize = 1 (Default) #> stepsize_jitter = 0 (Default) #> id = 1 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b02199f3176.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> #> Gradient evaluation took 1.7e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds. #> Adjust your expectations accordingly! #> #> #> Iteration: 1 / 2000 [ 0%] (Warmup) #> Iteration: 100 / 2000 [ 5%] (Warmup) #> Iteration: 200 / 2000 [ 10%] (Warmup) #> Iteration: 300 / 2000 [ 15%] (Warmup) #> Iteration: 400 / 2000 [ 20%] (Warmup) #> Iteration: 500 / 2000 [ 25%] (Warmup) #> Iteration: 600 / 2000 [ 30%] (Warmup) #> Iteration: 700 / 2000 [ 35%] (Warmup) #> Iteration: 800 / 2000 [ 40%] (Warmup) #> Iteration: 900 / 2000 [ 45%] (Warmup) #> Iteration: 1000 / 2000 [ 50%] (Warmup) #> Iteration: 1001 / 2000 [ 50%] (Sampling) #> Iteration: 1100 / 2000 [ 55%] (Sampling) #> Iteration: 1200 / 2000 [ 60%] (Sampling) #> Iteration: 1300 / 2000 [ 65%] (Sampling) #> Iteration: 1400 / 2000 [ 70%] (Sampling) #> Iteration: 1500 / 2000 [ 75%] (Sampling) #> Iteration: 1600 / 2000 [ 80%] (Sampling) #> Iteration: 1700 / 2000 [ 85%] (Sampling) #> Iteration: 1800 / 2000 [ 90%] (Sampling) #> Iteration: 1900 / 2000 [ 95%] (Sampling) #> Iteration: 2000 / 2000 [100%] (Sampling) #> #> Elapsed Time: 0.012149 seconds (Warm-up) #> 0.018859 seconds (Sampling) #> 0.031008 seconds (Total) #> #> method = sample (Default) #> sample #> num_samples = 1000 (Default) #> num_warmup = 1000 (Default) #> save_warmup = 0 (Default) #> thin = 1 (Default) #> adapt #> engaged = 1 (Default) #> gamma = 0.050000000000000003 (Default) #> delta = 0.80000000000000004 (Default) #> kappa = 0.75 (Default) #> t0 = 10 (Default) #> init_buffer = 75 (Default) #> term_buffer = 50 (Default) #> window = 25 (Default) #> algorithm = hmc (Default) #> hmc #> engine = nuts (Default) #> nuts #> max_depth = 10 (Default) #> metric = diag_e (Default) #> metric_file = (Default) #> stepsize = 1 (Default) #> stepsize_jitter = 0 (Default) #> id = 2 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b02199f3176.data.R #> init = 2 (Default) #> random #> seed = 124 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> #> Gradient evaluation took 1.8e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds. #> Adjust your expectations accordingly! #> #> #> Iteration: 1 / 2000 [ 0%] (Warmup) #> Iteration: 100 / 2000 [ 5%] (Warmup) #> Iteration: 200 / 2000 [ 10%] (Warmup) #> Iteration: 300 / 2000 [ 15%] (Warmup) #> Iteration: 400 / 2000 [ 20%] (Warmup) #> Iteration: 500 / 2000 [ 25%] (Warmup) #> Iteration: 600 / 2000 [ 30%] (Warmup) #> Iteration: 700 / 2000 [ 35%] (Warmup) #> Iteration: 800 / 2000 [ 40%] (Warmup) #> Iteration: 900 / 2000 [ 45%] (Warmup) #> Iteration: 1000 / 2000 [ 50%] (Warmup) #> Iteration: 1001 / 2000 [ 50%] (Sampling) #> Iteration: 1100 / 2000 [ 55%] (Sampling) #> Iteration: 1200 / 2000 [ 60%] (Sampling) #> Iteration: 1300 / 2000 [ 65%] (Sampling) #> Iteration: 1400 / 2000 [ 70%] (Sampling) #> Iteration: 1500 / 2000 [ 75%] (Sampling) #> Iteration: 1600 / 2000 [ 80%] (Sampling) #> Iteration: 1700 / 2000 [ 85%] (Sampling) #> Iteration: 1800 / 2000 [ 90%] (Sampling) #> Iteration: 1900 / 2000 [ 95%] (Sampling) #> Iteration: 2000 / 2000 [100%] (Sampling) #> #> Elapsed Time: 0.011733 seconds (Warm-up) #> 0.019533 seconds (Sampling) #> 0.031266 seconds (Total) #>
# Call CmdStan's bin/summary fit_mcmc$summary()
#> Running bin/stansummary \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-1.csv \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-sample-2.csv #> Inference for Stan model: bernoulli_model #> 2 chains: each with iter=(1000,1000); warmup=(0,0); thin=(1,1); 2000 iterations saved. #> #> Warmup took (0.012, 0.012) seconds, 0.024 seconds total #> Sampling took (0.019, 0.020) seconds, 0.038 seconds total #> #> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat #> lp__ -7.3 2.7e-02 7.3e-01 -8.9 -7.0 -6.8 738 19235 1.0e+00 #> accept_stat__ 0.92 3.1e-03 1.3e-01 0.64 0.97 1.0 1670 43487 1.0e+00 #> stepsize__ 0.92 1.7e-03 1.7e-03 0.92 0.92 0.92 1.0 26 1.4e+12 #> treedepth__ 1.3 1.1e-02 4.7e-01 1.0 1.0 2.0 1968 51255 1.0e+00 #> n_leapfrog__ 2.4 2.5e-02 1.0e+00 1.0 3.0 3.0 1640 42721 1.0e+00 #> divergent__ 0.00 0.0e+00 0.0e+00 0.00 0.00 0.00 1000 26047 nan #> energy__ 7.8 4.0e-02 1.0e+00 6.8 7.5 9.8 647 16846 1.0e+00 #> theta 0.24 4.6e-03 1.2e-01 0.077 0.22 0.47 720 18757 1.0e+00 #> #> Samples were drawn using hmc with nuts. #> For each parameter, N_Eff is a crude measure of effective sample size, #> and R_hat is the potential scale reduction factor on split chains (at #> convergence, R_hat=1). #>
# Run optimization method (default is Stan's LBFGS algorithm) # and also demonstrate specifying data as a path to a file (readable by CmdStan) my_data_file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.data.R") fit_optim <- mod$optimize(data = my_data_file, seed = 123)
#> Warning: Optimization method is experimental and the structure of returned object may change.
#> method = optimize #> optimize #> algorithm = lbfgs (Default) #> lbfgs #> init_alpha = 0.001 (Default) #> tol_obj = 9.9999999999999998e-13 (Default) #> tol_rel_obj = 10000 (Default) #> tol_grad = 1e-08 (Default) #> tol_rel_grad = 10000000 (Default) #> tol_param = 1e-08 (Default) #> history_size = 5 (Default) #> iter = 2000 (Default) #> save_iterations = 0 (Default) #> id = 1 #> data #> file = /Users/jgabry/.cmdstanr/cmdstan/examples/bernoulli/bernoulli.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-optimize-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> Initial log joint probability = -9.51104 #> Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes #> 6 -5.00402 0.000103557 2.55661e-07 1 1 9 #> Optimization terminated normally: #> Convergence detected: relative gradient magnitude is below tolerance
#' Print estimates fit_optim$summary()
#> Estimates from optimization:
#> theta lp__ #> 0.20000 -5.00402
# Run variational Bayes method (default is meanfield ADVI) fit_vb <- mod$variational(data = standata, seed = 123)
#> Warning: Variational inference method is experimental and the structure of returned object may change.
#> method = variational #> variational #> algorithm = meanfield (Default) #> meanfield #> iter = 10000 (Default) #> grad_samples = 1 (Default) #> elbo_samples = 100 (Default) #> eta = 1 (Default) #> adapt #> engaged = 1 (Default) #> iter = 50 (Default) #> tol_rel_obj = 0.01 (Default) #> eval_elbo = 100 (Default) #> output_samples = 1000 (Default) #> id = 1 #> data #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/Rtmpy8TKSY/standata-b022268471e.data.R #> init = 2 (Default) #> random #> seed = 123 #> output #> file = /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv #> diagnostic_file = (Default) #> refresh = 100 (Default) #> #> ------------------------------------------------------------ #> EXPERIMENTAL ALGORITHM: #> This procedure has not been thoroughly tested and may be unstable #> or buggy. The interface is subject to change. #> ------------------------------------------------------------ #> #> #> #> Gradient evaluation took 2.1e-05 seconds #> 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds. #> Adjust your expectations accordingly! #> #> #> Begin eta adaptation. #> Iteration: 1 / 250 [ 0%] (Adaptation) #> Iteration: 50 / 250 [ 20%] (Adaptation) #> Iteration: 100 / 250 [ 40%] (Adaptation) #> Iteration: 150 / 250 [ 60%] (Adaptation) #> Iteration: 200 / 250 [ 80%] (Adaptation) #> Success! Found best value [eta = 1] earlier than expected. #> #> Begin stochastic gradient ascent. #> iter ELBO delta_ELBO_mean delta_ELBO_med notes #> 100 -6.258 1.000 1.000 #> 200 -6.475 0.517 1.000 #> 300 -6.228 0.358 0.040 #> 400 -6.220 0.269 0.040 #> 500 -6.379 0.220 0.034 #> 600 -6.195 0.188 0.034 #> 700 -6.262 0.163 0.030 #> 800 -6.345 0.144 0.030 #> 900 -6.201 0.131 0.025 #> 1000 -6.307 0.119 0.025 #> 1100 -6.290 0.020 0.023 #> 1200 -6.238 0.017 0.017 #> 1300 -6.182 0.014 0.013 #> 1400 -6.167 0.014 0.013 #> 1500 -6.219 0.012 0.011 #> 1600 -6.164 0.010 0.009 MEDIAN ELBO CONVERGED #> #> Drawing a sample of size 1000 from the approximate posterior... #> COMPLETED.
# Call CmdStan's bin/summary fit_vb$summary()
#> Running bin/stansummary \ #> /var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T//Rtmpy8TKSY/bernoulli-stan-variational-1.csv #> Warning: non-fatal error reading adapation data #> Inference for Stan model: bernoulli_model #> 1 chains: each with iter=(1001); warmup=(0); thin=(0); 1001 iterations saved. #> #> Warmup took (0.00) seconds, 0.00 seconds total #> Sampling took (0.00) seconds, 0.00 seconds total #> #> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat #> lp__ 0.00 0.0e+00 0.00 0.00 0.00 0.0e+00 500 inf nan #> log_p__ -7.2 2.5e-02 0.72 -8.6 -7.0 -6.8e+00 789 inf 1.0e+00 #> log_g__ -0.54 2.9e-02 0.76 -2.1 -0.27 -1.5e-03 679 inf 1.0e+00 #> theta 0.26 4.2e-03 0.12 0.091 0.23 4.9e-01 823 inf 1.0e+00 #> #> Samples were drawn using meanfield with . #> For each parameter, N_Eff is a crude measure of effective sample size, #> and R_hat is the potential scale reduction factor on split chains (at #> convergence, R_hat=1). #>
# For models fit using MCMC, if you like working with RStan's stanfit objects # then you can create one with rstan::read_stan_csv() if (require(rstan, quietly = TRUE)) { stanfit <- rstan::read_stan_csv(fit_mcmc$output_files()) print(stanfit) }
#> Inference for Stan model: bernoulli-stan-sample-1. #> 2 chains, each with iter=2000; warmup=1000; thin=1; #> post-warmup draws per chain=1000, total post-warmup draws=2000. #> #> mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat #> theta 0.24 0.00 0.12 0.06 0.14 0.22 0.32 0.52 720 1 #> lp__ -7.32 0.03 0.73 -9.37 -7.53 -7.05 -6.81 -6.75 737 1 #> #> Samples were drawn using NUTS(diag_e) at Mon Oct 14 21:41:31 2019. #> For each parameter, n_eff is a crude measure of effective sample size, #> and Rhat is the potential scale reduction factor on split chains (at #> convergence, Rhat=1).
# }