|Project Status||Build Status|
sample method to generate draws from a Stan Language Program. It is the primary workhorse in the StanJulia ecosystem.
StanSample.jl v7 supports InferenceObjects.jl as a package extension. Use
inferencedata(model) to create an InferenceData object. See also note 1 below. An example Pluto notebook can be found here
Use of both InferenceObjects.jl and the
read_samples()output_format options :dimarray and :dimarrays (based on DimensionalData.jl) creates a conflict. Hence these output_format options are no longer included. See the example Pluto notebook
test_dimarray.jlin StanExampleNotebooks.jl for an example how to still use that option. At some point in time InferenceObjects.jl might provide an alternative way to create a stacked DataFrame and/or DimensionalData object.
I've removed BridgeStan.jl from StanSample.jl. Two example Pluto notebooks,
bridgestan_stansample_example.jlin StanExampleNotebooks.jl demonstrate how BridgeStan can be used.
You need a working installation of Stan's cmdstan, the path of which you should specify in either
JULIA_CMDSTAN_HOME, e.g. in your
~/.julia/config/startup.jl include a line like:
# CmdStan setup ENV["CMDSTAN"] = expanduser("~/.../cmdstan/") # replace with your path
Or you can define and export CMDSTAN in your .profile, .bashrc, .zshrc, etc.
For more details see this file.
example/bernoulli.jl for a basic example. Many more examples and test scripts are available in this package and also in Stan.jl.
Multi-threading and multi-chaining behavior.
From StanSample.jl v6 onwards 2 mechanisms for in paralel drawing samples for chains are supported, i.e. on C++ level (using threads) and on Julia level (by spawning a Julia process for each chain).
use_cpp_chains keyword argument in the call to
stan_sample() determines if chains are executed on C++ level or on Julia level. By default,
use_cpp_chains = false.
From cmdstan-2.28.0 onwards it is possible to use C++ threads to run multiple chains by setting
use_cpp_chains=true in the call to
rc = stan_sample(_your_model_; use_cpp_chains=true, [ data | init | ...])
To enable multithreading in
cmdstan specify this before the build process of
cmdstan, i.e. before running
make -j9 build. I typically create a
path_to_my_cmdstan_directory/make/local file containing
STAN_THREADS=true. You can see an example in
By default in either case
??stan_sample for all keyword arguments. Internally,
num_chains will be copied to either
Currently I do not suggest to use both C++ and Julia level chains. Based on the value of
use_cpp_chains (true or false) the
stan_sample() method will set either
num_cpp_chains=num_chains; num_julia_chains=1 or
This default behavior can be disabled by setting the postional
check_num_chains argument in the call to
Threads on C++ level can be used in multiple ways, e.g. to run separate chains and to speed up certain operations. By default StanSample.jl's SampleModel sets the C++ num_threads to 4.
See the (updated for cmdstan-2.29.0) RedCardsStudy example graphs in Stan.jl and here for more details, in particular with respect to just enabling threads and including TBB or not on Intel, and also some indications of the performance on an Apple's M1/ARM processor running native (not using Rosetta and without Intel's TBB).
In some cases I have seen performance advantages using both Julia threads and C++ threads but too many combined threads certainly doesn't help. Note that if you only want 1000 draws (using 1000 warmup samples for tuning), multiple chains (C++ or Julia) do not help.
This package is registered. It can be installed with:
pkg> add StanSample.jl
Use this package like this:
See the docstrings (in particular
??StanSample) for more help.
- Switch to cmdstan.2.32.0 for testing
- Removed BridgeStan extension
- Updated column types for sample_stats (NamedTuples and DataFrames)
- InferenceObjects.jl support.
- Conditional support for BridgeStan.
- Reduced support for :dimarray and :dimarrays option in
- Support for InferenceObjects v0.3.
tmpdirectories created during testing have been removed from the repo.
- Support for BridgeStan v1.0 has been dropped.
- Moved InferenceObjects behind Requires
- Added inferencedata3()
- Added option to enable logging in the terminal (thanks to @FelixNoessler)
Version 6.13.0 - 6.13.5
- Many more (minor and a bit more) updates to
- Updates to BridgeStan (more to be expected soon)
- Fix for chain numbering when using CPP threads (thanks to @apinter)
- Switched to use cmdstan-2.32.0 for testing
- Updates to Examples_Notebooks (in particular now using both
- Dropped support for read_samples(m, :dimarray) as this conflicted with InferenceData
- Added experimental version of inferencedata(). See example in ./test/test_inferencedata.jl
- Added InferenceObjects.jl as a dependency
- Dropped MonteCarloMeasurements.jl as a dependency (still supported using Requires)
- Dropped MCMCChains.jl as a dependency (still supported using Requires)
- Dropped AxisKeys.jl as a dependency
- Add sig_figs field to SampleModel (thanks to Andrew Radcliffe).
This change enables the user to control the number of significant digits which are preserved in the output. sig_figs=6 is the default cmdstan option, which is what StanSample has been defaulting to.
Typically, a user should prefer to generate outputs with sig_figs=18 so that the f64's are uniquely identified. It might be wise to make such a recommendation in the documentation, but I suppose that casual users would complain about the correspondingly increased .csv sizes (and subsequent read times).
- Dropped conversion to Symbols in
read_csv_files()if internals are requested (
- Added InferenceObjects as a dependency.
This is part of the work with Set Haxen to enable working with InferenceData objects in a future release (probably v6.12).
- Fix bridge_path in SampleModel.
- Support for BridgeStan as a dependency of StanSample.jl (Thanks to Seth Axen)
- Support for the updated version of BridgeStan.
- A much better test has been added for multidimensional input arrays thanks to Andy Pohl (
- More general handling of Array input data to cmdstan if the Array has more than 2 dimensions.
- Experimental support for BridgeStan.
- For chains read in as either a :dataframe or a :nesteddataframe the function matrix(...) has been replaced by array(...). Depending on the the eltype of the requested column, array will return a Vector, a Matrix or an Array with 3 dimensions.
- The function describe() has been added which returns a df with results based on Stan's stansummary executable.
- A new method has been added to DataFrames.getindex to extract cells in stansummary DataFrame, e.g. ss1_1[:a, :ess].
Version 6.8.0 (nesteddataframe is experimental!)
- Added :nesteddataframe option to read_samples(). Maybe useful if cmdstan returns vectors or matrices.
- Extended the matrix() function to matrix(df, Symbol).
- Drops support for creating R files.
- Requires StanBase 4.7.0
- Updated Redcardsstudy results for cmdstan-2.29.0
- Switch to cmdstan-2.29.0 testing.
- Better handling of .csv chain retrieval in read_csv_files.
- Revert back to by default use Julia level chains.
- Documentation improvements.
- Modified (simplified?) use of
num_chainsto define either number of chains on C++ or Julia level based on
use_cpp_chainskeyword argument to
- Switch to C++ threads by default.
- Use JSON3.jl for data.json and init.json as replacement for data.r and init.r files.
- The function
read_generated_quantities()has been dropped.
- The function
stan_generate_quantites()now returns a DataFrame.
Version 5.4 - 5.6
- Full usage of num_threads and num_cpp_threads
Version 5.3.1 & 5.3.2
- Drop the use of the STAN_NUM_THREADS environment variable in favor of the keyword num_threads in stan_sample(). Default value is 4.
- Enable local multithreading. Local as cmdstan needs to be built with STAN_THREADS=true (see make/local examples).
- Switch use CMDSTAN environment variable
- Testing with conda based install (Windows, but also other platforms)
- Docs updates.
- Fix for DimensionalData v0.19.1 (@dim no longer exported)
- Added DataFrame parameter blocking option.
- Keyword based SampleModel and stan_sample().
- Dropped dependency on StanBase.
- Needs cmdstan 2.28.1 (for num_threads).
tmpdirnow positional argument in SampleZModel.
- Refactor src dir (add
- stan_sample() is now an alias for stan_run().
- Added keywords seed and n_chains to stan_sample().
- SampleModel no longer uses shared fields (prep work for v5).
- Minor updates
- Added test for MCMCChains
- The addition of :dimarray and :dimarrays output_format (see ?read_samples).
- No longer re-exporting many previously exported packages.
- The use of Requires.jl to enable most output_format options.
- All example scripts have been moved to Stan.jl (because of item 3).
Version 4.0.0 (BREAKING RELEASE!)
- Make KeyedArray chains the read_samples() default output.
- Drop the output_format kwarg, e.g.:
- Default output format is KeyedArray chains, i.e.:
chns = read_samples(model).
- Introduction of Tables.jl interface as an output_format option (
- Overloading Tables.matrix to group a variable in Stan's output file as a matrix.
- Re-used code in read_csv_files() for generated_quantities.
- The read_samples() method now consistently applies keyword arguments start and chains.
- The table for each chain output_format is :tables.
- Thanks to the help of John Wright (@jwright11) all StanJulia packages have been tested on Windows. Most functionality work, with one exception. Stansummary.exe fails on Windows if warmup samples have been saved.
- By default read_samples(model) will return a NamedTuple with all chains appended.
output_format=:namedtupleswill provide a NamedTuple with separate chains.
- Thanks to @yiyuezhuo, a function
extracthas been added to simplify grouping variables into a NamedTuple.
- read_sample() output_format argument has been extended with an option to request conversion to a NamedTuple.
- Dropped the use of pmap in StanBase