RegressionAndOtherStories.jl v0.10
Project Status | Build Status |
---|---|
Purpose (once completed)
RegressionAndOtherStories.jl contains supporting (Julia) functions and the data files used in "Regression and Other Stories" by Andrew Gelham, Jennifer Hill and Aki Vehtari.
The package is also used in project SR2StanPluto.jl v9+, a revised inplementation of the Statistical Rethinking support functions using Makie.jl, CausalInference.jl and GraphViz.jl.
Contents
The supporting functions are intended to be used in (currently) 3 Julia projects (also under development), ROSStanPluto.jl, ROSTuringPluto.jl and SR2StanPluto.
All data files are in .csv
format and located in the data
directory.
If RegressionAndOtherStories.jl is loaded, the files can be read in as a DataFrame using:
hibbs = CSV.read(ros_datadir("ElectionsEconomy", "hibbs.csv"), DataFrame)
For that purpose ros_datadir()
is exported.
If needed, Stata files (.dat
) have been converted to .csv
files using the scripts in the scripts
directory, e.g. see scripts\hdi.jl
. To access the Stata files in the R package ROS-Examples RegressionAndOtherStories.jl expects the environment variable JULIA_ROS_HOME
to be defined, e.g.:
ENV["JULIA_ROS_HOME"] = expanduser("~/Projects/R/ROS-Examples")
R itself does not necessarily need to be installed for this to work.
If so desired, direct use of the Stata files is also possible as the Stata to .csv file conversion scripts mentioned above show.
Approach
RegressionAndOtherStories.jl v9+ is using Julia's package extension option. In particular Turing, Stan, Makie, GraphViz and CausalInference, if needed, are included as extensions.
Over time I might minimize the use of AlgebraOfGraphics.jl. It is a nice package but also a bit more difficult to tailor (compared to Makie/GLMakie).
In working on this I will move over (and likely update) several important functions from StatisticalRethinking.jl as well, e.g. link()
.
I expect I can use ParetoSmoothedImportanceSampling.jl as is but will take another look at PSIS.jl and ParetoSmooth.jl when revising the relevant chapters.
Project maintenance for Pluto notebooks
In the subdirectory src/Maintenance/reset_notebooks.jl
is a function I use in the Pluto notebook projects (SR2StanPluto, ROSStanPluto, etc.). The function potentially makes two changes to selected notebooks:
- If it finds a line starting with
Pkg.activate
it disables that line ifreset_activate = true
. - If it finds a line starting with
#Pkg.activate
it enables that line ifset_activate = true
. - It removes the Project and Manifest sections of all notebooks selected for reset. See the maintenance notebooks in projects such as SR2StanPluto and ROSStanPluto.
Using Pkg.activate(...)
is useful if your workflow uses many different notebooks.
Issues, comments and questions
Please file issues, comments and questions here.
Pull requests are also welcome.
Versions
Version 0.11
- Redone working with CausalInference.jl.
- Added introductory CausalInference notebooks.
Version 0.10
- Redone DAG struct.
- Use GraphViz with CairoMakie.
- switch to use CairoMakie instead of GLMakie.
Version 0.9
- Switch to extensions.
- Added simulate function.
- Added scale_df_cols! (scale! conflicted with Makie and other packages).
- Switching to CausalInference.jl as a replacement for StructuralCausalModels.jl.
- Possibly switching to either PSIS.jl or ParetoSmooth.jl as a replacement for ParetoSmoothedImportanceSampling.jl.
- Switched to Makie.jl and GLMakie.jl as back-end.
- Use of GraphViz.jl to display DAGs.
Versions 0.7 and 0.8
- Primarily following package updates.
Version 0.6.1
- Changed back to use DataFrames directly as basis for summaries.
- Use getindex to access single elements in summary DataFrames (first argument taken vrom
parameters
column in df) - For Stan use array() to group nested columns into a matrix. For Turing continue to use nested_column_to_array.
Release 0.5.0
- Added DataFrame operatior function (not exported).
- Added errorbars_mean and errorbars_draws.
- Added nested_column_to_array.
- Made model_summary String/Symbol agnostic.
Release 0.4.5
- Doc fixes by Pietro Monticone
- Added model_summary(::SampleModel).
Release 0.4.x
- Model_summary and plot_chains (accept both Symbol and Strings)
- Focus on Appendices A and B.
- Focus on chapters 4, 5, 6, 7
Versions 0.3.6 - 0.3.10
- Fine tuning working with ros_functions and ros_notebooks.
Release 0.3.5
- Added maintenance functions for a (large) set of notebooks.
Release 0.3.4
- Is tagging using JuliaHub with setting branch name working?
Version 0.3.3
- Add initial version of notebook maintenance routines.
- Tag this version (if not done by TagBot)
Version 0.3.2
- Fix Makie and AoG glue scripts.
Version 0.3.1
- StatsFuns compat entry to 1.0.
Version 0.3.0 (under development)
- Switch back to using Requires.jl
- Switch to using
eachindex()
where appropriate. - Experimental versions for chapter 3.
Version 0.2.4
- Chapter 2 mostly done
- Added trankplot function
Version 0.2.0
- Support for the 5 examples from chapter 1 done.
- Added plot_chains() and model_summary() functions.
- Added Makie and AlgebraOfGraphics as dependencies.
Note: Source files for Makie/AoG are all in src/Makie/ to simplify moving those to a separate repo (not my intention right now, but still).
- In sync with both ROS[Turing|Stan]Pluto projects tagged 2.3 and up.
Version 0.1.0
- Initial commit (to registrate the package for usage in projects).
References
Of course this package is focused on:
which in a sense is a major update to item 3. below.
There is no shortage of other good books on Bayesian statistics. A few of my favorites are:
-
Gelman, Hill: Data Analysis Using Regression and Multilevel/Hierarchical Models
-
Betancourt: A Conceptual Introduction to Hamiltonian Monte Carlo
-
Pearl, Glymour, Jewell: Causal Inference in Statistics: A Primer
A good book to understand most of the Julia constructs used in this book is: