Extended StatsModels.jl
@formula syntax for
regression modeling.
Note that the functionality in this package is very new: please verify that the resulting schematized formulae and model coefficient (names) are what you were expecting, especially if you are combining multiple "advanced" formula features.
a / b expands to a + fulldummy(a) & b.
Numeric constants are special cased so that / performs division, making it possible to e.g. convert time to speed in the formula:
julia> fit(MyModelType, @formula(time_in_milliseconds / 1000 ~ 1 + x), my_data)Generate all main effects and interactions up to the specified order. For
instance, (a+b+c)^2 generates a + b + c + a&b + a&c + b&c, but not a&b&c.
NB: The presence of interaction terms within the base will result in redundant terms and is currently unsupported.
Extended syntax is supported at two levels. First, RegressionFormulae.jl
defines apply_schema methods that capture calls within a @formula to the
special syntax (^, /, etc.). Second, we define methods for the
corresponding functions in Base (Base.:(^), Base.:(/), etc.) for arguments
that are <:AbstractTerm which implement the special behavior, returning the
appropriate terms. This allows the syntax to be used both within a @formula
and for constructing terms programmatically at run-time.
If using apply_schema directly, please note that you need to pass an appropriate model type as context.
Currently, the extensions here are defined for StatsAPI.RegressionModel and subtypes:
f = apply_schema(f, s, RegressionModel)