Douglass.jl is a package for manipulating DataFrames in Julia using a syntax that is very similar to Stata.
using Douglass, RDatasets df = dataset("datasets", "iris") # set the active DataFrame Douglass.set_active_df(:df) # create a variable `z` that is the sum of `SepalLength` and `SepalWidth`, for each row d"gen :z = :SepalLength + :SepalWidth" # replace `z` by the row index for the first 10 observations d"replace :z = _n if _n <= 10" # drop a variable d"drop :z" # construct the within-group mean for a subset of the observations d"bysort :Species : egen :z = mean(:SepalLength) if :SepalWidth .> 3.0"
generate-- Creates a new variable and assigns the output from an expression to it.
replace-- Recplaces the content of a variable, but does not change the type.
egenfor short) -- Creates a new variable. Operates on vectors.
erepfor short) -- Analogous to
egen, replaces values of existing variables.
drop-- Drops the specified observations (if used in conjunction with
if) or variables (without
rename-- Rename a variable
sort-- Sort the rows activate
DataFrameby the specified columns
reshape-- Reshape the activate
DataFramebetween wide and long format (
merge-- Merge the active
DataFramewith another one in the local scope (
duplicates_drop-- Delete duplicate rows, also by subset of columns
See the commands documentation page for more details on syntax of these commands.
Press the backtick (
`) to switch between the normal Julia REPL and the Douglass REPL mode:
- Better documentation of the interface will come when the package is a bit more stable. In the meantime, the Test script is probably the best introduction to the interface for those that know Stata.
- Keep in mind that this is not Stata. Here are some notable differences.
Roadmap / Todo's
- Implement more commands
- If other people find the package useful, it may be worth making the package extensible, so that other commands can be added in separate packages
Douglass.jl is named in honour of the economic historian Douglass North.