TAXSIM is a program of the National Bureau of Economic Research (NBER) which calculates liabilities under US federal and state income tax laws. It can be accessed by uploading tax filer information to the NBER's TAXSIM server. The program then computes a number of variables (income taxes, tax credits, etc.) and returns them.
Taxsim.jl
exchanges data between the Julia workspace and the server. Its function taxsim32
supports the latest TAXSIM version 32. Future versions will be included.
Here is a short video tutorial I gave at JuliaCon2021. It illustrates how to use Taxsim.jl
with Census and American Community Survey data (downloaded from IPUMS).
Daniel Feenberg develops and maintains TAXSIM. He and his collaborators provide helpful materials including codes to prepare input files from household datasets (CPS, SCF, PSID).
Reach out to Daniel with questions on TAXSIM and follow his request on citation (see bottom of this webpage).
Taxsim.jl
can be installed via Julia's package manager using one of two options:
- REPL:
] add Taxsim
- Pkg functions:
using Pkg; Pkg.add("Taxsim")
Before using taxsim32
, please make yourself familiar with Internet TAXSIM 32. Submit a few individual observations and upload an entire csv file.
taxsim32(df; kwargs...)
df
has to be a DataFrame object with at least one observation.- Included columns have to be named exactly as in the Internet TAXSIM 32 variable list (bold names after boxes) but can be in any order.
taxsim32
returns typos and case errors. - Non-provided input variables are set to zero by the TAXSIM server but
" "
(blanks as strings) ormissing
lead to non-response as the server only accepts Integers or Floats.taxsim32
returns type errors.
- Included columns have to be named exactly as in the Internet TAXSIM 32 variable list (bold names after boxes) but can be in any order.
connection
: choose either"SSH"
or"FTP"
."SSH"
issues a system curl command while"FTP"
uses the FTPClient Package. Defaults to"SSH"
(which is faster).full
: request the full list of TAXSIM return variables v1 to v45. Defaults tofalse
which returns v1 to v9.long_names
: name all return variables with their long TAXSIM names. Defaults tofalse
which returns abbreviated names for v1 to v9 and no names for v10 to v45.
- Data frame with requested TAXSIM return variables. Column types are either Integer or Float.
- If
df
does not includestate
or ifstate = 0
, the data frame returned by afull
request does not include v30 to v41. - Observations are ordered as in
df
sohcat(df, df_output, makeunique=true)
merges all variables of the input and output data frames.
using DataFrames, Taxsim
df_small_input = DataFrame(year=1980, mstat=2, ltcg=100000)
1×3 DataFrame
│ Row │ year │ mstat │ ltcg │
│ │ Int64 │ Int64 │ Int64 │
├─────┼───────┼───────┼────────┤
│ 1 │ 1980 │ 2 │ 100000 │
df_small_output_default = taxsim32(df_small_input)
1×9 DataFrame
│ Row │ taxsimid │ year │ state │ fiitax │ siitax │ fica │ frate │ srate │ ficar │
│ │ Float64 │ Int64 │ Int64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │
├─────┼──────────┼───────┼───────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ 1 │ 0.0 │ 1980 │ 0 │ 10920.0 │ 0.0 │ 0.0 │ 20.0 │ 0.0 │ 12.0 │
df_small_output_full = taxsim32(df_small_input, connection="FTP", full=true)
1×29 DataFrame
│ Row │ taxsimid │ year │ state │ fiitax │ siitax │ fica │ frate │ srate │ ficar │ v10 │ v11 │ ... | v29 │ v42 │ ... | v45 │
│ │ Float64 │ Int64 │ Int64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ ... │ Float64 │ Float64 | ... | Float64 |
├─────┼──────────┼───────┼───────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────┼─────────┼─────────┼─────┼─────────┼
│ 1 │ 0.0 │ 1980 │ 0 │ 10920.0 │ 0.0 │ 0.0 │ 20.0 │ 0.0 │ 12.26 │ 40000.0 │ 0.0 │ ... | 0.0 │ 0.0 | ... | 0.0 |
df_small_output_names = taxsim32(df_small_input, long_names=true)
1×9 DataFrame
│ Row │ Case ID │ Year │ State │ Federal income tax liability including capital gains rates, surtaxes, AMT and refundable and non-refundable credits │ ... │ federal marginal rate │
│ │ Float64 │ Int64 │ Int64 │ Float64 │ ... │ Float64 │
├─────┼─────────┼───────┼───────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────┼─
│ 1 │ 0.0 │ 1980 │ 0 │ 10920.0 │ ... │ 20.0 │
N = 10000
df_small_stateN = DataFrame(year=repeat([1980],inner=N), mstat=repeat([2],inner=N), ltcg=repeat([100000],inner=N), state=repeat([1],inner=N))
df_small_stateN_out = taxsim32(df_small_stateN)
10000×9 DataFrame
Row │ taxsimid year state fiitax siitax fica frate srate ficar
│ Float64 Int64 Int64 Float64 Float64 Float64 Float64 Float64 Float64
───────┼──────────────────────────────────────────────────────────────────────────────
1 │ 1.0 1980 1 10920.0 1119.0 0.0 20.0 4.0 12.0
2 │ 2.0 1980 1 10920.0 1119.0 0.0 20.0 4.0 12.0
3 │ 3.0 1980 1 10920.0 1119.0 0.0 20.0 4.0 12.0
4 │ 4.0 1980 1 10920.0 1119.0 0.0 20.0 4.0 12.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
9998 │ 9998.0 1980 1 10920.0 1119.0 0.0 20.0 4.0 12.0
9999 │ 9999.0 1980 1 10920.0 1119.0 0.0 20.0 4.0 12.0
10000 │ 10000.0 1980 1 10920.0 1119.0 0.0 20.0 4.0 12.0
Expect three different kinds of errors
- Input Error Adjust
df
so it meets the required column types and names. - Connection Error Indicates that
taxsim32
cannot connect to the TAXSIM server. Try a different connection option. If this does not help, check your internet and network settings and contact your network administrator - you're probably behind a restrictive firewall. - Server Error Returned from the TAXSIM server (error message begins with "TAXSIM: ... "). Either a faulty
df
passed the input tests or TAXSIM cannot compute the tax variables for some other reason which the error message hopefully helps to identify. Example: "TAXSIM: Non-joint return with 2 wage-earners"
Please file an issue if you experience problems with large input data frames (server non-response, truncated return data frames, etc).
taxsim32
currently returns marginal tax rates computed with respect to taxpayer earnings. Marginal rates for "Wage Income", "Spouse Earning", etc. will be included as keyword options in future releases.- HTTP connection will be included as another connection option in future releases.