This package is a simplistic port of the data repo created by @alexanderrobitzsch as part of their CDM R package.
To install this package, go to the package manager mode and run:
add CDMrdataOnce you have installed the package, you can use it just like any other Julia package
using CDMrdataCurrently the package has only two functions:
load_data(data_name): takes thedata_nameas an argument and loads it as aDictlist_datasets(): lists all the datasets available in the package.
Suppose we want to load the ecpe dataset:
dat = load_data("ecpe");To know the fields for that specific dataset:
keys(dat)KeySet for a Dict{String, Int64} with 2 entries. Keys:
  "data"
  "q.matrix"And to access one of those fields:
dat["q.matrix"]28×3 DataFrame
 Row │ skill1  skill2  skill3 
     │ Int32   Int32   Int32  
─────┼────────────────────────
   1 │      1       1       0
   2 │      0       1       0
   3 │      1       0       1
   4 │      0       0       1
   5 │      0       0       1
   ⋮ │      ⋮        ⋮       ⋮
  24 │      0       1       0
  25 │      1       0       0
  26 │      0       0       1
  27 │      1       0       0
  28 │      0       0       1
               18 rows omitted| Dataset Name | Description (From CDM R Package Dev) | 
|---|---|
cdm01 | 
A multiple choice dataset | 
cdm02 | 
Multiple choice dataset with a Q-matrix designed for polytomous attributes. | 
cdm03 | 
Resimulated dataset from Chiu, Koehn and Wu (2016) where the data generating model is a reduced RUM model. | 
cdm04 | 
Simulated dataset for the sequential DINA model (as described in Ma & de la Torre, 2016). The dataset contains 1000 persons and 12 items which measure 2 skills. | 
cdm05 | 
Example dataset used in Philipp, Strobl, de la Torre and Zeileis (2018). This dataset is a sub-dataset of the probability dataset in the pks package (Heller & Wickelmaier, 2013). | 
cdm06 | 
Resimulated example dataset from Chen and Chen (2017). | 
cdm07 | 
This is a resimulated dataset from the social anxiety disorder data concerning social phobia which involve 13 dichotomous questions (Fang, Liu & Ling, 2017). The simulation was based on a latent class model with five classes. The dataset was also used in Chen, Li, Liu and Ying (2017). | 
cdm08 | 
This is a simulated dataset involving four skills and three misconceptions for the model for simultaneously identifying skills and misconceptions (SISM; Kuo, Chen & de la Torre, 2018). The Q-matrix follows the specification in their simulation study. | 
cdm09 | 
This is a simulated dataset involving polytomous skills which is adapted from the empirical example (proportional reasoning data) of Chen and de la Torre (2013). | 
cdm10 | 
This is a simulated dataset involving a hierarchical skill structure. Skill A has four levels, skill B possesses two levels and skill C has three levels. | 
dcm | 
Dataset from Book 'Diagnostic Measurement' of Rupp, Templin and Henson (2010). | 
dtmr | 
DTMR Fraction Data (Bradshaw et al., 2014). | 
ecpe | 
The dataset has been used in Templin and Hoffman (2013), and Templin and Bradshaw (2014). | 
fraction1 | 
The dataset has been used in de la Torre, J. (2009). | 
fraction2 | 
The dataset has been used in de la Torre, J. (2009) & . Henson, Templin and Willse (2009) | 
fraction3 | 
The dataset has been used in de la Torre (2011). | 
fraction4 | 
The dataset has been used in de la Torre and Douglas (2004) and Chen, Liu, Xu and Ying (2015). | 
fraction5 | 
This dataset was used as an example for the multiple strategy DINA model in de la Torre and Douglas (2008) and Hou and de la Torre (2014). | 
hr | 
Simulated data according to Ravand et al. (2013). | 
jang | 
Simulated dataset according to the Jang (2005) L2 reading comprehension study. | 
melab | 
This is a simulated dataset according to the MELAB reading study (Li, 2011; Li & Suen, 2013). Li (2011) investigated the Fusion model (RUM model) for calibrating this dataset. The dataset in this package is simulated assuming the reduced RUM model (RRUM). | 
mg | 
Large-scale dataset with multiple groups, survey weights and 11 polytomous items. | 
pgdina | 
Dataset for the estimation of the polytomous GDINA model. | 
pisa00R.ct | 
PISA 2000 of German students including 26 items of the reading test [Chen and de la Torre (2014)]. | 
pisa00R.cc | 
PISA 2000 of German students including 20 items of the reading test [Chen and Chen (2016)]. | 
sda6 | 
This is a simulated dataset of the SDA6 study according to informations given in Jurich and Bradshaw (2014). | 
Students | 
This dataset contains item responses of students at a scale of cultural activities (act), mathematics self concept (sc) and mathematics joyment (mj) from an Austrian survey of 8th grade students | 
timss03.G8.su | 
This is a dataset with a subset of 23 Mathematics items from TIMSS 2003 items used in Su, Choi, Lee, Choi and McAninch (2013). | 
timss07.G4.lee | 
This dataset is a list containing dichotomous item responses (data; information on booklet and gender included), the Q-matrix (q.matrix) and descriptions of the skills (skillinfo) used in Lee et al. (2011). | 
timss07.G4.py | 
This dataset uses the same items as timss07.G4.lee but employs a simplified Q-matrix with 7 skills. This Q-matrix was used in Park and Lee (2014) and Park et al. (2018). | 
timss07.G4.Qdomains | 
This Q-matrix data is a simplification of timss07.G4.py$q.matrix to 3 domains and involves a simple structure of skills. | 
timss11.G4.AUT | 
TIMSS 2011 dataset of 4668 Austrian fourth-graders. | 
timss11.G4.AUT.part | 
Part of timss11.G4.AUT and contains only the first three booklets (with N=1010 students). | 
timss11.G4.sa | 
Contains the Q-matrix used in Sedat and Arican (2015). | 
fraction.subtraction.data | 
Tatsuoka's (1984) fraction subtraction data set is comprised of responses to 𝐽=20 fraction subtraction test items from 𝑁=536 middle school students | 
fraction.subtraction.qmatrix | 
The Q-Matrix corresponding to Tatsuoka (1984) fraction subtraction data set. |