Table Interface
This page provides further details on the interface of ReadStatTable.
ReadStatTables.ReadStatTable — TypeReadStatTable{Cols} <: Tables.AbstractColumnsA Tables.jl-compatible column table that collects data read from or written to a Stata, SAS or SPSS file processed with the ReadStat C library. File-level and variable-level metadata can be retrieved and modified via methods compatible with DataAPI.jl. For a ReadStatTable constructed by readstat, Cols is either ReadStatColumns or ChainedReadStatColumns depending on whether multiple threads are used for parsing the data file. For a ReadStatTable constructed for writestat, Cols is allowed to be a column table type for any Tables.jl-compatible table. See also ReadStatMeta and ReadStatColMeta for the included metadata.
Data Columns
As a subtype of Tables.AbstractColumns, commonly used methods including those defined in Tables.jl are implemented for ReadStatTable.
julia> using ReadStatTables, Tablesjulia> tb = readstat("data/sample.dta")5×7 ReadStatTable: Row │ mychar mynum mydate dtime mylabl ⋯ │ String3 Float64 Date? DateTime? Labeled{Int8} Label ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ a 1.1 2018-05-06 2018-05-06T10:10:10 Male ⋯ 2 │ b 1.2 1880-05-06 1880-05-06T10:10:10 Female ⋯ 3 │ c -1000.3 1960-01-01 1960-01-01T00:00:00 Male ⋯ 4 │ d -1.4 1583-01-01 1583-01-01T00:00:00 Female ⋯ 5 │ e 1000.3 missing missing Male ⋯ 2 columns omitted
A column can be accessed either by name or by position via multiple methods:
julia> tb.mynum5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3julia> tb[:mynum]5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3julia> Tables.getcolumn(tb, :mynum)5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3julia> tb[2]5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3julia> Tables.getcolumn(tb, 2)5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
To check whether a column is in a ReadStatTable:
julia> haskey(tb, :mynum)truejulia> haskey(tb, 2)true
To check the number of rows in a ReadStatTable:
julia> Tables.rowcount(tb)5julia> size(tb, 1)5
To check the number of columns in a ReadStatTable:
julia> length(tb)7julia> size(tb, 2)7
Iterating a ReadStatTable directly results in iteration across columns:
julia> for col in tb println(eltype(col)) endString3 Float64 Union{Missing, Date} Union{Missing, DateTime} LabeledValue{Union{Missing, Int8}, Union{Char, Int32}} LabeledValue{Union{Missing, Int8}, Union{Char, Int32}} DateTime
Data Values
In addition to retrieving the data columns, it is possible to directly retrieve and modify individual data values via getindex and setindex!:
julia> tb[1,1]"a"julia> tb[1,1] = "f""f"julia> tb[1,1]"f"julia> tb[1,:mylabl]1julia> tb[1,:mylabl] = 22julia> tb[1,:mylabl]2julia> tb[1,:mydate]21310.0julia> tb[1,:dtime]1.84122061e12
Notice that for data columns with value labels, these methods only deal with the underlying values and disregard the value labels. Similarly, for data columns with a date/time format, the numerical values instead of the converted Date/DateTime values are returned.