Table Interface

This page provides further details on the interface of ReadStatTable.

ReadStatTables.ReadStatTableType
ReadStatTable{Cols} <: Tables.AbstractColumns

A Tables.jl-compatible column table that collects data read from or written to a Stata, SAS or SPSS file processed with the ReadStat C library. File-level and variable-level metadata can be retrieved and modified via methods compatible with DataAPI.jl. For a ReadStatTable constructed by readstat, Cols is either ReadStatColumns or ChainedReadStatColumns depending on whether multiple threads are used for parsing the data file. For a ReadStatTable constructed for writestat, Cols is allowed to be a column table type for any Tables.jl-compatible table. See also ReadStatMeta and ReadStatColMeta for the included metadata.

source

Data Columns

As a subtype of Tables.AbstractColumns, commonly used methods including those defined in Tables.jl are implemented for ReadStatTable.

julia> using ReadStatTables, Tables
julia> tb = readstat("data/sample.dta")5×7 ReadStatTable: Row │ mychar mynum mydate dtime mylabl ⋯ │ String3 Float64 Date? DateTime? Labeled{Int8} Label ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ a 1.1 2018-05-06 2018-05-06T10:10:10 Male ⋯ 2 │ b 1.2 1880-05-06 1880-05-06T10:10:10 Female ⋯ 3 │ c -1000.3 1960-01-01 1960-01-01T00:00:00 Male ⋯ 4 │ d -1.4 1583-01-01 1583-01-01T00:00:00 Female ⋯ 5 │ e 1000.3 missing missing Male ⋯ 2 columns omitted

A column can be accessed either by name or by position via multiple methods:

julia> tb.mynum5-element Vector{Float64}:
     1.1
     1.2
 -1000.3
    -1.4
  1000.3
julia> tb[:mynum]5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
julia> Tables.getcolumn(tb, :mynum)5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
julia> tb[2]5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
julia> Tables.getcolumn(tb, 2)5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3

To check whether a column is in a ReadStatTable:

julia> haskey(tb, :mynum)true
julia> haskey(tb, 2)true

To check the number of rows in a ReadStatTable:

julia> Tables.rowcount(tb)5
julia> size(tb, 1)5

To check the number of columns in a ReadStatTable:

julia> length(tb)7
julia> size(tb, 2)7

Iterating a ReadStatTable directly results in iteration across columns:

julia> for col in tb
           println(eltype(col))
       endString3
Float64
Union{Missing, Date}
Union{Missing, DateTime}
LabeledValue{Union{Missing, Int8}, Union{Char, Int32}}
LabeledValue{Union{Missing, Int8}, Union{Char, Int32}}
DateTime

Data Values

In addition to retrieving the data columns, it is possible to directly retrieve and modify individual data values via getindex and setindex!:

julia> tb[1,1]"a"
julia> tb[1,1] = "f""f"
julia> tb[1,1]"f"
julia> tb[1,:mylabl]1
julia> tb[1,:mylabl] = 22
julia> tb[1,:mylabl]2
julia> tb[1,:mydate]21310.0
julia> tb[1,:dtime]1.84122061e12

Notice that for data columns with value labels, these methods only deal with the underlying values and disregard the value labels. Similarly, for data columns with a date/time format, the numerical values instead of the converted Date/DateTime values are returned.