Table Interface
This page provides further details on the interface of ReadStatTable
.
ReadStatTables.ReadStatTable
— TypeReadStatTable{Cols} <: Tables.AbstractColumns
A Tables.jl
-compatible column table that collects data read from or written to a Stata, SAS or SPSS file processed with the ReadStat
C library. File-level and variable-level metadata can be retrieved and modified via methods compatible with DataAPI.jl
. For a ReadStatTable
constructed by readstat
, Cols
is either ReadStatColumns
or ChainedReadStatColumns
depending on whether multiple threads are used for parsing the data file. For a ReadStatTable
constructed for writestat
, Cols
is allowed to be a column table type for any Tables.jl
-compatible table. See also ReadStatMeta
and ReadStatColMeta
for the included metadata.
Data Columns
As a subtype of Tables.AbstractColumns
, commonly used methods including those defined in Tables.jl are implemented for ReadStatTable
.
julia> using ReadStatTables, Tables
julia> tb = readstat("data/sample.dta")
5×7 ReadStatTable: Row │ mychar mynum mydate dtime mylabl ⋯ │ String3 Float64 Date? DateTime? Labeled{Int8} Label ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ a 1.1 2018-05-06 2018-05-06T10:10:10 Male ⋯ 2 │ b 1.2 1880-05-06 1880-05-06T10:10:10 Female ⋯ 3 │ c -1000.3 1960-01-01 1960-01-01T00:00:00 Male ⋯ 4 │ d -1.4 1583-01-01 1583-01-01T00:00:00 Female ⋯ 5 │ e 1000.3 missing missing Male ⋯ 2 columns omitted
A column can be accessed either by name or by position via multiple methods:
julia> tb.mynum
5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
julia> tb[:mynum]
5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
julia> Tables.getcolumn(tb, :mynum)
5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
julia> tb[2]
5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
julia> Tables.getcolumn(tb, 2)
5-element Vector{Float64}: 1.1 1.2 -1000.3 -1.4 1000.3
To check whether a column is in a ReadStatTable
:
julia> haskey(tb, :mynum)
true
julia> haskey(tb, 2)
true
To check the number of rows in a ReadStatTable
:
julia> Tables.rowcount(tb)
5
julia> size(tb, 1)
5
To check the number of columns in a ReadStatTable
:
julia> length(tb)
7
julia> size(tb, 2)
7
Iterating a ReadStatTable
directly results in iteration across columns:
julia> for col in tb println(eltype(col)) end
String3 Float64 Union{Missing, Date} Union{Missing, DateTime} LabeledValue{Union{Missing, Int8}, Union{Char, Int32}} LabeledValue{Union{Missing, Int8}, Union{Char, Int32}} DateTime
Data Values
In addition to retrieving the data columns, it is possible to directly retrieve and modify individual data values via getindex
and setindex!
:
julia> tb[1,1]
"a"
julia> tb[1,1] = "f"
"f"
julia> tb[1,1]
"f"
julia> tb[1,:mylabl]
1
julia> tb[1,:mylabl] = 2
2
julia> tb[1,:mylabl]
2
julia> tb[1,:mydate]
21310.0
julia> tb[1,:dtime]
1.84122061e12
Notice that for data columns with value labels, these methods only deal with the underlying values and disregard the value labels. Similarly, for data columns with a date/time format, the numerical values instead of the converted Date
/DateTime
values are returned.