Metadata
File-level metadata associated with a data file are collected in a ReadStatMeta; while variable-level metadata associated with each data column are collected in ReadStatColMetas. These metadata objects are stored in a ReadStatTable along with the data columns and can be accessed via methods compatible with DataAPI.jl.
File-Level Metadata
Each ReadStatTable contains a ReadStatMeta for file-level metadata.
ReadStatTables.ReadStatMeta — TypeReadStatMeta <: AbstractMetaDictA collection of file-level metadata associated with a data file processed with ReadStat.
Metadata can be retrieved and modified from the associated ReadStatTable via methods compatible with DataAPI.jl. A dictionary-like interface is also available for directly working with ReadStatMeta.
Fields
row_count::Int: number of rows returned byReadStatparser; being-1if not available in metadata; may reflect the value set with therow_limitparser option instead of the actual number of rows in the data file.var_count::Int: number of data columns returned byReadStatparser.creation_time::DateTime: timestamp for file creation.modified_time::DateTime: timestamp for file modification.file_format_version::Int: version number of file format.file_format_is_64bit::Bool: indicator for 64-bit file format; only relevant to SAS.compression::readstat_compress_t: file compression mode; only relevant to certain file formats.endianness::readstat_endian_t: endianness of data file.table_name::String: name of the data table; only relevant to.xptformat.file_label::String: label of data file.file_encoding::String: character encoding of data file.notes::Vector{String}: notes attached to data file.file_ext::String: file extension of data file.
To retrieve the ReadStatMeta from the ReadStatTable:
julia> metadata(tb)ReadStatMeta: row count => 5 var count => 7 modified time => 2021-04-22T21:36:00 file format version => 118 file label => A test file file extension => .dta
The value associated with a specific metadata key can be retrieved via:
julia> metadata(tb, "file_label")"A test file"julia> metadata(tb, "file_label", style=true)("A test file", :note)
To obtain a complete list of metadata keys:
julia> metadatakeys(tb)("row_count", "var_count", "creation_time", "modified_time", "file_format_version", "file_format_is_64bit", "compression", "endianness", "table_name", "file_label", "file_encoding", "notes", "file_ext")
Metadata contained in a ReadStatMeta can be modified, optionally with a metadata style set at the same time:
julia> metadata!(tb, "file_label", "A file label", style=:default)ReadStatMeta: row count => 5 var count => 7 modified time => 2021-04-22T21:36:00 file format version => 118 file label => A file label file extension => .dta
Since ReadStatMeta has a dictionary-like interface, one can also directly work with it:
julia> m = metadata(tb)ReadStatMeta: row count => 5 var count => 7 modified time => 2021-04-22T21:36:00 file format version => 118 file label => A file label file extension => .dtajulia> keys(m)KeySet for a ReadStatMeta with 13 entries. Keys: "row_count" "var_count" "creation_time" "modified_time" "file_format_version" "file_format_is_64bit" "compression" "endianness" "table_name" "file_label" "file_encoding" "notes" "file_ext"julia> m["file_label"]"A file label"julia> m["file_label"] = "A new file label""A new file label"julia> copy(m)Dict{String, Any} with 13 entries: "file_ext" => ".dta" "file_encoding" => "" "file_label" => "A new file label" "var_count" => 7 "row_count" => 5 "modified_time" => DateTime("2021-04-22T21:36:00") "file_format_version" => 118 "file_format_is_64bit" => true "table_name" => "" "creation_time" => DateTime("2021-04-22T21:36:00") "endianness" => READSTAT_ENDIAN_LITTLE "compression" => READSTAT_COMPRESS_NONE "notes" => String[]
Variable-Level Metadata
A ReadStatColMeta is associated with each data column for variable-level metadata.
ReadStatTables.ReadStatColMeta — TypeReadStatColMeta <: AbstractMetaDictA collection of variable-level metadata associated with a data column processed with ReadStat.
Metadata can be retrieved and modified from the associated ReadStatTable via methods compatible with DataAPI.jl. A dictionary-like interface is also available for directly working with ReadStatColMeta, but it does not allow modifying metadata values. An alternative way to retrive and modify the metadata is via colmetavalues.
Fields
label::String: variable label.format::String: variable format.type::readstat_type_t: original variable type recognized byReadStat.vallabel::Symbol: name of the dictionary of value labels associated with the variable; see alsogetvaluelabelsfor the effect of modifying this field.storage_width::Csize_t: variable storage width in data file.display_width::Cint: width for display.measure::readstat_measure_t: measure type of the variable; only relevant to SPSS.alignment::readstat_alignment_t: variable display alignment.
To retrieve the ReadStatColMeta for a specified data column contained in a ReadStatTable:
julia> colmetadata(tb, :mylabl)ReadStatColMeta: label => labeled format => %16.0f type => READSTAT_TYPE_INT8 value label => mylabl storage width => 1 display width => 16 measure => READSTAT_MEASURE_UNKNOWN alignment => READSTAT_ALIGNMENT_RIGHT
The value associated with a specific metadata key can be retrieved via:
julia> colmetadata(tb, :mylabl, "label")"labeled"julia> colmetadata(tb, :mylabl, "label", style=true)("labeled", :note)
To obtain a complete list of metadata keys:
julia> colmetadatakeys(tb, :mylabl)("label", "format", "type", "vallabel", "storage_width", "display_width", "measure", "alignment")
Metadata contained in a ReadStatColMeta can be modified, optionally with a metadata style set at the same time:
julia> colmetadata!(tb, :mylabl, "label", "A variable label", style=:default)ColMetaIterator{ReadStatColMeta} with 7 entries: :mychar => ReadStatColMeta(character, %-1s) :mynum => ReadStatColMeta(numeric, %16.2f) :mydate => ReadStatColMeta(date, %td) :dtime => ReadStatColMeta(datetime, %tc) :mylabl => ReadStatColMeta(A variable label, %16.0f) :myord => ReadStatColMeta(ordinal, %16.0f) :mytime => ReadStatColMeta(time, %tcHH:MM:SS)
A ReadStatColMeta also has a dictionary-like interface:
julia> m = colmetadata(tb, :mylabl)ReadStatColMeta: label => A variable label format => %16.0f type => READSTAT_TYPE_INT8 value label => mylabl storage width => 1 display width => 16 measure => READSTAT_MEASURE_UNKNOWN alignment => READSTAT_ALIGNMENT_RIGHTjulia> keys(m)KeySet for a ReadStatColMeta with 8 entries. Keys: "label" "format" "type" "vallabel" "storage_width" "display_width" "measure" "alignment"julia> m["label"]"A variable label"julia> copy(m)Dict{String, Any} with 8 entries: "label" => "A variable label" "format" => "%16.0f" "display_width" => 16 "measure" => READSTAT_MEASURE_UNKNOWN "alignment" => READSTAT_ALIGNMENT_RIGHT "type" => READSTAT_TYPE_INT8 "storage_width" => 0x0000000000000001 "vallabel" => :mylabl
However, it cannot be modified directly via setindex!:
julia> m["label"] = "A new label"ERROR: MethodError: no method matching setindex!(::ReadStatColMeta, ::String, ::String) The function `setindex!` exists, but no method is defined for this combination of argument types. Closest candidates are: setindex!(::AbstractDict, ::Any, ::Any, ::Any, ::Any...) @ Base abstractdict.jl:556
Instead, since the metadata associated with each key are stored consecutively in arrays internally, one may directly access the underlying array for a given metadata key:
ReadStatTables.colmetavalues — Functioncolmetavalues(tb::ReadStatTable, key)Return an array of metadata values associated with key for all columns in tb.
julia> v = colmetavalues(tb, "label")7-element Vector{String}: "character" "numeric" "date" "datetime" "A variable label" "ordinal" "time"
Notice that changing any value in the array returned above will affect the corresponding ReadStatColMeta:
julia> colmetadata(tb, :mychar, "label")"character"julia> v[1] = "char""char"julia> colmetadata(tb, :mychar, "label")"char"
Metadata Styles
Metadata styles provide additional information on how the metadata should be processed in certain scenarios. ReadStatTables.jl does not require such information. However, specifying metadata styles can be useful when the metadata need to be transferred to some other object (e.g., DataFrame from DataFrames.jl). Packages that implement metadata-related methods compatible with DataAPI.jl are able to recognize the metadata contained in ReadStatTable.
By default, metadata on labels and notes have the :note style; all other metadata have the :default style. Keys for metadata with user-specified styles, along with those that have the :note style by default, are recorded in a dictionary:
julia> metastyle(tb)Dict{Symbol, Symbol} with 4 entries: :label => :default :vallabel => :note :notes => :note :file_label => :default
All metadata associated with keys not listed above are of :default style. To modify the metadata style for those associated with a given key:
julia> metastyle!(tb, "modified_time", :note)Dict{Symbol, Symbol} with 5 entries: :label => :default :vallabel => :note :notes => :note :modified_time => :note :file_label => :default
The same method is also used for variable-specific metadata. However, since the styles are only determined by the metadata keys, metadata associated with the same key always have the same style and hence are not distinguished across different columns.
julia> metastyle!(tb, "label", :default)Dict{Symbol, Symbol} with 5 entries: :label => :default :vallabel => :note :notes => :note :modified_time => :note :file_label => :defaultjulia> colmetadata(tb, :mychar, "label", style=true)("char", :default)julia> colmetadata(tb, :mynum, "label", style=true)("numeric", :default)
ReadStatTables.metastyle — Functionmetastyle(tb::ReadStatTable, [key::Union{Symbol, AbstractString}])Return the specified style(s) of all metadata for table tb. If a metadata key is specified, only the style for the associated metadata are returned. By default, metadata on labels and notes have the :note style; all other metadata have the :default style.
The style of metadata is only determined by key and hence is not distinguished across different columns.
ReadStatTables.metastyle! — Functionmetastyle!(tb::ReadStatTable, key::Union{Symbol, AbstractString}, style::Symbol)Set the style of all metadata associated with key to style for table tb.
The style of metadata is only determined by key and hence is not distinguished across different columns.