`octa --schema`¶

Print the column schema of a tabular file: column name + data type, nothing else.

Synopsis¶

octa --schema FILE [-f tsv|json|csv]

Flag	Required	Default	Meaning
`--schema FILE`	yes	(no default)	Path to the file to inspect.
`-f`, `--format`	no	`tsv`	Output format. See CLI overview.

What it prints¶

Two-column output:

Column	Meaning
`name`	The column's name from the file
`type`	The column's data type, in Octa's type system

Octa's type strings are Arrow-derived: Int8, Int16, Int32, Int64, Float32, Float64, Utf8, LargeUtf8, Boolean, Date32, Timestamp(Microsecond, None), Binary, LargeBinary, etc. These map cleanly to most other type systems.

Examples¶

TSV (default)¶

$ octa --schema sales.parquet
name      type
region    Utf8
quarter   Utf8
amount    Float64
order_id  Int64

JSON¶

$ octa --schema sales.parquet -f json
[
  { "name": "region", "type": "Utf8" },
  { "name": "quarter", "type": "Utf8" },
  { "name": "amount", "type": "Float64" },
  { "name": "order_id", "type": "Int64" }
]

Piping into jq works as you'd expect:

octa --schema sales.parquet -f json | jq -r '.[] | "\(.name): \(.type)"'
# region: Utf8
# quarter: Utf8
# amount: Float64
# order_id: Int64

CSV¶

$ octa --schema sales.parquet -f csv
name,type
region,Utf8
quarter,Utf8
amount,Float64
order_id,Int64

Notes¶

Multi-table sources (SQLite, DuckDB, GeoPackage with more than one table) currently print the first table's schema. Cross-table schema listing isn't exposed via the CLI yet; the MCP server's list_tables tool covers that case.
Streaming formats (Parquet, CSV, TSV) load the standard initial-row batch (5 Million rows by default, override with --rows N|all) and then project the schema out, so the cost is the read cost of the cap, not the whole file. For schema-only inspection on multi-GB files, this is usually still sub-second on Parquet. Parquet files with very many row groups fall back to a DuckDB-backed reader automatically.
Text formats (CSV, JSON, etc.) infer types from the header row, following the same rules the GUI uses.
Read-only formats are supported just fine; schema works for SAS, RDS, HDF5, NetCDF, EPUB, GeoJSON the same as for Parquet.

octa --schema¶