Skip to content

Man Page Reference

The full man-page reference for octa(1). The source of truth is docs/cli/octa.1.adoc (AsciiDoc); this page mirrors that content as Markdown for the docs site.

On Linux (with the man page installed) man octa gives you the same content at a terminal. See Installation for how to get the man page on disk.

Name

octa: multi-format tabular data viewer, editor, CLI tool, and MCP server.

Synopsis

octa [FILE...]
octa --schema FILE [-f FORMAT] [--rows N|all]
octa --head FILE [-n N] [-f FORMAT] [--rows N|all]
octa --tail FILE [-n N] [-f FORMAT] [--rows N|all]
octa --sample FILE [-n N] [--seed N] [-f FORMAT] [--rows N|all]
octa --convert IN OUT [--rows N|all]
octa --sql FILE -q QUERY [-f FORMAT] [--rows N|all]
     [--sql-table NAME=PATH ...] [--sql-attach ALIAS=PATH ...]
     [--sql-write-to PATH --sql-write-table TABLE
       [--sql-write-schema SCHEMA] [--sql-write-mode create|append|replace]]
octa --export-schema FILE [-t TARGET]
octa --compare-schemas FILE_A FILE_B [--table-a NAME] [--table-b NAME] [-f FORMAT]
octa --diff FILE_A FILE_B [-f FORMAT]
octa --describe FILE [--table NAME] [--sample-rows N] [-f FORMAT]
octa --validate-schema FILE --expect-schema SCHEMA_FILE [--table NAME] [-f FORMAT]
octa --unique-columns FILE [--table NAME] [--max-combo N] [-f FORMAT]
octa --mcp

Description

octa is a desktop application for viewing and editing tabular data files. It opens Parquet, CSV, JSON, SQLite, DuckDB, Excel, and roughly twenty more formats in a spreadsheet-like view with sorting, filtering, full-text search, inline editing, SQL queries, and file comparison.

When invoked with no flags, it launches the graphical interface, optionally opening the supplied FILE(s) in tabs. When invoked with one of the action flags (--schema, --head, --tail, --sample, --convert, --sql, --export-schema, --compare-schemas, --diff, --describe, --validate-schema, --unique-columns, --mcp), it performs that action and exits.

Action flags are mutually exclusive. Trailing FILE arguments are ignored (with a warning) when an action flag is set.

Action Flags

--schema FILE
Print the column schema of FILE as a two-column table (column name, data type). For streaming formats (Parquet, CSV, TSV) the reader loads the initial-row batch (5,000,000 rows by default) and projects the schema from that. See octa --schema for the dedicated page.
--head FILE
Print the first N rows of FILE to standard output. N defaults to 20 and is set with -n / --lines. For streaming formats, the reader stops at the initial-load cap and N is a slice off that. See octa --head.
--tail FILE
Print the last N rows of FILE. N defaults to 20 (-n / --lines). Streaming formats load with the initial-row cap, so the tail reflects the end of the loaded window; raise --rows to tail the true end of a very large file.
--sample FILE
Print a random N-row sample of FILE (without replacement, original row order preserved). N defaults to 20 (-n / --lines); the sample is reproducible for a given --seed (default 0).
--convert IN OUT
Convert IN to OUT. Both formats are inferred from each path's extension and routed through the shared format registry. Read-only output formats (SAS, R datasets, HDF5, NetCDF, EPUB, GeoJSON) are rejected with a clear error. Conversion is bounded by the initial-load cap (5 M rows by default); pass --rows all to convert the full file. See octa --convert.
--sql FILE
Run a SQL query against FILE. The query is supplied via -q / --query. FILE is exposed to DuckDB as a temporary table called data. Extra tables can be loaded with --sql-table NAME=PATH and whole DuckDB / SQLite databases can be attached with --sql-attach ALIAS=PATH, so a single invocation can JOIN across formats. The SELECT result can be persisted to a DuckDB or SQLite file via --sql-write-to. Mutations (INSERT / UPDATE / DELETE) on data itself do not persist back to FILE, because the in-memory DuckDB connection is discarded at exit. See octa --sql.
--export-schema FILE
Render FILE's column schema as SQL DDL, a Pydantic model, a TypeScript interface, a JSON Schema document, or a Rust struct, and print it to standard output. The target is chosen with -t / --target (default postgres); only the column list is read. See octa --export-schema.
--compare-schemas FILE_A FILE_B
Diff the column schemas of two files. Prints a four-column table (status / column / type_a / type_b). Matching is by exact, case-sensitive column name. Use --table-a / --table-b to pick a specific table on multi-table sources. See octa --compare-schemas.
--diff FILE_A FILE_B
Row-level diff of two files. Compares rows by whole-row content (every column, positionally) and prints the rows present in only one side, tagged by a leading status column (only_in_a / only_in_b); a summary line (shared / only-in-A / only-in-B counts) goes to standard error. The two files should share the same column order. Complements --compare-schemas, which diffs only the column metadata.
--describe FILE
Print a one-shot orientation snapshot of FILE: format, file size, row count, column schema, and a small sample of rows. Use --sample-rows N to change the preview size (default 5, max 100). See octa --describe.
--validate-schema FILE
Check FILE's column schema against the JSON Schema given by --expect-schema SCHEMA_FILE. Exit code is 0 on a clean match and 1 otherwise, which is CI-pipeable. Schemas produced by --export-schema -t json-schema round-trip cleanly. See octa --validate-schema.
--unique-columns FILE
Find columns (and optional small combinations) whose values are unique across FILE. Useful for primary-key reconnaissance. Use --max-combo N (clamped to [1, 3]) to test pairs / triples. See octa --unique-columns.
--mcp
Start a Model Context Protocol (MCP) server on standard input / output. Twenty tools are exposed: read_table, tail, sample, schema, list_tables, count_rows, run_sql, convert, export_schema, profile, find_duplicates, value_frequency, search, compare_schemas, diff_tables, describe_file, validate_against_schema, unique_columns, write_table, edit_table. Defaults for the row limit and per-cell byte cap come from the user's Octa settings (Settings → MCP). See the MCP server guide for setup.

Options

-n N, --lines N
Row count for --head, --tail, and --sample. Default 20.
--seed N
Seed for --sample, for reproducible output. Default 0.
-q QUERY, --query QUERY
SQL query string for --sql. Always reference the file's data as the table data.
--sql-table NAME=PATH
For --sql only. Register an extra file as a workspace table named NAME. Any supported format. Repeatable.
--sql-attach ALIAS=PATH
For --sql only. ATTACH a DuckDB or SQLite database under ALIAS. Repeatable. After attachment every inner table is queryable as alias.schema.tbl (DuckDB) or alias.tbl (SQLite when the DuckDB sqlite extension is bundled, otherwise fallback registration under alias__table).
--sql-write-to PATH
For --sql only. Persist the SELECT result to PATH instead of printing it. PATH's extension picks DuckDB (.duckdb, .ddb) or SQLite (everything else). The file is created if missing. Requires --sql-write-table; --sql-write-schema and --sql-write-mode are optional.
--sql-write-table TABLE
Target table name for --sql-write-to.
--sql-write-schema SCHEMA
Target schema for --sql-write-to. DuckDB only; defaults to main. SQLite has no schemas; pass main or leave unset.
--sql-write-mode MODE
create (default; errors if the table already exists), replace (drop + recreate), or append (INSERT into the existing table). Column count and order must match in append mode.
-t TARGET, --target TARGET
Output target for --export-schema. TARGET is one of postgres (default), mysql, sqlite, databricks, snowflake, pydantic, typescript, json-schema, or rust.
--table-a NAME, --table-b NAME
For --compare-schemas only: pick a specific table on each side when the source is multi-table (SQLite, DuckDB, GeoPackage).
--table NAME
For --describe, --validate-schema, and --unique-columns: pick a specific table on the file when the source is multi-table.
--expect-schema SCHEMA_FILE
Path to the expected JSON Schema for --validate-schema. Required by that action.
--sample-rows N
Number of preview rows for --describe (default 5, clamped to 100).
--max-combo N
Max combo size for --unique-columns (default 1, clamped to [1, 3]).
--rows N|all
Override the initial-load row cap for this invocation. Streaming formats (Parquet, CSV, TSV) honour a process-wide cap (default 5,000,000 rows); --rows 10,000,000 raises it, --rows all disables it entirely. Applies to --schema, --head, --convert, and --sql. Commas / underscores in the number are allowed for readability.
-f FORMAT, --format FORMAT

Output format for actions that print a table. FORMAT is one of:

  • tsv (default): tab-separated values, one row per line, header row first. TAB and newline characters in cells are replaced with spaces (TSV has no escape mechanism).
  • json: pretty-printed JSON array of {column: value} objects. Numeric and boolean cells keep their native JSON types; dates, blobs, and nested values become strings.
  • csv: RFC 4180 CSV. Fields with comma, quote, or newline are properly quoted; embedded quotes are doubled.

--format has no effect for --convert (output format is taken from the output path's extension), --export-schema (which emits source code chosen by -t), or --mcp.

-h, --help
Print the full help text (worked examples for every action) and exit. -h and --help produce the same long-form output, because Octa intentionally wires both flags to the same help text rather than using clap's default short/long split.
--version
Print the Octa version and exit.

Output Streams

Tabular data is written to stdout. Status messages, warnings, and errors are written to stderr. This means octa --sql FILE -q QUERY -f json | jq ... is safe even when an error occurs, since the data stream stays clean.

Exit code is 0 on success and 1 on any error (invalid arguments, file-not-found, parse failure, write rejection, etc.). --validate-schema also exits 1 on a successful read where the schemas differ, so CI pipelines can gate on the schema directly.

Examples

Open multiple files in the GUI:

octa file1.csv file2.parquet file3.json

Print the schema of a Parquet file:

octa --schema sales.parquet

Print the first 5 rows of a CSV as JSON:

octa --head data.csv -n 5 -f json

Print the last rows / a reproducible random sample:

octa --tail data.csv -n 5
octa --sample data.parquet -n 20 --seed 1

Convert formats:

octa --convert in.csv out.parquet
octa --convert workbook.xlsx tidy.sqlite

Group-by aggregation:

octa --sql sales.parquet -q 'SELECT region, SUM(amount) FROM data GROUP BY region'

Read every row of a huge file:

octa --sql huge.parquet -q 'SELECT count(*) FROM data' --rows all
octa --head huge.parquet -n 100 --rows 10,000,000

Pipe a SQL result through jq:

octa --sql users.parquet -q 'SELECT email FROM data WHERE active' -f json \
  | jq -r '.[].email'

JOIN across formats:

octa --sql sales.parquet \
     --sql-table customers=customers.csv \
     -q 'SELECT c.name, SUM(d.amount) FROM data d
         JOIN customers c ON d.cid = c.cid GROUP BY c.name'

ATTACH a DuckDB warehouse and JOIN against it:

octa --sql sales.parquet \
     --sql-attach wh=warehouse.duckdb \
     -q 'SELECT count(*) FROM data d
         JOIN wh.main.products p ON d.cid = p.cid'

Write a SQL result back to a DuckDB schema:

octa --sql sales.parquet -q '
  SELECT region, SUM(amount) AS total FROM data GROUP BY region
' --sql-write-to analytics.duckdb \
  --sql-write-schema reports \
  --sql-write-table q4_summary

Export a schema as Snowflake DDL or a Pydantic model:

octa --export-schema sales.parquet -t snowflake
octa -e users.parquet -t pydantic > users_model.py

Diff the schemas of two files:

octa --compare-schemas v1.parquet v2.parquet
octa --compare-schemas a.sqlite b.sqlite --table-a users --table-b users -f json

Diff two files' rows:

octa --diff v1.csv v2.csv
octa --diff a.parquet b.parquet -f json

One-shot file snapshot:

octa --describe data.parquet --sample-rows 3
octa --describe users.sqlite --table customers -f json

Validate a file against a JSON Schema in CI:

octa --export-schema sales.parquet -t json-schema > sales.schema.json
octa --validate-schema sales.parquet --expect-schema sales.schema.json

Find primary-key candidates:

octa --unique-columns users.csv
octa --unique-columns orders.parquet --max-combo 2 -f json

Start the MCP server:

octa --mcp

Files

$XDG_CONFIG_HOME/octa/settings.toml
Linux. User settings. Created on first launch with defaults. See Settings reference for every key.
$HOME/Library/Application Support/Octa/settings.toml
macOS. Same purpose.
%APPDATA%\Octa\settings.toml
Windows. Same purpose.

MCP Server

When invoked with --mcp, Octa speaks the Model Context Protocol over JSON-RPC on stdin/stdout. Fifteen tools are exposed:

Defaults (the response row cap of 1000 rows, per-cell byte cap of 64 KiB, and file-loader cap of 5,000,000 rows) are configurable under Settings → MCP and Settings → Performance. They are read once at server startup; changes require a restart. Per-call, pass limit: 0 to lift the response cap and unlimited: true to lift the file-loader cap so the tool sees every row on disk. Parquet files with very many row groups fall back to a DuckDB-backed reader automatically. See Limits & truncation for the full mechanics.

See Also

man(1), jq(1), duckdb(1), parquet-tools(1)

Bugs / Feedback

Report bugs at https://github.com/thorstenfoltz/octa/issues.

Author

Thorsten Foltz

Copyright © 2026 Thorsten Foltz. Licensed under the MIT license.