Sora

Sora helps you keep game configuration data understandable while still giving runtime code typed access.

You write a schema that describes table shapes, fill the table rows in Excel, CSV, TOML, JSON, or YAML, and let Sora validate the data. After validation, Sora writes a runtime data bundle and generates code that knows how to load that bundle.

The schema is the contract. Excel, CSV, TOML, generated code, and exported runtime bundles are all projections of that contract. A designer can edit rows in a workbook, while game code consumes strongly typed generated APIs.

For a small project, the file flow looks like this:

project.toml
  -> schema/items.toml
  -> data/Item.xlsx
  -> generated/config.sora
  -> generated/rust

You normally hand-write project.toml and schema files. Designers or tools edit files under data/. Files under generated/ are Sora outputs.

What Sora Does

schema modules -> Excel/CSV/TOML/JSON/YAML data -> validation
                                      |-> runtime bundle
                                      |-> generated code

Sora currently focuses on these stages:

describe tables, records, enums, unions, references, indexes, and validation rules in schema files;
inspect and edit schema modules in the embedded Sora Studio UI;
generate Excel templates from the schema so spreadsheet headers stay consistent;
load table data from TOML, JSON, YAML, CSV, or Excel .xlsx;
validate data against the normalized schema and cross-table references;
export data as Sora binary, debug JSON, JSON bundle, CBOR bundle, or Sora Protobuf bundle;
generate language runtimes that load those exported bundles.

Common Terms

Sora uses the word format in a few different places:

Term	Meaning	Example
Schema format	The file format used to write schema/project files.	TOML, YAML, JSON, Lua
Source format	The editable table data format.	Excel `.xlsx`, CSV, TOML, JSON, YAML
Export format	The data bundle written after validation.	`binary`, `json`, `cbor`
Runtime format	The bundle format generated code expects to load.	`sora`, `json`, `cbor`

For example, Rust codegen with runtime_format = "sora" needs a matching binary export. The source data can still come from Excel.

When This Fits

Sora is intended for game configuration and similar data-heavy applications where:

designers or tools edit tabular data;
runtime code wants typed access instead of loose dictionaries;
schema changes should be reviewed in source control;
generated language support should be extendable by downstream users.

The project is still early, so the public API can change. The design goal is to keep the core schema and IR independent from individual language backends, so downstream users can add generators or exporters without patching the core pipeline.

Projects that need stable output should pin the sora CLI version. Runtime/export format versions are bumped only for actual generated-runtime incompatibility; Sora does not currently maintain old schema semantics behind edition flags. See Versioning and Compatibility.

Core Concepts

Project

A project manifest declares the package name, schema modules, build outputs, codegen targets, and export targets. It is the entry point used by sora check, sora build, sora gen, and sora export.

Schema

Schema files describe the shape of configuration data. They define enums, structs, unions, tables, indexes, references, and field rules. Sora normalizes schema files into an IR before validation, export, or code generation.

Table

A table is a named collection of rows. Tables can be list-like, keyed by one field, or singleton. Source metadata tells Sora where the editable data comes from.

The table schema is also used to generate editor projections such as Excel headers. The spreadsheet is not the contract; it is one way to edit rows that conform to the contract.

Value

Sora validates source cells into a common value tree before export. Generated runtimes read that same shape from different runtime formats, so a target language can switch between sora, json, cbor, or sora-protobuf without changing the schema.

Runtime Format

A runtime format is the wire format that generated code loads. It is selected per language target with runtime_format.

Generator

A generator is a language backend registered in the codegen registry. Built-in generators are ordinary registry entries, which keeps the pipeline open to downstream extensions.

Exporter

An exporter writes validated data into a runtime bundle. The exporter registry is separate from code generation so data formats and language targets can evolve independently.

Scope

Schemas, fields, and tables can declare a scope. A build can select a scope to generate or export only the pieces needed by one runtime environment.

Quick Start

This guide builds a minimal item table, generates an Excel template, exports a runtime bundle, and generates Rust code that can load it.

Install the CLI from the GitHub Releases page by downloading the archive for your platform and placing the sora binary on your PATH.

If you already have a Rust toolchain, you can also install the published package from crates.io:

cargo install sora-cli

For local development from a checkout:

cargo install --path crates/sora-cli

1. Create a Project

The fastest path is to scaffold the same minimal project:

sora init --out my-config --schema-format toml
cd my-config

--schema-format accepts toml, yaml, json, or lua. The scaffold creates this layout:

Path	Who edits it	Purpose
`project.toml`	You	Project entry point, build outputs, default data location.
`schema/items.toml`	You	Schema for the `Item` table.
`data/Item.xlsx`	Designers or tools	Editable row data.
`generated/`	Sora	Schema lock, Excel templates, generated code, exported data.

The rest of this section shows the generated files so you can understand the project shape. project.toml looks like this:

package = "game_config"
includes = ["schema/items.toml"]

[build]
default_source_format = "xlsx"
data_root = "data"
schema_lock = "generated/schema.lock"
excel_templates = "generated/excel"

[[build.codegen]]
target = "rust"
out = "generated/rust"
format = "auto"

[[build.exports]]
format = "binary"
out = "generated/config.sora"

In this file, default_source_format = "xlsx" means table sources default to Excel. data_root = "data" means Item.xlsx is read from data/Item.xlsx during export and build. excel_templates = "generated/excel" is only the generated template output directory. It is where Sora writes fresh workbooks with schema headers; it is not the source data directory. Keep it separate from data so regenerating templates cannot overwrite edited row data. The binary export writes the runtime bundle that Rust code will load because Rust defaults to runtime_format = "sora".

Create schema/items.toml:

[[enums]]
name = "ItemType"
values = ["Weapon", "Armor", "Material", "Consumable"]

[[tables]]
name = "Item"
mode = "map"
key = "id"

[tables.source]
format = "xlsx"
file = "Item.xlsx"
sheet = "Item"

[[tables.fields]]
name = "id"
type = "i32"
comment = "Item id"

[[tables.fields]]
name = "name"
type = "string"
comment = "Display name"

[[tables.fields]]
name = "item_type"
type = "enum<ItemType>"
comment = "Item category"

[[tables.fields]]
name = "max_stack"
type = "i32"
default = "1"
range = [1, 9999]
comment = "Stack limit"

2. Generate the Excel Template

The workbook header is generated from the schema:

sora excel-template --project project.toml --out generated/excel

This creates generated/excel/Item.xlsx. Treat that file as a template artifact that can be regenerated after schema changes. For a new table, copy it to data/Item.xlsx and fill rows below the generated header:

id	name	item_type	max_stack
1001	Iron Sword	Weapon	1
2001	Health Potion	Consumable	99

After you have real data in data/Item.xlsx, do not run excel-template --out data unless you intentionally want to replace those files. Keep generating empty templates into generated/excel, and use excel-sync to update existing data workbooks in place when the schema changes.

For existing data workbooks, prefer syncing headers in place:

sora excel-sync --project project.toml --data-root data
sora excel-sync --project project.toml --data-root data --write

The preview command shows added fields and legacy columns. The --write command refreshes generated header rows while preserving data rows; fields removed from schema stay in Excel as legacy columns that Sora ignores.

3. Check, Export, and Generate

Validate the schema without reading row data:

sora check --project project.toml

Run every output declared in [build]. This also loads and validates source data before writing exports:

sora build --project project.toml

You can also open the project in Sora Studio, the schema editor embedded in the CLI:

sora studio --project project.toml

The command prints a local URL. Open it in a browser to visualize schema relationships, edit schema modules, preview the generated changes, and save them back to the project.

Or run the steps separately:

sora gen --target rust --project project.toml --out generated/rust

sora export \
  --format binary \
  --default-source-format xlsx \
  --project project.toml \
  --data-root data \
  --out generated/config.sora

4. Next Steps

Read Sora Studio if you want to edit schemas visually. Read First Config for the same example with the generated runtime usage, or inspect examples/showcase/project.toml for a larger multi-language setup.

Tutorials

Tutorials walk through Sora from an application user’s point of view.

Start with First Config to build a minimal table end to end. Then read Excel Workflow to understand generated spreadsheet templates and Load Generated Code to connect exported data to runtime code.

First Config

This tutorial creates a small item configuration table. The same pattern scales to larger game data: define the schema, generate an editable workbook, fill rows, export a runtime bundle, and generate code.

Project Layout

project.toml
schema/items.toml
data/Item.xlsx
generated/

Project Manifest

package = "game_config"
includes = ["schema/items.toml"]

[build]
default_source_format = "xlsx"
data_root = "data"
schema_lock = "generated/schema.lock"
excel_templates = "generated/excel"

[[build.codegen]]
target = "rust"
out = "generated/rust"
format = "auto"

[[build.exports]]
format = "binary"
out = "generated/config.sora"

schema_lock captures the normalized schema, excel_templates writes workbooks with generated headers, build.codegen declares language output, and build.exports declares runtime data output.

Schema

[[enums]]
name = "ItemType"
values = ["Weapon", "Armor", "Material", "Consumable"]

[[tables]]
name = "Item"
mode = "map"
key = "id"

[tables.source]
format = "xlsx"
file = "Item.xlsx"
sheet = "Item"

[[tables.fields]]
name = "id"
type = "i32"
comment = "Item id"

[[tables.fields]]
name = "name"
type = "string"
comment = "Display name"

[[tables.fields]]
name = "item_type"
type = "enum<ItemType>"
comment = "Item category"

[[tables.fields]]
name = "max_stack"
type = "i32"
default = "1"
range = [1, 9999]
comment = "Stack limit"

This table uses mode = "map", so the generated runtime exposes keyed lookup by id.

Excel Template

Generate a workbook:

sora excel-template --project project.toml --out generated/excel

The generated sheet has metadata rows above the editable data area:

#field	id	name	item_type	max_stack
#type	i32	string	`enum<ItemType>`	i32
#input	key			range=1..9999
#desc	Item id	Display name	Item category	Stack limit

Rows start after the generated header:

id	name	item_type	max_stack
1001	Iron Sword	Weapon	1
2001	Health Potion	Consumable	99

Copy the workbook to data/Item.xlsx after generating it, or point your source file at the generated location during experiments.

Build

Run the configured outputs:

sora build --project project.toml

Expected artifacts:

generated/schema.lock
generated/excel/Item.xlsx
generated/rust
generated/config.sora

Use sora check --project project.toml when you only want schema validation.

Excel Workflow

Excel support is designed around generated templates. The schema owns the table shape; Excel is an editable projection of that schema.

Generate Templates

There are two ways to generate Excel templates.

The direct command only writes templates:

sora excel-template --project project.toml --out generated/excel

This reads the schema from project.toml and writes generated workbooks under generated/excel. The directory is safe to delete and regenerate because it should contain template artifacts, not hand-edited source data.

The build workflow can do the same thing when excel_templates is configured:

[build]
excel_templates = "generated/excel"

sora build --project project.toml

Both paths generate the same kind of template files. The direct command only writes Excel templates. sora build runs the template output together with the other configured build outputs such as schema locks, code generation, and exports.

Template Directory vs Data Directory

excel_templates is an output directory for templates. It is not the runtime data input directory. Data input normally comes from [build].data_root or the --data-root command option.

The usual layout keeps these paths separate:

Path	Role	Can be regenerated
`generated/excel`	Generated workbook templates with schema headers.	Yes
`data`	Edited table rows used by export and build.	No

Do not point excel-template --out or [build].excel_templates at a directory that already contains edited data workbooks unless replacing those files is intentional. Use generated templates for new workbooks; use excel-sync for workbooks that already contain real data.

Sync Existing Workbooks

For real projects with existing data, use excel-sync instead of copying rows into a fresh template. It updates workbook headers from the current schema while preserving data rows:

sora excel-sync --project project.toml --data-root data

Without --write, the command only previews what would change. To write the updated workbook files:

sora excel-sync --project project.toml --data-root data --write

When writing an existing workbook, Sora first copies the old file under data/.sora-backup/<timestamp>/.

Sync matches columns by the #field row, not by column position:

existing schema fields keep their data;
new schema fields are added as empty columns;
changed type, parser, scope, range, length, comments, and table metadata refresh the generated header rows;
fields removed from schema are not deleted from Excel. They are kept as legacy columns ignored by Sora, so designers can delete them manually when they are ready;
non-schema sheets in the same workbook are preserved as value-only sheets.

The workbook and sheet for each table come from that table’s source:

[[tables]]
name = "Item"

[tables.source]
format = "xlsx"
file = "Core.xlsx"
sheet = "Item"

[[tables]]
name = "Quest"

[tables.source]
format = "xlsx"
file = "Core.xlsx"
sheet = "Quest"

This writes two sheets, Item and Quest, into generated/excel/Core.xlsx.

A table with a different source file goes into a different workbook:

[tables.source]
format = "xlsx"
file = "Battle.xlsx"
sheet = "Skill"

This writes the Skill sheet into generated/excel/Battle.xlsx.

Header Rows

Generated sheets include several header rows:

Row	Purpose
`@table` metadata	Table name, mode, key, scope, and schema hash.
`#name`	Display name row for the spreadsheet.
`#field`	Stable schema field names read by Sora.
`#type`	Type hints such as `i32`, `enum<ItemType>`, or `struct<Cost>(kind: enum<ResourceKind>, id: i32, count: i32)`.
`#scope`	Scope information for each field.
`#input`	Input hints such as key, parser, range, length, or derived-field source.
`#desc`	Field comments for designers and reviewers.

Data rows start after the generated header.

What Users Should Edit

Users should edit data rows. They should not hand-maintain field names, types, key metadata, input hints, or validation rules in Excel. Those rows are regenerated from schema changes.

If a column’s #input cell starts with from=, that field is derived from another table. Leave the generated placeholder in that column and edit the child table rows instead.

When the schema changes, run sora excel-sync --project project.toml --data-root data to preview header changes, then rerun with --write after reviewing them. This keeps spreadsheet editing convenient without making Excel a second schema language.

Common Field Shapes

Simple fields map directly to cells:

id	name	max_stack
1001	Iron Sword	1

Structured values use parsers when a cell needs a compact representation:

[[tables.fields]]
name = "price"
type = "struct<ResourceCost>"
parser = { kind = "tuple" }
comment = "Tuple: kind,id,count"

Example cell:

Item,1001,3

Collections can use JSON or map-style parsers:

[[tables.fields]]
name = "tags"
type = "set<string>"
parser = { kind = "json" }
default = "[\"misc\"]"

[[tables.fields]]
name = "attributes"
type = "map<string,i32>"
parser = { kind = "map" }
comment = "Map pairs: key,value|key,value"

Example cells:

["starter","melee"]
attack,12|speed,2

Load Generated Code

Generated code contains strongly typed row models, table containers, and a config loader for the selected runtime format.

Choose a Runtime Format

[codegen.rust]
runtime_format = "sora"

The runtime format selected by code generation must match an exported bundle:

[[build.exports]]
format = "binary"
out = "generated/config.sora"

runtime_format = "sora" corresponds to the binary export. json, cbor, and sora-protobuf correspond to their matching export formats.

Rust Example

mod generated;

use generated::SoraConfig;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let bytes = std::fs::read("generated/config.sora")?;
    let config = SoraConfig::from_sora_bytes(&bytes)?;

    if let Some(item) = config.items.get(&1001) {
        println!("{} stacks to {}", item.name, item.max_stack);
    }

    Ok(())
}

Exact names are derived from schema names and target language conventions. For example, a table named Item generally becomes an item row type plus an item table accessor.

Adapter Targets

Some targets expose adapter hooks for formats where the ecosystem dependency should be supplied by the application. For example, Lua, Erlang, and Dart can accept decode_cbor or decode_sora_protobuf functions instead of embedding a specific third-party decoder.

See Runtime Adapters for examples.

Schema

A schema module is a TOML, YAML, JSON, or Lua file included by a project manifest.

package = "game_config"
includes = ["schema/items.toml", "schema/skills.toml"]

Schema modules are the source of truth for Sora. They describe the stable data contract; source files such as Excel workbooks contain row values that are checked against that contract.

See Schema Formats for the supported file formats and equivalent TOML/YAML/JSON/Lua shapes.

Enums

[[enums]]
name = "ItemType"
values = ["Weapon", "Armor", "Material"]

Enums are stored by symbolic value in editable data and generated as native enum-like constructs when the target language supports them.

Structs

[[structs]]
name = "Cost"

[[structs.fields]]
name = "gold"
type = "i32"

Structs model repeated object shapes. They are useful for costs, rewards, coordinates, stat modifiers, and other nested values.

Unions

[[unions]]
name = "RewardAction"
tag = "type"

[[unions.variants]]
name = "AddItem"

[[unions.variants.fields]]
name = "item_id"
type = "ref<Item.id>"

Unions model tagged variants. The tag field is the discriminator name used in source data and runtime values.

Tables

[[tables]]
name = "Item"
mode = "map"
key = "id"

[tables.source]
format = "xlsx"
file = "Item.xlsx"
sheet = "Item"

Tables define source-backed row collections. See Tables for modes, keys, sources, indexes, and derived fields.

Field Types

Common field types include primitives, enums, structs, unions, references, lists, sets, fixed arrays, maps, and optionals:

i32
string
enum<ItemType>
struct<Cost>
union<Reward>
ref<Item.id>
list<i32>
set<string>
array<i32,3>
map<string,i32>
optional<string>

See Types for the full list and examples.

See Cell Parsers for compact Excel/CSV cell formats and column projections such as split, tuple, columns, tuple_list, map, and json.

Schema Formats

Sora schema files can be written as TOML, YAML, JSON, or Lua. All formats load into the same schema model and produce the same IR, generated code, Excel templates, exports, and schema locks.

The file extension selects the parser:

Extension	Format
`.toml`	TOML
`.yaml`, `.yml`	YAML
`.json`	JSON
`.lua`	Lua

Includes are parsed by their own file extension, so a YAML project can include TOML, JSON, or Lua modules, and any supported project format can mix supported module formats.

TOML

package = "game_config"
includes = ["schema/items.toml"]

[[enums]]
name = "ItemType"
values = ["Weapon", "Armor"]

[[tables]]
name = "Item"
mode = "map"
key = "id"

[[tables.fields]]
name = "id"
type = "i32"

YAML

package: game_config
includes:
  - schema/items.yaml

enums:
  - name: ItemType
    values: [Weapon, Armor]

tables:
  - name: Item
    mode: map
    key: id
    fields:
      - name: id
        type: i32

JSON

{
  "package": "game_config",
  "includes": ["schema/items.json"],
  "enums": [
    { "name": "ItemType", "values": ["Weapon", "Armor"] }
  ],
  "tables": [
    {
      "name": "Item",
      "mode": "map",
      "key": "id",
      "fields": [
        { "name": "id", "type": "i32" }
      ]
    }
  ]
}

Lua

Lua schema files must return one table. The returned table uses the same field names as the TOML/YAML/JSON shapes. Lua schema loading is data-oriented; package, io, os, and debug are not available.

return {
  package = "game_config",
  includes = { "schema/items.lua" },

  enums = {
    { name = "ItemType", values = { "Weapon", "Armor" } },
  },

  tables = {
    {
      name = "Item",
      mode = "map",
      key = "id",
      fields = {
        { name = "id", type = "i32" },
      },
    },
  },
}

Project Build Config

The project file can also use YAML, JSON, or Lua for build:

package: game_config
includes:
  - schema/items.yaml

build:
  default_source_format: xlsx
  data_root: data
  schema_lock: generated/schema.lock
  excel_templates: generated/excel
  codegen:
    - target: rust
      out: generated/rust
      format: auto
  exports:
    - format: binary
      out: generated/config.sora

{
  "package": "game_config",
  "includes": ["schema/items.json"],
  "build": {
    "default_source_format": "xlsx",
    "data_root": "data",
    "schema_lock": "generated/schema.lock",
    "excel_templates": "generated/excel",
    "codegen": [
      { "target": "rust", "out": "generated/rust", "format": "auto" }
    ],
    "exports": [
      { "format": "binary", "out": "generated/config.sora" }
    ]
  }
}

return {
  package = "game_config",
  includes = { "schema/items.lua" },
  build = {
    default_source_format = "xlsx",
    data_root = "data",
    schema_lock = "generated/schema.lock",
    excel_templates = "generated/excel",
    codegen = {
      { target = "rust", out = "generated/rust", format = "auto" },
    },
    exports = {
      { format = "binary", out = "generated/config.sora" },
    },
  },
}

Tables

Tables are source-backed row collections. A table schema declares the table mode, source location, fields, and optional indexes.

Modes

Mode	Shape	Typical Use
`map`	Rows keyed by one field.	Items, quests, levels, buffs.
`list`	Ordered rows without keyed lookup.	Drop entries, weighted pools, ordered steps.
`singleton`	One row.	Global settings, tuning constants.

[[tables]]
name = "Item"
mode = "map"
key = "id"

[[tables.fields]]
name = "id"
type = "i32"

For map tables, key names the table’s primary key field. Sora uses it for row uniqueness, generated lookup APIs, Excel template hints, and ref<Table.key> validation.

Source

[tables.source]
format = "xlsx"
file = "Core.xlsx"
sheet = "Item"

format can be omitted when the project or command provides a default source format. file is resolved under the command’s --data-root during export and validation.

Built-in source formats are xlsx, csv, toml, json, and yaml. JSON and YAML table files are arrays of row objects:

[
  { "id": 1001, "name": "Iron Sword" },
  { "id": 1002, "name": "Health Potion" }
]

For JSON and YAML, file can also point to a directory. In that case Sora recursively reads every matching .json, .yaml, or .yml file as one row object, sorted by path.

Indexes

Indexes are extra lookup paths on a table. They are different from the key of a mode = "map" table:

Concept	Purpose
table `key`	The primary key. A map table uses it to keep rows unique and to generate the main `get(id)` lookup.
`[[tables.indexes]]`	Additional lookup paths, such as lookup by name, grouping by type, or finding drops by stage.

For example, an Item table can use id as its primary key:

[[tables]]
name = "Item"
mode = "map"
key = "id"

[[tables.fields]]
name = "id"
type = "i32"

[[tables.fields]]
name = "name"
type = "string"

[[tables.fields]]
name = "item_type"
type = "enum<ItemType>"

Add a unique index when another field should also identify at most one row:

[[tables.indexes]]
name = "by_name"
fields = ["name"]
unique = true

Example data:

id	name	item_type
1001	Iron Sword	Weapon
1002	Wood Shield	Armor

unique = true means name cannot repeat. Generated code for targets that support the index can expose a helper similar to get_by_name("Iron Sword"), returning one row or no row.

Use a non-unique index when a key can match many rows:

[[tables.indexes]]
name = "by_item_type"
fields = ["item_type"]
unique = false

Example data:

id	name	item_type
1001	Iron Sword	Weapon
1002	Bronze Axe	Weapon
2001	Wood Shield	Armor

unique = false means one key can match several rows. Generated code for targets that support the index can expose a helper similar to get_by_item_type(ItemType::Weapon), returning the matching rows.

fields is a list, so a unique index can also express combined uniqueness:

[[tables.indexes]]
name = "by_world_stage"
fields = ["world", "stage"]
unique = true

This requires each (world, stage) pair to be unique. For example, (1, 1) can appear once, while (1, 2) is a different key. Current generated lookup helpers mainly support single-field indexes on non-singleton tables; combined indexes are most useful for validation today.

Validation

Sora validates table rows after loading source data:

non-optional fields must be present unless a default exists;
key fields must be unique for map tables;
enum values must be valid;
references must point to existing rows;
numeric ranges and length ranges must pass;
parser output must match the declared field type.

Types

Sora type expressions are written as strings in schema fields.

Primitive Types

Type	Meaning
`bool`	Boolean value.
`i8`	8-bit signed integer.
`u8`	8-bit unsigned integer.
`i16`	16-bit signed integer.
`u16`	16-bit unsigned integer.
`i32`	32-bit signed integer.
`u32`	32-bit unsigned integer.
`i64`	64-bit signed integer.
`f32`	32-bit floating point value.
`f64`	64-bit floating point value.
`string`	UTF-8 string.
`duration`	Non-negative duration written as units such as `500ms`, `30s`, `15m`, `2h`, `7d`, or `1h 30m`. Units must be ordered from largest to smallest: `d`, `h`, `m`, `s`, `ms`. Runtime data stores milliseconds.
`text`	Localization text key. See Localization.

Integer widths are validated by Sora before export. Some target languages do not have unsigned small integer types, so generated code may use a wider signed type while preserving the schema range.

[[tables.fields]]
name = "level"
type = "u16"
range = [1, 100]

Named Types

Type	Example
Enum	`enum<ItemType>`
Struct	`struct<ResourceCost>`
Union	`union<RewardAction>`
Reference	`ref<Item.id>`

References must point to the primary key of a mode = "map" table. Containers can wrap references, for example list<ref<Item.id>>.

[[tables.fields]]
name = "item_type"
type = "enum<ItemType>"

[[tables.fields]]
name = "price"
type = "struct<ResourceCost>"
parser = { kind = "tuple" }

Collections

Type	Meaning
`list<T>`	Ordered repeated values.
`set<T>`	Unique repeated values.
`array<T,N>`	Fixed-length repeated values.
`map<K,V>`	Key/value pairs.
`optional<T>`	Nullable or absent value.

[[tables.fields]]
name = "tags"
type = "set<string>"
parser = { kind = "json" }
default = "[\"misc\"]"

[[tables.fields]]
name = "attributes"
type = "map<string,i32>"
parser = { kind = "map" }

Cell Examples

These examples show what a designer would put in an Excel or CSV cell:

Field type	Parser	Cell value
`u16`	none	`1001`
`enum<ItemType>`	none	`Weapon`
`list<i32>`	none or `split`	`1,2,3`
`duration`	none	`1h 30m`
`text`	none	`quest.1001.title`
`set<string>`	`json`	`["starter","melee"]`
`struct<ResourceCost>`	`tuple`	`Gold,0,100`
`struct<ResourceCost>`	`columns`	spread across `cost_kind`, `cost_id`, `cost_count` columns
`map<string,i32>`	`map`	`atk,10\|hp,20`
`union<EventCondition>`	`json`	`{"type":"QuestCompleted","quest_id":5002}`
`optional<ref<Item.id>>`	none	empty cell or `1001`

Field Rules

[[tables.fields]], [[structs.fields]], and [[unions.variants.fields]] share the common field properties. Table fields have extra table-only properties for derived values; those properties are invalid on struct fields and union variant fields. A table primary key is declared once on the table itself with key = "field_name".

Field presence is part of the type: optional<T> means the value may be absent or null, while every other type is required unless a default fills the missing value.

For TOML/JSON/YAML-style object inputs, a field can be absent from the object. For Excel and CSV, the column must exist in the header; an omitted cell, blank cell, or short CSV record is treated as an empty cell.

Schema field	Object field absent	Excel/CSV cell empty
`type = "i32"`	Validation error.	Validation error.
`type = "optional<i32>"`	`null`.	`null`.
`type = "i32"` plus `default = "1"`	`1`.	`1`.
`type = "optional<i32>"` plus `default = "1"`	`1`.	`null`.

Property	Applies To	Purpose
`name`	all fields	Field name used in source data, validation errors, generated code, and exported runtime data.
`type`	all fields	Type expression such as `i32`, `struct<ResourceCost>`, or `list<union<RewardAction>>`.
`default`	all fields except derived fields	String value used when the source object field is absent or a required Excel/CSV cell is empty.
`comment`	all fields	Description used in generated Excel headers.
`range`	numeric fields, `duration`, and collection elements of those types	Inclusive numeric range, written as `[min, max]`. Duration ranges are milliseconds.
`length`	`string`, `list`, `set`, `array`, `map`	Inclusive length range, written as `[min, max]`.
`parser`	cell-based inputs and defaults	Cell parser hint. See Cell Parsers.
`scope`	all fields	Includes the field only for selected generation/export scopes. Defaults to `all`.
`from`	table fields only	Optional child-table source for a derived field.

Defaults are written as strings because they are parsed through the same type-aware conversion path as source data.

from describes a field derived from matching rows in another table; see References and Derived Fields. Derived fields can be list<T>, T, or optional<T> and cannot declare default.

Enums, Structs, and Unions

These definitions let schemas model more than flat tables.

Enums

[[enums]]
name = "Rarity"
values = ["Common", "Uncommon", "Rare", "Epic", "Legendary"]

Enums keep source data readable while generated code receives a constrained type.

Aliases can keep imported or legacy names readable:

[[enums.aliases]]
name = "Purple"
alias = "Epic"

Structs

[[structs]]
name = "ResourceCost"

[[structs.fields]]
name = "kind"
type = "enum<ResourceKind>"

[[structs.fields]]
name = "id"
type = "i32"

[[structs.fields]]
name = "count"
type = "i32"
range = [1, 999999]

Use structs for nested values that appear in many places. A field can reference a struct with type = "struct<ResourceCost>".

Struct fields use the same field properties as table fields, including name, type, default, comment, range, length, parser, and scope. Table-specific properties such as key and from are not meaningful for normal struct fields. See Types for the full field reference.

In cell-based inputs, a struct field can be written as JSON object text by default:

{"kind":"Gold","id":0,"count":100}

For compact cells, declare parser = { kind = "tuple" } on the field that references the struct. Tuple values follow the struct field order:

Gold,0,100

Unions

Use a union when one field can contain different shapes. For example, an event condition might be either “quest completed” or “player has item”:

{"type":"QuestCompleted","quest_id":5002}

{"type":"HasItem","item_id":1001,"count":2}

The type value selects which variant is present. The rest of the fields depend on that variant.

[[unions]]
name = "RewardAction"
tag = "type"

[[unions.variants]]
name = "AddItem"

[[unions.variants.fields]]
name = "item_id"
type = "ref<Item.id>"

[[unions.variants.fields]]
name = "count"
type = "i32"

[[unions.variants]]
name = "UnlockStage"

[[unions.variants.fields]]
name = "stage_id"
type = "ref<Stage.id>"

Use unions when a field can contain one of several tagged shapes. Examples include conditions, rewards, triggers, and scripted actions.

The union tag defaults to type if omitted. Source data must include that tag with the variant name. The remaining fields must match the selected variant; unknown fields and missing non-optional variant fields are validation errors.

The most direct Excel or CSV form is JSON object text in one cell:

Field type	Cell value
`union<RewardAction>`	`{"type":"AddItem","item_id":1001,"count":2}`

For a list of union values, declare parser = { kind = "json" } and write a JSON array:

[[tables.fields]]
name = "actions"
type = "list<union<RewardAction>>"
parser = { kind = "json" }

[
  {"type":"AddItem","item_id":1001,"count":2},
  {"type":"UnlockStage","stage_id":9002}
]

If you do not want JSON in Excel or CSV cells, a single union<T> field can be expanded into several columns. This action field is one union value:

[[tables.fields]]
name = "action"
type = "union<RewardAction>"
parser = { kind = "tagged_columns" }

The Excel sheet then has columns like this:

A	B	C	D	E	F
`id`	`name`	`action.type`	`action.item_id`	`action.count`	`action.stage_id`
`1`	`Give Sword`	`AddItem`	`1001`	`2`
`2`	`Open Stage`	`UnlockStage`			`9002`

action.type contains the variant name. An AddItem row fills only item_id and count; an UnlockStage row fills only stage_id. Columns for other variants stay empty.

tagged_columns is only valid on a field whose type is exactly union<T>; it cannot be applied directly to list<union<T>>. When a parent field needs several union values, put each union value in a child row and derive the parent list from that child table:

[[tables.fields]]
name = "actions"
type = "list<union<RewardAction>>"
from = { table = "EventActionEntry", parent_key = "id", child_key = "event_id", field = "value", order_by = "seq" }

[[tables]]
name = "EventActionEntry"
mode = "list"

[[tables.fields]]
name = "event_id"
type = "ref<EventRule.id>"

[[tables.fields]]
name = "seq"
type = "i32"

[[tables.fields]]
name = "value"
type = "union<RewardAction>"
parser = { kind = "tagged_columns", prefix = "" }

The parent EventRule sheet keeps ordinary columns:

A	B
`id`	`name`
`1`	`First Event`

The child EventActionEntry sheet stores one action per row:

A	B	C	D	E	F
`event_id`	`seq`	`type`	`item_id`	`count`	`stage_id`
`1`	`1`	`AddItem`	`1001`	`2`
`1`	`2`	`UnlockStage`			`9002`

On export, EventRule.actions receives two union values ordered by seq. The prefix = "" option makes the child table columns use plain names such as type, item_id, count, and stage_id; do not use an empty prefix if those names conflict with other fields on the same table.

See Cell Parsers for the exact column rules.

In TOML data files, unions can be written as normal nested tables:

[[rows]]
id = 1
condition = { type = "QuestCompleted", quest_id = 5002 }
actions = [
  { type = "AddItem", item_id = 1001, count = 2 },
  { type = "UnlockStage", stage_id = 9002 },
]

References and Derived Fields

References let one table point to another table’s primary key. Derived fields copy or assemble data from matching rows in another table.

Feature	What source data stores	What runtime model gets
`ref<Item.id>`	The target row id, such as `1001`.	The id value or a target-specific wrapper.
`from = { ... }`	Rows stay in a child table.	The parent row receives a copied/nested value.

Use ref when the relationship itself should remain an id. Use from when exported data should contain a convenient nested field.

The target of a ref must be a mode = "map" table, and the referenced field must be that table’s key.

References

[[tables.fields]]
name = "required_item"
type = "ref<Item.id>"

Sora validates that every value points to an existing row in the referenced table.

References are still stored as values in source data. The generated runtime can expose them as key values or target-specific wrapper types depending on the language backend.

References can be nested in containers such as list<ref<Item.id>>, set<ref<Item.id>>, or optional<ref<Item.id>>. The same primary-key rule applies to the inner ref.

Derived Fields

A derived field is not read from the current table’s cell. It is built from matching rows in another table.

This keeps editable data normalized while generated runtime models can expose convenient nested values. For example, quest rewards can be stored as two tables:

Quest:

id	name
1001	First Quest
1002	Second Quest

QuestReward:

quest_id	sort_order	item_id	count
1001	1	2001	10
1001	2	2002	1
1002	1	2003	5

At runtime, Quest may want a direct rewards: list<Reward> field. Declare that the field comes from QuestReward:

[[structs]]
name = "Reward"

[[structs.fields]]
name = "item_id"
type = "ref<Item.id>"

[[structs.fields]]
name = "count"
type = "i32"

[[tables]]
name = "Quest"
mode = "map"
key = "id"

[[tables.fields]]
name = "id"
type = "i32"

[[tables.fields]]
name = "name"
type = "string"

[[tables.fields]]
name = "rewards"
type = "list<struct<Reward>>"
from = { table = "QuestReward", parent_key = "id", child_key = "quest_id", order_by = "sort_order" }

[[tables]]
name = "QuestReward"
mode = "list"

[[tables.fields]]
name = "quest_id"
type = "ref<Quest.id>"

[[tables.fields]]
name = "sort_order"
type = "i32"

[[tables.fields]]
name = "item_id"
type = "ref<Item.id>"

[[tables.fields]]
name = "count"
type = "i32"

This means:

from.table = "QuestReward": read matching rows from the QuestReward child table.
from.parent_key = "id": use the parent row’s Quest.id value for matching.
from.child_key = "quest_id": match child rows where QuestReward.quest_id equals the parent key.
from.order_by = "sort_order": when several child rows match, sort them by the child table’s sort_order field in ascending order.

With the example data above, Quest.id = 1001 receives two reward rows, ordered as 2001, then 2002.

The exported parent row is shaped as if rewards had been written directly on Quest:

{
  "id": 1001,
  "name": "First Quest",
  "rewards": [
    {"item_id": 2001, "count": 10},
    {"item_id": 2002, "count": 1}
  ]
}

The field type controls how many child rows may match:

Field type	Match count	Result when no row matches
`list<T>`	zero or more	empty list
`optional<T>`	zero or one	`null`
`T`	exactly one	validation error

If T or optional<T> matches more than one child row, Sora reports an error.

Copying One Child Field

Without from.field, Sora assembles a struct from child table fields with the same names as the struct fields.

When the parent should receive one field from the child row instead, set from.field:

[[unions]]
name = "EventCondition"
tag = "type"

[[unions.variants]]
name = "QuestCompleted"

[[unions.variants.fields]]
name = "quest_id"
type = "ref<Quest.id>"

[[unions.variants]]
name = "HasItem"

[[unions.variants.fields]]
name = "item_id"
type = "ref<Item.id>"

[[unions.variants.fields]]
name = "count"
type = "i32"

[[tables.fields]]
name = "condition"
type = "union<EventCondition>"
from = { table = "EventConditionEntry", parent_key = "id", child_key = "event_id", field = "value" }

[[tables]]
name = "EventConditionEntry"
mode = "list"

[[tables.fields]]
name = "event_id"
type = "ref<Event.id>"

[[tables.fields]]
name = "value"
type = "union<EventCondition>"
parser = { kind = "tagged_columns", prefix = "" }

This means Event.condition receives EventConditionEntry.value for the child row whose event_id matches Event.id. The child table may still contain helper columns such as id, event_id, notes, or sort fields; only the value field named by from.field is copied into the parent field.

In Excel, EventConditionEntry can look like this:

A	B	C	D	E
`event_id`	`type`	`quest_id`	`item_id`	`count`
`1`	`QuestCompleted`	`5002`
`2`	`HasItem`		`1001`	`2`

From Options

The from object has these options:

Option	Required	Meaning
`table`	yes	Child table name. Sora scans this table for matching rows.
`parent_key`	yes	Field name on the parent table. Each parent row uses this field value for matching.
`child_key`	yes	Field name on the child table. A child row is selected when this value equals the parent key.
`field`	no	Field name on the child table. When present, Sora copies this field’s value instead of assembling a struct from the child row.
`order_by`	no	Field name on the child table. When present, matched child rows are sorted by this field in ascending order.

order_by is a field name, not an expression. There is no desc, multi-field ordering, filtering, or custom sort syntax. If order_by is omitted, matched rows keep the source table read order.

The order_by field must exist on the child table. It is usually an i32 ordering field such as sort_order, seq, or rank. Sorting is ascending.

Without from.field, the derived value type must be a struct, either list<struct<...>>, struct<...>, or optional<struct<...>>. Struct fields are copied from child table fields with the same names:

[[structs]]
name = "Reward"

[[structs.fields]]
name = "item_id"
type = "ref<Item.id>"

[[structs.fields]]
name = "count"
type = "i32"

Here Reward.item_id and Reward.count must both exist as compatible fields on QuestReward.

With from.field, the derived value type must be compatible with that child field. For example, type = "union<EventCondition>" can derive from a child field value whose type is also union<EventCondition>.

A derived field cannot also declare default. Its value comes from matched child rows.

Multiple Derived Fields from One Child Table

Several parent tables can derive fields from the same child table. This does not consume or move child rows. It reads the child table and copies matching values into each parent field.

For example, both Quest and QuestPreview can receive rewards from QuestReward:

[[tables]]
name = "Quest"
mode = "map"
key = "id"

[[tables.fields]]
name = "rewards"
type = "list<struct<Reward>>"
from = { table = "QuestReward", parent_key = "id", child_key = "quest_id", order_by = "sort_order" }

[[tables]]
name = "QuestPreview"
mode = "map"
key = "id"

[[tables.fields]]
name = "rewards"
type = "list<struct<Reward>>"
from = { table = "QuestReward", parent_key = "id", child_key = "quest_id", order_by = "sort_order" }

If both Quest.id = 1001 and QuestPreview.id = 1001 exist, both parent rows receive the reward list from QuestReward.quest_id = 1001. Sora does not mark the child row as already used by Quest, and it does not remove the row from QuestReward.

Cell Parsers

Parsers are only for cell-based inputs such as Excel and CSV. Most parsers tell Sora how to turn one cell into a typed value; projection parsers such as columns and tagged_columns tell Sora how one field maps to several input columns. String default values use the same parser path for single-cell parsers. TOML row data can usually use native TOML arrays and tables instead.

Use a parser when the default cell format is too verbose or ambiguous:

[[tables.fields]]
name = "tags"
type = "list<string>"
parser = { kind = "split", separator = "|" }

With that schema, the cell value is:

starter|melee|weapon

Parser options are string values. Unknown parser kinds, unsupported options, and empty option values fail during schema normalization. The exception is projection prefixes such as columns.prefix and tagged_columns.prefix, where "" is meaningful.

Custom Lua Parsers

Projects can load project-local Lua parser scripts from project.toml:

[parsers]
scripts = ["tools/parsers.lua"]

Script paths are resolved relative to the project file. After that, every command that reads the project can use the custom parsers without repeating command-line flags:

sora build --project project.toml
sora export --project project.toml --data-root data --format json --out generated/config.json

CLI commands can also load temporary parser scripts with the global --parser-script option:

sora --parser-script tools/parsers.lua build --project project.toml
sora --parser-script tools/parsers.lua export --project project.toml --data-root data --format json --out generated/config.json

The option can be repeated and is appended after project-configured scripts. Custom parsers are trusted project code. Sora loads them with a limited Lua standard library and does not expose io, os, package, or debug.

A parser script returns a table with parsers. Each parser must define parse(cell, ctx). options is the list of supported parser options. validate(field) is optional and runs during schema normalization.

return {
  parsers = {
    slug = {
      options = { "prefix" },
      validate = function(field)
        if field.type ~= "string" then
          error("slug parser requires string")
        end
      end,
      parse = function(cell, ctx)
        local text = string.lower(string.gsub(cell.text, "%s+", "-"))
        if ctx.options.prefix ~= nil then
          return ctx.options.prefix .. text
        end
        return text
      end,
    },
  },
}

Schema fields use the custom parser by name:

[[tables.fields]]
name = "tag"
type = "string"
parser = { kind = "slug", prefix = "item-" }

cell contains kind, text, and value where applicable. ctx contains field, type, options, path, and location fields such as row, column, and sheet for worksheets. Lua return values map to Sora data values: nil, booleans, integers, floats, strings, array-like tables, and string-keyed tables.

Custom Lua parsers are single-cell parsers. They do not replace projection parsers such as columns or tagged_columns, cannot read neighboring cells, and do not change schema, source loading, or generated runtime behavior.

Default Parsing

If a field has no parser, Sora uses type-aware default parsing:

Type	Cell format
`bool`	Boolean cells, `true`, `false`, or numeric cells where zero is false and non-zero is true.
`i32`, `i64`, `ref<Table.key>`	Integer cells, integer text, or whole-number float cells.
`duration`	Duration text using `d`, `h`, `m`, `s`, or `ms`, for example `500ms`, `30s`, or `1h 30m`. Units must be ordered from largest to smallest.
`f32`, `f64`	Numeric cells or numeric text.
`string`, `enum<Name>`	Cell display text.
`struct<Name>`, `union<Name>`	JSON object text.
`list<T>`, `set<T>`, `array<T,N>`	Comma-separated text. Use `json` for JSON arrays.
`map<K,V>`	JSON array of two-item pairs, for example `[["atk",10],["hp",20]]`.
`optional<T>`	Empty cell becomes `null`; otherwise the inner `T` is parsed.

Default collection parsing is intentionally simple. Primitive items are parsed by type. Struct and union collection items must be JSON object text. Nested collections cannot be represented safely with one separator; use parser = { kind = "json" }.

Parser Summary

Parser	Valid target types	Cell shape
`split`	`list<T>`, `set<T>`, `array<T,N>`, or `optional` around those types	`a,b,c`
`tuple`	`struct<T>` or `optional<struct<T>>`	`Gold,0,100`
`columns`	`struct<T>` or `optional<struct<T>>`	Multiple columns
`tuple_list`	`list<struct<T>>`, `set<struct<T>>`, `array<struct<T>,N>`, or `optional` around those types	`Gold,0,100\|Gem,0,5`
`map`	`map<K,V>` or `optional<map<K,V>>`	`atk,10\|hp,20`
`tagged_columns`	`union<T>` only	Multiple columns
`json`	Any type	JSON value matching the field type

array<T,N> checks the parsed item count. tuple checks the value count against the referenced struct’s field count.

split

Use split for a flat collection of primitive values, enums, refs, or simple values that can be separated reliably.

[[tables.fields]]
name = "starter_items"
type = "list<ref<Item.id>>"
parser = { kind = "split" }

Cell:

1001,1002,1003

Parsed value:

[1001,1002,1003]

Use separator when comma is not a good separator:

[[tables.fields]]
name = "tags"
type = "set<string>"
parser = { kind = "split", separator = "|" }

Cell:

starter|melee|weapon

tuple

Use tuple when a single struct is small enough to fit naturally in one cell. Values follow the referenced struct’s field declaration order.

[[structs]]
name = "ResourceCost"

[[structs.fields]]
name = "kind"
type = "enum<ResourceKind>"

[[structs.fields]]
name = "id"
type = "i32"

[[structs.fields]]
name = "count"
type = "i32"

[[tables.fields]]
name = "price"
type = "struct<ResourceCost>"
parser = { kind = "tuple" }

Cell:

Gold,0,100

Parsed value:

{"kind":"Gold","id":0,"count":100}

Use separator if struct values themselves commonly contain commas:

parser = { kind = "tuple", separator = "|" }

Cell:

Gold|0|100

columns

Use columns when one struct should be edited as normal Excel or CSV columns instead of as JSON or one compact tuple cell. It is valid on struct<T> and optional<struct<T>> table fields.

[[structs]]
name = "ResourceCost"

[[structs.fields]]
name = "kind"
type = "enum<ResourceKind>"

[[structs.fields]]
name = "id"
type = "i32"

[[structs.fields]]
name = "count"
type = "i32"

[[tables.fields]]
name = "price"
type = "struct<ResourceCost>"
parser = { kind = "columns", prefix = "price_" }

CSV headers and row:

id,name,price_kind,price_id,price_count
1,Iron Sword,Gold,0,100

Parsed price value:

{"kind":"Gold","id":0,"count":100}

With the default prefix, a field named price projects columns such as price.kind, price.id, and price.count. Use prefix = "" only when the struct field names should live at the table’s top level. Sora rejects projected column name conflicts.

columns does not recursively project nested structs or unions. If a projected struct field is itself complex, either give that child field a single-cell parser such as tuple, split, map, or json, or move the nested data into a dedicated table and connect it with ref or a derived field. This keeps the spreadsheet narrow and keeps complex records reusable.

For generated XLSX templates, columns projected from the same columns field share the same header color.

tuple_list

Use tuple_list for a list of small structs. separator splits fields inside one struct item. item_separator splits items in the list.

[[tables.fields]]
name = "materials"
type = "list<struct<ResourceCost>>"
parser = { kind = "tuple_list" }

Cell:

Item,2003,4|Gold,0,1000

Parsed value:

[
  {"kind":"Item","id":2003,"count":4},
  {"kind":"Gold","id":0,"count":1000}
]

Custom separators:

parser = { kind = "tuple_list", separator = ":", item_separator = ";" }

Cell:

Item:2003:4;Gold:0:1000

map

Use map when a map is simple enough to write as repeated key/value pairs. separator splits key from value. item_separator splits map entries.

[[tables.fields]]
name = "attributes"
type = "map<string,i32>"
parser = { kind = "map" }

Cell:

atk,10|hp,20

Parsed value:

[["atk",10],["hp",20]]

Sora exports maps as pair arrays so non-string keys remain unambiguous. If you prefer JSON cell syntax, use parser = { kind = "json" } and write the same pair-array shape:

[["atk",10],["hp",20]]

tagged_columns

Use tagged_columns when one union<T> value should be edited across multiple Excel or CSV columns. It is only valid on a table field whose type is exactly union<T>. It is intentionally not valid for optional<union<T>>, list<union<T>>, set<union<T>>, or other containers.

[[unions]]
name = "EventCondition"
tag = "type"

[[unions.variants]]
name = "QuestCompleted"

[[unions.variants.fields]]
name = "quest_id"
type = "ref<Quest.id>"

[[unions.variants]]
name = "HasItem"

[[unions.variants.fields]]
name = "item_id"
type = "ref<Item.id>"

[[unions.variants.fields]]
name = "count"
type = "i32"

[[tables.fields]]
name = "value"
type = "union<EventCondition>"
parser = { kind = "tagged_columns", prefix = "" }

CSV headers and rows:

id,type,quest_id,item_id,count
1,QuestCompleted,5002,,
2,HasItem,,1001,2

The tag column contains the union variant name. Only fields for the selected variant may contain values. With the default prefix, a field named condition projects columns such as condition.type, condition.quest_id, and condition.item_id. Use prefix = "" only when the projected columns should live at the table’s top level.

Sora rejects projected column name conflicts, for example a normal table field named type plus prefix = "" for a union whose tag is also type.

tagged_columns also does not recursively project nested structs or nested unions inside variant fields. Variant fields can still use single-cell parsers such as tuple, split, map, or json. If a variant needs a large nested object or repeated nested objects, model that data as a dedicated table and reference or derive it instead of widening the union row.

For generated XLSX templates, columns projected from the same tagged_columns field share the same header color. The tag column uses the same color group with stronger emphasis.

json

Use json for nested values, unions inside containers, nested collections, and any shape that needs explicit escaping.

[[tables.fields]]
name = "actions"
type = "list<union<RewardAction>>"
parser = { kind = "json" }

Cell:

[
  {"type":"AddItem","item_id":1007,"count":3},
  {"type":"UnlockStage","stage_id":9002}
]

For one union value:

[[tables.fields]]
name = "condition"
type = "union<EventCondition>"
parser = { kind = "json" }

Cell:

{"type":"QuestCompleted","quest_id":5002}

For map<K,V>, JSON uses an array of pairs, not a JSON object:

[["atk",10],["hp",20]]

Choosing a Parser

Need	Prefer
Flat list of primitive values	`split`
One compact struct	`tuple`
One struct spread across columns	`columns`
Repeated compact structs	`tuple_list`
Simple key/value pairs	`map`
One union spread across columns	`tagged_columns`
Nested values, unions in containers, escaping, or JSON-shaped cells	`json`

Project Config

The project manifest can be used as a simple schema root or as a full build description. It can be written as TOML, YAML, JSON, or Lua; examples on this page use TOML.

package = "game_config"
includes = ["schema/items.toml"]

[parsers]
scripts = ["tools/parsers.lua"]

[type_mappings]
scripts = ["tools/type_mappings.lua"]

[build]
default_source_format = "xlsx"
data_root = "data"
schema_lock = "generated/schema.lock"
excel_templates = "generated/excel"

[[build.codegen]]
target = "rust"
out = "rust/src/generated"
format = "auto"

[[build.exports]]
format = "binary"
out = "generated/config.sora"

Run every configured output:

sora build --project project.toml

data_root and excel_templates serve different purposes. data_root is the input directory used by export and build, so it contains edited table rows. excel_templates is an output directory for generated workbook templates, so it can be deleted and regenerated after schema changes. Do not point excel_templates at your edited data directory unless replacing those workbooks is intentional.

[parsers].scripts lists custom Lua cell parser scripts used by CLI commands that read the project. Paths are relative to the project file. See Cell Parsers for the script API.

[type_mappings].scripts lists Lua scripts that customize generated language types. Paths are relative to the project file. Type mappings are codegen-only: the schema still uses language-neutral Sora types such as struct<Vec3>, while the mapping script can map that named type to a target-specific type.

Localization is declared at the project root with [localization]. Its sources are independent from normal [[tables]]; see Localization.

Run one configured codegen target:

sora build --project project.toml --target rust

Target Options

Language-specific options live under [codegen.<target>]:

[codegen.rust]
runtime_format = "sora"

[codegen.typescript]
runtime_format = "json"
enum_repr = "string"

[codegen.lua]
runtime_format = "cbor"
lua_version = "5.4"

These options are consumed by the selected generator. The normalized IR stays language-neutral.

Type mapping scripts return a table with type_mappings. Each mapping targets one language and one named schema type:

return {
  type_mappings = {
    {
      target = "csharp",
      schema_type = "Vec3",
      type_name = "Vector3",
      nullable_type_name = "Vector3?",
      decode = "GameMappings.ToVector3({value})",
      value_decode = "GameMappings.ToVector3({value})",
      imports = { "UnityEngine" },
    },
  },
}

nullable_type_name is optional. Use it when optional<schema_type> needs a different target-language type expression from the backend’s default nullable wrapper.

decode wraps the normal binary runtime decode expression, and value_decode wraps JSON/CBOR/protobuf-style value decode. The {value} placeholder is replaced with the generated default expression.

The C target uses write-into decode functions, so C mappings should use decode_into instead of decode. The {target} placeholder is replaced with the output pointer expression. C mappings can also provide free, where {target} is replaced with the pointer that should be released:

{
  target = "c",
  schema_type = "Vec3",
  type_name = "game_vector3",
  decode_into = "game_vector3_decode(reader, {target})",
  free = "game_vector3_free({target});",
  imports = { "#include \"vector3.h\"" },
}

imports is target-specific and is only emitted by language generators that need it. C#, Java, Kotlin, and Scala expect an import namespace/path without the leading keyword. Go expects an import spec such as "example.com/game/vector". Python, TypeScript, JavaScript, Dart, Godot, C, C++, and Rust expect a complete import/include/use/preload line.

runtime_format can be sora, json, cbor, or sora-protobuf, but not every target supports every runtime format. See Runtime Formats for the support matrix.

Built-In Target Options

Target	Options
`rust`	`runtime_format` default `sora`; `map_type = "std"` or `"fx_hash_map"` default `std`; `string_storage = "owned"` or `"arc"` default `owned`.
`kotlin`	`runtime_format` default `sora`.
`csharp`	`runtime_format` default `sora`.
`java`	`runtime_format` default `sora`; `nullable_annotation` defaults to `SoraNullable`, set an annotation class such as `org.jetbrains.annotations.Nullable`, or set `""` to disable annotations.
`scala`	`runtime_format` default `sora`; `scala_version = "2.12"`, `"2.13"`, or `"3"` default `3`.
`go`	`runtime_format` default `sora`.
`dart`	`runtime_format = "json"`, `"cbor"`, or `"sora-protobuf"`. Set this explicitly; `sora` is not supported for Dart.
`godot`	`runtime_format = "json"`. Set this explicitly; it is the only supported Godot runtime format.
`c`	`runtime_format = "sora"`; `c_standard = "c99"`, `"c11"`, `"c17"`, or `"c23"` default `c11`; `prefix` optional symbol prefix.
`cpp`	`runtime_format = "sora"`; `cpp_standard = "c++11"`, `"c++14"`, `"c++17"`, `"c++20"`, or `"c++23"` default `c++17`; `namespace` optional C++ namespace.
`typescript`	`runtime_format` default `sora`; `enum_repr = "string"` or `"integer"` default `string`.
`javascript`	`runtime_format` default `sora`; `enum_repr = "string"` or `"integer"` default `string`; `emit_dts` boolean default `true`.
`erlang`	`runtime_format` default `sora`; `enum_repr = "atom"` or `"integer"` default `atom`.
`lua`	`runtime_format` default `sora`; `module` optional require/import prefix; `lua_version = "5.1"`, `"5.2"`, `"5.3"`, `"5.4"`, or `"luajit"` default `5.4`; `enum_repr = "string"` or `"integer"` default `string`.
`python`	`runtime_format` default `sora`.
`proto-schema`	No target options. Generates `.proto` schema files instead of a runtime loader.

Example with several language-specific options:

[codegen.rust]
runtime_format = "sora"
map_type = "fx_hash_map"
string_storage = "arc"

[codegen.cpp]
runtime_format = "sora"
cpp_standard = "c++20"
namespace = "game::config"

[codegen.javascript]
runtime_format = "json"
enum_repr = "integer"
emit_dts = true

Localization

Sora treats translated text as a separate locale catalog, not as a normal config table.

Business config stores text keys with the text type. Locale source sheets provide translations for those keys. Runtime code loads the normal config bundle and mounts one or more locale packs separately.

business tables -> config bundle
localization sources -> LocaleCatalog -> i18n locale packs

Text Keys

Use text for fields that point to localized copy:

[[tables.fields]]
name = "title_key"
type = "text"

[[tables.fields]]
name = "body_keys"
type = "list<text>"

text is a key, not the translated text itself. Source data should contain values such as quest.1001.title or ui.confirm. Generated code exposes this as a TextKey where the target language has a distinct generated runtime type.

The catalog validator checks every text value in business data. A missing key or empty translation is a build error.

Catalog Sources

Declare localization at the project schema root:

[localization]
locales = ["zh_cn", "en_us"]
default_locale = "zh_cn"
fallback_locale = "en_us"

[[localization.sources]]
name = "ui"
file = "Core.xlsx"
sheet = "UILocalization"

[[localization.sources]]
name = "quest"
file = "Quest.xlsx"
sheet = "QuestLocalization"

Each source is a wide table. The default key column is key:

key	zh_cn	en_us	note
`ui.confirm`	确认	Confirm	button label
`quest.1001.title`	第一章	Chapter One	quest title

Locale columns named in locales are exported into locale packs. Other columns, such as note, are editor-only metadata and are ignored by runtime packs.

Rules:

Rule	Behavior
`source.name`	Must be an ASCII identifier. It is used for diagnostics and organization, not as a key prefix.
`key` values	May use dotted names such as `quest.1001.title`.
Multiple sources	All sources merge into one logical catalog.
Duplicate keys	Build error. Keys are globally unique across all sources.
Missing locale column	Build error.
Empty translation	Build error.

Use key = "id" on a source if the key column is not named key:

[[localization.sources]]
name = "ui"
file = "Core.xlsx"
sheet = "UILocalization"
key = "id"

Export Locale Packs

Normal exports (binary, json, cbor, sora-protobuf, proto) contain business data and text keys only. They do not include translation text.

Add i18n exports in the build manifest:

[[build.exports]]
format = "binary"
out = "generated/config.sora"

[[build.exports]]
format = "i18n-binary"
out = "generated/i18n/zh_cn.sora-i18n"
locale = "zh_cn"

[[build.exports]]
format = "i18n-json"
out = "generated/i18n/en_us.json"
locale = "en_us"

Use i18n-binary for production locale packs. Use i18n-json for inspection, external translation handoff, or tests.

Runtime Mounting

Generated runtimes load config and locale packs separately. In Rust:

#![allow(unused)]
fn main() {
let config = SoraConfig::from_bytes(config_bytes)?;
let pack = generated::runtime::LocalePack::from_bytes(locale_bytes)?;

let mut i18n = generated::SoraI18n::new();
i18n.mount(&config, pack)?;
i18n.set_locale("zh_cn")?;

let mail = config.mail_template().get(&1001).unwrap();
let title = i18n.text(&mail.title_key);
let body = i18n.format(&mail.body_key, [("count", 100)])?;
}

Mounting validates:

Check	Purpose
`schema_fingerprint`	Prevents loading a locale pack generated for a different schema.
locale declaration	Rejects packs for locales not declared in `[localization].locales`.
text keys	Rejects packs that miss keys used by this config or contain empty text.
mounted locale	`set_locale` fails until a pack for that locale has been mounted.

Business code does not know which source sheet a key came from. It looks up TextKey values with the mounted i18n runtime.

Sora Studio

Sora Studio is the browser-based schema editor embedded in the sora CLI. It is meant for inspecting and editing project schemas without running a separate frontend server.

Start it with a project file:

sora studio --project project.toml

By default Studio binds to 127.0.0.1:5174 and prints the local URL. Use --host or --port when that address is not suitable:

sora studio --project project.toml --port 5180

What It Edits

Studio loads the project file and every schema module listed in includes. Project files and schema modules can be TOML, YAML, JSON, or Lua, and a project can mix those formats.

The editor can update:

project package name and schema include list;
schema module files, including creating and removing included files;
tables, structs, enums, and unions;
table fields, struct fields, enum values, and union variants;
table mode, primary key, source settings, parser settings, defaults, comments, range and length constraints;
reference fields and derived child-table fields.

Studio is a schema editor, not a row-data editor. Excel, CSV, TOML, JSON, and YAML table rows are still edited in their source files and validated by sora check, sora export, or sora build.

Visualization

The main canvas shows schema nodes and their relationships:

type edges for fields that use enums, structs, or unions;
reference edges for ref<Table> fields;
derived edges for child-table fields assembled from another table.

The sidebar can filter schemas by name, shows project summary counts, and groups nodes by kind. Diagnostics are shown in the UI so an invalid schema can be identified from Studio instead of making the whole editor unusable.

Preview and Save

Use preview before saving to review the files Studio will write. Studio renders each changed project or schema file in its own format:

.toml files are written as TOML;
.yaml and .yml files are written as YAML;
.json files are written as pretty JSON;
.lua files are written as data-returning Lua tables.

Saving normalizes the touched files through Studio’s renderer. This is intentional: Studio keeps the schema data model stable, but it does not preserve comments, exact whitespace, or hand-written ordering inside the edited files. Review the preview before committing.

Delivery

Release builds embed the Studio frontend assets into the sora binary. End users only need the CLI from GitHub Releases or crates.io; they do not need Node.js or a local Vite server.

For release maintainers, build the frontend before building the CLI binary:

cd apps/studio
npm run build
cd ../..
cargo build -p sora-cli --release

If the embedded assets are missing, sora studio reports that apps/studio needs to be built before the CLI.

CLI Reference

Use sora --help for the installed binary’s exact help text, and sora <command> --help for command-specific options. This page summarizes the common workflow commands, aliases, and short flags.

Global Options

Global options can be placed before or after the subcommand.

Option	Description
`-j, --jobs <N>`	Maximum worker threads. Must be greater than zero.
`--serial`	Disable parallel execution.
`--parser-script <PATH>`	Load a custom Lua cell parser script. Can be repeated. Project-level parser scripts can also be configured in `[parsers].scripts` in `project.toml`.
`--type-mapping-script <PATH>`	Load a custom Lua type mapping script. Can be repeated. Project-level scripts can also be configured in `[type_mappings].scripts` in `project.toml`.
`-h, --help`	Print help.
`-V, --version`	Print the CLI version.

Command Aliases

Command	Aliases
`build`	`b`
`check`	`c`
`init`	`i`
`gen`	`g`
`export`	`e`
`diff`	`d`
`excel-template`	`template`, `et`
`excel-sync`	`sync`, `es`
`schema-lock`	`lock`, `sl`
`studio`	`st`

Common Short Flags

Short	Long	Used by
`-p`	`--project`	Project-reading commands.
`-o`	`--out`	`init`, `gen`, `export`, `diff`, `excel-template`, `schema-lock`.
`-s`	`--scope`	`build`, `gen`, `export`, `diff`, `excel-template`, `excel-sync`, `schema-lock`.
`-t`	`--target`	`build`, `gen`.
`-f`	`--format`	`export`.
`-d`	`--data-root`	`build`, `export`, `excel-sync`.
`-l`	`--lock`, `--left-root`	`check`, `diff`.
`-r`	`--right-root`	`diff`.
`-c`	`--clean`	`build`.
`-w`	`--write`	`excel-sync`.

Commands

`init`

Create a new project scaffold.

sora init --out my-config --schema-format toml
sora i -o my-config --schema-format yaml

Option	Description
`-o, --out <DIR>`	Output directory for the scaffold.
`–schema-format <toml	yaml
`--force`	Allow writing into an existing scaffold path.

`check`

Validate a project schema, optionally against a schema lock.

sora check --project project.toml
sora c -p project.toml -l generated/schema.lock

Option	Description
`-p, --project <PATH>`	Project manifest path.
`-l, --lock <PATH>`	Existing schema lock to verify against.

`build`

Run outputs declared in [build] in project.toml, such as schema locks, Excel templates, codegen, and exports.

sora build --project project.toml
sora b -p project.toml -t rust -c

Option	Description
`-p, --project <PATH>`	Project manifest path.
`–default-source-format <csv	json
`-d, --data-root <DIR>`	Data input root. Overrides `[build].data_root`.
`-s, --scope <NAME>`	Build only schema items included in a scope.
`-t, --target <NAME>`	Codegen target to run. Can be repeated.
`-c, --clean`	Delete selected generated outputs before rebuilding.

`gen`

Generate code for one target directly, without using [build.codegen].

sora gen --target rust --project project.toml --out generated/rust
sora g -t typescript -p project.toml -o generated/typescript

Option	Description
`-t, --target <NAME>`	Codegen target, such as `rust`, `typescript`, or `python`.
`-p, --project <PATH>`	Project manifest path.
`-o, --out <DIR>`	Output directory.
`–format-code <never	auto
`-s, --scope <NAME>`	Generate only schema items included in a scope.

`export`

Load table data and export runtime data.

sora export --project project.toml --data-root data --format json --out generated/config.json
sora e -p project.toml -d data -f binary -o generated/config.sora

Option	Description
`-f, --format <NAME>`	Export format, such as `binary`, `json`, `debug-json`, `cbor`, `sora-protobuf`, or `typed-protobuf`.
`–default-source-format <csv	json
`-p, --project <PATH>`	Project manifest path.
`-d, --data-root <DIR>`	Data input root.
`-o, --out <PATH>`	Output file or directory, depending on export format.
`-s, --scope <NAME>`	Export only schema items included in a scope.
`–compression <none	zstd>`
`--compression-level <N>`	Compression level for compressed exports.

`diff`

Compare two data roots using the same project schema.

sora diff --project project.toml --left-root old-data --right-root data --out generated/diff.json
sora d -p project.toml -l old-data -r data -o generated/diff.json

Option	Description
`–default-source-format <csv	json
`-p, --project <PATH>`	Project manifest path.
`-l, --left-root <DIR>`	Baseline data root.
`-r, --right-root <DIR>`	Changed data root.
`-o, --out <PATH>`	Diff output path.
`-s, --scope <NAME>`	Diff only schema items included in a scope.

`excel-template`

Generate empty Excel workbooks from the schema. Use this for new workbooks, not for existing data files.

sora excel-template --project project.toml --out generated/excel
sora et -p project.toml -o generated/excel

Option	Description
`-p, --project <PATH>`	Project manifest path.
`-o, --out <DIR>`	Output directory for generated workbooks.
`-s, --scope <NAME>`	Generate templates only for schema items included in a scope.

`excel-sync`

Preview or apply schema header updates to existing Excel data workbooks while preserving data rows. Removed schema fields stay as ignored legacy columns.

sora excel-sync --project project.toml --data-root data
sora es -p project.toml -d data -w

Option	Description
`-p, --project <PATH>`	Project manifest path.
`-d, --data-root <DIR>`	Data workbook root.
`-s, --scope <NAME>`	Sync only schema items included in a scope.
`-w, --write`	Write workbook changes. Without this flag, the command previews changes only.

`schema-lock`

Write a schema lock for the current normalized schema.

sora schema-lock --project project.toml --out generated/schema.lock
sora sl -p project.toml -o generated/schema.lock

Option	Description
`-p, --project <PATH>`	Project manifest path.
`-o, --out <PATH>`	Schema lock output path.
`-s, --scope <NAME>`	Lock only schema items included in a scope.

`studio`

Start the embedded Sora Studio schema editor.

sora studio --project project.toml
sora st -p project.toml --port 5180

Option	Description
`-p, --project <PATH>`	Project manifest path.
`--host <IP>`	Bind address. Defaults to `127.0.0.1`.
`--port <PORT>`	Port. Defaults to `5174`.

Data Export

Sora separates data export from language code generation.

The exporter receives validated data and writes a runtime bundle. Generated code then reads one of those bundle formats. This lets the same schema and data feed several languages or runtime storage choices.

The short version:

source data -> export format -> generated code runtime_format

For example, if generated Rust code uses runtime_format = "sora", the build must also write a binary export. Code generation decides how to read; export writes the file that will be read.

Built-in Exports

Format	Purpose
`binary`	Native sectioned Sora binary bundle.
`json-debug`	Human-readable debug output for inspection.
`json`	Runtime JSON bundle.
`cbor`	Runtime CBOR bundle.
`sora-protobuf`	Runtime Protobuf bundle using Sora’s value model.
`proto`	Typed Protobuf bundle using a generated game-specific schema.
`i18n-binary`	Binary locale pack for one locale.
`i18n-json`	JSON locale pack for one locale.

The binary export is selected by runtime_format = "sora" in codegen options.

Command Example

sora export \
  --format binary \
  --default-source-format xlsx \
  --project project.toml \
  --data-root data \
  --out generated/config.sora

Build Manifest Example

Build manifests can declare multiple exports:

[[build.exports]]
format = "binary"
out = "generated/config.sora"

[[build.exports]]
format = "json-debug"
out = "generated/debug-json"

[[build.exports]]
format = "i18n-binary"
out = "generated/i18n/zh_cn.sora-i18n"
locale = "zh_cn"

When sora build runs, it checks that configured codegen targets have a matching export for their selected runtime format.

Localization packs are separate runtime assets and are mounted by the generated i18n runtime. See Localization.

Export Formats

Export formats are runtime bundle formats. They are independent from source formats such as Excel, CSV, TOML, JSON, or YAML.

Export	Codegen Runtime Format	Output Shape	Use When
`binary`	`sora`	Native sectioned binary bundle.	You want a compact self-contained Sora runtime.
`json`	`json`	Runtime JSON bundle.	You want easy inspection or simple platform integration.
`cbor`	`cbor`	Runtime CBOR bundle.	You want a compact general-purpose binary value format.
`sora-protobuf`	`sora-protobuf`	Sora value model encoded with Protobuf.	You want Protobuf-based transport without per-game `.proto` models.
`proto`	none	Typed Protobuf bundle using the generated game-specific schema.	You want a business `.proto` contract for external tooling.
`json-debug`	none	Per-table debug JSON.	You want reviewable output for inspection and tests.
`i18n-binary`	none	Native binary locale pack for one locale.	You want production localization packs mounted separately from config.
`i18n-json`	none	Debug JSON locale pack for one locale.	You want reviewable text for translation handoff or tests.

Example build outputs:

[[build.exports]]
format = "binary"
out = "generated/config.sora"

[[build.exports]]
format = "json"
out = "generated/config.json"

[[build.exports]]
format = "json-debug"
out = "generated/debug-json"

[[build.exports]]
format = "i18n-binary"
out = "generated/i18n/zh_cn.sora-i18n"
locale = "zh_cn"

Generated runtimes only load runtime formats they support. json-debug is for humans and tools, not generated runtime loading.

Localization exports require [localization] and a locale in the build manifest. See Localization.

Code Generation

Code generation turns the normalized schema IR into target-language row types, table containers, and config loaders.

It is driven by a registry of language generators.

Each generator declares:

a target id and aliases;
display metadata;
supported runtime formats;
optional formatter integration;
a CodeGenerator implementation.

This lets built-in languages and downstream generators use the same pipeline shape.

schema files -> schema model -> normalized IR -> generator registry -> target generator -> files

Generate a target directly:

sora gen --target typescript --project project.toml --out generated/typescript

Or declare it in the build manifest:

[[build.codegen]]
target = "typescript"
out = "typescript/generated"
format = "auto"

format can be never, auto, or required. auto runs a known formatter when it is available. required fails if the formatter is missing or returns an error.

Runtime Format

Each target can choose a runtime format:

[codegen.typescript]
runtime_format = "json"

The selected runtime format controls the loader code emitted for that target. It does not change the schema or the source data.

Generated Shape

Generated code generally contains:

enums for schema enums;
record types for structs, union variants, and table rows;
table containers for map, list, and singleton tables;
lookup helpers for keys and indexes where supported;
a top-level config loader for the selected runtime format.

Generated identifiers follow target-language conventions while runtime data lookup keeps using the original schema names. See Identifier Naming.

Schema optional<T> is mapped to the target language’s strongest available nullability representation. See Nullability.

Identifier Naming

Schema names are the source of truth. Generated code adapts those names to each target language’s naming style, while runtime data lookup keeps using the original schema names.

For example, a schema field named max_stack may become maxStack in TypeScript, MaxStack in C#, and max_stack in Rust. The generated decoder still reads the field named max_stack from the runtime bundle.

Naming Pipeline

Sora derives common name forms from each schema name before language generation:

Form	Example from `max_stack`	Common use
Raw	`max_stack`	Runtime table names, field names, enum text values, union tags.
Pascal	`MaxStack`	Types, classes, enum variants, exported symbols.
Camel	`maxStack`	Fields, properties, parameters, methods in camel-case languages.
Snake	`max_stack`	Files, modules, fields, functions in snake-case languages.

Language generators choose from these forms and may apply additional language-specific sanitization for invalid identifiers or reserved words.

Language Conventions

The built-in generators follow the target language’s normal public API style:

Target	Types	Fields and accessors	Files/modules
Rust	`PascalCase`	`snake_case`	`snake_case.rs`
C	prefixed `snake_case`	`snake_case`	`snake_case.h`, `snake_case.c`
C++	`PascalCase`	`snake_case`	`snake_case.hpp`
C#	`PascalCase`	`PascalCase` properties	`PascalCase.cs`
Go	`PascalCase` exported names	`PascalCase` exported fields	`snake_case.go`
Java	`PascalCase`	`lowerCamelCase`	`PascalCase.java`
Kotlin	`PascalCase`	`lowerCamelCase`	target layout
Scala	`PascalCase`	`lowerCamelCase`	`PascalCase.scala`
TypeScript	`PascalCase`	`lowerCamelCase`	`snake_case.ts`
JavaScript	`PascalCase`	`lowerCamelCase`	`snake_case.js`, `snake_case.d.ts`
Python	`PascalCase`	`snake_case`	`snake_case.py`
Dart	`PascalCase`	`lowerCamelCase`	`snake_case.dart`
Lua	`PascalCase` table-like types	`lowerCamelCase`	`snake_case.lua`
Erlang	`snake_case` modules	`snake_case` map keys/functions	`snake_case.erl`
Godot	`PascalCase` classes	`snake_case`	`snake_case.gd`

This table describes generated code identifiers, not runtime data names.

Runtime Names Stay Raw

The following values keep the original schema spelling:

table names in bundles and table metadata;
field names read from runtime rows;
enum string values;
union variant tag values;
schema lock and fingerprint input.

Changing a schema name changes the data contract. Changing only a target language’s generated identifier style should not.

Custom Type Mappings

Custom type mappings do not rename generated schema identifiers. They only control target-language type expressions, imports/includes, and optional conversion hooks.

Mapping function names are native code written by the user for that target language, so they should follow that language’s own naming convention. The mapping key remains the named schema type, such as Vec3.

Nullability

Schema nullability is expressed with optional<T>. Code generators map that schema type to the strongest nullability representation available in each target language.

Runtime bundles encode optional values with explicit presence. Generated code should preserve that distinction in its public API instead of relying on undocumented null conventions.

Built-In Representations

Target	`optional<T>` representation
Rust	`Option<T>`
C#	`T?` with nullable reference types enabled
Kotlin	`T?`
Dart	`T?`
Scala	`Option[T]`
TypeScript	`T
JavaScript d.ts	`T
Python	`T
C++	`std::optional<T>` for C++17 and newer; `SoraOptional<T>` for older standards
C	generated optional wrapper type with presence state
Go	`*T`
Erlang	`T
Lua	`T?` EmmyLua annotation
Godot	`Variant` with `null`
Java	nullable value type plus annotation

Dynamic targets such as JavaScript, Lua, and Godot can only document nullability for tooling. Statically typed targets expose it in the generated type whenever the language supports that.

Java Annotations

Java has no standard nullable type syntax. Sora emits nullable Java fields, constructor parameters, and nullable lookup results with an annotation.

By default, Java generation uses a self-contained package-local SoraNullable annotation:

@SoraNullable
public final String nickname;

Projects that use a specific annotation package can configure it:

[codegen.java]
nullable_annotation = "org.jetbrains.annotations.Nullable"

Set nullable_annotation = "" to emit nullable Java values without annotations.

Custom Type Mappings

Type mapping scripts can provide nullable_type_name when the target language needs a different type expression for optional<YourType>:

{
  target = "java",
  schema_type = "UserId",
  type_name = "int",
  nullable_type_name = "Integer",
}

This only changes the generated type expression. The backend still controls how optional presence is decoded.

Runtime Formats

Select a runtime format per codegen target:

[codegen.rust]
runtime_format = "sora"

Runtime formats are the formats generated code can load. They correspond to export formats:

Codegen `runtime_format`	Required Export
`sora`	`binary`
`json`	`json`
`cbor`	`cbor`
`sora-protobuf`	`sora-protobuf`

This setting does not change Excel, CSV, TOML, JSON, YAML, or schema files. It only changes the loader generated for the target language. The selected runtime format must have a matching export in the project build.

Support Matrix

Target	`sora`	`json`	`cbor`	`sora-protobuf`
Rust	self-contained	managed dependency	managed dependency	managed dependency
Kotlin	self-contained	managed dependency	managed dependency	managed dependency
C#	self-contained	managed dependency	managed dependency	managed dependency
Java	self-contained	managed dependency	managed dependency	managed dependency
Scala	self-contained	managed dependency	managed dependency	managed dependency
Go	self-contained	managed dependency	managed dependency	managed dependency
TypeScript	self-contained	managed dependency	managed dependency	managed dependency
JavaScript	self-contained	managed dependency	managed dependency	managed dependency
Python	self-contained	managed dependency	managed dependency	managed dependency
Dart	not supported	standard library	user adapter	user adapter
Godot	not supported	standard library	not supported	not supported
C	self-contained	not supported	not supported	not supported
C++	self-contained	not supported	not supported	not supported
Erlang	self-contained	user adapter	user adapter	user adapter
Lua	self-contained	user adapter	user adapter	user adapter

Dependency meanings:

Kind	Meaning
self-contained	Generated runtime includes the decoder.
standard library	Generated runtime uses the language standard library.
managed dependency	Generated runtime expects normal package dependencies for that ecosystem.
user adapter	Generated runtime exposes an adapter hook and the application supplies the concrete decoder.

Choosing a Format

Use sora when you want the native Sora binary bundle and the target supports it.

Use json when inspectability, tooling, or platform simplicity matters more than compactness.

Use cbor when you want a compact general-purpose binary value format and your runtime already has a CBOR dependency.

Use sora-protobuf when your environment prefers Protobuf transport but you still want Sora’s schema-driven value model.

The CI runtime matrix generates every supported combination in this table and syntax-checks languages where the check is lightweight.

Runtime Adapters

Some languages do not have a built-in dependency story for every runtime format. Those targets use adapter hooks instead of embedding a third-party decoder.

The generated runtime owns the Sora value model and table loading logic. The application supplies a small function that turns bytes into the decoded value tree expected by the runtime.

This keeps generated code independent from dependency choices. A game can use the CBOR, Protobuf, or compression library it already trusts.

Lua

local config = SoraConfig.from_cbor(bytes, {
  decode_cbor = function(payload)
    return my_cbor.decode(payload)
  end,
})

Erlang

Options = #{
    decode_cbor => fun my_cbor:decode/1
},
Config = sora_config:from_cbor(Bytes, Options).

Dart

final config = SoraConfig.fromCbor(
  bytes,
  decodeCbor: (payload) => myCborDecode(payload),
);

Adapters keep generated code independent from dependency choices while still allowing the same exported data formats to be used.

What the Adapter Must Return

The adapter should return the language-specific Sora value tree expected by the generated runtime. It is not responsible for constructing typed rows; generated code handles that after decoding.

If a target has a self-contained decoder for a format, no adapter is needed.

Versioning and Compatibility

Sora is still early. The project does not provide Rust-style editions or compatibility modes for old schema semantics. A project that needs stable output should pin the sora CLI version it uses, and treat a CLI upgrade as an explicit migration step.

What To Pin

Pin the CLI binary or crate version in the project tooling:

download a specific GitHub Release asset and keep using that version in CI;
install a specific crates.io version with cargo install sora-cli --version X.Y.Z;
record the expected sora --version in project setup docs or build scripts.

Generated code, generated Excel templates, schema locks, and exported runtime bundles should be produced by the same pinned CLI version for a given project build.

Runtime Bundle Versions

Exported runtime bundles carry a format version. The Sora binary bundle also has a file header version, and generated runtimes reject bundles with unsupported versions.

Sora only bumps these runtime/export format versions when the generated runtime can no longer safely read data written by an older layout. Examples include:

changing the .sora binary section layout;
changing the manifest fields required by generated runtimes;
changing JSON, CBOR, or Protobuf bundle structure in a way that old generated code cannot read;
changing value encoding rules in exported runtime bundles.

During the early development stage, ordinary implementation changes do not automatically bump format_version. Version bumps are manual and reserved for actual runtime/export incompatibility.

Schema and Codegen Semantics

Schema syntax, parser behavior, validation rules, Studio rendering, and generated language APIs may still change while the project is young. Sora does not keep old behavior behind an edition flag or any other compatibility mode.

If a newer CLI changes schema or codegen semantics, users should:

upgrade the CLI intentionally;
regenerate schema locks, templates, exports, and code;
review diffs;
update schema/data/project files as needed.

Schema fingerprints and schema locks help detect mismatches between generated code, schema, and data, but they are not migration tools. They prevent silent incompatibility; they do not preserve old semantics.

Extending Sora

Sora is designed to be used as a library by projects that need their own language or data format support.

The extension boundary is intentionally split:

input adapter -> schema model -> normalized IR -> data validation
                                      |-> exporter
                                      |-> code generator

Add a Code Generator

Implement the generator trait:

#![allow(unused)]
fn main() {
pub trait CodeGenerator: Send + Sync {
    fn generate(&self, context: CodegenContext<'_>, out_dir: &Path) -> Result<()>;
}
}

See Generators for a longer walkthrough.

Keep the IR Neutral

Language-specific settings belong in target options and generator code. The normalized IR should describe schema semantics only: packages, tables, fields, types, keys, indexes, unions, and validation metadata.

Project-specific language type mappings should use codegen type mapping providers, not schema fields. This keeps data semantics separate from target-language representation choices such as mapping struct<Vec3> to UnityEngine.Vector3.

Add an Exporter

Exporters are separate from generators. Add a data exporter when you need a new runtime bundle format. Add a code generator when you need a new language target.

See Exporters for the expected boundary.

Generators

A generator turns the normalized IR into files for one language target.

Registration

Generators are registered with:

a canonical target id;
aliases;
display metadata;
supported runtime formats;
optional formatter integration;
a CodeGenerator implementation.

This lets built-in generators and downstream generators use the same pipeline.

Implementation Shape

#![allow(unused)]
fn main() {
pub trait CodeGenerator: Send + Sync {
    fn generate(&self, context: CodegenContext<'_>, out_dir: &Path) -> Result<()>;
}
}

The generator receives:

the normalized IR;
parsed target options;
the registered type mapping providers;
the output directory;
runtime format selection.

It should not mutate the IR or rely on language-specific fields being present in the IR.

Type Mappings

Language generators can consult context.type_mappings before falling back to their built-in type mapping. A provider maps a target plus a named schema type, such as struct<Vec3>, to a generated type name, optional nullable type name, and optional decode wrappers. Container and optional types should recurse through the same mapper so list<struct<Vec3>> and optional<struct<Vec3>> automatically use the mapped target type.

The schema remains language-neutral. Project-specific mappings belong in library registration code or CLI Lua type mapping scripts, not in field definitions.

Target Options

Language-specific options live under [codegen.<target>]:

[codegen.rust]
runtime_format = "sora"
map_type = "btree"
string_storage = "owned"

The generator owns the interpretation of these options.

Exporters

An exporter writes validated configuration data into a runtime bundle.

Exporters are separate from code generators because the same exported data can be consumed by many languages.

When to Add an Exporter

Add an exporter when you need:

a new runtime wire format;
a platform-specific asset package;
a different compression or section layout;
an inspection format for tooling.

Do not add an exporter just to support a new programming language. Add a code generator for that.

Expected Boundary

An exporter should consume:

the normalized schema IR;
validated config data;
exporter options;
an output target.

It should not depend on a specific language generator.

Design Notes

These notes explain the architectural choices behind Sora.

The short version is that schema files are the source of truth. Excel headers, runtime bundles, generated code, and extension points are projections of the normalized schema and validated data.

Schema as Source of Truth

Sora is schema-first. The TOML schema is the contract for configuration data; source files and generated outputs are projections of that contract.

schema modules
  -> normalized IR
  -> Excel headers
  -> validation
  -> runtime exports
  -> generated language code

This design avoids the common problem where a spreadsheet, a hand-written parser, and runtime code all define slightly different versions of the same data shape.

Consequences

Field names, types, keys, defaults, references, and validation rules live in schema.
Excel and CSV files provide values, not a second schema.
Runtime export formats do not change the data model.
Language options belong to codegen targets, not to the IR.
Downstream users can add generators or exporters without changing schema semantics.

The schema can still include editing hints such as comment, parser hints, ranges, and length limits. Those hints are part of the data contract because they affect validation or generated projections.

Excel Header Projection

Excel templates are generated from the normalized schema. The header is a projection, not an independent format definition.

Why Generate Headers

Manually maintained spreadsheet headers tend to drift from code:

a field is renamed in code but not in Excel;
a type changes but old rows still look valid;
a designer adds a column that no runtime reads;
validation rules are documented in comments instead of enforced.

Sora avoids this by generating the workbook structure from schema.

What the Header Contains

Generated rows include:

table metadata: table name, mode, key, scope, and schema hash;
stable field names;
type hints;
scope hints;
validation and parser rules;
comments for editors.

Only row data should be treated as authored content. Header rows can be regenerated whenever the schema changes.

Practical Workflow

Change the schema.
Regenerate Excel templates.
Move or paste existing data rows into the updated template.
Run sora build or sora export to validate values and references.
Run sora build to produce exports and generated code.

This keeps Excel useful for editing while keeping the schema authoritative.

IR Boundaries

The normalized IR describes schema semantics. It should not encode language-specific codegen choices.

Belongs in IR

packages and included schema modules;
enums, structs, unions, tables, fields, and indexes;
table modes and keys;
source metadata;
field types, defaults, parsers, ranges, lengths, and comments;
references and derived child-table field metadata;
scopes.

Does Not Belong in IR

Rust map implementation choices;
TypeScript enum representation choices;
Lua module names;
runtime decoder dependency choices;
formatter settings;
target-specific file layout.

Those settings belong in [codegen.<target>] or in generator registration metadata.

Extension Boundary

schema input -> normalized IR -> validation
                              |-> exporter registry
                              |-> codegen registry

A new language generator should consume the IR and its own target options. A new runtime data format should be added as an exporter. Neither should require changing the schema model unless the actual data semantics need to change.

Keyboard shortcuts

Sora