Sora
Sora helps you keep game configuration data understandable while still giving runtime code typed access.
You write a schema that describes table shapes, fill the table rows in Excel, CSV, TOML, JSON, or YAML, and let Sora validate the data. After validation, Sora writes a runtime data bundle and generates code that knows how to load that bundle.
The schema is the contract. Excel, CSV, TOML, generated code, and exported runtime bundles are all projections of that contract. A designer can edit rows in a workbook, while game code consumes strongly typed generated APIs.
For a small project, the file flow looks like this:
project.toml
-> schema/items.toml
-> data/Item.xlsx
-> generated/config.sora
-> generated/rust
You normally hand-write project.toml and schema files. Designers or tools edit files under data/. Files under generated/ are Sora outputs.
What Sora Does
schema modules -> Excel/CSV/TOML/JSON/YAML data -> validation
|-> runtime bundle
|-> generated code
Sora currently focuses on these stages:
- describe tables, records, enums, unions, references, indexes, and validation rules in schema files;
- inspect and edit schema modules in the embedded Sora Studio UI;
- generate Excel templates from the schema so spreadsheet headers stay consistent;
- load table data from TOML, JSON, YAML, CSV, or Excel
.xlsx; - validate data against the normalized schema and cross-table references;
- export data as Sora binary, debug JSON, JSON bundle, CBOR bundle, or Sora Protobuf bundle;
- generate language runtimes that load those exported bundles.
Common Terms
Sora uses the word format in a few different places:
| Term | Meaning | Example |
|---|---|---|
| Schema format | The file format used to write schema/project files. | TOML, YAML, JSON, Lua |
| Source format | The editable table data format. | Excel .xlsx, CSV, TOML, JSON, YAML |
| Export format | The data bundle written after validation. | binary, json, cbor |
| Runtime format | The bundle format generated code expects to load. | sora, json, cbor |
For example, Rust codegen with runtime_format = "sora" needs a matching binary export. The source data can still come from Excel.
When This Fits
Sora is intended for game configuration and similar data-heavy applications where:
- designers or tools edit tabular data;
- runtime code wants typed access instead of loose dictionaries;
- schema changes should be reviewed in source control;
- generated language support should be extendable by downstream users.
The project is still early, so the public API can change. The design goal is to keep the core schema and IR independent from individual language backends, so downstream users can add generators or exporters without patching the core pipeline.
Projects that need stable output should pin the sora CLI version. Runtime/export format versions are bumped only for actual generated-runtime incompatibility; Sora does not currently maintain old schema semantics behind edition flags. See Versioning and Compatibility.
Suggested Reading Order
Start with Quick Start, then read Sora Studio, First Config, and Excel Workflow. After that, the most useful reference pages are Types, Tables, Cell Parsers, References and Derived Fields, and Versioning and Compatibility.
Design notes and extension pages are meant for readers who already understand the basic build flow.
Core Concepts
Project
A project manifest declares the package name, schema modules, build outputs, codegen targets, and export targets. It is the entry point used by sora check, sora build, sora gen, and sora export.
Schema
Schema files describe the shape of configuration data. They define enums, structs, unions, tables, indexes, references, and field rules. Sora normalizes schema files into an IR before validation, export, or code generation.
Table
A table is a named collection of rows. Tables can be list-like, keyed by one field, or singleton. Source metadata tells Sora where the editable data comes from.
The table schema is also used to generate editor projections such as Excel headers. The spreadsheet is not the contract; it is one way to edit rows that conform to the contract.
Value
Sora validates source cells into a common value tree before export. Generated runtimes read that same shape from different runtime formats, so a target language can switch between sora, json, cbor, or sora-protobuf without changing the schema.
Runtime Format
A runtime format is the wire format that generated code loads. It is selected per language target with runtime_format.
Generator
A generator is a language backend registered in the codegen registry. Built-in generators are ordinary registry entries, which keeps the pipeline open to downstream extensions.
Exporter
An exporter writes validated data into a runtime bundle. The exporter registry is separate from code generation so data formats and language targets can evolve independently.
Scope
Schemas, fields, and tables can declare a scope. A build can select a scope to generate or export only the pieces needed by one runtime environment.
Quick Start
This guide builds a minimal item table, generates an Excel template, exports a runtime bundle, and generates Rust code that can load it.
Install the CLI from the GitHub Releases page by downloading the archive for your platform and placing the sora binary on your PATH.
If you already have a Rust toolchain, you can also install the published package from crates.io:
cargo install sora-cli
For local development from a checkout:
cargo install --path crates/sora-cli
1. Create a Project
The fastest path is to scaffold the same minimal project:
sora init --out my-config --schema-format toml
cd my-config
--schema-format accepts toml, yaml, json, or lua. The scaffold creates this layout:
| Path | Who edits it | Purpose |
|---|---|---|
project.toml | You | Project entry point, build outputs, default data location. |
schema/items.toml | You | Schema for the Item table. |
data/Item.xlsx | Designers or tools | Editable row data. |
generated/ | Sora | Schema lock, Excel templates, generated code, exported data. |
The rest of this section shows the generated files so you can understand the project shape. project.toml looks like this:
package = "game_config"
includes = ["schema/items.toml"]
[build]
default_source_format = "xlsx"
data_root = "data"
schema_lock = "generated/schema.lock"
excel_templates = "generated/excel"
[[build.codegen]]
target = "rust"
out = "generated/rust"
format = "auto"
[[build.exports]]
format = "binary"
out = "generated/config.sora"
In this file, default_source_format = "xlsx" means table sources default to Excel. data_root = "data" means Item.xlsx is read from data/Item.xlsx during export and build. excel_templates = "generated/excel" is only the generated template output directory. It is where Sora writes fresh workbooks with schema headers; it is not the source data directory. Keep it separate from data so regenerating templates cannot overwrite edited row data. The binary export writes the runtime bundle that Rust code will load because Rust defaults to runtime_format = "sora".
Create schema/items.toml:
[[enums]]
name = "ItemType"
values = ["Weapon", "Armor", "Material", "Consumable"]
[[tables]]
name = "Item"
mode = "map"
key = "id"
[tables.source]
format = "xlsx"
file = "Item.xlsx"
sheet = "Item"
[[tables.fields]]
name = "id"
type = "i32"
comment = "Item id"
[[tables.fields]]
name = "name"
type = "string"
comment = "Display name"
[[tables.fields]]
name = "item_type"
type = "enum<ItemType>"
comment = "Item category"
[[tables.fields]]
name = "max_stack"
type = "i32"
default = "1"
range = [1, 9999]
comment = "Stack limit"
2. Generate the Excel Template
The workbook header is generated from the schema:
sora excel-template --project project.toml --out generated/excel
This creates generated/excel/Item.xlsx. Treat that file as a template artifact that can be regenerated after schema changes. For a new table, copy it to data/Item.xlsx and fill rows below the generated header:
| id | name | item_type | max_stack |
|---|---|---|---|
| 1001 | Iron Sword | Weapon | 1 |
| 2001 | Health Potion | Consumable | 99 |
After you have real data in data/Item.xlsx, do not run excel-template --out data unless you intentionally want to replace those files. Keep generating empty templates into generated/excel, and use excel-sync to update existing data workbooks in place when the schema changes.
For existing data workbooks, prefer syncing headers in place:
sora excel-sync --project project.toml --data-root data
sora excel-sync --project project.toml --data-root data --write
The preview command shows added fields and legacy columns. The --write command refreshes generated header rows while preserving data rows; fields removed from schema stay in Excel as legacy columns that Sora ignores.
3. Check, Export, and Generate
Validate the schema without reading row data:
sora check --project project.toml
Run every output declared in [build]. This also loads and validates source data before writing exports:
sora build --project project.toml
You can also open the project in Sora Studio, the schema editor embedded in the CLI:
sora studio --project project.toml
The command prints a local URL. Open it in a browser to visualize schema relationships, edit schema modules, preview the generated changes, and save them back to the project.
Or run the steps separately:
sora gen --target rust --project project.toml --out generated/rust
sora export \
--format binary \
--default-source-format xlsx \
--project project.toml \
--data-root data \
--out generated/config.sora
4. Next Steps
Read Sora Studio if you want to edit schemas visually. Read First Config for the same example with the generated runtime usage, or inspect examples/showcase/project.toml for a larger multi-language setup.
Tutorials
Tutorials walk through Sora from an application user’s point of view.
Start with First Config to build a minimal table end to end. Then read Excel Workflow to understand generated spreadsheet templates and Load Generated Code to connect exported data to runtime code.
First Config
This tutorial creates a small item configuration table. The same pattern scales to larger game data: define the schema, generate an editable workbook, fill rows, export a runtime bundle, and generate code.
Project Layout
project.toml
schema/items.toml
data/Item.xlsx
generated/
Project Manifest
package = "game_config"
includes = ["schema/items.toml"]
[build]
default_source_format = "xlsx"
data_root = "data"
schema_lock = "generated/schema.lock"
excel_templates = "generated/excel"
[[build.codegen]]
target = "rust"
out = "generated/rust"
format = "auto"
[[build.exports]]
format = "binary"
out = "generated/config.sora"
schema_lock captures the normalized schema, excel_templates writes workbooks with generated headers, build.codegen declares language output, and build.exports declares runtime data output.
Schema
[[enums]]
name = "ItemType"
values = ["Weapon", "Armor", "Material", "Consumable"]
[[tables]]
name = "Item"
mode = "map"
key = "id"
[tables.source]
format = "xlsx"
file = "Item.xlsx"
sheet = "Item"
[[tables.fields]]
name = "id"
type = "i32"
comment = "Item id"
[[tables.fields]]
name = "name"
type = "string"
comment = "Display name"
[[tables.fields]]
name = "item_type"
type = "enum<ItemType>"
comment = "Item category"
[[tables.fields]]
name = "max_stack"
type = "i32"
default = "1"
range = [1, 9999]
comment = "Stack limit"
This table uses mode = "map", so the generated runtime exposes keyed lookup by id.
Excel Template
Generate a workbook:
sora excel-template --project project.toml --out generated/excel
The generated sheet has metadata rows above the editable data area:
| #field | id | name | item_type | max_stack |
|---|---|---|---|---|
| #type | i32 | string | enum<ItemType> | i32 |
| #input | key | range=1..9999 | ||
| #desc | Item id | Display name | Item category | Stack limit |
Rows start after the generated header:
| id | name | item_type | max_stack |
|---|---|---|---|
| 1001 | Iron Sword | Weapon | 1 |
| 2001 | Health Potion | Consumable | 99 |
Copy the workbook to data/Item.xlsx after generating it, or point your source file at the generated location during experiments.
Build
Run the configured outputs:
sora build --project project.toml
Expected artifacts:
generated/schema.lockgenerated/excel/Item.xlsxgenerated/rustgenerated/config.sora
Use sora check --project project.toml when you only want schema validation.
Excel Workflow
Excel support is designed around generated templates. The schema owns the table shape; Excel is an editable projection of that schema.
Generate Templates
There are two ways to generate Excel templates.
The direct command only writes templates:
sora excel-template --project project.toml --out generated/excel
This reads the schema from project.toml and writes generated workbooks under generated/excel. The directory is safe to delete and regenerate because it should contain template artifacts, not hand-edited source data.
The build workflow can do the same thing when excel_templates is configured:
[build]
excel_templates = "generated/excel"
sora build --project project.toml
Both paths generate the same kind of template files. The direct command only writes Excel templates. sora build runs the template output together with the other configured build outputs such as schema locks, code generation, and exports.
Template Directory vs Data Directory
excel_templates is an output directory for templates. It is not the runtime data input directory. Data input normally comes from [build].data_root or the --data-root command option.
The usual layout keeps these paths separate:
| Path | Role | Can be regenerated |
|---|---|---|
generated/excel | Generated workbook templates with schema headers. | Yes |
data | Edited table rows used by export and build. | No |
Do not point excel-template --out or [build].excel_templates at a directory that already contains edited data workbooks unless replacing those files is intentional. Use generated templates for new workbooks; use excel-sync for workbooks that already contain real data.
Sync Existing Workbooks
For real projects with existing data, use excel-sync instead of copying rows into a fresh template. It updates workbook headers from the current schema while preserving data rows:
sora excel-sync --project project.toml --data-root data
Without --write, the command only previews what would change. To write the updated workbook files:
sora excel-sync --project project.toml --data-root data --write
When writing an existing workbook, Sora first copies the old file under data/.sora-backup/<timestamp>/.
Sync matches columns by the #field row, not by column position:
- existing schema fields keep their data;
- new schema fields are added as empty columns;
- changed type, parser, scope, range, length, comments, and table metadata refresh the generated header rows;
- fields removed from schema are not deleted from Excel. They are kept as legacy columns ignored by Sora, so designers can delete them manually when they are ready;
- non-schema sheets in the same workbook are preserved as value-only sheets.
The workbook and sheet for each table come from that table’s source:
[[tables]]
name = "Item"
[tables.source]
format = "xlsx"
file = "Core.xlsx"
sheet = "Item"
[[tables]]
name = "Quest"
[tables.source]
format = "xlsx"
file = "Core.xlsx"
sheet = "Quest"
This writes two sheets, Item and Quest, into generated/excel/Core.xlsx.
A table with a different source file goes into a different workbook:
[tables.source]
format = "xlsx"
file = "Battle.xlsx"
sheet = "Skill"
This writes the Skill sheet into generated/excel/Battle.xlsx.
Header Rows
Generated sheets include several header rows:
| Row | Purpose |
|---|---|
@table metadata | Table name, mode, key, scope, and schema hash. |
#name | Display name row for the spreadsheet. |
#field | Stable schema field names read by Sora. |
#type | Type hints such as i32, enum<ItemType>, or struct<Cost>(kind: enum<ResourceKind>, id: i32, count: i32). |
#scope | Scope information for each field. |
#input | Input hints such as key, parser, range, length, or derived-field source. |
#desc | Field comments for designers and reviewers. |
Data rows start after the generated header.
What Users Should Edit
Users should edit data rows. They should not hand-maintain field names, types, key metadata, input hints, or validation rules in Excel. Those rows are regenerated from schema changes.
If a column’s #input cell starts with from=, that field is derived from another table. Leave the generated placeholder in that column and edit the child table rows instead.
When the schema changes, run sora excel-sync --project project.toml --data-root data to preview header changes, then rerun with --write after reviewing them. This keeps spreadsheet editing convenient without making Excel a second schema language.
Common Field Shapes
Simple fields map directly to cells:
| id | name | max_stack |
|---|---|---|
| 1001 | Iron Sword | 1 |
Structured values use parsers when a cell needs a compact representation:
[[tables.fields]]
name = "price"
type = "struct<ResourceCost>"
parser = { kind = "tuple" }
comment = "Tuple: kind,id,count"
Example cell:
Item,1001,3
Collections can use JSON or map-style parsers:
[[tables.fields]]
name = "tags"
type = "set<string>"
parser = { kind = "json" }
default = "[\"misc\"]"
[[tables.fields]]
name = "attributes"
type = "map<string,i32>"
parser = { kind = "map" }
comment = "Map pairs: key,value|key,value"
Example cells:
["starter","melee"]
attack,12|speed,2
Load Generated Code
Generated code contains strongly typed row models, table containers, and a config loader for the selected runtime format.
Choose a Runtime Format
[codegen.rust]
runtime_format = "sora"
The runtime format selected by code generation must match an exported bundle:
[[build.exports]]
format = "binary"
out = "generated/config.sora"
runtime_format = "sora" corresponds to the binary export. json, cbor, and sora-protobuf correspond to their matching export formats.
Rust Example
mod generated;
use generated::SoraConfig;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let bytes = std::fs::read("generated/config.sora")?;
let config = SoraConfig::from_sora_bytes(&bytes)?;
if let Some(item) = config.items.get(&1001) {
println!("{} stacks to {}", item.name, item.max_stack);
}
Ok(())
}
Exact names are derived from schema names and target language conventions. For example, a table named Item generally becomes an item row type plus an item table accessor.
Adapter Targets
Some targets expose adapter hooks for formats where the ecosystem dependency should be supplied by the application. For example, Lua, Erlang, and Dart can accept decode_cbor or decode_sora_protobuf functions instead of embedding a specific third-party decoder.
See Runtime Adapters for examples.
Schema
A schema module is a TOML, YAML, JSON, or Lua file included by a project manifest.
package = "game_config"
includes = ["schema/items.toml", "schema/skills.toml"]
Schema modules are the source of truth for Sora. They describe the stable data contract; source files such as Excel workbooks contain row values that are checked against that contract.
See Schema Formats for the supported file formats and equivalent TOML/YAML/JSON/Lua shapes.
Enums
[[enums]]
name = "ItemType"
values = ["Weapon", "Armor", "Material"]
Enums are stored by symbolic value in editable data and generated as native enum-like constructs when the target language supports them.
Structs
[[structs]]
name = "Cost"
[[structs.fields]]
name = "gold"
type = "i32"
Structs model repeated object shapes. They are useful for costs, rewards, coordinates, stat modifiers, and other nested values.
Unions
[[unions]]
name = "RewardAction"
tag = "type"
[[unions.variants]]
name = "AddItem"
[[unions.variants.fields]]
name = "item_id"
type = "ref<Item.id>"
Unions model tagged variants. The tag field is the discriminator name used in source data and runtime values.
Tables
[[tables]]
name = "Item"
mode = "map"
key = "id"
[tables.source]
format = "xlsx"
file = "Item.xlsx"
sheet = "Item"
Tables define source-backed row collections. See Tables for modes, keys, sources, indexes, and derived fields.
Field Types
Common field types include primitives, enums, structs, unions, references, lists, sets, fixed arrays, maps, and optionals:
i32
string
enum<ItemType>
struct<Cost>
union<Reward>
ref<Item.id>
list<i32>
set<string>
array<i32,3>
map<string,i32>
optional<string>
See Types for the full list and examples.
See Cell Parsers for compact Excel/CSV cell formats and column projections such as split, tuple, columns, tuple_list, map, and json.
Schema Formats
Sora schema files can be written as TOML, YAML, JSON, or Lua. All formats load into the same schema model and produce the same IR, generated code, Excel templates, exports, and schema locks.
The file extension selects the parser:
| Extension | Format |
|---|---|
.toml | TOML |
.yaml, .yml | YAML |
.json | JSON |
.lua | Lua |
Includes are parsed by their own file extension, so a YAML project can include TOML, JSON, or Lua modules, and any supported project format can mix supported module formats.
TOML
package = "game_config"
includes = ["schema/items.toml"]
[[enums]]
name = "ItemType"
values = ["Weapon", "Armor"]
[[tables]]
name = "Item"
mode = "map"
key = "id"
[[tables.fields]]
name = "id"
type = "i32"
YAML
package: game_config
includes:
- schema/items.yaml
enums:
- name: ItemType
values: [Weapon, Armor]
tables:
- name: Item
mode: map
key: id
fields:
- name: id
type: i32
JSON
{
"package": "game_config",
"includes": ["schema/items.json"],
"enums": [
{ "name": "ItemType", "values": ["Weapon", "Armor"] }
],
"tables": [
{
"name": "Item",
"mode": "map",
"key": "id",
"fields": [
{ "name": "id", "type": "i32" }
]
}
]
}
Lua
Lua schema files must return one table. The returned table uses the same field names as the TOML/YAML/JSON shapes. Lua schema loading is data-oriented; package, io, os, and debug are not available.
return {
package = "game_config",
includes = { "schema/items.lua" },
enums = {
{ name = "ItemType", values = { "Weapon", "Armor" } },
},
tables = {
{
name = "Item",
mode = "map",
key = "id",
fields = {
{ name = "id", type = "i32" },
},
},
},
}
Project Build Config
The project file can also use YAML, JSON, or Lua for build:
package: game_config
includes:
- schema/items.yaml
build:
default_source_format: xlsx
data_root: data
schema_lock: generated/schema.lock
excel_templates: generated/excel
codegen:
- target: rust
out: generated/rust
format: auto
exports:
- format: binary
out: generated/config.sora
{
"package": "game_config",
"includes": ["schema/items.json"],
"build": {
"default_source_format": "xlsx",
"data_root": "data",
"schema_lock": "generated/schema.lock",
"excel_templates": "generated/excel",
"codegen": [
{ "target": "rust", "out": "generated/rust", "format": "auto" }
],
"exports": [
{ "format": "binary", "out": "generated/config.sora" }
]
}
}
return {
package = "game_config",
includes = { "schema/items.lua" },
build = {
default_source_format = "xlsx",
data_root = "data",
schema_lock = "generated/schema.lock",
excel_templates = "generated/excel",
codegen = {
{ target = "rust", out = "generated/rust", format = "auto" },
},
exports = {
{ format = "binary", out = "generated/config.sora" },
},
},
}
Tables
Tables are source-backed row collections. A table schema declares the table mode, source location, fields, and optional indexes.
Modes
| Mode | Shape | Typical Use |
|---|---|---|
map | Rows keyed by one field. | Items, quests, levels, buffs. |
list | Ordered rows without keyed lookup. | Drop entries, weighted pools, ordered steps. |
singleton | One row. | Global settings, tuning constants. |
[[tables]]
name = "Item"
mode = "map"
key = "id"
[[tables.fields]]
name = "id"
type = "i32"
For map tables, key names the table’s primary key field. Sora uses it for row uniqueness, generated lookup APIs, Excel template hints, and ref<Table.key> validation.
Source
[tables.source]
format = "xlsx"
file = "Core.xlsx"
sheet = "Item"
format can be omitted when the project or command provides a default source format. file is resolved under the command’s --data-root during export and validation.
Built-in source formats are xlsx, csv, toml, json, and yaml. JSON and YAML table files are arrays of row objects:
[
{ "id": 1001, "name": "Iron Sword" },
{ "id": 1002, "name": "Health Potion" }
]
For JSON and YAML, file can also point to a directory. In that case Sora recursively reads every matching .json, .yaml, or .yml file as one row object, sorted by path.
Indexes
Indexes are extra lookup paths on a table. They are different from the key of a mode = "map" table:
| Concept | Purpose |
|---|---|
table key | The primary key. A map table uses it to keep rows unique and to generate the main get(id) lookup. |
[[tables.indexes]] | Additional lookup paths, such as lookup by name, grouping by type, or finding drops by stage. |
For example, an Item table can use id as its primary key:
[[tables]]
name = "Item"
mode = "map"
key = "id"
[[tables.fields]]
name = "id"
type = "i32"
[[tables.fields]]
name = "name"
type = "string"
[[tables.fields]]
name = "item_type"
type = "enum<ItemType>"
Add a unique index when another field should also identify at most one row:
[[tables.indexes]]
name = "by_name"
fields = ["name"]
unique = true
Example data:
| id | name | item_type |
|---|---|---|
| 1001 | Iron Sword | Weapon |
| 1002 | Wood Shield | Armor |
unique = true means name cannot repeat. Generated code for targets that support the index can expose a helper similar to get_by_name("Iron Sword"), returning one row or no row.
Use a non-unique index when a key can match many rows:
[[tables.indexes]]
name = "by_item_type"
fields = ["item_type"]
unique = false
Example data:
| id | name | item_type |
|---|---|---|
| 1001 | Iron Sword | Weapon |
| 1002 | Bronze Axe | Weapon |
| 2001 | Wood Shield | Armor |
unique = false means one key can match several rows. Generated code for targets that support the index can expose a helper similar to get_by_item_type(ItemType::Weapon), returning the matching rows.
fields is a list, so a unique index can also express combined uniqueness:
[[tables.indexes]]
name = "by_world_stage"
fields = ["world", "stage"]
unique = true
This requires each (world, stage) pair to be unique. For example, (1, 1) can appear once, while (1, 2) is a different key. Current generated lookup helpers mainly support single-field indexes on non-singleton tables; combined indexes are most useful for validation today.
Validation
Sora validates table rows after loading source data:
- non-optional fields must be present unless a default exists;
- key fields must be unique for map tables;
- enum values must be valid;
- references must point to existing rows;
- numeric ranges and length ranges must pass;
- parser output must match the declared field type.
Types
Sora type expressions are written as strings in schema fields.
Primitive Types
| Type | Meaning |
|---|---|
bool | Boolean value. |
i8 | 8-bit signed integer. |
u8 | 8-bit unsigned integer. |
i16 | 16-bit signed integer. |
u16 | 16-bit unsigned integer. |
i32 | 32-bit signed integer. |
u32 | 32-bit unsigned integer. |
i64 | 64-bit signed integer. |
f32 | 32-bit floating point value. |
f64 | 64-bit floating point value. |
string | UTF-8 string. |
duration | Non-negative duration written as units such as 500ms, 30s, 15m, 2h, 7d, or 1h 30m. Units must be ordered from largest to smallest: d, h, m, s, ms. Runtime data stores milliseconds. |
text | Localization text key. See Localization. |
Integer widths are validated by Sora before export. Some target languages do not have unsigned small integer types, so generated code may use a wider signed type while preserving the schema range.
[[tables.fields]]
name = "level"
type = "u16"
range = [1, 100]
Named Types
| Type | Example |
|---|---|
| Enum | enum<ItemType> |
| Struct | struct<ResourceCost> |
| Union | union<RewardAction> |
| Reference | ref<Item.id> |
References must point to the primary key of a mode = "map" table. Containers can wrap references, for example list<ref<Item.id>>.
[[tables.fields]]
name = "item_type"
type = "enum<ItemType>"
[[tables.fields]]
name = "price"
type = "struct<ResourceCost>"
parser = { kind = "tuple" }
Collections
| Type | Meaning |
|---|---|
list<T> | Ordered repeated values. |
set<T> | Unique repeated values. |
array<T,N> | Fixed-length repeated values. |
map<K,V> | Key/value pairs. |
optional<T> | Nullable or absent value. |
[[tables.fields]]
name = "tags"
type = "set<string>"
parser = { kind = "json" }
default = "[\"misc\"]"
[[tables.fields]]
name = "attributes"
type = "map<string,i32>"
parser = { kind = "map" }
Cell Examples
These examples show what a designer would put in an Excel or CSV cell:
| Field type | Parser | Cell value |
|---|---|---|
u16 | none | 1001 |
enum<ItemType> | none | Weapon |
list<i32> | none or split | 1,2,3 |
duration | none | 1h 30m |
text | none | quest.1001.title |
set<string> | json | ["starter","melee"] |
struct<ResourceCost> | tuple | Gold,0,100 |
struct<ResourceCost> | columns | spread across cost_kind, cost_id, cost_count columns |
map<string,i32> | map | atk,10|hp,20 |
union<EventCondition> | json | {"type":"QuestCompleted","quest_id":5002} |
optional<ref<Item.id>> | none | empty cell or 1001 |
Field Rules
[[tables.fields]], [[structs.fields]], and [[unions.variants.fields]] share the common field properties. Table fields have extra table-only properties for derived values; those properties are invalid on struct fields and union variant fields. A table primary key is declared once on the table itself with key = "field_name".
Field presence is part of the type: optional<T> means the value may be absent or null, while every other type is required unless a default fills the missing value.
For TOML/JSON/YAML-style object inputs, a field can be absent from the object. For Excel and CSV, the column must exist in the header; an omitted cell, blank cell, or short CSV record is treated as an empty cell.
| Schema field | Object field absent | Excel/CSV cell empty |
|---|---|---|
type = "i32" | Validation error. | Validation error. |
type = "optional<i32>" | null. | null. |
type = "i32" plus default = "1" | 1. | 1. |
type = "optional<i32>" plus default = "1" | 1. | null. |
| Property | Applies To | Purpose |
|---|---|---|
name | all fields | Field name used in source data, validation errors, generated code, and exported runtime data. |
type | all fields | Type expression such as i32, struct<ResourceCost>, or list<union<RewardAction>>. |
default | all fields except derived fields | String value used when the source object field is absent or a required Excel/CSV cell is empty. |
comment | all fields | Description used in generated Excel headers. |
range | numeric fields, duration, and collection elements of those types | Inclusive numeric range, written as [min, max]. Duration ranges are milliseconds. |
length | string, list, set, array, map | Inclusive length range, written as [min, max]. |
parser | cell-based inputs and defaults | Cell parser hint. See Cell Parsers. |
scope | all fields | Includes the field only for selected generation/export scopes. Defaults to all. |
from | table fields only | Optional child-table source for a derived field. |
Defaults are written as strings because they are parsed through the same type-aware conversion path as source data.
from describes a field derived from matching rows in another table; see References and Derived Fields. Derived fields can be list<T>, T, or optional<T> and cannot declare default.
Enums, Structs, and Unions
These definitions let schemas model more than flat tables.
Enums
[[enums]]
name = "Rarity"
values = ["Common", "Uncommon", "Rare", "Epic", "Legendary"]
Enums keep source data readable while generated code receives a constrained type.
Aliases can keep imported or legacy names readable:
[[enums.aliases]]
name = "Purple"
alias = "Epic"
Structs
[[structs]]
name = "ResourceCost"
[[structs.fields]]
name = "kind"
type = "enum<ResourceKind>"
[[structs.fields]]
name = "id"
type = "i32"
[[structs.fields]]
name = "count"
type = "i32"
range = [1, 999999]
Use structs for nested values that appear in many places. A field can reference a struct with type = "struct<ResourceCost>".
Struct fields use the same field properties as table fields, including name, type, default, comment, range, length, parser, and scope. Table-specific properties such as key and from are not meaningful for normal struct fields. See Types for the full field reference.
In cell-based inputs, a struct field can be written as JSON object text by default:
{"kind":"Gold","id":0,"count":100}
For compact cells, declare parser = { kind = "tuple" } on the field that references the struct. Tuple values follow the struct field order:
Gold,0,100
Unions
Use a union when one field can contain different shapes. For example, an event condition might be either “quest completed” or “player has item”:
{"type":"QuestCompleted","quest_id":5002}
{"type":"HasItem","item_id":1001,"count":2}
The type value selects which variant is present. The rest of the fields depend on that variant.
[[unions]]
name = "RewardAction"
tag = "type"
[[unions.variants]]
name = "AddItem"
[[unions.variants.fields]]
name = "item_id"
type = "ref<Item.id>"
[[unions.variants.fields]]
name = "count"
type = "i32"
[[unions.variants]]
name = "UnlockStage"
[[unions.variants.fields]]
name = "stage_id"
type = "ref<Stage.id>"
Use unions when a field can contain one of several tagged shapes. Examples include conditions, rewards, triggers, and scripted actions.
The union tag defaults to type if omitted. Source data must include that tag with the variant name. The remaining fields must match the selected variant; unknown fields and missing non-optional variant fields are validation errors.
The most direct Excel or CSV form is JSON object text in one cell:
| Field type | Cell value |
|---|---|
union<RewardAction> | {"type":"AddItem","item_id":1001,"count":2} |
For a list of union values, declare parser = { kind = "json" } and write a JSON array:
[[tables.fields]]
name = "actions"
type = "list<union<RewardAction>>"
parser = { kind = "json" }
[
{"type":"AddItem","item_id":1001,"count":2},
{"type":"UnlockStage","stage_id":9002}
]
If you do not want JSON in Excel or CSV cells, a single union<T> field can be expanded into several columns. This action field is one union value:
[[tables.fields]]
name = "action"
type = "union<RewardAction>"
parser = { kind = "tagged_columns" }
The Excel sheet then has columns like this:
| A | B | C | D | E | F |
|---|---|---|---|---|---|
id | name | action.type | action.item_id | action.count | action.stage_id |
1 | Give Sword | AddItem | 1001 | 2 | |
2 | Open Stage | UnlockStage | 9002 |
action.type contains the variant name. An AddItem row fills only item_id and count; an UnlockStage row fills only stage_id. Columns for other variants stay empty.
tagged_columns is only valid on a field whose type is exactly union<T>; it cannot be applied directly to list<union<T>>. When a parent field needs several union values, put each union value in a child row and derive the parent list from that child table:
[[tables.fields]]
name = "actions"
type = "list<union<RewardAction>>"
from = { table = "EventActionEntry", parent_key = "id", child_key = "event_id", field = "value", order_by = "seq" }
[[tables]]
name = "EventActionEntry"
mode = "list"
[[tables.fields]]
name = "event_id"
type = "ref<EventRule.id>"
[[tables.fields]]
name = "seq"
type = "i32"
[[tables.fields]]
name = "value"
type = "union<RewardAction>"
parser = { kind = "tagged_columns", prefix = "" }
The parent EventRule sheet keeps ordinary columns:
| A | B |
|---|---|
id | name |
1 | First Event |
The child EventActionEntry sheet stores one action per row:
| A | B | C | D | E | F |
|---|---|---|---|---|---|
event_id | seq | type | item_id | count | stage_id |
1 | 1 | AddItem | 1001 | 2 | |
1 | 2 | UnlockStage | 9002 |
On export, EventRule.actions receives two union values ordered by seq. The prefix = "" option makes the child table columns use plain names such as type, item_id, count, and stage_id; do not use an empty prefix if those names conflict with other fields on the same table.
See Cell Parsers for the exact column rules.
In TOML data files, unions can be written as normal nested tables:
[[rows]]
id = 1
condition = { type = "QuestCompleted", quest_id = 5002 }
actions = [
{ type = "AddItem", item_id = 1001, count = 2 },
{ type = "UnlockStage", stage_id = 9002 },
]
References and Derived Fields
References let one table point to another table’s primary key. Derived fields copy or assemble data from matching rows in another table.
| Feature | What source data stores | What runtime model gets |
|---|---|---|
ref<Item.id> | The target row id, such as 1001. | The id value or a target-specific wrapper. |
from = { ... } | Rows stay in a child table. | The parent row receives a copied/nested value. |
Use ref when the relationship itself should remain an id. Use from when exported data should contain a convenient nested field.
The target of a ref must be a mode = "map" table, and the referenced field must be that table’s key.
References
[[tables.fields]]
name = "required_item"
type = "ref<Item.id>"
Sora validates that every value points to an existing row in the referenced table.
References are still stored as values in source data. The generated runtime can expose them as key values or target-specific wrapper types depending on the language backend.
References can be nested in containers such as list<ref<Item.id>>, set<ref<Item.id>>, or optional<ref<Item.id>>. The same primary-key rule applies to the inner ref.
Derived Fields
A derived field is not read from the current table’s cell. It is built from matching rows in another table.
This keeps editable data normalized while generated runtime models can expose convenient nested values. For example, quest rewards can be stored as two tables:
Quest:
| id | name |
|---|---|
| 1001 | First Quest |
| 1002 | Second Quest |
QuestReward:
| quest_id | sort_order | item_id | count |
|---|---|---|---|
| 1001 | 1 | 2001 | 10 |
| 1001 | 2 | 2002 | 1 |
| 1002 | 1 | 2003 | 5 |
At runtime, Quest may want a direct rewards: list<Reward> field. Declare that the field comes from QuestReward:
[[structs]]
name = "Reward"
[[structs.fields]]
name = "item_id"
type = "ref<Item.id>"
[[structs.fields]]
name = "count"
type = "i32"
[[tables]]
name = "Quest"
mode = "map"
key = "id"
[[tables.fields]]
name = "id"
type = "i32"
[[tables.fields]]
name = "name"
type = "string"
[[tables.fields]]
name = "rewards"
type = "list<struct<Reward>>"
from = { table = "QuestReward", parent_key = "id", child_key = "quest_id", order_by = "sort_order" }
[[tables]]
name = "QuestReward"
mode = "list"
[[tables.fields]]
name = "quest_id"
type = "ref<Quest.id>"
[[tables.fields]]
name = "sort_order"
type = "i32"
[[tables.fields]]
name = "item_id"
type = "ref<Item.id>"
[[tables.fields]]
name = "count"
type = "i32"
This means:
from.table = "QuestReward": read matching rows from theQuestRewardchild table.from.parent_key = "id": use the parent row’sQuest.idvalue for matching.from.child_key = "quest_id": match child rows whereQuestReward.quest_idequals the parent key.from.order_by = "sort_order": when several child rows match, sort them by the child table’ssort_orderfield in ascending order.
With the example data above, Quest.id = 1001 receives two reward rows, ordered as 2001, then 2002.
The exported parent row is shaped as if rewards had been written directly on Quest:
{
"id": 1001,
"name": "First Quest",
"rewards": [
{"item_id": 2001, "count": 10},
{"item_id": 2002, "count": 1}
]
}
The field type controls how many child rows may match:
| Field type | Match count | Result when no row matches |
|---|---|---|
list<T> | zero or more | empty list |
optional<T> | zero or one | null |
T | exactly one | validation error |
If T or optional<T> matches more than one child row, Sora reports an error.
Copying One Child Field
Without from.field, Sora assembles a struct from child table fields with the same names as the struct fields.
When the parent should receive one field from the child row instead, set from.field:
[[unions]]
name = "EventCondition"
tag = "type"
[[unions.variants]]
name = "QuestCompleted"
[[unions.variants.fields]]
name = "quest_id"
type = "ref<Quest.id>"
[[unions.variants]]
name = "HasItem"
[[unions.variants.fields]]
name = "item_id"
type = "ref<Item.id>"
[[unions.variants.fields]]
name = "count"
type = "i32"
[[tables.fields]]
name = "condition"
type = "union<EventCondition>"
from = { table = "EventConditionEntry", parent_key = "id", child_key = "event_id", field = "value" }
[[tables]]
name = "EventConditionEntry"
mode = "list"
[[tables.fields]]
name = "event_id"
type = "ref<Event.id>"
[[tables.fields]]
name = "value"
type = "union<EventCondition>"
parser = { kind = "tagged_columns", prefix = "" }
This means Event.condition receives EventConditionEntry.value for the child row whose event_id matches Event.id. The child table may still contain helper columns such as id, event_id, notes, or sort fields; only the value field named by from.field is copied into the parent field.
In Excel, EventConditionEntry can look like this:
| A | B | C | D | E |
|---|---|---|---|---|
event_id | type | quest_id | item_id | count |
1 | QuestCompleted | 5002 | ||
2 | HasItem | 1001 | 2 |
From Options
The from object has these options:
| Option | Required | Meaning |
|---|---|---|
table | yes | Child table name. Sora scans this table for matching rows. |
parent_key | yes | Field name on the parent table. Each parent row uses this field value for matching. |
child_key | yes | Field name on the child table. A child row is selected when this value equals the parent key. |
field | no | Field name on the child table. When present, Sora copies this field’s value instead of assembling a struct from the child row. |
order_by | no | Field name on the child table. When present, matched child rows are sorted by this field in ascending order. |
order_by is a field name, not an expression. There is no desc, multi-field ordering, filtering, or custom sort syntax. If order_by is omitted, matched rows keep the source table read order.
The order_by field must exist on the child table. It is usually an i32 ordering field such as sort_order, seq, or rank. Sorting is ascending.
Without from.field, the derived value type must be a struct, either list<struct<...>>, struct<...>, or optional<struct<...>>. Struct fields are copied from child table fields with the same names:
[[structs]]
name = "Reward"
[[structs.fields]]
name = "item_id"
type = "ref<Item.id>"
[[structs.fields]]
name = "count"
type = "i32"
Here Reward.item_id and Reward.count must both exist as compatible fields on QuestReward.
With from.field, the derived value type must be compatible with that child field. For example, type = "union<EventCondition>" can derive from a child field value whose type is also union<EventCondition>.
A derived field cannot also declare default. Its value comes from matched child rows.
Multiple Derived Fields from One Child Table
Several parent tables can derive fields from the same child table. This does not consume or move child rows. It reads the child table and copies matching values into each parent field.
For example, both Quest and QuestPreview can receive rewards from QuestReward:
[[tables]]
name = "Quest"
mode = "map"
key = "id"
[[tables.fields]]
name = "rewards"
type = "list<struct<Reward>>"
from = { table = "QuestReward", parent_key = "id", child_key = "quest_id", order_by = "sort_order" }
[[tables]]
name = "QuestPreview"
mode = "map"
key = "id"
[[tables.fields]]
name = "rewards"
type = "list<struct<Reward>>"
from = { table = "QuestReward", parent_key = "id", child_key = "quest_id", order_by = "sort_order" }
If both Quest.id = 1001 and QuestPreview.id = 1001 exist, both parent rows receive the reward list from QuestReward.quest_id = 1001. Sora does not mark the child row as already used by Quest, and it does not remove the row from QuestReward.
Cell Parsers
Parsers are only for cell-based inputs such as Excel and CSV. Most parsers tell Sora how to turn one cell into a typed value; projection parsers such as columns and tagged_columns tell Sora how one field maps to several input columns. String default values use the same parser path for single-cell parsers. TOML row data can usually use native TOML arrays and tables instead.
Use a parser when the default cell format is too verbose or ambiguous:
[[tables.fields]]
name = "tags"
type = "list<string>"
parser = { kind = "split", separator = "|" }
With that schema, the cell value is:
starter|melee|weapon
Parser options are string values. Unknown parser kinds, unsupported options, and empty option values fail during schema normalization. The exception is projection prefixes such as columns.prefix and tagged_columns.prefix, where "" is meaningful.
Custom Lua Parsers
Projects can load project-local Lua parser scripts from project.toml:
[parsers]
scripts = ["tools/parsers.lua"]
Script paths are resolved relative to the project file. After that, every command that reads the project can use the custom parsers without repeating command-line flags:
sora build --project project.toml
sora export --project project.toml --data-root data --format json --out generated/config.json
CLI commands can also load temporary parser scripts with the global --parser-script option:
sora --parser-script tools/parsers.lua build --project project.toml
sora --parser-script tools/parsers.lua export --project project.toml --data-root data --format json --out generated/config.json
The option can be repeated and is appended after project-configured scripts. Custom parsers are trusted project code. Sora loads them with a limited Lua standard library and does not expose io, os, package, or debug.
A parser script returns a table with parsers. Each parser must define parse(cell, ctx). options is the list of supported parser options. validate(field) is optional and runs during schema normalization.
return {
parsers = {
slug = {
options = { "prefix" },
validate = function(field)
if field.type ~= "string" then
error("slug parser requires string")
end
end,
parse = function(cell, ctx)
local text = string.lower(string.gsub(cell.text, "%s+", "-"))
if ctx.options.prefix ~= nil then
return ctx.options.prefix .. text
end
return text
end,
},
},
}
Schema fields use the custom parser by name:
[[tables.fields]]
name = "tag"
type = "string"
parser = { kind = "slug", prefix = "item-" }
cell contains kind, text, and value where applicable. ctx contains field, type, options, path, and location fields such as row, column, and sheet for worksheets. Lua return values map to Sora data values: nil, booleans, integers, floats, strings, array-like tables, and string-keyed tables.
Custom Lua parsers are single-cell parsers. They do not replace projection parsers such as columns or tagged_columns, cannot read neighboring cells, and do not change schema, source loading, or generated runtime behavior.
Default Parsing
If a field has no parser, Sora uses type-aware default parsing:
| Type | Cell format |
|---|---|
bool | Boolean cells, true, false, or numeric cells where zero is false and non-zero is true. |
i32, i64, ref<Table.key> | Integer cells, integer text, or whole-number float cells. |
duration | Duration text using d, h, m, s, or ms, for example 500ms, 30s, or 1h 30m. Units must be ordered from largest to smallest. |
f32, f64 | Numeric cells or numeric text. |
string, enum<Name> | Cell display text. |
struct<Name>, union<Name> | JSON object text. |
list<T>, set<T>, array<T,N> | Comma-separated text. Use json for JSON arrays. |
map<K,V> | JSON array of two-item pairs, for example [["atk",10],["hp",20]]. |
optional<T> | Empty cell becomes null; otherwise the inner T is parsed. |
Default collection parsing is intentionally simple. Primitive items are parsed by type. Struct and union collection items must be JSON object text. Nested collections cannot be represented safely with one separator; use parser = { kind = "json" }.
Parser Summary
| Parser | Valid target types | Cell shape |
|---|---|---|
split | list<T>, set<T>, array<T,N>, or optional around those types | a,b,c |
tuple | struct<T> or optional<struct<T>> | Gold,0,100 |
columns | struct<T> or optional<struct<T>> | Multiple columns |
tuple_list | list<struct<T>>, set<struct<T>>, array<struct<T>,N>, or optional around those types | Gold,0,100|Gem,0,5 |
map | map<K,V> or optional<map<K,V>> | atk,10|hp,20 |
tagged_columns | union<T> only | Multiple columns |
json | Any type | JSON value matching the field type |
array<T,N> checks the parsed item count. tuple checks the value count against the referenced struct’s field count.
split
Use split for a flat collection of primitive values, enums, refs, or simple values that can be separated reliably.
[[tables.fields]]
name = "starter_items"
type = "list<ref<Item.id>>"
parser = { kind = "split" }
Cell:
1001,1002,1003
Parsed value:
[1001,1002,1003]
Use separator when comma is not a good separator:
[[tables.fields]]
name = "tags"
type = "set<string>"
parser = { kind = "split", separator = "|" }
Cell:
starter|melee|weapon
tuple
Use tuple when a single struct is small enough to fit naturally in one cell. Values follow the referenced struct’s field declaration order.
[[structs]]
name = "ResourceCost"
[[structs.fields]]
name = "kind"
type = "enum<ResourceKind>"
[[structs.fields]]
name = "id"
type = "i32"
[[structs.fields]]
name = "count"
type = "i32"
[[tables.fields]]
name = "price"
type = "struct<ResourceCost>"
parser = { kind = "tuple" }
Cell:
Gold,0,100
Parsed value:
{"kind":"Gold","id":0,"count":100}
Use separator if struct values themselves commonly contain commas:
parser = { kind = "tuple", separator = "|" }
Cell:
Gold|0|100
columns
Use columns when one struct should be edited as normal Excel or CSV columns instead of as JSON or one compact tuple cell. It is valid on struct<T> and optional<struct<T>> table fields.
[[structs]]
name = "ResourceCost"
[[structs.fields]]
name = "kind"
type = "enum<ResourceKind>"
[[structs.fields]]
name = "id"
type = "i32"
[[structs.fields]]
name = "count"
type = "i32"
[[tables.fields]]
name = "price"
type = "struct<ResourceCost>"
parser = { kind = "columns", prefix = "price_" }
CSV headers and row:
id,name,price_kind,price_id,price_count
1,Iron Sword,Gold,0,100
Parsed price value:
{"kind":"Gold","id":0,"count":100}
With the default prefix, a field named price projects columns such as price.kind, price.id, and price.count. Use prefix = "" only when the struct field names should live at the table’s top level. Sora rejects projected column name conflicts.
columns does not recursively project nested structs or unions. If a projected struct field is itself complex, either give that child field a single-cell parser such as tuple, split, map, or json, or move the nested data into a dedicated table and connect it with ref or a derived field. This keeps the spreadsheet narrow and keeps complex records reusable.
For generated XLSX templates, columns projected from the same columns field share the same header color.
tuple_list
Use tuple_list for a list of small structs. separator splits fields inside one struct item. item_separator splits items in the list.
[[tables.fields]]
name = "materials"
type = "list<struct<ResourceCost>>"
parser = { kind = "tuple_list" }
Cell:
Item,2003,4|Gold,0,1000
Parsed value:
[
{"kind":"Item","id":2003,"count":4},
{"kind":"Gold","id":0,"count":1000}
]
Custom separators:
parser = { kind = "tuple_list", separator = ":", item_separator = ";" }
Cell:
Item:2003:4;Gold:0:1000
map
Use map when a map is simple enough to write as repeated key/value pairs. separator splits key from value. item_separator splits map entries.
[[tables.fields]]
name = "attributes"
type = "map<string,i32>"
parser = { kind = "map" }
Cell:
atk,10|hp,20
Parsed value:
[["atk",10],["hp",20]]
Sora exports maps as pair arrays so non-string keys remain unambiguous. If you prefer JSON cell syntax, use parser = { kind = "json" } and write the same pair-array shape:
[["atk",10],["hp",20]]
tagged_columns
Use tagged_columns when one union<T> value should be edited across multiple Excel or CSV columns. It is only valid on a table field whose type is exactly union<T>. It is intentionally not valid for optional<union<T>>, list<union<T>>, set<union<T>>, or other containers.
[[unions]]
name = "EventCondition"
tag = "type"
[[unions.variants]]
name = "QuestCompleted"
[[unions.variants.fields]]
name = "quest_id"
type = "ref<Quest.id>"
[[unions.variants]]
name = "HasItem"
[[unions.variants.fields]]
name = "item_id"
type = "ref<Item.id>"
[[unions.variants.fields]]
name = "count"
type = "i32"
[[tables.fields]]
name = "value"
type = "union<EventCondition>"
parser = { kind = "tagged_columns", prefix = "" }
CSV headers and rows:
id,type,quest_id,item_id,count
1,QuestCompleted,5002,,
2,HasItem,,1001,2
The tag column contains the union variant name. Only fields for the selected variant may contain values. With the default prefix, a field named condition projects columns such as condition.type, condition.quest_id, and condition.item_id. Use prefix = "" only when the projected columns should live at the table’s top level.
Sora rejects projected column name conflicts, for example a normal table field named type plus prefix = "" for a union whose tag is also type.
tagged_columns also does not recursively project nested structs or nested unions inside variant fields. Variant fields can still use single-cell parsers such as tuple, split, map, or json. If a variant needs a large nested object or repeated nested objects, model that data as a dedicated table and reference or derive it instead of widening the union row.
For generated XLSX templates, columns projected from the same tagged_columns field share the same header color. The tag column uses the same color group with stronger emphasis.
json
Use json for nested values, unions inside containers, nested collections, and any shape that needs explicit escaping.
[[tables.fields]]
name = "actions"
type = "list<union<RewardAction>>"
parser = { kind = "json" }
Cell:
[
{"type":"AddItem","item_id":1007,"count":3},
{"type":"UnlockStage","stage_id":9002}
]
For one union value:
[[tables.fields]]
name = "condition"
type = "union<EventCondition>"
parser = { kind = "json" }
Cell:
{"type":"QuestCompleted","quest_id":5002}
For map<K,V>, JSON uses an array of pairs, not a JSON object:
[["atk",10],["hp",20]]
Choosing a Parser
| Need | Prefer |
|---|---|
| Flat list of primitive values | split |
| One compact struct | tuple |
| One struct spread across columns | columns |
| Repeated compact structs | tuple_list |
| Simple key/value pairs | map |
| One union spread across columns | tagged_columns |
| Nested values, unions in containers, escaping, or JSON-shaped cells | json |
Project Config
The project manifest can be used as a simple schema root or as a full build description. It can be written as TOML, YAML, JSON, or Lua; examples on this page use TOML.
package = "game_config"
includes = ["schema/items.toml"]
[parsers]
scripts = ["tools/parsers.lua"]
[type_mappings]
scripts = ["tools/type_mappings.lua"]
[build]
default_source_format = "xlsx"
data_root = "data"
schema_lock = "generated/schema.lock"
excel_templates = "generated/excel"
[[build.codegen]]
target = "rust"
out = "rust/src/generated"
format = "auto"
[[build.exports]]
format = "binary"
out = "generated/config.sora"
Run every configured output:
sora build --project project.toml
data_root and excel_templates serve different purposes. data_root is the input directory used by export and build, so it contains edited table rows. excel_templates is an output directory for generated workbook templates, so it can be deleted and regenerated after schema changes. Do not point excel_templates at your edited data directory unless replacing those workbooks is intentional.
[parsers].scripts lists custom Lua cell parser scripts used by CLI commands that read the project. Paths are relative to the project file. See Cell Parsers for the script API.
[type_mappings].scripts lists Lua scripts that customize generated language types. Paths are relative to the project file. Type mappings are codegen-only: the schema still uses language-neutral Sora types such as struct<Vec3>, while the mapping script can map that named type to a target-specific type.
Localization is declared at the project root with [localization]. Its sources are independent from normal [[tables]]; see Localization.
Run one configured codegen target:
sora build --project project.toml --target rust
Target Options
Language-specific options live under [codegen.<target>]:
[codegen.rust]
runtime_format = "sora"
[codegen.typescript]
runtime_format = "json"
enum_repr = "string"
[codegen.lua]
runtime_format = "cbor"
lua_version = "5.4"
These options are consumed by the selected generator. The normalized IR stays language-neutral.
Type mapping scripts return a table with type_mappings. Each mapping targets one language and one named schema type:
return {
type_mappings = {
{
target = "csharp",
schema_type = "Vec3",
type_name = "Vector3",
nullable_type_name = "Vector3?",
decode = "GameMappings.ToVector3({value})",
value_decode = "GameMappings.ToVector3({value})",
imports = { "UnityEngine" },
},
},
}
nullable_type_name is optional. Use it when optional<schema_type> needs a different target-language type expression from the backend’s default nullable wrapper.
decode wraps the normal binary runtime decode expression, and value_decode wraps JSON/CBOR/protobuf-style value decode. The {value} placeholder is replaced with the generated default expression.
The C target uses write-into decode functions, so C mappings should use decode_into instead of decode. The {target} placeholder is replaced with the output pointer expression. C mappings can also provide free, where {target} is replaced with the pointer that should be released:
{
target = "c",
schema_type = "Vec3",
type_name = "game_vector3",
decode_into = "game_vector3_decode(reader, {target})",
free = "game_vector3_free({target});",
imports = { "#include \"vector3.h\"" },
}
imports is target-specific and is only emitted by language generators that need it. C#, Java, Kotlin, and Scala expect an import namespace/path without the leading keyword. Go expects an import spec such as "example.com/game/vector". Python, TypeScript, JavaScript, Dart, Godot, C, C++, and Rust expect a complete import/include/use/preload line.
runtime_format can be sora, json, cbor, or sora-protobuf, but not every target supports every runtime format. See Runtime Formats for the support matrix.
Built-In Target Options
| Target | Options |
|---|---|
rust | runtime_format default sora; map_type = "std" or "fx_hash_map" default std; string_storage = "owned" or "arc" default owned. |
kotlin | runtime_format default sora. |
csharp | runtime_format default sora. |
java | runtime_format default sora; nullable_annotation defaults to SoraNullable, set an annotation class such as org.jetbrains.annotations.Nullable, or set "" to disable annotations. |
scala | runtime_format default sora; scala_version = "2.12", "2.13", or "3" default 3. |
go | runtime_format default sora. |
dart | runtime_format = "json", "cbor", or "sora-protobuf". Set this explicitly; sora is not supported for Dart. |
godot | runtime_format = "json". Set this explicitly; it is the only supported Godot runtime format. |
c | runtime_format = "sora"; c_standard = "c99", "c11", "c17", or "c23" default c11; prefix optional symbol prefix. |
cpp | runtime_format = "sora"; cpp_standard = "c++11", "c++14", "c++17", "c++20", or "c++23" default c++17; namespace optional C++ namespace. |
typescript | runtime_format default sora; enum_repr = "string" or "integer" default string. |
javascript | runtime_format default sora; enum_repr = "string" or "integer" default string; emit_dts boolean default true. |
erlang | runtime_format default sora; enum_repr = "atom" or "integer" default atom. |
lua | runtime_format default sora; module optional require/import prefix; lua_version = "5.1", "5.2", "5.3", "5.4", or "luajit" default 5.4; enum_repr = "string" or "integer" default string. |
python | runtime_format default sora. |
proto-schema | No target options. Generates .proto schema files instead of a runtime loader. |
Example with several language-specific options:
[codegen.rust]
runtime_format = "sora"
map_type = "fx_hash_map"
string_storage = "arc"
[codegen.cpp]
runtime_format = "sora"
cpp_standard = "c++20"
namespace = "game::config"
[codegen.javascript]
runtime_format = "json"
enum_repr = "integer"
emit_dts = true
Localization
Sora treats translated text as a separate locale catalog, not as a normal config table.
Business config stores text keys with the text type. Locale source sheets provide translations for those keys. Runtime code loads the normal config bundle and mounts one or more locale packs separately.
business tables -> config bundle
localization sources -> LocaleCatalog -> i18n locale packs
Text Keys
Use text for fields that point to localized copy:
[[tables.fields]]
name = "title_key"
type = "text"
[[tables.fields]]
name = "body_keys"
type = "list<text>"
text is a key, not the translated text itself. Source data should contain values such as quest.1001.title or ui.confirm. Generated code exposes this as a TextKey where the target language has a distinct generated runtime type.
The catalog validator checks every text value in business data. A missing key or empty translation is a build error.
Catalog Sources
Declare localization at the project schema root:
[localization]
locales = ["zh_cn", "en_us"]
default_locale = "zh_cn"
fallback_locale = "en_us"
[[localization.sources]]
name = "ui"
file = "Core.xlsx"
sheet = "UILocalization"
[[localization.sources]]
name = "quest"
file = "Quest.xlsx"
sheet = "QuestLocalization"
Each source is a wide table. The default key column is key:
| key | zh_cn | en_us | note |
|---|---|---|---|
ui.confirm | 确认 | Confirm | button label |
quest.1001.title | 第一章 | Chapter One | quest title |
Locale columns named in locales are exported into locale packs. Other columns, such as note, are editor-only metadata and are ignored by runtime packs.
Rules:
| Rule | Behavior |
|---|---|
source.name | Must be an ASCII identifier. It is used for diagnostics and organization, not as a key prefix. |
key values | May use dotted names such as quest.1001.title. |
| Multiple sources | All sources merge into one logical catalog. |
| Duplicate keys | Build error. Keys are globally unique across all sources. |
| Missing locale column | Build error. |
| Empty translation | Build error. |
Use key = "id" on a source if the key column is not named key:
[[localization.sources]]
name = "ui"
file = "Core.xlsx"
sheet = "UILocalization"
key = "id"
Export Locale Packs
Normal exports (binary, json, cbor, sora-protobuf, proto) contain business data and text keys only. They do not include translation text.
Add i18n exports in the build manifest:
[[build.exports]]
format = "binary"
out = "generated/config.sora"
[[build.exports]]
format = "i18n-binary"
out = "generated/i18n/zh_cn.sora-i18n"
locale = "zh_cn"
[[build.exports]]
format = "i18n-json"
out = "generated/i18n/en_us.json"
locale = "en_us"
Use i18n-binary for production locale packs. Use i18n-json for inspection, external translation handoff, or tests.
Runtime Mounting
Generated runtimes load config and locale packs separately. In Rust:
#![allow(unused)]
fn main() {
let config = SoraConfig::from_bytes(config_bytes)?;
let pack = generated::runtime::LocalePack::from_bytes(locale_bytes)?;
let mut i18n = generated::SoraI18n::new();
i18n.mount(&config, pack)?;
i18n.set_locale("zh_cn")?;
let mail = config.mail_template().get(&1001).unwrap();
let title = i18n.text(&mail.title_key);
let body = i18n.format(&mail.body_key, [("count", 100)])?;
}
Mounting validates:
| Check | Purpose |
|---|---|
schema_fingerprint | Prevents loading a locale pack generated for a different schema. |
| locale declaration | Rejects packs for locales not declared in [localization].locales. |
| text keys | Rejects packs that miss keys used by this config or contain empty text. |
| mounted locale | set_locale fails until a pack for that locale has been mounted. |
Business code does not know which source sheet a key came from. It looks up TextKey values with the mounted i18n runtime.
Sora Studio
Sora Studio is the browser-based schema editor embedded in the sora CLI. It is meant for inspecting and editing project schemas without running a separate frontend server.
Start it with a project file:
sora studio --project project.toml
By default Studio binds to 127.0.0.1:5174 and prints the local URL. Use --host or --port when that address is not suitable:
sora studio --project project.toml --port 5180
What It Edits
Studio loads the project file and every schema module listed in includes. Project files and schema modules can be TOML, YAML, JSON, or Lua, and a project can mix those formats.
The editor can update:
- project package name and schema include list;
- schema module files, including creating and removing included files;
- tables, structs, enums, and unions;
- table fields, struct fields, enum values, and union variants;
- table mode, primary key, source settings, parser settings, defaults, comments, range and length constraints;
- reference fields and derived child-table fields.
Studio is a schema editor, not a row-data editor. Excel, CSV, TOML, JSON, and YAML table rows are still edited in their source files and validated by sora check, sora export, or sora build.
Visualization
The main canvas shows schema nodes and their relationships:
- type edges for fields that use enums, structs, or unions;
- reference edges for
ref<Table>fields; - derived edges for child-table fields assembled from another table.
The sidebar can filter schemas by name, shows project summary counts, and groups nodes by kind. Diagnostics are shown in the UI so an invalid schema can be identified from Studio instead of making the whole editor unusable.
Preview and Save
Use preview before saving to review the files Studio will write. Studio renders each changed project or schema file in its own format:
.tomlfiles are written as TOML;.yamland.ymlfiles are written as YAML;.jsonfiles are written as pretty JSON;.luafiles are written as data-returning Lua tables.
Saving normalizes the touched files through Studio’s renderer. This is intentional: Studio keeps the schema data model stable, but it does not preserve comments, exact whitespace, or hand-written ordering inside the edited files. Review the preview before committing.
Delivery
Release builds embed the Studio frontend assets into the sora binary. End users only need the CLI from GitHub Releases or crates.io; they do not need Node.js or a local Vite server.
For release maintainers, build the frontend before building the CLI binary:
cd apps/studio
npm run build
cd ../..
cargo build -p sora-cli --release
If the embedded assets are missing, sora studio reports that apps/studio needs to be built before the CLI.
CLI Reference
Use sora --help for the installed binary’s exact help text, and sora <command> --help for command-specific options. This page summarizes the common workflow commands, aliases, and short flags.
Global Options
Global options can be placed before or after the subcommand.
| Option | Description |
|---|---|
-j, --jobs <N> | Maximum worker threads. Must be greater than zero. |
--serial | Disable parallel execution. |
--parser-script <PATH> | Load a custom Lua cell parser script. Can be repeated. Project-level parser scripts can also be configured in [parsers].scripts in project.toml. |
--type-mapping-script <PATH> | Load a custom Lua type mapping script. Can be repeated. Project-level scripts can also be configured in [type_mappings].scripts in project.toml. |
-h, --help | Print help. |
-V, --version | Print the CLI version. |
Command Aliases
| Command | Aliases |
|---|---|
build | b |
check | c |
init | i |
gen | g |
export | e |
diff | d |
excel-template | template, et |
excel-sync | sync, es |
schema-lock | lock, sl |
studio | st |
Common Short Flags
| Short | Long | Used by |
|---|---|---|
-p | --project | Project-reading commands. |
-o | --out | init, gen, export, diff, excel-template, schema-lock. |
-s | --scope | build, gen, export, diff, excel-template, excel-sync, schema-lock. |
-t | --target | build, gen. |
-f | --format | export. |
-d | --data-root | build, export, excel-sync. |
-l | --lock, --left-root | check, diff. |
-r | --right-root | diff. |
-c | --clean | build. |
-w | --write | excel-sync. |
Commands
init
Create a new project scaffold.
sora init --out my-config --schema-format toml
sora i -o my-config --schema-format yaml
| Option | Description |
|---|---|
-o, --out <DIR> | Output directory for the scaffold. |
| `–schema-format <toml | yaml |
--force | Allow writing into an existing scaffold path. |
check
Validate a project schema, optionally against a schema lock.
sora check --project project.toml
sora c -p project.toml -l generated/schema.lock
| Option | Description |
|---|---|
-p, --project <PATH> | Project manifest path. |
-l, --lock <PATH> | Existing schema lock to verify against. |
build
Run outputs declared in [build] in project.toml, such as schema locks, Excel templates, codegen, and exports.
sora build --project project.toml
sora b -p project.toml -t rust -c
| Option | Description |
|---|---|
-p, --project <PATH> | Project manifest path. |
| `–default-source-format <csv | json |
-d, --data-root <DIR> | Data input root. Overrides [build].data_root. |
-s, --scope <NAME> | Build only schema items included in a scope. |
-t, --target <NAME> | Codegen target to run. Can be repeated. |
-c, --clean | Delete selected generated outputs before rebuilding. |
gen
Generate code for one target directly, without using [build.codegen].
sora gen --target rust --project project.toml --out generated/rust
sora g -t typescript -p project.toml -o generated/typescript
| Option | Description |
|---|---|
-t, --target <NAME> | Codegen target, such as rust, typescript, or python. |
-p, --project <PATH> | Project manifest path. |
-o, --out <DIR> | Output directory. |
| `–format-code <never | auto |
-s, --scope <NAME> | Generate only schema items included in a scope. |
export
Load table data and export runtime data.
sora export --project project.toml --data-root data --format json --out generated/config.json
sora e -p project.toml -d data -f binary -o generated/config.sora
| Option | Description |
|---|---|
-f, --format <NAME> | Export format, such as binary, json, debug-json, cbor, sora-protobuf, or typed-protobuf. |
| `–default-source-format <csv | json |
-p, --project <PATH> | Project manifest path. |
-d, --data-root <DIR> | Data input root. |
-o, --out <PATH> | Output file or directory, depending on export format. |
-s, --scope <NAME> | Export only schema items included in a scope. |
| `–compression <none | zstd>` |
--compression-level <N> | Compression level for compressed exports. |
diff
Compare two data roots using the same project schema.
sora diff --project project.toml --left-root old-data --right-root data --out generated/diff.json
sora d -p project.toml -l old-data -r data -o generated/diff.json
| Option | Description |
|---|---|
| `–default-source-format <csv | json |
-p, --project <PATH> | Project manifest path. |
-l, --left-root <DIR> | Baseline data root. |
-r, --right-root <DIR> | Changed data root. |
-o, --out <PATH> | Diff output path. |
-s, --scope <NAME> | Diff only schema items included in a scope. |
excel-template
Generate empty Excel workbooks from the schema. Use this for new workbooks, not for existing data files.
sora excel-template --project project.toml --out generated/excel
sora et -p project.toml -o generated/excel
| Option | Description |
|---|---|
-p, --project <PATH> | Project manifest path. |
-o, --out <DIR> | Output directory for generated workbooks. |
-s, --scope <NAME> | Generate templates only for schema items included in a scope. |
excel-sync
Preview or apply schema header updates to existing Excel data workbooks while preserving data rows. Removed schema fields stay as ignored legacy columns.
sora excel-sync --project project.toml --data-root data
sora es -p project.toml -d data -w
| Option | Description |
|---|---|
-p, --project <PATH> | Project manifest path. |
-d, --data-root <DIR> | Data workbook root. |
-s, --scope <NAME> | Sync only schema items included in a scope. |
-w, --write | Write workbook changes. Without this flag, the command previews changes only. |
schema-lock
Write a schema lock for the current normalized schema.
sora schema-lock --project project.toml --out generated/schema.lock
sora sl -p project.toml -o generated/schema.lock
| Option | Description |
|---|---|
-p, --project <PATH> | Project manifest path. |
-o, --out <PATH> | Schema lock output path. |
-s, --scope <NAME> | Lock only schema items included in a scope. |
studio
Start the embedded Sora Studio schema editor.
sora studio --project project.toml
sora st -p project.toml --port 5180
| Option | Description |
|---|---|
-p, --project <PATH> | Project manifest path. |
--host <IP> | Bind address. Defaults to 127.0.0.1. |
--port <PORT> | Port. Defaults to 5174. |
Data Export
Sora separates data export from language code generation.
The exporter receives validated data and writes a runtime bundle. Generated code then reads one of those bundle formats. This lets the same schema and data feed several languages or runtime storage choices.
The short version:
source data -> export format -> generated code runtime_format
For example, if generated Rust code uses runtime_format = "sora", the build must also write a binary export. Code generation decides how to read; export writes the file that will be read.
Built-in Exports
| Format | Purpose |
|---|---|
binary | Native sectioned Sora binary bundle. |
json-debug | Human-readable debug output for inspection. |
json | Runtime JSON bundle. |
cbor | Runtime CBOR bundle. |
sora-protobuf | Runtime Protobuf bundle using Sora’s value model. |
proto | Typed Protobuf bundle using a generated game-specific schema. |
i18n-binary | Binary locale pack for one locale. |
i18n-json | JSON locale pack for one locale. |
The binary export is selected by runtime_format = "sora" in codegen options.
Command Example
sora export \
--format binary \
--default-source-format xlsx \
--project project.toml \
--data-root data \
--out generated/config.sora
Build Manifest Example
Build manifests can declare multiple exports:
[[build.exports]]
format = "binary"
out = "generated/config.sora"
[[build.exports]]
format = "json-debug"
out = "generated/debug-json"
[[build.exports]]
format = "i18n-binary"
out = "generated/i18n/zh_cn.sora-i18n"
locale = "zh_cn"
When sora build runs, it checks that configured codegen targets have a matching export for their selected runtime format.
Localization packs are separate runtime assets and are mounted by the generated i18n runtime. See Localization.
Export Formats
Export formats are runtime bundle formats. They are independent from source formats such as Excel, CSV, TOML, JSON, or YAML.
| Export | Codegen Runtime Format | Output Shape | Use When |
|---|---|---|---|
binary | sora | Native sectioned binary bundle. | You want a compact self-contained Sora runtime. |
json | json | Runtime JSON bundle. | You want easy inspection or simple platform integration. |
cbor | cbor | Runtime CBOR bundle. | You want a compact general-purpose binary value format. |
sora-protobuf | sora-protobuf | Sora value model encoded with Protobuf. | You want Protobuf-based transport without per-game .proto models. |
proto | none | Typed Protobuf bundle using the generated game-specific schema. | You want a business .proto contract for external tooling. |
json-debug | none | Per-table debug JSON. | You want reviewable output for inspection and tests. |
i18n-binary | none | Native binary locale pack for one locale. | You want production localization packs mounted separately from config. |
i18n-json | none | Debug JSON locale pack for one locale. | You want reviewable text for translation handoff or tests. |
Example build outputs:
[[build.exports]]
format = "binary"
out = "generated/config.sora"
[[build.exports]]
format = "json"
out = "generated/config.json"
[[build.exports]]
format = "json-debug"
out = "generated/debug-json"
[[build.exports]]
format = "i18n-binary"
out = "generated/i18n/zh_cn.sora-i18n"
locale = "zh_cn"
Generated runtimes only load runtime formats they support. json-debug is for humans and tools, not generated runtime loading.
Localization exports require [localization] and a locale in the build manifest. See Localization.
Code Generation
Code generation turns the normalized schema IR into target-language row types, table containers, and config loaders.
It is driven by a registry of language generators.
Each generator declares:
- a target id and aliases;
- display metadata;
- supported runtime formats;
- optional formatter integration;
- a
CodeGeneratorimplementation.
This lets built-in languages and downstream generators use the same pipeline shape.
schema files -> schema model -> normalized IR -> generator registry -> target generator -> files
Generate a target directly:
sora gen --target typescript --project project.toml --out generated/typescript
Or declare it in the build manifest:
[[build.codegen]]
target = "typescript"
out = "typescript/generated"
format = "auto"
format can be never, auto, or required. auto runs a known formatter when it is available. required fails if the formatter is missing or returns an error.
Runtime Format
Each target can choose a runtime format:
[codegen.typescript]
runtime_format = "json"
The selected runtime format controls the loader code emitted for that target. It does not change the schema or the source data.
Generated Shape
Generated code generally contains:
- enums for schema enums;
- record types for structs, union variants, and table rows;
- table containers for
map,list, andsingletontables; - lookup helpers for keys and indexes where supported;
- a top-level config loader for the selected runtime format.
Generated identifiers follow target-language conventions while runtime data lookup keeps using the original schema names. See Identifier Naming.
Schema optional<T> is mapped to the target language’s strongest available nullability representation. See Nullability.
Identifier Naming
Schema names are the source of truth. Generated code adapts those names to each target language’s naming style, while runtime data lookup keeps using the original schema names.
For example, a schema field named max_stack may become maxStack in TypeScript, MaxStack in C#, and max_stack in Rust. The generated decoder still reads the field named max_stack from the runtime bundle.
Naming Pipeline
Sora derives common name forms from each schema name before language generation:
| Form | Example from max_stack | Common use |
|---|---|---|
| Raw | max_stack | Runtime table names, field names, enum text values, union tags. |
| Pascal | MaxStack | Types, classes, enum variants, exported symbols. |
| Camel | maxStack | Fields, properties, parameters, methods in camel-case languages. |
| Snake | max_stack | Files, modules, fields, functions in snake-case languages. |
Language generators choose from these forms and may apply additional language-specific sanitization for invalid identifiers or reserved words.
Language Conventions
The built-in generators follow the target language’s normal public API style:
| Target | Types | Fields and accessors | Files/modules |
|---|---|---|---|
| Rust | PascalCase | snake_case | snake_case.rs |
| C | prefixed snake_case | snake_case | snake_case.h, snake_case.c |
| C++ | PascalCase | snake_case | snake_case.hpp |
| C# | PascalCase | PascalCase properties | PascalCase.cs |
| Go | PascalCase exported names | PascalCase exported fields | snake_case.go |
| Java | PascalCase | lowerCamelCase | PascalCase.java |
| Kotlin | PascalCase | lowerCamelCase | target layout |
| Scala | PascalCase | lowerCamelCase | PascalCase.scala |
| TypeScript | PascalCase | lowerCamelCase | snake_case.ts |
| JavaScript | PascalCase | lowerCamelCase | snake_case.js, snake_case.d.ts |
| Python | PascalCase | snake_case | snake_case.py |
| Dart | PascalCase | lowerCamelCase | snake_case.dart |
| Lua | PascalCase table-like types | lowerCamelCase | snake_case.lua |
| Erlang | snake_case modules | snake_case map keys/functions | snake_case.erl |
| Godot | PascalCase classes | snake_case | snake_case.gd |
This table describes generated code identifiers, not runtime data names.
Runtime Names Stay Raw
The following values keep the original schema spelling:
- table names in bundles and table metadata;
- field names read from runtime rows;
- enum string values;
- union variant tag values;
- schema lock and fingerprint input.
Changing a schema name changes the data contract. Changing only a target language’s generated identifier style should not.
Custom Type Mappings
Custom type mappings do not rename generated schema identifiers. They only control target-language type expressions, imports/includes, and optional conversion hooks.
Mapping function names are native code written by the user for that target language, so they should follow that language’s own naming convention. The mapping key remains the named schema type, such as Vec3.
Nullability
Schema nullability is expressed with optional<T>. Code generators map that schema type to the strongest nullability representation available in each target language.
Runtime bundles encode optional values with explicit presence. Generated code should preserve that distinction in its public API instead of relying on undocumented null conventions.
Built-In Representations
| Target | optional<T> representation |
|---|---|
| Rust | Option<T> |
| C# | T? with nullable reference types enabled |
| Kotlin | T? |
| Dart | T? |
| Scala | Option[T] |
| TypeScript | `T |
| JavaScript d.ts | `T |
| Python | `T |
| C++ | std::optional<T> for C++17 and newer; SoraOptional<T> for older standards |
| C | generated optional wrapper type with presence state |
| Go | *T |
| Erlang | `T |
| Lua | T? EmmyLua annotation |
| Godot | Variant with null |
| Java | nullable value type plus annotation |
Dynamic targets such as JavaScript, Lua, and Godot can only document nullability for tooling. Statically typed targets expose it in the generated type whenever the language supports that.
Java Annotations
Java has no standard nullable type syntax. Sora emits nullable Java fields, constructor parameters, and nullable lookup results with an annotation.
By default, Java generation uses a self-contained package-local SoraNullable annotation:
@SoraNullable
public final String nickname;
Projects that use a specific annotation package can configure it:
[codegen.java]
nullable_annotation = "org.jetbrains.annotations.Nullable"
Set nullable_annotation = "" to emit nullable Java values without annotations.
Custom Type Mappings
Type mapping scripts can provide nullable_type_name when the target language needs a different type expression for optional<YourType>:
{
target = "java",
schema_type = "UserId",
type_name = "int",
nullable_type_name = "Integer",
}
This only changes the generated type expression. The backend still controls how optional presence is decoded.
Runtime Formats
Select a runtime format per codegen target:
[codegen.rust]
runtime_format = "sora"
Runtime formats are the formats generated code can load. They correspond to export formats:
Codegen runtime_format | Required Export |
|---|---|
sora | binary |
json | json |
cbor | cbor |
sora-protobuf | sora-protobuf |
This setting does not change Excel, CSV, TOML, JSON, YAML, or schema files. It only changes the loader generated for the target language. The selected runtime format must have a matching export in the project build.
Support Matrix
| Target | sora | json | cbor | sora-protobuf |
|---|---|---|---|---|
| Rust | self-contained | managed dependency | managed dependency | managed dependency |
| Kotlin | self-contained | managed dependency | managed dependency | managed dependency |
| C# | self-contained | managed dependency | managed dependency | managed dependency |
| Java | self-contained | managed dependency | managed dependency | managed dependency |
| Scala | self-contained | managed dependency | managed dependency | managed dependency |
| Go | self-contained | managed dependency | managed dependency | managed dependency |
| TypeScript | self-contained | managed dependency | managed dependency | managed dependency |
| JavaScript | self-contained | managed dependency | managed dependency | managed dependency |
| Python | self-contained | managed dependency | managed dependency | managed dependency |
| Dart | not supported | standard library | user adapter | user adapter |
| Godot | not supported | standard library | not supported | not supported |
| C | self-contained | not supported | not supported | not supported |
| C++ | self-contained | not supported | not supported | not supported |
| Erlang | self-contained | user adapter | user adapter | user adapter |
| Lua | self-contained | user adapter | user adapter | user adapter |
Dependency meanings:
| Kind | Meaning |
|---|---|
| self-contained | Generated runtime includes the decoder. |
| standard library | Generated runtime uses the language standard library. |
| managed dependency | Generated runtime expects normal package dependencies for that ecosystem. |
| user adapter | Generated runtime exposes an adapter hook and the application supplies the concrete decoder. |
Choosing a Format
Use sora when you want the native Sora binary bundle and the target supports it.
Use json when inspectability, tooling, or platform simplicity matters more than compactness.
Use cbor when you want a compact general-purpose binary value format and your runtime already has a CBOR dependency.
Use sora-protobuf when your environment prefers Protobuf transport but you still want Sora’s schema-driven value model.
The CI runtime matrix generates every supported combination in this table and syntax-checks languages where the check is lightweight.
Runtime Adapters
Some languages do not have a built-in dependency story for every runtime format. Those targets use adapter hooks instead of embedding a third-party decoder.
The generated runtime owns the Sora value model and table loading logic. The application supplies a small function that turns bytes into the decoded value tree expected by the runtime.
This keeps generated code independent from dependency choices. A game can use the CBOR, Protobuf, or compression library it already trusts.
Lua
local config = SoraConfig.from_cbor(bytes, {
decode_cbor = function(payload)
return my_cbor.decode(payload)
end,
})
Erlang
Options = #{
decode_cbor => fun my_cbor:decode/1
},
Config = sora_config:from_cbor(Bytes, Options).
Dart
final config = SoraConfig.fromCbor(
bytes,
decodeCbor: (payload) => myCborDecode(payload),
);
Adapters keep generated code independent from dependency choices while still allowing the same exported data formats to be used.
What the Adapter Must Return
The adapter should return the language-specific Sora value tree expected by the generated runtime. It is not responsible for constructing typed rows; generated code handles that after decoding.
If a target has a self-contained decoder for a format, no adapter is needed.
Versioning and Compatibility
Sora is still early. The project does not provide Rust-style editions or compatibility modes for old schema semantics. A project that needs stable output should pin the sora CLI version it uses, and treat a CLI upgrade as an explicit migration step.
What To Pin
Pin the CLI binary or crate version in the project tooling:
- download a specific GitHub Release asset and keep using that version in CI;
- install a specific crates.io version with
cargo install sora-cli --version X.Y.Z; - record the expected
sora --versionin project setup docs or build scripts.
Generated code, generated Excel templates, schema locks, and exported runtime bundles should be produced by the same pinned CLI version for a given project build.
Runtime Bundle Versions
Exported runtime bundles carry a format version. The Sora binary bundle also has a file header version, and generated runtimes reject bundles with unsupported versions.
Sora only bumps these runtime/export format versions when the generated runtime can no longer safely read data written by an older layout. Examples include:
- changing the
.sorabinary section layout; - changing the manifest fields required by generated runtimes;
- changing JSON, CBOR, or Protobuf bundle structure in a way that old generated code cannot read;
- changing value encoding rules in exported runtime bundles.
During the early development stage, ordinary implementation changes do not automatically bump format_version. Version bumps are manual and reserved for actual runtime/export incompatibility.
Schema and Codegen Semantics
Schema syntax, parser behavior, validation rules, Studio rendering, and generated language APIs may still change while the project is young. Sora does not keep old behavior behind an edition flag or any other compatibility mode.
If a newer CLI changes schema or codegen semantics, users should:
- upgrade the CLI intentionally;
- regenerate schema locks, templates, exports, and code;
- review diffs;
- update schema/data/project files as needed.
Schema fingerprints and schema locks help detect mismatches between generated code, schema, and data, but they are not migration tools. They prevent silent incompatibility; they do not preserve old semantics.
Extending Sora
Sora is designed to be used as a library by projects that need their own language or data format support.
The extension boundary is intentionally split:
input adapter -> schema model -> normalized IR -> data validation
|-> exporter
|-> code generator
Add a Code Generator
Implement the generator trait:
#![allow(unused)]
fn main() {
pub trait CodeGenerator: Send + Sync {
fn generate(&self, context: CodegenContext<'_>, out_dir: &Path) -> Result<()>;
}
}
Register it with an id, aliases, runtime capabilities, and optional formatter configuration.
See Generators for a longer walkthrough.
Keep the IR Neutral
Language-specific settings belong in target options and generator code. The normalized IR should describe schema semantics only: packages, tables, fields, types, keys, indexes, unions, and validation metadata.
Project-specific language type mappings should use codegen type mapping providers, not schema fields. This keeps data semantics separate from target-language representation choices such as mapping struct<Vec3> to UnityEngine.Vector3.
Add an Exporter
Exporters are separate from generators. Add a data exporter when you need a new runtime bundle format. Add a code generator when you need a new language target.
See Exporters for the expected boundary.
Generators
A generator turns the normalized IR into files for one language target.
Registration
Generators are registered with:
- a canonical target id;
- aliases;
- display metadata;
- supported runtime formats;
- optional formatter integration;
- a
CodeGeneratorimplementation.
This lets built-in generators and downstream generators use the same pipeline.
Implementation Shape
#![allow(unused)]
fn main() {
pub trait CodeGenerator: Send + Sync {
fn generate(&self, context: CodegenContext<'_>, out_dir: &Path) -> Result<()>;
}
}
The generator receives:
- the normalized IR;
- parsed target options;
- the registered type mapping providers;
- the output directory;
- runtime format selection.
It should not mutate the IR or rely on language-specific fields being present in the IR.
Type Mappings
Language generators can consult context.type_mappings before falling back to their built-in type mapping. A provider maps a target plus a named schema type, such as struct<Vec3>, to a generated type name, optional nullable type name, and optional decode wrappers. Container and optional types should recurse through the same mapper so list<struct<Vec3>> and optional<struct<Vec3>> automatically use the mapped target type.
The schema remains language-neutral. Project-specific mappings belong in library registration code or CLI Lua type mapping scripts, not in field definitions.
Target Options
Language-specific options live under [codegen.<target>]:
[codegen.rust]
runtime_format = "sora"
map_type = "btree"
string_storage = "owned"
The generator owns the interpretation of these options.
Exporters
An exporter writes validated configuration data into a runtime bundle.
Exporters are separate from code generators because the same exported data can be consumed by many languages.
When to Add an Exporter
Add an exporter when you need:
- a new runtime wire format;
- a platform-specific asset package;
- a different compression or section layout;
- an inspection format for tooling.
Do not add an exporter just to support a new programming language. Add a code generator for that.
Expected Boundary
An exporter should consume:
- the normalized schema IR;
- validated config data;
- exporter options;
- an output target.
It should not depend on a specific language generator.
Design Notes
These notes explain the architectural choices behind Sora.
The short version is that schema files are the source of truth. Excel headers, runtime bundles, generated code, and extension points are projections of the normalized schema and validated data.
Schema as Source of Truth
Sora is schema-first. The TOML schema is the contract for configuration data; source files and generated outputs are projections of that contract.
schema modules
-> normalized IR
-> Excel headers
-> validation
-> runtime exports
-> generated language code
This design avoids the common problem where a spreadsheet, a hand-written parser, and runtime code all define slightly different versions of the same data shape.
Consequences
- Field names, types, keys, defaults, references, and validation rules live in schema.
- Excel and CSV files provide values, not a second schema.
- Runtime export formats do not change the data model.
- Language options belong to codegen targets, not to the IR.
- Downstream users can add generators or exporters without changing schema semantics.
The schema can still include editing hints such as comment, parser hints, ranges, and length limits. Those hints are part of the data contract because they affect validation or generated projections.
Excel Header Projection
Excel templates are generated from the normalized schema. The header is a projection, not an independent format definition.
Why Generate Headers
Manually maintained spreadsheet headers tend to drift from code:
- a field is renamed in code but not in Excel;
- a type changes but old rows still look valid;
- a designer adds a column that no runtime reads;
- validation rules are documented in comments instead of enforced.
Sora avoids this by generating the workbook structure from schema.
What the Header Contains
Generated rows include:
- table metadata: table name, mode, key, scope, and schema hash;
- stable field names;
- type hints;
- scope hints;
- validation and parser rules;
- comments for editors.
Only row data should be treated as authored content. Header rows can be regenerated whenever the schema changes.
Practical Workflow
- Change the schema.
- Regenerate Excel templates.
- Move or paste existing data rows into the updated template.
- Run
sora buildorsora exportto validate values and references. - Run
sora buildto produce exports and generated code.
This keeps Excel useful for editing while keeping the schema authoritative.
IR Boundaries
The normalized IR describes schema semantics. It should not encode language-specific codegen choices.
Belongs in IR
- packages and included schema modules;
- enums, structs, unions, tables, fields, and indexes;
- table modes and keys;
- source metadata;
- field types, defaults, parsers, ranges, lengths, and comments;
- references and derived child-table field metadata;
- scopes.
Does Not Belong in IR
- Rust map implementation choices;
- TypeScript enum representation choices;
- Lua module names;
- runtime decoder dependency choices;
- formatter settings;
- target-specific file layout.
Those settings belong in [codegen.<target>] or in generator registration metadata.
Extension Boundary
schema input -> normalized IR -> validation
|-> exporter registry
|-> codegen registry
A new language generator should consume the IR and its own target options. A new runtime data format should be added as an exporter. Neither should require changing the schema model unless the actual data semantics need to change.