Schema Engine
The schema engine transforms, coerces, and merges data from multiple sources into your defined output shape.
Overview#
The schema engine is the core of Braids. It takes raw records from connectors, applies field mapping expressions, coerces values to the target types, and merges records from multiple sources. You define the output shape — Braids maps upstream data into it.
This inverts the typical integration model: instead of conforming to an aggregator's fixed schema, you declare exactly the fields and types you need, and the engine handles the rest.
Field Mapping DSL#
The mapping DSL supports three expression types for transforming upstream fields into your schema.
1. Direct field reference
Use the field name from the upstream source to extract its value directly:
mapping:
email: email
name: customer_name
This extracts the field value directly: record["email"] → email, record["customer_name"] → name.
2. String literals
Quoted strings produce static values:
mapping:
source: "'stripe'"
prefix: "'usr_'"
Single-quoted strings inside the expression are literal values.
3. Concatenation
Join fields and literals with +:
mapping:
id: "'stripe_' + id"
name: first_name + ' ' + last_name
label: "'[' + status + '] ' + title"
Complete example
Here is a full endpoint configuration using all three expression types across two sources:
endpoints:
/customers:
schema: customers
sources:
- connector: stripe
resource: customers
mapping:
id: "'stripe_' + id" # literal + field
email: email # direct reference
name: name # direct reference
source: "'stripe'" # literal only
created_at: created # direct reference
- connector: shopify
resource: customers
mapping:
id: "'shopify_' + id"
email: email
name: first_name + ' ' + last_name # multi-field concat
source: "'shopify'"
created_at: created_at
Type Coercion#
The schema engine coerces every mapped value to the type declared in the schema. This ensures a consistent output shape regardless of upstream quirks.
| Target Type | Input | Behavior | Example |
|---|---|---|---|
string |
any | Formats using Go's %v |
42 → "42", true → "true" |
int |
float64 | Truncates decimal | 19.99 → 19 |
int |
string | Parses integer | "42" → 42 |
int |
int | Passthrough | 42 → 42 |
float |
int | Converts to float | 42 → 42.0 |
float |
string | Parses float | "19.99" → 19.99 |
float |
float64 | Passthrough | 19.99 → 19.99 |
datetime |
Unix int/float | Converts to RFC 3339 | 1710158400 → "2024-03-11T12:00:00Z" |
datetime |
ISO string | Normalizes to RFC 3339 | "2024-03-11" → "2024-03-11T00:00:00Z" |
any |
nil | Returns nil | nil → nil |
If a string cannot be parsed to the target numeric type, the original value is returned unchanged.
Datetime Handling#
All datetime fields are normalized to RFC 3339 format (2006-01-02T15:04:05Z07:00). The engine accepts a variety of input formats and converts them to a consistent representation.
Unix timestamps
Both integer and floating-point Unix timestamps are converted. Sub-second precision from floats is preserved internally, but the RFC 3339 output format truncates to whole seconds.
Accepted string formats
- RFC 3339 —
2024-03-11T12:00:00Z - ISO 8601 with timezone —
2024-03-11T12:00:00+05:00 - ISO 8601 without timezone —
2024-03-11T12:00:00(treated as UTC) - Date only —
2024-03-11(midnight UTC)
Before/after examples
Input → Output
1710158400 → "2024-03-11T12:00:00Z"
1710158400.5 → "2024-03-11T12:00:00Z"
"2024-03-11" → "2024-03-11T00:00:00Z"
"2024-03-11T12:00:00" → "2024-03-11T12:00:00Z"
"2024-03-11T12:00:00Z" → "2024-03-11T12:00:00Z"
"2024-03-11T12:00:00+05:00" → "2024-03-11T12:00:00+05:00"
Merge Strategies#
When an endpoint pulls from multiple sources, the merge strategy controls how records are combined.
1. No merge key
When merge_on is not set, records from all sources are concatenated in source order. No deduplication is performed. If Stripe returns 25 records and Shopify returns 18, the response contains 43 records.
endpoints:
/customers:
schema: customers
# no merge_on — records are concatenated
sources:
- connector: stripe
resource: customers
mapping: ...
- connector: shopify
resource: customers
mapping: ...
2. Merge with default resolution
When merge_on is set but no conflict_resolution is specified, records are grouped by the merge key. When multiple sources provide a record with the same key, the last source wins — fields from later sources overwrite earlier ones.
schemas:
customers:
merge_on: email
fields:
email:
type: string
# ...
endpoints:
/customers:
schema: customers
sources:
- connector: stripe
resource: customers
mapping: ...
- connector: shopify
resource: customers
mapping: ...
Before and after:
Source 1 (stripe): { email: "[email protected]", name: "Alice", source: "stripe" }
Source 2 (shopify): { email: "[email protected]", name: "Alice S.", source: "shopify" }
Result (default): { email: "[email protected]", name: "Alice S.", source: "shopify" }
3. Merge with prefer_latest
When conflict_resolution: prefer_latest is set, records are grouped by merge key. The record with the most recent created_at timestamp is used as the base. All other records' fields are overlaid, but empty or nil fields do not overwrite existing values.
schemas:
customers:
merge_on: email
conflict_resolution: prefer_latest
fields:
email:
type: string
# ...
endpoints:
/customers:
schema: customers
sources:
- connector: stripe
resource: customers
mapping: ...
- connector: shopify
resource: customers
mapping: ...
Before and after:
Source 1 (stripe): { email: "[email protected]", name: "Alice", created_at: "2024-03-10T..." }
Source 2 (shopify): { email: "[email protected]", name: "Alice S.", created_at: "2024-03-11T..." }
Result (prefer_latest): { email: "[email protected]", name: "Alice S.", created_at: "2024-03-11T..." }
# Shopify record is newer, so it's the base. All fields from shopify take precedence.
Full Pipeline#
Every request to a multi-source endpoint flows through five stages:
- Fetch — All sources are queried in parallel. HTTP connectors and extension processes run concurrently, with circuit breakers and timeouts protecting against slow upstreams.
- Map — Field mapping expressions are evaluated per record per source. Each source's mapping configuration is applied independently.
- Coerce — Each mapped value is coerced to the schema field type. Strings become integers, Unix timestamps become RFC 3339, and so on.
- Merge — If
merge_onis set, records are grouped by the merge key and combined according to the conflict resolution strategy. Otherwise, records are concatenated. - Respond — Final records are wrapped in the response envelope and returned as JSON.