Schema Engine

The schema engine transforms, coerces, and merges data from multiple sources into your defined output shape.

Overview#

The schema engine is the core of Braids. It takes raw records from connectors, applies field mapping expressions, coerces values to the target types, and merges records from multiple sources. You define the output shape — Braids maps upstream data into it.

This inverts the typical integration model: instead of conforming to an aggregator's fixed schema, you declare exactly the fields and types you need, and the engine handles the rest.

Field Mapping DSL#

The mapping DSL supports three expression types for transforming upstream fields into your schema.

1. Direct field reference

Use the field name from the upstream source to extract its value directly:

mapping:
  email: email
  name: customer_name

This extracts the field value directly: record["email"] → email, record["customer_name"] → name.

2. String literals

Quoted strings produce static values:

mapping:
  source: "'stripe'"
  prefix: "'usr_'"

Single-quoted strings inside the expression are literal values.

3. Concatenation

Join fields and literals with +:

mapping:
  id: "'stripe_' + id"
  name: first_name + ' ' + last_name
  label: "'[' + status + '] ' + title"

Complete example

Here is a full endpoint configuration using all three expression types across two sources:

braids.yaml

endpoints:
  /customers:
    schema: customers
    sources:
      - connector: stripe
        resource: customers
        mapping:
          id: "'stripe_' + id"         # literal + field
          email: email                  # direct reference
          name: name                    # direct reference
          source: "'stripe'"            # literal only
          created_at: created          # direct reference
      - connector: shopify
        resource: customers
        mapping:
          id: "'shopify_' + id"
          email: email
          name: first_name + ' ' + last_name  # multi-field concat
          source: "'shopify'"
          created_at: created_at

Type Coercion#

The schema engine coerces every mapped value to the type declared in the schema. This ensures a consistent output shape regardless of upstream quirks.

Target Type	Input	Behavior	Example
`string`	any	Formats using Go's `%v`	`42` → `"42"`, `true` → `"true"`
`int`	float64	Truncates decimal	`19.99` → `19`
`int`	string	Parses integer	`"42"` → `42`
`int`	int	Passthrough	`42` → `42`
`float`	int	Converts to float	`42` → `42.0`
`float`	string	Parses float	`"19.99"` → `19.99`
`float`	float64	Passthrough	`19.99` → `19.99`
`datetime`	Unix int/float	Converts to RFC 3339	`1710158400` → `"2024-03-11T12:00:00Z"`
`datetime`	ISO string	Normalizes to RFC 3339	`"2024-03-11"` → `"2024-03-11T00:00:00Z"`
`any`	nil	Returns nil	`nil` → `nil`

Note

If a string cannot be parsed to the target numeric type, the original value is returned unchanged.

Datetime Handling#

All datetime fields are normalized to RFC 3339 format (2006-01-02T15:04:05Z07:00). The engine accepts a variety of input formats and converts them to a consistent representation.

Unix timestamps

Both integer and floating-point Unix timestamps are converted. Sub-second precision from floats is preserved internally, but the RFC 3339 output format truncates to whole seconds.

Accepted string formats

RFC 3339 — 2024-03-11T12:00:00Z
ISO 8601 with timezone — 2024-03-11T12:00:00+05:00
ISO 8601 without timezone — 2024-03-11T12:00:00 (treated as UTC)
Date only — 2024-03-11 (midnight UTC)

Before/after examples

Input                          → Output
1710158400                     → "2024-03-11T12:00:00Z"
1710158400.5                   → "2024-03-11T12:00:00Z"
"2024-03-11"                   → "2024-03-11T00:00:00Z"
"2024-03-11T12:00:00"          → "2024-03-11T12:00:00Z"
"2024-03-11T12:00:00Z"         → "2024-03-11T12:00:00Z"
"2024-03-11T12:00:00+05:00"   → "2024-03-11T12:00:00+05:00"

Merge Strategies#

When an endpoint pulls from multiple sources, the merge strategy controls how records are combined.

1. No merge key

When merge_on is not set, records from all sources are concatenated in source order. No deduplication is performed. If Stripe returns 25 records and Shopify returns 18, the response contains 43 records.

braids.yaml

endpoints:
  /customers:
    schema: customers
    # no merge_on — records are concatenated
    sources:
      - connector: stripe
        resource: customers
        mapping: ...
      - connector: shopify
        resource: customers
        mapping: ...

2. Merge with default resolution

When merge_on is set but no conflict_resolution is specified, records are grouped by the merge key. When multiple sources provide a record with the same key, the last source wins — fields from later sources overwrite earlier ones.

braids.yaml

schemas:
  customers:
    merge_on: email
    fields:
      email:
        type: string
      # ...

endpoints:
  /customers:
    schema: customers
    sources:
      - connector: stripe
        resource: customers
        mapping: ...
      - connector: shopify
        resource: customers
        mapping: ...

Before and after:

Source 1 (stripe):   { email: "[email protected]", name: "Alice", source: "stripe" }
Source 2 (shopify):  { email: "[email protected]", name: "Alice S.", source: "shopify" }

Result (default):    { email: "[email protected]", name: "Alice S.", source: "shopify" }

3. Merge with prefer_latest

When conflict_resolution: prefer_latest is set, records are grouped by merge key. The record with the most recent created_at timestamp is used as the base. All other records' fields are overlaid, but empty or nil fields do not overwrite existing values.

braids.yaml

schemas:
  customers:
    merge_on: email
    conflict_resolution: prefer_latest
    fields:
      email:
        type: string
      # ...

endpoints:
  /customers:
    schema: customers
    sources:
      - connector: stripe
        resource: customers
        mapping: ...
      - connector: shopify
        resource: customers
        mapping: ...

Before and after:

Source 1 (stripe):   { email: "[email protected]", name: "Alice", created_at: "2024-03-10T..." }
Source 2 (shopify):  { email: "[email protected]", name: "Alice S.", created_at: "2024-03-11T..." }

Result (prefer_latest): { email: "[email protected]", name: "Alice S.", created_at: "2024-03-11T..." }
# Shopify record is newer, so it's the base. All fields from shopify take precedence.

Full Pipeline#

Every request to a multi-source endpoint flows through five stages:

Fetch

→

Map

→

Coerce

→

Merge

→

Respond

Fetch — All sources are queried in parallel. HTTP connectors and extension processes run concurrently, with circuit breakers and timeouts protecting against slow upstreams.
Map — Field mapping expressions are evaluated per record per source. Each source's mapping configuration is applied independently.
Coerce — Each mapped value is coerced to the schema field type. Strings become integers, Unix timestamps become RFC 3339, and so on.
Merge — If merge_on is set, records are grouped by the merge key and combined according to the conflict resolution strategy. Otherwise, records are concatenated.
Respond — Final records are wrapped in the response envelope and returned as JSON.

Previous ← Configuration Next Connectors →