docs

Schema Engine

The schema engine transforms, coerces, and merges data from multiple sources into your defined output shape.

Overview#

The schema engine is the core of Braids. It takes raw records from connectors, applies field mapping expressions, coerces values to the target types, and merges records from multiple sources. You define the output shape — Braids maps upstream data into it.

This inverts the typical integration model: instead of conforming to an aggregator's fixed schema, you declare exactly the fields and types you need, and the engine handles the rest.

Field Mapping DSL#

The mapping DSL supports three expression types for transforming upstream fields into your schema.

1. Direct field reference

Use the field name from the upstream source to extract its value directly:

mapping:
  email: email
  name: customer_name

This extracts the field value directly: record["email"]email, record["customer_name"]name.

2. String literals

Quoted strings produce static values:

mapping:
  source: "'stripe'"
  prefix: "'usr_'"

Single-quoted strings inside the expression are literal values.

3. Concatenation

Join fields and literals with +:

mapping:
  id: "'stripe_' + id"
  name: first_name + ' ' + last_name
  label: "'[' + status + '] ' + title"

Complete example

Here is a full endpoint configuration using all three expression types across two sources:

braids.yaml
endpoints:
  /customers:
    schema: customers
    sources:
      - connector: stripe
        resource: customers
        mapping:
          id: "'stripe_' + id"         # literal + field
          email: email                  # direct reference
          name: name                    # direct reference
          source: "'stripe'"            # literal only
          created_at: created          # direct reference
      - connector: shopify
        resource: customers
        mapping:
          id: "'shopify_' + id"
          email: email
          name: first_name + ' ' + last_name  # multi-field concat
          source: "'shopify'"
          created_at: created_at

Type Coercion#

The schema engine coerces every mapped value to the type declared in the schema. This ensures a consistent output shape regardless of upstream quirks.

Target Type Input Behavior Example
string any Formats using Go's %v 42"42", true"true"
int float64 Truncates decimal 19.9919
int string Parses integer "42"42
int int Passthrough 4242
float int Converts to float 4242.0
float string Parses float "19.99"19.99
float float64 Passthrough 19.9919.99
datetime Unix int/float Converts to RFC 3339 1710158400"2024-03-11T12:00:00Z"
datetime ISO string Normalizes to RFC 3339 "2024-03-11""2024-03-11T00:00:00Z"
any nil Returns nil nilnil
Note

If a string cannot be parsed to the target numeric type, the original value is returned unchanged.

Datetime Handling#

All datetime fields are normalized to RFC 3339 format (2006-01-02T15:04:05Z07:00). The engine accepts a variety of input formats and converts them to a consistent representation.

Unix timestamps

Both integer and floating-point Unix timestamps are converted. Sub-second precision from floats is preserved internally, but the RFC 3339 output format truncates to whole seconds.

Accepted string formats

Before/after examples

Input                           Output
1710158400                      "2024-03-11T12:00:00Z"
1710158400.5                    "2024-03-11T12:00:00Z"
"2024-03-11"                    "2024-03-11T00:00:00Z"
"2024-03-11T12:00:00"           "2024-03-11T12:00:00Z"
"2024-03-11T12:00:00Z"          "2024-03-11T12:00:00Z"
"2024-03-11T12:00:00+05:00"    "2024-03-11T12:00:00+05:00"

Merge Strategies#

When an endpoint pulls from multiple sources, the merge strategy controls how records are combined.

1. No merge key

When merge_on is not set, records from all sources are concatenated in source order. No deduplication is performed. If Stripe returns 25 records and Shopify returns 18, the response contains 43 records.

braids.yaml
endpoints:
  /customers:
    schema: customers
    # no merge_on — records are concatenated
    sources:
      - connector: stripe
        resource: customers
        mapping: ...
      - connector: shopify
        resource: customers
        mapping: ...

2. Merge with default resolution

When merge_on is set but no conflict_resolution is specified, records are grouped by the merge key. When multiple sources provide a record with the same key, the last source wins — fields from later sources overwrite earlier ones.

braids.yaml
schemas:
  customers:
    merge_on: email
    fields:
      email:
        type: string
      # ...

endpoints:
  /customers:
    schema: customers
    sources:
      - connector: stripe
        resource: customers
        mapping: ...
      - connector: shopify
        resource: customers
        mapping: ...

Before and after:

Source 1 (stripe):   { email: "[email protected]", name: "Alice", source: "stripe" }
Source 2 (shopify):  { email: "[email protected]", name: "Alice S.", source: "shopify" }

Result (default):    { email: "[email protected]", name: "Alice S.", source: "shopify" }

3. Merge with prefer_latest

When conflict_resolution: prefer_latest is set, records are grouped by merge key. The record with the most recent created_at timestamp is used as the base. All other records' fields are overlaid, but empty or nil fields do not overwrite existing values.

braids.yaml
schemas:
  customers:
    merge_on: email
    conflict_resolution: prefer_latest
    fields:
      email:
        type: string
      # ...

endpoints:
  /customers:
    schema: customers
    sources:
      - connector: stripe
        resource: customers
        mapping: ...
      - connector: shopify
        resource: customers
        mapping: ...

Before and after:

Source 1 (stripe):   { email: "[email protected]", name: "Alice", created_at: "2024-03-10T..." }
Source 2 (shopify):  { email: "[email protected]", name: "Alice S.", created_at: "2024-03-11T..." }

Result (prefer_latest): { email: "[email protected]", name: "Alice S.", created_at: "2024-03-11T..." }
# Shopify record is newer, so it's the base. All fields from shopify take precedence.

Full Pipeline#

Every request to a multi-source endpoint flows through five stages:

Fetch
Map
Coerce
Merge
Respond
  1. Fetch — All sources are queried in parallel. HTTP connectors and extension processes run concurrently, with circuit breakers and timeouts protecting against slow upstreams.
  2. Map — Field mapping expressions are evaluated per record per source. Each source's mapping configuration is applied independently.
  3. Coerce — Each mapped value is coerced to the schema field type. Strings become integers, Unix timestamps become RFC 3339, and so on.
  4. Merge — If merge_on is set, records are grouped by the merge key and combined according to the conflict resolution strategy. Otherwise, records are concatenated.
  5. Respond — Final records are wrapped in the response envelope and returned as JSON.