Schema Evolution with Avro & Kafka

Managing backward-compatible changes in trading event schemas across Kafka topics

  • Avro
  • Kafka
  • Schema Registry
  • Backend
  • OMS

In a modular, event-driven OMS, each Kafka topic acts as a contract between producers and consumers. As trading workflows evolve — new fields for algo routing, compliance tags, or enriched audit data — schemas must evolve without breaking downstream services.

This article documents how I used Avro + Schema Registry to manage schema evolution across the OMS, ensuring backward compatibility, governance, and resilience in a high-performance trading environment.


Why Schema Evolution?

  • Backward compatibility: Older consumers must continue processing messages even as producers evolve.
  • Safe rollout of new features: Add fields for algo routing, risk flags, or market metadata without disrupting existing flows.
  • Governance: Centralize schema validation and versioning across multiple teams.
  • Cross-language support: Enable Java (OMS core), Rust (low-latency modules), and Python (analytics) to share message formats.
  • Auditability: Maintain a clear history of schema changes for compliance and debugging.

Evolution Strategy

We enforce backward-compatible evolution as the default policy. This ensures that new producers can publish messages without breaking older consumers.

✅ Safe Changes

  • Add new optional fields (with default values)
  • Rename fields using aliases
  • Reorder fields (Avro is order-independent)

❌ Breaking Changes

  • Remove existing fields
  • Change field types (e.g., int → string)
  • Rename fields without alias

When breaking changes are unavoidable (e.g., regulatory-driven field type changes), we version the schema explicitly (OrderEventV1, OrderEventV2) and migrate consumers gradually.


Workflow

  1. Define schema in .avsc or Java POJO.
  2. Register schema with Confluent Schema Registry.
  3. Serialize using KafkaAvroSerializer.
  4. Deserialize using KafkaAvroDeserializer.
  5. Validate compatibility via Schema Registry API in CI/CD.
  6. Document schema history in devlogs for traceability.

Example: OrderEvent Evolution

v1 – Initial schema

{
  "type": "record",
  "name": "OrderEvent",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "symbol", "type": "string"},
    {"name": "quantity", "type": "int"}
  ]
}

v2 – Added optional field

{
  "type": "record",
  "name": "OrderEvent",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "symbol", "type": "string"},
    {"name": "quantity", "type": "int"},
    {"name": "algoType", "type": ["null", "string"], "default": null}
  ]
}

This change is backward-compatible.

  • Consumers using v1 ignore algoType.
  • Newer consumers can leverage it for smart order routing.

CI/CD Integration

To prevent accidental breaking changes, schema validation is part of the pipeline:

  • Pre-commit hook → runs avro-tools to validate schema syntax.
  • CI job → checks compatibility against the latest registered schema in Schema Registry.
  • Fail-fast policy → pipeline blocks merges if compatibility fails.
  • Schema changelog → auto-generated markdown documenting each schema version.

Monitoring & Governance

  • Schema Registry UI → visualize versions and compatibility.
  • Prometheus metrics → track serialization/deserialization errors.
  • Kafka topic README → each topic has a schema history and usage guide.
  • Audit logs → every schema update is tied to a Git commit and Jira ticket.

Capital Markets Context

In trading systems, schema evolution is not just technical — it’s business-critical:

  • 🛡️ Risk checks: Adding new fields like riskCategory or marginRequirement.
  • ⚖️ Compliance: Regulatory changes may require new audit fields (MiFID II, CAT reporting).
  • Performance: Schema bloat can increase serialization latency, so optional fields are carefully managed.
  • 🔄 Algo trading: New algo parameters (TWAP, VWAP, Iceberg) can be added without breaking existing order flows.

Optimizations & Learnings

  • 🗂️ Always version schemas explicitly (OrderEventV1, OrderEventV2).
  • ⚠️ Avoid nested unions or deeply nested records — they complicate evolution.
  • 📖 Document schema changes in devlogs for traceability.
  • 🧰 Use Avro codegen to auto-generate POJOs and avoid manual errors.
  • 🔍 Validate schema compatibility in CI before deployment.
  • 📊 Benchmark serialization/deserialization latency under load (important for trading spikes).

Final Thoughts

Schema evolution is critical for maintaining modular, resilient trading systems. By using Avro and Schema Registry, I ensured that my OMS can evolve safely — adding new features without breaking existing consumers.

This approach reinforces the principle that contracts between services must be versioned and governed, especially in high-performance, multi-team environments like capital markets.

With this foundation, future modules — like Algo Routing, Trade Enrichment, or Compliance Reporting — can be added confidently and cleanly.