KB article

Default Aggregation: When SUM Is the Wrong Assumption

Default aggregations can distort results when a sum is not meaningful.

arf-kbsemantic-integritydefault-aggregationdeterministic-querysemantic-contract

TL;DR

  • SUM is not always correct.
  • Set explicit aggregations or measures.

The problem

  • Columns default to SUM even when the right behavior is average, max, or distinct count.
  • AI queries often rely on default aggregation.

Why it matters

  • Wrong aggregation leads to wrong answers with high confidence.
  • Errors are subtle and often missed.

Symptoms

  • Rates or percentages are summed.
  • Balances are aggregated across time.

Root causes

  • Default aggregation left untouched in model.
  • No explicit measures for key fields.

What good looks like

  • Explicit measures for critical metrics.
  • Columns set to correct aggregation or “Do not summarize.”

How to fix

  • Audit columns used in AI and reports.
  • Define measures for rates and ratios.
  • Set default aggregations in the model.

Pitfalls

  • Relying on report visuals to override defaults.
  • Leaving raw columns exposed for AI.

Checklist

  • All ratio metrics use measures.
  • Default aggregation settings reviewed.
  • No critical metric depends on implicit SUM.

Framework placement

Primary ARF layer: Semantic Integrity. Diagnostic bridge: semantic-reliability, change-reliability.