Characterization Tests

Characterization tests are a special type of test designed for legacy code. Unlike traditional tests that verify correct behavior, characterization tests document current behavior—whatever the code actually does right now. This distinction is crucial when working with legacy systems where the "correct" behavior may be unclear, undocumented, or encoded only in the running code itself.

These tests act as a safety net, detecting when refactoring accidentally changes behavior.

What Are Characterization Tests?

Definition

A characterization test captures the actual behavior of existing code without making judgments about whether that behavior is correct.

Traditional test: "Given valid input, the function should return the correct calculated discount."

Characterization test: "Given this specific input, the function currently returns 0.23. We don't know if that's 'correct,' but it's what happens now."

Why They Matter

When working with legacy code:

You don't fully understand the requirements
Original developers are gone
Documentation is missing or outdated
The running code is the only reliable source of truth
You need to refactor without changing behavior

Characterization tests let you refactor confidently by alerting you when behavior changes—even if you don't yet understand whether that behavior is correct.

Writing Characterization Tests

The Process

1. Write a test with a guess at the output

describe('calculateLoyaltyPoints', () => {
  it('calculates points for $100 purchase', () => {
    const points = calculateLoyaltyPoints(100, 'gold');
    expect(points).toBe(0); // We don't know the real value
  });
});

2. Run the test and let it fail

Expected: 0
Received: 150

3. Update test with actual value

it('calculates points for $100 purchase', () => {
  const points = calculateLoyaltyPoints(100, 'gold');
  expect(points).toBe(150); // Now matches actual behavior
});

4. Run test again to verify it passes

5. Repeat for different inputs to characterize behavior

describe('calculateLoyaltyPoints', () => {
  it('calculates points for gold member', () => {
    expect(calculateLoyaltyPoints(100, 'gold')).toBe(150);
  });

  it('calculates points for silver member', () => {
    expect(calculateLoyaltyPoints(100, 'silver')).toBe(100);
  });

  it('calculates points for regular member', () => {
    expect(calculateLoyaltyPoints(100, 'regular')).toBe(50);
  });

  it('handles edge case: zero purchase', () => {
    expect(calculateLoyaltyPoints(0, 'gold')).toBe(0);
  });

  it('handles edge case: negative amount', () => {
    expect(calculateLoyaltyPoints(-50, 'gold')).toBe(0);
  });
});

Focus on Inputs and Outputs

Characterization tests should:

Test observable behavior - What users see, not internal implementation
Cover representative cases - Common scenarios and edge cases
Be specific and concrete - Use real example data, not random values
Document discovered behavior - Add comments explaining unexpected results

it('applies strange rounding that we discovered', () => {
  // Note: This rounds down even for .99, seems like a bug but shipping it
  expect(calculateTotal([2.99, 3.99])).toBe(6);
});

Real-World Applications

Example 1: Characterizing a Pricing Engine

// Legacy pricing function - unclear how it works
function calculatePrice(product: Product, customer: Customer): number {
  let price = product.basePrice;

  if (customer.membershipLevel === 'premium') {
    price *= 0.9;
  }

  if (product.category === 'electronics' && customer.lastPurchase) {
    const daysSinceLastPurchase = getDaysSince(customer.lastPurchase);
    if (daysSinceLastPurchase < 30) {
      price *= 0.95;
    }
  }

  if (product.stock < 5) {
    price *= 1.1;
  }

  return Math.round(price * 100) / 100;
}

// Characterization tests
describe('calculatePrice', () => {
  const baseProduct = {
    id: '1',
    basePrice: 100,
    category: 'electronics',
    stock: 10,
  };

  const regularCustomer = {
    id: '1',
    membershipLevel: 'regular',
    lastPurchase: null,
  };

  it('returns base price for regular customer, normal stock', () => {
    const price = calculatePrice(baseProduct, regularCustomer);
    expect(price).toBe(100);
  });

  it('applies 10% premium discount', () => {
    const premiumCustomer = { ...regularCustomer, membershipLevel: 'premium' };
    const price = calculatePrice(baseProduct, premiumCustomer);
    expect(price).toBe(90); // 10% off
  });

  it('applies recent purchase discount on electronics', () => {
    const recentCustomer = {
      ...regularCustomer,
      lastPurchase: new Date(Date.now() - 15 * 24 * 60 * 60 * 1000), // 15 days ago
    };
    const price = calculatePrice(baseProduct, recentCustomer);
    expect(price).toBe(95); // 5% repeat customer discount
  });

  it('stacks premium and repeat customer discounts', () => {
    const premiumRecentCustomer = {
      ...regularCustomer,
      membershipLevel: 'premium',
      lastPurchase: new Date(Date.now() - 15 * 24 * 60 * 60 * 1000),
    };
    const price = calculatePrice(baseProduct, premiumRecentCustomer);
    // 10% premium discount, then 5% repeat discount
    expect(price).toBe(85.5);
  });

  it('increases price 10% when stock is low', () => {
    const lowStockProduct = { ...baseProduct, stock: 3 };
    const price = calculatePrice(lowStockProduct, regularCustomer);
    expect(price).toBe(110); // 10% urgency premium
  });

  it('applies complex interaction: premium + low stock', () => {
    const lowStockProduct = { ...baseProduct, stock: 3 };
    const premiumCustomer = { ...regularCustomer, membershipLevel: 'premium' };
    const price = calculatePrice(lowStockProduct, premiumCustomer);
    // First 10% discount (premium), then 10% increase (low stock)
    expect(price).toBe(99); // 100 * 0.9 * 1.1 = 99
  });

  it('does not apply repeat discount if last purchase > 30 days', () => {
    const oldCustomer = {
      ...regularCustomer,
      lastPurchase: new Date(Date.now() - 45 * 24 * 60 * 60 * 1000), // 45 days
    };
    const price = calculatePrice(baseProduct, oldCustomer);
    expect(price).toBe(100); // No discount
  });

  it('rounds to 2 decimal places', () => {
    const product = { ...baseProduct, basePrice: 99.999 };
    const price = calculatePrice(product, regularCustomer);
    expect(price).toBe(100); // Rounded
  });
});

Example 2: Characterizing Date Logic

// Legacy function with unclear date handling
function getInvoiceDueDate(invoiceDate: Date, terms: string): Date {
  const date = new Date(invoiceDate);

  switch (terms) {
    case 'NET30':
      date.setDate(date.getDate() + 30);
      break;
    case 'NET60':
      date.setDate(date.getDate() + 60);
      break;
    case 'EOM':
      date.setMonth(date.getMonth() + 1);
      date.setDate(0); // Last day of current month
      break;
    case 'immediate':
      // No change
      break;
    default:
      date.setDate(date.getDate() + 30);
  }

  // Skip weekends? This behavior is unclear
  while (date.getDay() === 0 || date.getDay() === 6) {
    date.setDate(date.getDate() + 1);
  }

  return date;
}

// Characterization tests document actual behavior
describe('getInvoiceDueDate', () => {
  it('NET30: adds 30 days', () => {
    const invoice = new Date('2024-01-15');
    const due = getInvoiceDueDate(invoice, 'NET30');
    expect(due).toEqual(new Date('2024-02-14'));
  });

  it('NET30: skips weekend', () => {
    const invoice = new Date('2024-01-12'); // Friday
    const due = getInvoiceDueDate(invoice, 'NET30');
    // 30 days = Feb 11 (Sunday), should move to Monday
    expect(due).toEqual(new Date('2024-02-12'));
  });

  it('EOM: end of month from mid-month', () => {
    const invoice = new Date('2024-01-15');
    const due = getInvoiceDueDate(invoice, 'EOM');
    expect(due).toEqual(new Date('2024-01-31'));
  });

  it('EOM: handles February (non-leap year)', () => {
    const invoice = new Date('2023-01-15');
    const due = getInvoiceDueDate(invoice, 'EOM');
    expect(due).toEqual(new Date('2023-01-31'));
  });

  it('EOM: handles February (leap year)', () => {
    const invoice = new Date('2024-01-15');
    const due = getInvoiceDueDate(invoice, 'EOM');
    expect(due).toEqual(new Date('2024-01-31'));
  });

  it('EOM: end of month falls on weekend, moves to Monday', () => {
    const invoice = new Date('2024-02-15');
    const due = getInvoiceDueDate(invoice, 'EOM');
    // Feb 29, 2024 is Thursday, so no weekend adjustment
    expect(due).toEqual(new Date('2024-02-29'));
  });

  it('immediate: returns same date if weekday', () => {
    const invoice = new Date('2024-01-15'); // Monday
    const due = getInvoiceDueDate(invoice, 'immediate');
    expect(due).toEqual(new Date('2024-01-15'));
  });

  it('immediate: moves to Monday if invoice date is weekend', () => {
    const invoice = new Date('2024-01-13'); // Saturday
    const due = getInvoiceDueDate(invoice, 'immediate');
    expect(due).toEqual(new Date('2024-01-15')); // Monday
  });

  it('unknown terms: defaults to NET30', () => {
    const invoice = new Date('2024-01-15');
    const due = getInvoiceDueDate(invoice, 'UNKNOWN');
    expect(due).toEqual(new Date('2024-02-14'));
  });
});

Example 3: Characterizing Validation Logic

// Legacy email validation with unclear rules
function isValidEmail(email: string): boolean {
  if (!email || email.length === 0) return false;
  if (email.length > 254) return false;
  if (!email.includes('@')) return false;

  const parts = email.split('@');
  if (parts.length !== 2) return false;

  const [local, domain] = parts;
  if (local.length === 0 || local.length > 64) return false;
  if (domain.length === 0 || domain.length > 255) return false;

  if (local.startsWith('.') || local.endsWith('.')) return false;
  if (local.includes('..')) return false;

  if (!domain.includes('.')) return false;

  return true;
}

// Characterization tests
describe('isValidEmail', () => {
  // Valid cases
  it('accepts standard email', () => {
    expect(isValidEmail('user@example.com')).toBe(true);
  });

  it('accepts email with dots in local part', () => {
    expect(isValidEmail('first.last@example.com')).toBe(true);
  });

  it('accepts email with plus sign', () => {
    // Discovered: Plus signs ARE allowed
    expect(isValidEmail('user+tag@example.com')).toBe(true);
  });

  it('accepts email with subdomain', () => {
    expect(isValidEmail('user@mail.example.com')).toBe(true);
  });

  // Invalid cases
  it('rejects email without @', () => {
    expect(isValidEmail('userexample.com')).toBe(false);
  });

  it('rejects email with multiple @', () => {
    expect(isValidEmail('user@@example.com')).toBe(false);
  });

  it('rejects email starting with dot', () => {
    expect(isValidEmail('.user@example.com')).toBe(false);
  });

  it('rejects email ending with dot', () => {
    expect(isValidEmail('user.@example.com')).toBe(false);
  });

  it('rejects email with consecutive dots', () => {
    expect(isValidEmail('user..name@example.com')).toBe(false);
  });

  it('rejects email with no domain extension', () => {
    expect(isValidEmail('user@example')).toBe(false);
  });

  it('rejects email over 254 characters', () => {
    const longEmail = 'a'.repeat(250) + '@example.com';
    expect(isValidEmail(longEmail)).toBe(false);
  });

  it('rejects email with local part over 64 characters', () => {
    const longLocal = 'a'.repeat(65) + '@example.com';
    expect(isValidEmail(longLocal)).toBe(false);
  });

  // Edge cases discovered during characterization
  it('rejects empty string', () => {
    expect(isValidEmail('')).toBe(false);
  });

  it('rejects null', () => {
    expect(isValidEmail(null as any)).toBe(false);
  });

  it('accepts technically invalid but commonly used emails', () => {
    // Discovered: Doesn't validate TLD properly
    expect(isValidEmail('user@localhost.local')).toBe(true);
  });
});

Best Practices

1. Start with Happy Path, Then Edge Cases

// Order of characterization
describe('calculateShipping', () => {
  // 1. Happy path
  it('calculates standard shipping for normal order', () => {
    expect(calculateShipping({ weight: 5, distance: 100 })).toBe(15);
  });

  // 2. Common variations
  it('calculates express shipping', () => {
    expect(calculateShipping({ weight: 5, distance: 100, express: true })).toBe(30);
  });

  // 3. Edge cases
  it('handles zero weight', () => {
    expect(calculateShipping({ weight: 0, distance: 100 })).toBe(10); // Minimum fee
  });

  it('handles international shipping', () => {
    expect(calculateShipping({ weight: 5, distance: 5000 })).toBe(125);
  });

  // 4. Unexpected behavior
  it('clamps maximum shipping cost', () => {
    // Discovered: Never charges more than $200
    expect(calculateShipping({ weight: 1000, distance: 10000 })).toBe(200);
  });
});

2. Document Surprising Behavior

it('rounds prices in unexpected way', () => {
  // Note: This truncates instead of rounding, seems like a bug
  // but changing it would affect all historical orders
  // TODO: Fix in v2 of pricing engine
  expect(calculatePrice(99.99)).toBe(99);
});

3. Use Descriptive Test Names

// Bad
it('test1', () => { /* ... */ });

// Good
it('applies senior discount for customers age 65+', () => { /* ... */ });
it('waives shipping fee for orders over $50', () => { /* ... */ });
it('rejects orders with more than 100 line items', () => { /* ... */ });

describe('calculateTax', () => {
  describe('when customer is in California', () => {
    it('applies state tax rate of 7.25%', () => { /* ... */ });
    it('adds local tax for LA county', () => { /* ... */ });
    it('exempts groceries from tax', () => { /* ... */ });
  });

  describe('when customer is in Oregon', () => {
    it('applies no sales tax', () => { /* ... */ });
  });

  describe('when customer is international', () => {
    it('applies no US tax', () => { /* ... */ });
    it('includes VAT ID in invoice', () => { /* ... */ });
  });
});

Moving from Characterization to Specification

Characterization tests are a starting point, not an end goal:

1. Start with characterization - Document what code does now

2. Understand the domain - Talk to stakeholders, read docs, analyze business logic

3. Identify bugs vs. features - Some "characterized" behavior is actually bugs

4. Refactor safely - With tests in place, you can now refactor

5. Update to specification tests - Gradually replace characterization with tests of correct behavior

// Characterization test
it('calculates discount', () => {
  expect(calculateDiscount(100, 'VIP')).toBe(23);
});

// After understanding requirements, becomes specification test
it('applies 20% VIP discount plus 3% loyalty bonus', () => {
  const discount = calculateDiscount(100, 'VIP');
  expect(discount).toBe(23); // 20% + 3% = 23%
});

// Then you might discover and fix a bug
it('applies 20% VIP discount plus 5% loyalty bonus', () => {
  const discount = calculateDiscount(100, 'VIP');
  expect(discount).toBe(25); // Fixed: Should be 25%, not 23%
});

Key Takeaways

Characterization tests document current behavior without judging correctness
They provide a safety net for refactoring legacy code
Write tests by guessing output, running to see actual output, then updating test
Focus on observable inputs and outputs, not internal implementation
Cover happy paths, common variations, and edge cases
Document surprising or questionable behavior with comments
Characterization tests are a means to an end—eventually replace with specification tests
They're essential when requirements are unclear or lost to time