Characterization Tests
Characterization tests are a special type of test designed for legacy code. Unlike traditional tests that verify correct behavior, characterization tests document current behavior—whatever the code actually does right now. This distinction is crucial when working with legacy systems where the "correct" behavior may be unclear, undocumented, or encoded only in the running code itself.
These tests act as a safety net, detecting when refactoring accidentally changes behavior.
What Are Characterization Tests?
Definition
A characterization test captures the actual behavior of existing code without making judgments about whether that behavior is correct.
Traditional test: "Given valid input, the function should return the correct calculated discount."
Characterization test: "Given this specific input, the function currently returns 0.23. We don't know if that's 'correct,' but it's what happens now."
Why They Matter
When working with legacy code:
- You don't fully understand the requirements
- Original developers are gone
- Documentation is missing or outdated
- The running code is the only reliable source of truth
- You need to refactor without changing behavior
Characterization tests let you refactor confidently by alerting you when behavior changes—even if you don't yet understand whether that behavior is correct.
Writing Characterization Tests
The Process
1. Write a test with a guess at the output
describe('calculateLoyaltyPoints', () => {
it('calculates points for $100 purchase', () => {
const points = calculateLoyaltyPoints(100, 'gold');
expect(points).toBe(0); // We don't know the real value
});
});
2. Run the test and let it fail
Expected: 0
Received: 150
3. Update test with actual value
it('calculates points for $100 purchase', () => {
const points = calculateLoyaltyPoints(100, 'gold');
expect(points).toBe(150); // Now matches actual behavior
});
4. Run test again to verify it passes
5. Repeat for different inputs to characterize behavior
describe('calculateLoyaltyPoints', () => {
it('calculates points for gold member', () => {
expect(calculateLoyaltyPoints(100, 'gold')).toBe(150);
});
it('calculates points for silver member', () => {
expect(calculateLoyaltyPoints(100, 'silver')).toBe(100);
});
it('calculates points for regular member', () => {
expect(calculateLoyaltyPoints(100, 'regular')).toBe(50);
});
it('handles edge case: zero purchase', () => {
expect(calculateLoyaltyPoints(0, 'gold')).toBe(0);
});
it('handles edge case: negative amount', () => {
expect(calculateLoyaltyPoints(-50, 'gold')).toBe(0);
});
});
Focus on Inputs and Outputs
Characterization tests should:
- Test observable behavior - What users see, not internal implementation
- Cover representative cases - Common scenarios and edge cases
- Be specific and concrete - Use real example data, not random values
- Document discovered behavior - Add comments explaining unexpected results
it('applies strange rounding that we discovered', () => {
// Note: This rounds down even for .99, seems like a bug but shipping it
expect(calculateTotal([2.99, 3.99])).toBe(6);
});
Real-World Applications
Example 1: Characterizing a Pricing Engine
// Legacy pricing function - unclear how it works
function calculatePrice(product: Product, customer: Customer): number {
let price = product.basePrice;
if (customer.membershipLevel === 'premium') {
price *= 0.9;
}
if (product.category === 'electronics' && customer.lastPurchase) {
const daysSinceLastPurchase = getDaysSince(customer.lastPurchase);
if (daysSinceLastPurchase < 30) {
price *= 0.95;
}
}
if (product.stock < 5) {
price *= 1.1;
}
return Math.round(price * 100) / 100;
}
// Characterization tests
describe('calculatePrice', () => {
const baseProduct = {
id: '1',
basePrice: 100,
category: 'electronics',
stock: 10,
};
const regularCustomer = {
id: '1',
membershipLevel: 'regular',
lastPurchase: null,
};
it('returns base price for regular customer, normal stock', () => {
const price = calculatePrice(baseProduct, regularCustomer);
expect(price).toBe(100);
});
it('applies 10% premium discount', () => {
const premiumCustomer = { ...regularCustomer, membershipLevel: 'premium' };
const price = calculatePrice(baseProduct, premiumCustomer);
expect(price).toBe(90); // 10% off
});
it('applies recent purchase discount on electronics', () => {
const recentCustomer = {
...regularCustomer,
lastPurchase: new Date(Date.now() - 15 * 24 * 60 * 60 * 1000), // 15 days ago
};
const price = calculatePrice(baseProduct, recentCustomer);
expect(price).toBe(95); // 5% repeat customer discount
});
it('stacks premium and repeat customer discounts', () => {
const premiumRecentCustomer = {
...regularCustomer,
membershipLevel: 'premium',
lastPurchase: new Date(Date.now() - 15 * 24 * 60 * 60 * 1000),
};
const price = calculatePrice(baseProduct, premiumRecentCustomer);
// 10% premium discount, then 5% repeat discount
expect(price).toBe(85.5);
});
it('increases price 10% when stock is low', () => {
const lowStockProduct = { ...baseProduct, stock: 3 };
const price = calculatePrice(lowStockProduct, regularCustomer);
expect(price).toBe(110); // 10% urgency premium
});
it('applies complex interaction: premium + low stock', () => {
const lowStockProduct = { ...baseProduct, stock: 3 };
const premiumCustomer = { ...regularCustomer, membershipLevel: 'premium' };
const price = calculatePrice(lowStockProduct, premiumCustomer);
// First 10% discount (premium), then 10% increase (low stock)
expect(price).toBe(99); // 100 * 0.9 * 1.1 = 99
});
it('does not apply repeat discount if last purchase > 30 days', () => {
const oldCustomer = {
...regularCustomer,
lastPurchase: new Date(Date.now() - 45 * 24 * 60 * 60 * 1000), // 45 days
};
const price = calculatePrice(baseProduct, oldCustomer);
expect(price).toBe(100); // No discount
});
it('rounds to 2 decimal places', () => {
const product = { ...baseProduct, basePrice: 99.999 };
const price = calculatePrice(product, regularCustomer);
expect(price).toBe(100); // Rounded
});
});
Example 2: Characterizing Date Logic
// Legacy function with unclear date handling
function getInvoiceDueDate(invoiceDate: Date, terms: string): Date {
const date = new Date(invoiceDate);
switch (terms) {
case 'NET30':
date.setDate(date.getDate() + 30);
break;
case 'NET60':
date.setDate(date.getDate() + 60);
break;
case 'EOM':
date.setMonth(date.getMonth() + 1);
date.setDate(0); // Last day of current month
break;
case 'immediate':
// No change
break;
default:
date.setDate(date.getDate() + 30);
}
// Skip weekends? This behavior is unclear
while (date.getDay() === 0 || date.getDay() === 6) {
date.setDate(date.getDate() + 1);
}
return date;
}
// Characterization tests document actual behavior
describe('getInvoiceDueDate', () => {
it('NET30: adds 30 days', () => {
const invoice = new Date('2024-01-15');
const due = getInvoiceDueDate(invoice, 'NET30');
expect(due).toEqual(new Date('2024-02-14'));
});
it('NET30: skips weekend', () => {
const invoice = new Date('2024-01-12'); // Friday
const due = getInvoiceDueDate(invoice, 'NET30');
// 30 days = Feb 11 (Sunday), should move to Monday
expect(due).toEqual(new Date('2024-02-12'));
});
it('EOM: end of month from mid-month', () => {
const invoice = new Date('2024-01-15');
const due = getInvoiceDueDate(invoice, 'EOM');
expect(due).toEqual(new Date('2024-01-31'));
});
it('EOM: handles February (non-leap year)', () => {
const invoice = new Date('2023-01-15');
const due = getInvoiceDueDate(invoice, 'EOM');
expect(due).toEqual(new Date('2023-01-31'));
});
it('EOM: handles February (leap year)', () => {
const invoice = new Date('2024-01-15');
const due = getInvoiceDueDate(invoice, 'EOM');
expect(due).toEqual(new Date('2024-01-31'));
});
it('EOM: end of month falls on weekend, moves to Monday', () => {
const invoice = new Date('2024-02-15');
const due = getInvoiceDueDate(invoice, 'EOM');
// Feb 29, 2024 is Thursday, so no weekend adjustment
expect(due).toEqual(new Date('2024-02-29'));
});
it('immediate: returns same date if weekday', () => {
const invoice = new Date('2024-01-15'); // Monday
const due = getInvoiceDueDate(invoice, 'immediate');
expect(due).toEqual(new Date('2024-01-15'));
});
it('immediate: moves to Monday if invoice date is weekend', () => {
const invoice = new Date('2024-01-13'); // Saturday
const due = getInvoiceDueDate(invoice, 'immediate');
expect(due).toEqual(new Date('2024-01-15')); // Monday
});
it('unknown terms: defaults to NET30', () => {
const invoice = new Date('2024-01-15');
const due = getInvoiceDueDate(invoice, 'UNKNOWN');
expect(due).toEqual(new Date('2024-02-14'));
});
});
Example 3: Characterizing Validation Logic
// Legacy email validation with unclear rules
function isValidEmail(email: string): boolean {
if (!email || email.length === 0) return false;
if (email.length > 254) return false;
if (!email.includes('@')) return false;
const parts = email.split('@');
if (parts.length !== 2) return false;
const [local, domain] = parts;
if (local.length === 0 || local.length > 64) return false;
if (domain.length === 0 || domain.length > 255) return false;
if (local.startsWith('.') || local.endsWith('.')) return false;
if (local.includes('..')) return false;
if (!domain.includes('.')) return false;
return true;
}
// Characterization tests
describe('isValidEmail', () => {
// Valid cases
it('accepts standard email', () => {
expect(isValidEmail('user@example.com')).toBe(true);
});
it('accepts email with dots in local part', () => {
expect(isValidEmail('first.last@example.com')).toBe(true);
});
it('accepts email with plus sign', () => {
// Discovered: Plus signs ARE allowed
expect(isValidEmail('user+tag@example.com')).toBe(true);
});
it('accepts email with subdomain', () => {
expect(isValidEmail('user@mail.example.com')).toBe(true);
});
// Invalid cases
it('rejects email without @', () => {
expect(isValidEmail('userexample.com')).toBe(false);
});
it('rejects email with multiple @', () => {
expect(isValidEmail('user@@example.com')).toBe(false);
});
it('rejects email starting with dot', () => {
expect(isValidEmail('.user@example.com')).toBe(false);
});
it('rejects email ending with dot', () => {
expect(isValidEmail('user.@example.com')).toBe(false);
});
it('rejects email with consecutive dots', () => {
expect(isValidEmail('user..name@example.com')).toBe(false);
});
it('rejects email with no domain extension', () => {
expect(isValidEmail('user@example')).toBe(false);
});
it('rejects email over 254 characters', () => {
const longEmail = 'a'.repeat(250) + '@example.com';
expect(isValidEmail(longEmail)).toBe(false);
});
it('rejects email with local part over 64 characters', () => {
const longLocal = 'a'.repeat(65) + '@example.com';
expect(isValidEmail(longLocal)).toBe(false);
});
// Edge cases discovered during characterization
it('rejects empty string', () => {
expect(isValidEmail('')).toBe(false);
});
it('rejects null', () => {
expect(isValidEmail(null as any)).toBe(false);
});
it('accepts technically invalid but commonly used emails', () => {
// Discovered: Doesn't validate TLD properly
expect(isValidEmail('user@localhost.local')).toBe(true);
});
});
Best Practices
1. Start with Happy Path, Then Edge Cases
// Order of characterization
describe('calculateShipping', () => {
// 1. Happy path
it('calculates standard shipping for normal order', () => {
expect(calculateShipping({ weight: 5, distance: 100 })).toBe(15);
});
// 2. Common variations
it('calculates express shipping', () => {
expect(calculateShipping({ weight: 5, distance: 100, express: true })).toBe(30);
});
// 3. Edge cases
it('handles zero weight', () => {
expect(calculateShipping({ weight: 0, distance: 100 })).toBe(10); // Minimum fee
});
it('handles international shipping', () => {
expect(calculateShipping({ weight: 5, distance: 5000 })).toBe(125);
});
// 4. Unexpected behavior
it('clamps maximum shipping cost', () => {
// Discovered: Never charges more than $200
expect(calculateShipping({ weight: 1000, distance: 10000 })).toBe(200);
});
});
2. Document Surprising Behavior
it('rounds prices in unexpected way', () => {
// Note: This truncates instead of rounding, seems like a bug
// but changing it would affect all historical orders
// TODO: Fix in v2 of pricing engine
expect(calculatePrice(99.99)).toBe(99);
});
3. Use Descriptive Test Names
// Bad
it('test1', () => { /* ... */ });
// Good
it('applies senior discount for customers age 65+', () => { /* ... */ });
it('waives shipping fee for orders over $50', () => { /* ... */ });
it('rejects orders with more than 100 line items', () => { /* ... */ });
4. Group Related Behaviors
describe('calculateTax', () => {
describe('when customer is in California', () => {
it('applies state tax rate of 7.25%', () => { /* ... */ });
it('adds local tax for LA county', () => { /* ... */ });
it('exempts groceries from tax', () => { /* ... */ });
});
describe('when customer is in Oregon', () => {
it('applies no sales tax', () => { /* ... */ });
});
describe('when customer is international', () => {
it('applies no US tax', () => { /* ... */ });
it('includes VAT ID in invoice', () => { /* ... */ });
});
});
Moving from Characterization to Specification
Characterization tests are a starting point, not an end goal:
1. Start with characterization - Document what code does now
2. Understand the domain - Talk to stakeholders, read docs, analyze business logic
3. Identify bugs vs. features - Some "characterized" behavior is actually bugs
4. Refactor safely - With tests in place, you can now refactor
5. Update to specification tests - Gradually replace characterization with tests of correct behavior
// Characterization test
it('calculates discount', () => {
expect(calculateDiscount(100, 'VIP')).toBe(23);
});
// After understanding requirements, becomes specification test
it('applies 20% VIP discount plus 3% loyalty bonus', () => {
const discount = calculateDiscount(100, 'VIP');
expect(discount).toBe(23); // 20% + 3% = 23%
});
// Then you might discover and fix a bug
it('applies 20% VIP discount plus 5% loyalty bonus', () => {
const discount = calculateDiscount(100, 'VIP');
expect(discount).toBe(25); // Fixed: Should be 25%, not 23%
});
Key Takeaways
- Characterization tests document current behavior without judging correctness
- They provide a safety net for refactoring legacy code
- Write tests by guessing output, running to see actual output, then updating test
- Focus on observable inputs and outputs, not internal implementation
- Cover happy paths, common variations, and edge cases
- Document surprising or questionable behavior with comments
- Characterization tests are a means to an end—eventually replace with specification tests
- They're essential when requirements are unclear or lost to time
Further Reading
- Working Effectively with Legacy Code - Chapter 13 on characterization tests
- The Art of Unit Testing - Testing strategies
- Legacy Code: Change Algorithm - Michael Feathers' blog post
- Refactoring - Safe code transformations after tests are in place