Financial Month-End Close Pipeline
Design a month end financial close and reporting pipeline for a multinational corporation with operations in 47 countries. The company's primary ERP is SAP, but 12 subsidiaries use local GL systems that export data vi...
Design a month-end financial close and reporting pipeline for a multinational corporation with operations in 47 countries. The company's primary ERP is SAP, but 12 subsidiaries use local GL systems that export data via nightly file drops. At the end of each month, finance teams in all 47 countries submit and approve journal entries, intercompany transactions between subsidiaries must be detected and eliminated before consolidation, all local currency amounts must be converted to USD at the correct FX rate (different rates for P&L vs Balance Sheet per ASC 830 / IAS 21), and consolidated financial statements (P&L, Balance Sheet, Cash Flow) must be produced in compliance with US GAAP. The entire close must complete within 5 business days of month-end. The CFO needs a live dashboard showing close progress across all entities. Auditors need to be able to trace any line item back to the originating journal entry. Design this data pipeline end to end.
How to Approach This Problem
What Makes This Problem Unique Month end close is one of the most domain specific data engineering design questions. Candidates without finance domain knowledge treat it as "just another ETL pipeline" and produce a generic answer. The interviewers are evaluating whether you understand the business process constraints — specifically, the three financial engineering hard problems that have no analogue in typical data pipelines. Hard problem 1: Intercompany elimination at scale When Entity A (US subsidiary) sells services to Entity B (UK subsidiary), Entity A records revenue and Entity B records an expense. For the consolidated group, this transaction should not exist — it is internal. Before p
Clarifying Questions
Functional Requirements Which GAAP standard: US GAAP (ASC 830) or IFRS (IAS 21)? (Determines FX rate selection rules) What level of reporting is required? Entity level statutory reports in local GAAP + group level consolidated under US GAAP? Are there intercompany transactions beyond just sales — loans, royalties, management fees? What is the entity hierarchy? (Is it a simple parent subsidiaries tree, or a complex ownership structure with minority interests?) Should minority interest (non controlling interest) be calculated and presented separately? Are there segment level reports required (geographic, business unit, product line)? Does the system need to support XBRL tagging for SEC filings
Envelope Estimation
Scale Estimates GL Journal Volume Component Volume Total entities 47 Journal lines per entity per month (avg) ~1.06M Total journal lines per month ~50M Average journal line size ~500 bytes (account, entity, amount, currency, description, posting date) Raw storage per month 50M × 500B = ~25GB/month With Parquet + Snappy compression ~3–5GB/month Intercompany Transaction Volume Intercompany (IC) transactions: typically 10–20% of total journal volume in a multinational 50M × 15% = ~7.5M IC journal lines per month IC pairs to detect: 47 entities × 46 counterparties / 2 = 1,081 potential entity pairs In practice, 80% of IC volume concentrated in ~50 active entity pairs FX Conversion Volume Unique
Architecture: The 5-Phase Close Pipeline
Five Phases of Month End Close The close pipeline has five distinct phases with clear dependencies and approval gates. Phase 1: Extract (Day 1, 08:00–12:00) Airflow DAG triggers on the first business day after period end. For each of 47 entities: SAP entities: extract via SAP BAPI (Business Application Programming Interface) or an Airflow SAP connector Local GL entities: wait for file drop in landing zone (S3/GCS/ADLS) Each entity extraction writes to the raw zone: Validation gate: Before the IC step, each entity's trial balance must balance: Phase 2: Intercompany Elimination (Day 1, 12:00–14:00) Detect all IC transactions, generate elimination journals: 1. Tag each journal line with interco
Component Deep Dive
Intercompany Elimination Engine The IC elimination engine is the most complex and most finance specific component. Detection Algorithm FX Conversion Engine
Data Modeling
Core Financial Data Models journal lines (source of truth — append only) fx rates (the FX rate table) financial snapshots (versioned, immutable per close) period locks (enforcing the close calendar)