Skip to content
← Back
2025

Document Purchasing API with batch + webhooks

PythonDjangoAWSWebhooks

Internal names, providers, and exact numbers have been abstracted or generalized for confidentiality — the architecture patterns and trade-offs described are accurate.

Context

The credit product depends on aggregated regulatory and financial documents tied to a company (CNPJ) or person (CPF) — debt history, tax compliance, judicial records, and similar artifacts. These come from many providers, each with its own API, rate limits, auth scheme, and billing model. Documents are needed during onboarding and continuously through the lifetime of the credit relationship.

The original flow was synchronous and per-document: a calling service kicked off an analysis, looped over providers in order, and blocked. That had three problems:

  1. Latency. Sequential calls to slow providers blocked the analysis path.
  2. Cost. No deduplication — the same CNPJ could be re-fetched within minutes by different consumers.
  3. No batch path. When operations needed to refresh thousands of CNPJs overnight, there was nothing purpose-built.

The Document Purchasing API was extracted as a dedicated service to fix all three.

Architecture

caller service ──► Document Purchasing API ──┬─► provider adapter A
                          │                  ├─► provider adapter B
                          │                  └─► provider adapter ...
                          │
                          ├─► document store
                          │
                          └─► webhook dispatcher ──► caller callback

Surface

Two endpoints cover both consumption styles:

  • POST /purchases — accepts a list of {document_type, target} items, returns a purchase_id, and registers a callback URL.
  • GET /purchases/{id} — synchronous read for callers that prefer polling.

Async core

Each batch fans out into per-document jobs on a queue. Workers pull jobs, dispatch through the right provider adapter, persist the response, and signal completion. The webhook dispatcher fires the callback once all jobs in a batch reach a terminal state.

Idempotency

Two layers:

  • Batch-level: a caller-supplied idempotency key short-circuits repeated batches within a TTL.
  • Document-level: a document fetched within the last N hours is served from cache instead of re-billing the provider.

Failure handling

Per-job exponential backoff with a cap, a dead-letter queue for terminal failures, and a separate replay endpoint so operators can manually retrigger DLQ items after a provider incident.

Webhook delivery

Each callback is signed with HMAC, retried on 5xx with exponential backoff, and exposes a manual replay endpoint on the receiver's side. Delivery state is tracked per attempt.

Trade-offs

Async-first instead of sync API. Every caller now has to handle webhooks or poll. In return, slow providers no longer dictate the latency of the calling service, and ops batch jobs reuse the same code path as live traffic.

Cache by document, not by request. A single batch can split into partial-cache, partial-fetch — more complex hit/miss accounting. The benefit is significant provider spend reduction for hot CNPJs (the same large company gets analyzed across many credit operations).

HMAC-signed webhooks instead of mTLS. Receivers must verify signatures correctly, which pushed some education onto consumers. The benefit was avoiding cert lifecycle management across the receiver fleet, which was the larger operational cost.

Outcome

  • Latency. Analyses that depend on slow providers no longer block the calling service — the caller keeps moving while the API works in the background.
  • Provider spend. Hot CNPJs (large companies analyzed across many credit operations) get served from cache rather than re-billing the provider on every request.
  • Operability. Mass refreshes that previously required custom scripts became self-serve through the same replay/retrigger surface used for incident recovery.
  • Footprint. A single API now backs onboarding, ongoing monitoring, and ops batch jobs — collapsing what used to be three different code paths.