How a statement
becomes a report.
A short, honest note on what we read, what we send to the model, what we keep, and where the limits sit.
What we read
The uploaded PDF is parsed entirely within the server process that handles the request. We use unpdf, a serverless build of PDF.js, to extract the document's text content. Images, scanned pages, and embedded fonts are ignored — only the textual transaction log is retained.
For most digital bank statements (Revolut, Wise, N26, Erste, PBZ, Zaba, OTP, and the major US incumbents), this yields a clean stream of lines with date, description, and amount. Scanned statements without OCR will fail with a clear error.
What we send
Only the extracted text is sent to OpenAI's API. The original PDF never leaves this process. We pass the text to gpt-4o alongside a structured JSON schema describing transactions, totals, category aggregates, and insights.
The schema is enforced server-side via the model's structured-output mode, so the response is deterministically shaped and can be validated before rendering.
What we infer
The model categorises each line into one of seventeen categories, infers the statement's period, computes income / spending / net, and writes three to five concrete insights — each tied to a specific pattern with an annualised savings figure where one can be estimated.
Insights are ranked by potential annual impact, not chronology.
What we keep
For signed-in users: your email, a counter of analyses used, and the structured analysis result for each statement you process — so you can re-open past reports in your account. Each report is stored as JSON (categorised transactions, totals, insights), never the original PDF.
Reports are visible only to you (enforced at the database layer via row-level security). You can delete any report from your dashboard at any time — deletion is immediate and permanent.
OpenAI's API retains requests for up to 30 days for abuse monitoring under their default policy. Self-host or use a zero-retention API agreement if this is a constraint.
Where the limits are
- Scanned PDFs without an OCR layer cannot be read.
- Multi-currency statements are coerced to the dominant currency; conversions are approximate.
- The model occasionally splits or merges transaction lines with unusual formatting. Spot-check the ledger if a total looks wrong.
- Insights are heuristic. They are prompts for reflection, not financial advice.
Ready to try it?
Three analyses free. No credit card required.