UK finance professional · AI engineer

I build production AI systems for UK finance.

A finance professional who ships real AI: accounting automation, LLM evaluation, and agent tooling. Most AI builders cannot do finance, and most finance people cannot build. I do both - and the work is adopted, shipped, and measured.

Bilal Khizar - finance professional and AI engineer
  • ~350 PyPI downloads a month, on a library I authored
  • 166 UK nominal codes, fully documented
  • 95% median frontier-model score on a benchmark I authored for DeepMind
  • Top 1% AWS AIdeas semifinalist, of 10,000+ entrants

Who you would be working with

About

I am a UK finance professional running automation and AI at a UK accountancy practice, and I build and ship production AI systems. I hold a BSc in Banking and International Finance from Bayes Business School.

What makes the work different is the overlap: I bridge deep UK financial domain knowledge - accounting, VAT, HMRC, double-entry, payroll - with hands-on AI engineering. That means I can take a messy finance problem, model it correctly, and build the system that solves it. I am a published open-source author and a hackathon-winning builder.

What I can build for you

Services

Four things I deliver end to end, where deep UK financial domain knowledge and hands-on AI engineering are both required.

  • UK accounting & bookkeeping automation

    Transaction categorisation, compliance tracking, and Companies House API integration - grounded in real double-entry, VAT, and the UK Chart of Accounts that sits behind Sage, Xero, and HMRC filings.

  • AI & LLM system development

    End-to-end systems and integrations across OpenAI, Anthropic, Gemini, Bedrock, and Ollama - including agent tooling, OCR pipelines, and structured, auditable outputs.

  • LLM evaluation & red-teaming

    Benchmarking, evaluation harnesses, and adversarial testing - with a rare specialism in the finance and accounting domain, where correctness is not optional.

  • Full-stack product development

    React and TypeScript on the front, Python and FastAPI on the back, AWS serverless underneath. Shipped to the web, to iOS, and to live production APIs.

Selected work

Each piece answers one question: what could I deliver for you, and what proves I can?

Open source · adopted in production

uk-chart-of-accounts

A zero-dependency Python library providing the complete UK Chart of Accounts: 166 nominal codes, VAT treatments, HMRC box mappings, double-entry rules, and an LLM prompt export. It fills a genuine gap in the Python ecosystem - and real developers use it.

  • ~350 downloads / month (PyPI)
  • 166 / 166 codes documented
  • 0 dependencies

End-to-end AI system, inside finance

LedgerAgent

A multi-client AI bookkeeping system: LLM transaction categorisation against the UK Chart of Accounts, OCR receipt processing, confidence scoring with reasoning for audit trails, a double-entry ledger, and isolated client workspaces. Built on AWS with Bedrock, DynamoDB, S3, and Cognito.

  • AWS AIdeas semifinalist (top ~1,000 / 10,000+)
  • Audit-grade confidence and reasoning

LLM evaluation · specialist domain

UK bookkeeping benchmark - Google DeepMind

I authored and submitted a UK bookkeeping benchmark to Google DeepMind's "Measuring Progress Toward AGI" competition (Kaggle Community Benchmarks, Learning track). It probes where frontier models hold up - and where they quietly break - on real UK accounting tasks.

  • 59 / 62 (95%) median frontier-model score
  • 62 graded tasks across 5 difficulty tiers

Automation in a live practice

NK & Co automation

As Head of Systems & Automation at a UK accountancy practice, I built an automated compliance-tracking dashboard (Python plus the Companies House API) covering confirmation statements, statutory accounts, and corporation-tax deadlines; an ML-powered bookkeeping categoriser against UK nominal codes (rule matching, scikit-learn, and LLM inference); and a multi-provider desktop AI agent.

  • Deployed in a working practice
  • Companies House API integration

Shipped end to end, at scale

Sky Score

A property environmental-quality tool and B2B API: noise, air quality, and liveability scoring across London neighbourhoods, combining 10+ live government data sources. Fully serverless on AWS, with a live API and an iOS app shipped to the App Store.

  • 10+ live government data sources
  • Live API and App Store product

Also shipped: Noor - a multi-platform consumer app (15 modules, right-to-left and multi-language support, offline-first, full CI/CD across web, iOS, and Android); and Sterling - a voice-first assistant entered into the Gemini 3 Hackathon (~34,000 participants).

See it, don't just read it

Categorise a transaction

This is the judgement my tools automate. Pick the UK nominal code - the same taxonomy my uk-chart-of-accounts library encodes for machines to use.

Worked example

Monthly office rent paid via BACS

£1,200

Which nominal code?

reasoning Property rental is a recurring operating overhead - expensed to the P&L as incurred, never capitalised.

Codes from uk-chart-of-accounts

Available now

Let's talk about your project

I take on contract and freelance work in finance automation, AI and LLM development, and LLM evaluation. If that is what you need, get in touch. I read every serious enquiry and reply within a day - no pitch, just an honest answer on whether I can help.

Or reach me directly: bilalkhizar@hotmail.co.uk