From the journal

Reranking is the cheapest big win in legal retrieval

Adding a reranker was the single largest accuracy gain in the pipeline, and it cost almost nothing to add

Illia ProkopievCo-Founder and CEO4 min read

We are building a system that answers questions about EU financial and AI regulation and backs every statement with the exact article of the actual law. The part that decides whether that works is retrieval: if the right provision never surfaces, nothing downstream can save the answer. While choosing our retrieval stack we ran a controlled test on a real corpus, and one result was clear enough that it is worth sharing, with all the caveats attached.

The short version: adding a reranker was the single largest accuracy gain in the pipeline, and it cost almost nothing to add.

What a reranker does

Modern semantic search has two stages. First, an embedding model casts a wide net and pulls back a shortlist of candidate provisions by meaning, fast but approximate. Second, a reranker re-reads each shortlisted provision directly against the question and reorders them. The embedder is good at finding the right article somewhere in the top fifty; the reranker is good at moving it to the top. For a citation-grounded answer, the top is what matters, because the answer model should be reading the most relevant provision first.

How we measured

The corpus is the consolidated text of seven EU instruments (MiCA, DORA, the Transfer of Funds Regulation, the AML Regulation, AML Directive 6, the AI Act, and the AMLA Regulation), split into roughly 56,000 article and paragraph level units. We wrote 151 questions across four difficulty bands, from clean keyword-style queries to messy, colloquial ones, each labelled with the article that should be retrieved. We retrieved a 50-candidate pool with a single embedder, then reordered it with each reranker, and scored with Mean Reciprocal Rank (MRR): the higher the correct article ranks, the higher the score, where 1.0 means it was first every time.

The result

Mean MRR, by question band, with no reranker versus three rerankers, on our corpus:

Results
Results

On the hard, colloquially phrased questions, the kind real users actually ask, reranking raised MRR from 0.55 to 0.70 and the share of questions with a correct article in the top five from 76% to 87%. That is a large gain from a component that adds milliseconds and a fraction of a cent per query.

On vendors, and stated only for our corpus and our questions: Voyage rerank-2.5 led every band, which is why we adopted it. Cohere rerank-v3.5 was close to flat here, even with full-length documents and its token budget set fairly. We did not see it help on this material, which surprised us, and we would not generalise that beyond this corpus.

We included Kanon 2 Reranker from Isaacus, a reranker built specifically for legal text, because it is exactly the kind of tool you would expect to win here, and because its published benchmark reports an advantage over Voyage. On our corpus it came second or third in every band and helped only on the hard set. This is not a refutation of anyone's benchmark: Isaacus reports results on a different dataset of mixed legal material, and benchmarks measure what they measure. What we can say is narrower and, we think, more useful: on consolidated EU financial and AI regulation, with our questions and a general-purpose first-stage retriever, a strong general reranker outperformed the domain-specialist one. We saw the same pattern earlier with embedding models. For clean, well-structured statutory text, domain specialisation is worth testing rather than assuming.

Limitations

These numbers are ours, not a public benchmark. The 151 questions were written and labelled by us, not adjudicated by independent counsel; the test used one candidate-pool size and one first-stage retriever; and the field of rerankers is larger than three. Treat the vendor ordering as what worked in our setup, and the rerank lift as the durable, generalisable finding.

Takeaway

If you are building retrieval over statutory or regulatory text and you have not added a reranker, that is probably the highest-return change available to you, especially for the natural-language questions real users type. Measure it on your own corpus, with your own questions, before you trust anyone's leaderboard, including ours.


Illia Prokopiev

Written by

Illia Prokopiev

Co-Founder and CEO

Illia is the Managing Partner and founder of Licentium. With over 11 years of practice, he has guided innovators through cross-border M&A deals and the disputes that follow, combining transactional skill with courtroom resolve. Admitted to the bar in 2017, he pivoted early to Web3, serving as legal advisor to prominent crypto projects and carrying AML/MLRO duties that anchored complex token, DAO, and compliance questions on solid regulatory ground. Certified in money laundering prevention and an active crypto investor, Illia blends market intuition with a global network of specialists, enabling Licentium to untangle licensing knots for crypto and AI ventures anywhere in the world.

More from the journal

See all

European Commission Publishes Draft High-Risk AI Classification Guidelines Under EU AI Act, May 2026

On 19 May 2026, the European Commission published draft guidelines on classifying high-risk AI systems under Article 6 of Regulation (EU) 2024/1689, the EU AI Act. The guidelines adopt an expansive interpretation of the high-risk conformity assessment test and are supported by a targeted consultation open until 23 July 2026. The application deadline for Article 6(2) Annex III use cases has been postponed from 2 August 2026 to 2 December 2027.

FSB Issues Consultation on Sound Practices for Responsible AI Adoption in Finance, June 2026

On 10 June 2026, the Financial Stability Board published a consultation report identifying 12 sound practices for responsible AI adoption by financial institutions. The practices cover organisation-wide AI governance and AI lifecycle management at the use-case level. The FSB explicitly acknowledges the limits of human oversight of agentic AI systems and recommends AI-monitoring-AI architectures. Comments are due 22 July 2026, with a final report expected in October 2026.

New Zealand Online Casino Gambling Regulations 2026 Take Effect 3 July, Regulating Licensed Operators

The Online Casino Gambling Regulations 2026 (NZ) come into force on 3 July 2026, setting operational and advertising requirements for up to 15 licensed online casino operators under the Online Casino Gambling Act 2026. Key requirements include mandatory spending limit prompts, a prohibition on credit card payments, a 3.5% quarterly levy on gambling profits, bans on autoplay functions, and restrictions on affiliate marketing targeting specific audiences.

Ready to launch without the regulatory guesswork?

Book a 30-minute consultation. We'll map your AI or licensing path and tell you exactly what's required, in plain language.