AI & Natural Language Processing

RAG System Audit

Retrieval-quality assessment with concrete failure examples, chunking and embedding review, and a path to evals you can actually run in CI.

base €6,0002–3 weeks
Qualification

Is this for you?

  • You built RAG, launched it, and it's "mostly working".
  • Retrieval quality is the suspected problem but you haven't proven it.
  • No eval pipeline in place — you rely on users complaining.
  • Chunking was done once and never revisited.
  • You're considering swapping embedding models or adding a reranker and want a read first.
Deliverables

What you get

  • Retrieval quality assessment with specific failure examples from your corpus.
  • Chunking strategy review with tested alternatives.
  • Embedding model evaluation — current vs two candidates on your data.
  • Reranker evaluation if relevant.
  • Eval pipeline recommendation — what to measure, how, where to put it.
  • Prioritized remediation roadmap with effort estimates.
  • 60-minute walkthrough with your engineering team.
In and out

Scope

What's in

  • Retrieval pipeline assessment
  • Chunking and embedding strategy
  • Reranker and hybrid-search evaluation
  • Eval pipeline design

What's out

  • Corpus curation and labeling
  • Frontend or product UX
  • Implementing the fixes
  • Training custom embedding models
How it runs

Process

  1. Intake

    Day 1

    Kickoff, access to repo, corpus sample, retrieval dashboards, intro to team.

  2. Discovery

    Days 2–7

    Pipeline read, representative-query generation, failure mode inventory.

  3. Experiments

    Days 8–12

    Run chunking/embedding/reranker comparisons on your data.

  4. Report

    Days 13–17

    Draft, review, finalize.

  5. Walkthrough

    End of engagement

    60-minute call with your team.

Fixed, upfront

Pricing

base €6,000
2–3 weeks

50% to start, 50% on report delivery. Includes one 30-minute follow-up call within 30 days of delivery.

One fixed price. No surprises, no “starting at” language. If we agree on scope and you pay the deposit, the engagement is locked in.

FAQ

Questions

Do you need our full corpus?

A representative sample (typically 1–10% depending on diversity) and a set of real user queries.

We use LangChain / LlamaIndex / custom. Does it matter?

No — the review is framework-agnostic.

Can you implement the fixes?

Separately, yes. This engagement is advisory.

Will you sign an NDA?

Yes.

We haven't done an LLM Integration Review. Should we start there?

If your broader LLM integration is also a question mark, yes. If RAG is the specific concern, start here.

Who you're working with

About

Cornell NLP certified, with hands-on RAG and retrieval work predating the term becoming fashionable. The review is grounded in what improves real retrieval quality, not vendor pitches.

More about the studio

Ready to start?

Book an intro call. If we're not a fit, I'll tell you on the call.

Based in Cluj-Napoca • Available Worldwide