Data

Search & Discovery Audit

Audit the content ingestion pipeline that gets content into the searchable store, or the Elasticsearch relevance tuning that ranks it. Take one scope or the end-to-end bundle.

from €8,0002–3 weeks (per scope)
Qualification

Is this for you?

  • Users complain that search "just doesn't find things."
  • You run an audio, music, or media platform with messy catalog data.
  • Ingestion is a pile of cron jobs nobody fully trusts.
  • Relevance was tuned once years ago and nobody remembers by whom.
  • You're about to bolt on LLM or RAG features and want retrieval solid first.
  • Downstream teams routinely complain about data quality or search results.
Scope option

Content Ingestion Pipeline Audit

Deep review of how content gets from source to searchable, for platforms where assets and metadata are a mess.

€10,0002–3 weeks

What you get

  • Current-state architecture diagram.
  • Data quality assessment with specific defect categories and frequency.
  • Normalization and deduplication strategy recommendation.
  • Observability recommendations — what to monitor, where to alert.
  • Prioritized remediation plan with effort estimates.
  • 60-minute walkthrough.

Good fit: audio/music/media platforms with messy catalog data, ingestion pipelines that are a pile of cron jobs nobody fully trusts, or downstream teams (search, recommendations, billing) complaining about data quality.

Scope option

Elasticsearch Relevance Audit

A written audit of why your Elasticsearch results are wrong and a prioritized plan to fix them.

€8,0002 weeks

What you get

  • Written report (15–25 pages) with findings grouped by severity.
  • Annotated mapping and analyzer review.
  • Query review with 10–20 representative queries, scored and diagnosed.
  • Prioritized remediation roadmap with effort estimates.
  • 60-minute walkthrough with your engineering team.

Good fit: users complain search "just doesn't find things"; relevance was tuned once and nobody remembers by whom; planning to add LLM/RAG features and want retrieval solid first. OpenSearch is supported.

In and out

Scope

What's in

  • Content ingestion architecture and data-quality assessment
  • Normalization, deduplication, and observability design
  • Elasticsearch/OpenSearch mapping and analyzer review
  • Query structure and relevance scoring review
  • Prioritized remediation roadmap for either or both

What's out

  • Implementing the remediation (separate engagement)
  • Rights and licensing logic beyond what affects ingestion
  • Cluster infrastructure tuning and ES version upgrades
  • Hybrid search / vector integration beyond scoping
  • Full platform rearchitecture
How it runs

Process

  1. Intake

    Day 1

    Kickoff, access to repo, sample data, pipeline dashboards or index settings, sample queries, intro to the team.

  2. Discovery

    Days 2–7

    Pipeline trace or mapping/analyzer audit, sample data analysis, representative-query generation with stakeholders, scoring baseline.

  3. Analysis

    Days 8–12

    Remediation design, observability planning, fix design, ranking experiments where possible.

  4. Report

    Days 13–17

    Draft, review, finalize.

  5. Walkthrough

    End of engagement

    60-minute call with your team.

Fixed, upfront

Pricing

from €8,000
2–3 weeks (per scope)
Full-scope bundle
€16,000

End-to-end search & discovery: ingestion pipeline + Elasticsearch relevance together, vs. €18,000 if purchased separately.

50% to start, 50% on report delivery. Includes one 30-minute follow-up call within 30 days of delivery. End-to-end bundle invoiced as one engagement.

One fixed price. No surprises, no “starting at” language. If we agree on scope and you pay the deposit, the engagement is locked in.

FAQ

Questions

Which scope should I pick?

If content is getting into the system reliably but results are bad, take the Relevance scope. If ingestion itself is unreliable (duplicates, missing metadata, broken normalization), take the Ingestion scope. If both are in doubt, the end-to-end bundle tells the full "content enters the system → content is found by users" story — closer to how buyers think about the problem than the technology-centric split.

Do you work on OpenSearch?

Yes.

Do you work on non-audio platforms?

Yes — the patterns translate. Audio and music are where my deepest experience is; the methodology is the same.

Do you need production access?

No. Repo read access, sample data (sanitized is fine), and a representative environment are enough.

Do you implement the fixes?

Separately, yes.

Will you sign an NDA?

Yes.

Who you're working with

About

I've built and scaled content ingestion and Elasticsearch-based discovery for audio and content platforms — catalog ingestion, normalization, rights-aware delivery, relevance tuning. The audit is domain-specific, not generic data-pipeline or config-checklist advice.

More about Paper Scissors & Glue

Ready to start?

Book an intro call. If we're not a fit, I'll tell you on the call.

Based in Cluj-Napoca, Romania. Available across EU and US time zones.