Casting agents were spending 60-90 minutes per search. That’s what the admin team told us a few months into the platform’s life. They had filters: role type, location, experience level. But roles in entertainment don’t reduce to dropdown menus. “Athletic build, strong improv background, comfortable with physical performance, prior on-camera experience” is a combination no filter set handles well.
That conversation is what pushed us to add the AI matching layer. Here’s how we built it.
What We’d Already Built
The core platform came first: a talent management system for the entertainment industry with two portals.
The Talent Portal is where individuals create profiles, list their experience, upload media, and mark their availability. The Admin Portal is where casting professionals post opportunities, review talent, manage applications, and run the end-to-end matching process.
We built this on Next.js, PostgreSQL, and GCP Cloud Run. The schema supports multiple roles: Talent, Talent Managers, Admins, and Super Admins, each with distinct permissions and workflows. Getting that right took most of the first two months. By the time the AI matching question came up, the platform had hundreds of active talent profiles and a working application workflow. The matching problem was real: too many profiles, not enough filtering granularity.
The Simple Search Problem
Our first approach to smarter discovery was PostgreSQL full-text search combined with structured filters. It worked for clear-cut cases: search for “stunt performer” and you’d get stunt performers.
It broke on nuanced requirements. “Urban style, strong improvisation background, physically expressive” returned results that matched individual words across three unrelated profiles. The system had no way to understand that the casting agent wanted all of those attributes together, in one person, weighted against each other.
We needed something that could reason about semantic similarity, not just keyword overlap. That led us to pgvector on the PostgreSQL instance we were already running.
Building the Semantic Layer
The approach: embed each talent profile as a vector, embed each search query as a vector, then find the closest profiles to the query in embedding space.
We’d solved a similar problem on a unified workspace search project before. Adding vector search to an existing PostgreSQL database works cleanly. Same pattern here, different domain.
We used OpenAI’s text-embedding-3-small for embeddings. It’s well-documented, reasonably priced at about $0.02 per million tokens, and performs well for multi-attribute matching tasks. Embeddings are generated when a profile is saved or updated, then stored in a profile_embedding vector(1536) column in PostgreSQL. At search time, we embed the casting agent’s query and run a cosine similarity search:
-- Find top 10 talent matches for a given opportunity
SELECT
t.id,
t.display_name,
t.role_types,
1 - (t.profile_embedding <=> $1::vector) AS similarity_score
FROM talent_profiles t
WHERE
t.status = 'available'
AND t.location = ANY($2::text[])
AND t.role_types && $3::text[]
ORDER BY t.profile_embedding <=> $1::vector
LIMIT 10;
The <=> operator is pgvector’s cosine distance. 1 - cosine_distance gives you cosine similarity. The && is PostgreSQL array overlap for role type matching. Hard filters run in the WHERE clause before the similarity sort (more on why that ordering matters below).
One thing that’s not obvious from reading about pgvector: you need the right index before your dataset gets large. We started without one, which is fine for a few hundred profiles. As the dataset grew, we added an HNSW index:
CREATE INDEX ON talent_profiles
USING hnsw (profile_embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
HNSW beats IVFFlat for datasets under a few million rows at query time. We should’ve added it earlier.
The Profile Representation Problem
Here’s what we got wrong on the first attempt.
We embedded raw profile text directly: bio, listed skills, self-described experience, everything the talent user had typed into their profile. The search results were immediately biased by profile length.
A talent who’d written a 400-word bio ranked above a more experienced performer who’d written three sentences. The embedding captured the density of relevant vocabulary, and vocabulary density correlates with writing effort, not with actual fit. We caught this after several test searches where the top results were consistently the most verbose profiles, not the most relevant ones.
The fix: a normalized profile summary generated at save time. Instead of embedding whatever the user wrote, we run a structured extraction step first. We prompt GPT-4o to produce a standardized summary from the raw profile:
Extract the following from this talent profile:
- Primary role types (max 5, use standard terms)
- Performance style descriptors (max 8 words)
- Physical characteristics relevant to casting (age range, build type)
- Years of experience by medium (stage, screen, commercial)
- Notable skills or specializations
Return JSON. If a field is unclear, omit it rather than guessing.
We embed the structured output, not the raw bio. Results improved immediately: shorter profiles with specific attributes started ranking above verbose but vague ones. The semantic search returned matches that the casting agents recognized as plausible.
We still don’t have a clean solution for profiles where the user provides almost no information. We’ve added profile completion prompts in the UI, but some talent just won’t fill things out, and a sparse normalized summary embeds poorly. It’s an ongoing problem.
Hybrid Scoring: Filters Before Semantics
We tried running semantic similarity across the full profile database first, then filtering. It didn’t work well. Remote talent with strong fits outranked local talent who were equally strong but didn’t happen to describe their regional experience explicitly. Casting agents started ignoring results because too many were geographically wrong.
The right order: hard filters in the WHERE clause first, semantic similarity ranking on the remaining candidates.
Hard filters handle the non-negotiables: availability window, location, role type compatibility. These are binary match criteria and SQL handles them efficiently. Semantic similarity only ranks the pool that’s already passed the filters. This also makes the queries faster: cosine similarity on 200 filtered candidates is significantly faster than on 5,000 unfiltered profiles.
We also added a recency boost to the similarity score for talent who had updated their profile in the last 30 days. Stale profiles that haven’t been refreshed tend to have outdated availability information, and the casting agents trusted fresh profiles more. A small weight adjustment, but it reduced complaints about unavailable talent appearing at the top of results.
What I’d Do Differently
Build the profile normalization template before writing any embedding code. We went straight to embeddings, hit the quality problem, and had to retrofit the normalization step. If we’d designed the structured extraction format first and enforced it at profile creation, we’d have saved a week of debugging mediocre search results and avoided a retroactive re-embedding job for all existing profiles.
Also: get the HNSW index in place from the start. Adding it to a table that already has data requires a full index build. On a small dataset it’s quick. On a larger one it’s an operation you’d rather not run during production hours. The pgvector indexing documentation has a clear breakdown of IVFFlat vs HNSW trade-offs at different dataset sizes. Worth reading before you pick one.
The platform is still active, still growing. The full case study is on our talent platform case study page.
FAQ
How does AI matching work for a talent platform?
The core idea: represent talent profiles as vectors (numeric representations that capture semantic meaning) and represent casting requirements the same way. The system finds profiles that are mathematically close to the requirement in that embedding space. On its own, this isn’t enough. You need structured profile normalization so all profiles are comparable, and hard filters before semantic ranking so you’re not comparing talent across incompatible locations or role types.
How much does it cost to add AI matching to an existing platform?
Depends on dataset size and complexity. For a PostgreSQL-backed platform with existing profile data, adding pgvector is a PostgreSQL extension install, with no new database to manage. The main costs are embedding API calls for the initial bulk job (for a few thousand profiles, typically under $5 at current OpenAI rates) and ongoing per-profile costs as new profiles are created. Search query embeddings cost fractions of a cent each. The larger cost is engineering time: the profile normalization step and the hybrid scoring logic are where the real work is.
What’s the difference between keyword search and semantic search for talent matching?
Keyword search finds profiles that contain specific words or phrases. It works when the casting agent and the talent use the same vocabulary. Semantic search finds profiles that are conceptually similar to the query, even when different words are used. For entertainment roles, this handles the cases that matter most: “physically expressive with a comedic edge” doesn’t share words with “trained in physical comedy and movement theater,” but semantic similarity will find the match. Keyword search misses it entirely.
How long does it take to build a talent platform with AI matching from scratch?
For the core platform (two portals, application workflow, role management, profile system), we took roughly two months with two engineers. The AI matching layer added about three weeks: one week for pgvector setup and the initial embedding pipeline, one week for the profile normalization extraction step, and one week for the hybrid scoring logic and testing. If you’re starting from scratch, budget 3-4 months for something production-ready. The connectors and permission system take longer than the AI does.
Do you need a lot of data for AI matching to work?
No. The semantic search works on any dataset size. The HNSW index adds meaningful performance around 1,000 profiles. Below that, a sequential scan is fast enough and you can skip the index entirely. The profile normalization is worth doing from day one regardless of size. It’s much easier to enforce a consistent profile format early than to normalize inconsistent data across hundreds of existing profiles after the fact.
Building a platform that connects people and needs intelligent matching? Book a 30-minute call. We’ve shipped this pattern across industries and can usually scope it in one conversation.