Taste Labs emerges from stealth with $18.5m to sell judgment to an AI industry drowning in its own output

AI StartupsStartup FundingGlobal Startup Ecosystem

17 Jun

Taste Labs has emerged from stealth with an $18.5 million seed round co-led by CRV and Amplify Partners, betting that the scarcest resource in artificial intelligence is no longer the ability to generate output but the judgment to tell good output from bad. The New York company, founded by former Exa growth lead Thais Castello Branco, sells preference datasets, reasoning data, rubrics, and evaluation environments to frontier model labs alongside verification software for application companies. It is doing so on a contrarian premise, that the standard method for aligning AI models with human preference has been degrading their aesthetic quality and manufacturing the very uniformity the industry now calls slop.

The round, which the company confirmed on emerging from stealth, drew its first cheque from Latitud, the pre-seed firm behind Latin American founders, before Amplify and CRV co-led the seed. Taste Labs is now a team of 20, with founding engineers drawn from Exa, Palantir, Mercado Libre, and research institutions, and a founding designer who spent 15 years as a VP of Design and serves as the company's internal arbiter of quality. The first domain the company is addressing is design.

The averaging problem sits at the centre of why this company exists

The technical argument behind Taste Labs is more interesting than the funding figure. Reinforcement learning from human feedback, the technique that made products such as ChatGPT viable, works by having people compare model outputs, training a reward model on those comparisons, and then optimising the model to score highly against that learned function. When the preferences come from a broad population of users, the reward model learns to approximate the average of that population. For objective tasks, this works well. For subjective ones, it produces a predictable failure; the model drifts toward output that is agreeable, inoffensive, and bland, because the average of many tastes is no taste at all.

Sarah Catanzaro, the Amplify partner who co-led the round, said she had heard versions of this complaint repeatedly from teams at companies including Runway, Figma, Character AI, and Adobe, several of which found that feeding large volumes of general-user preference data into their models improved alignment with that feedback while degrading performance on expert-driven evaluations. The model got better at satisfying the reward function and worse by the standards that actually mattered to the product. What those teams wanted, she explained, was not more feedback from more users but better feedback from the right ones, the people with unusually strong judgment and the ability to articulate why something works.

Why traditional annotation cannot solve a problem it was never built for

This is where the company's positioning becomes a genuine bet on a new category rather than a feature. The annotation industry has historically optimised for scale and agreement, recruiting large numbers of workers to complete tasks with relatively objective answers, and selling speed and efficiency rather than depth of judgment. Subjective evaluation inverts the goal. The aim is not to estimate what the average person thinks but to identify the people whose judgments are most predictive of quality, which raises questions that resist clean answers.

How do you find someone with exceptional taste, distinguish genuine judgment from idiosyncratic preference, and do it at scale without diluting the signal?

Taste Labs answers this with what Catanzaro described as a trust-propagation model, seeding its annotator community with people vetted for strong aesthetic judgment and then allowing those tastemakers to nominate others, so that taste is socially validated rather than crowdsourced. Most annotation companies scale by adding annotators. Taste Labs scales by increasing the leverage of a small curated network through software, which is why it describes itself as infrastructure for operationalising judgment rather than an annotation vendor. The distinction matters commercially because it is the argument for why the company should command better margins and deeper defensibility than the labour-arbitrage businesses it sits beside.

The timing rests on a shift in where model competition is heading

The market context explains why investors moved now. The vibe-coding and coding-agent market, populated by tools such as Lovable, v0, Bolt, and Replit, has put software-building in the hands of people who have never considered design, and the combined valuations in that space already exceed $55 billion. At the same time, foundation labs are hitting diminishing returns on objective domains and beginning to compete on subjective quality, the tone, style, and design sensibility that makes an output feel deliberate rather than generated. Catanzaro noted that subjective quality also decays in a way objective accuracy does not, observing that the right answer to a maths problem in 2020 remains correct today, while great design in 2016 does not resemble great design in 2026, which means a taste dataset is not a one-time asset but a continuously refreshed one.

Castello Branco set out the company's purpose directly in a post on X marking the launch. “Our mission is to end AI slop,” she said, adding that the company was “building the data and infrastructure layer to give AI models and agents taste.” She explained that AI had succeeded in objective domains and made it easy to generate almost anything but that the work that remained was harder. “AI has nailed objective domains and made it easy to generate anything. But it still feels off. Now, the challenge is judgment,” she said in the post. She described that challenge as turning a fuzzy, subjective domain into something that could be measured and codified, and said the company was starting with design.

Castello Branco said on X that cracking the problem required work on two layers. The company had been working with Frontier Labs to evaluate and improve their models by crafting the right post-training data and reinforcement learning environments, and with application-layer companies to build the context and verification tools their agents needed to produce better and more on-brand outputs. “We want a future where AI feels right,” she said in the post.

What Does the round signal for the wider AI stack

The significance for the sector runs beyond a single seed round. For the past three years, the assumption underpinning model improvement has been that more data and more compute yield better results, an assumption that holds for verifiable domains and breaks down for aesthetic ones.

If subjective quality becomes the axis on which frontier models differentiate, as the diminishing returns on objective benchmarks suggest it might, then the supply of high-quality judgment becomes a strategic input rather than a nicety, and whoever controls that supply occupies a position analogous to the one the annotation and data-labelling giants held during the supervised-learning era. Catanzaro argued that the stakes are cultural as much as technical, that generative media will come to dominate the images, copy, and films people encounter, and that an outcome in which most of that output is mediocre is not one the industry should accept.

Whether Taste Labs can hold that position is the open question. The trust-propagation model is elegant in theory but unproven at the scale frontier labs require, and the company is selling into customers, the labs themselves, that have both the resources and the incentive to build curated evaluation networks in-house, as Figma has reportedly attempted with its own designers. The $18.5 million gives the company room to establish itself before that competition intensifies. The larger validation is that two of the more technically literate firms in the venture market, alongside a roster of frontier-lab and application-layer customers acquired during stealth, have concluded that judgment is now a thing worth paying for, and that the firms teaching models to have taste ought to possess some themselves.

Taste LabsAI SlopAI Content

Sindhu V Kashyap

Global Technology Journalist & Multimedia Storyteller | Covering Founders, Investors & Leaders Reshaping Tech | Writer · Interviewer · Moderator · Editor

Taste Labs emerges from stealth with $18.5m to sell judgment to an AI industry drowning in its own output

The averaging problem sits at the centre of why this company exists

Why traditional annotation cannot solve a problem it was never built for

The timing rests on a shift in where model competition is heading

What Does the round signal for the wider AI stack

Abu Dhabi's $73.4 billion ecosystem value rewards a decade of deliberate ecosystem building

CNTXT AI raises $60m Series A to scale sovereign AI infrastructure for global enterprise and government deployment