Technical white paper

Product embedding integrity in AI-driven catalog advertising

Why overwriting additional product images can collapse product-side context in Meta ads, and how overlay workflows preserve it.

A technical white paper on overlay vs. overwrite workflows, false creative diversity, and representation-driven ad delivery.

Waterbucket ResearchJuly 2026~28 min read

Abstract

Meta ad delivery is no longer best understood as simple audience targeting. It is better understood as a large-scale representation-matching system. Meta builds learned representations of people, ads, products, creatives, events, and context, then retrieves and ranks ad candidates against those representations.

Meta's public engineering work supports this high-level model. The Scaling User Modeling paper states that effective user representations are pivotal in personalized advertising and describes a system that synthesizes user embeddings from large amounts of user features for downstream ads ranking models. Meta's Andromeda post describes ads retrieval as the first step in a multi-stage recommendation system, narrowing tens of millions of ad candidates to a few thousand before ranking. Meta's sequence-learning work describes ads models using event embeddings, attention mechanisms, and multimodal content embeddings. Meta's GEM post describes an LLM-scale ads foundation model trained on ad content, user engagement, user and ad attributes, ad format, and creative representation.

This paper argues that in catalog advertising, the product feed is not merely a merchandising file. It is a product-side context layer. Fields such as title, description, category, brand, price, attributes, image_link, and additional_image_link help define what a product is to the advertising system.

The central failure mode of overwrite workflows is false creative diversity, which can trigger product-side representation collapse. The catalog appears to contain more creative variants, but those variants may be near-duplicates in representation space because they reuse the same product view, the same layout structure, and the same template logic.

The core axiom

In AI-driven catalog advertising, overwrite workflows can weaken product-side matching by erasing or compressing crucial multimodal product signals, causing false creative diversity and product-side representation collapse. Overlay workflows preserve product embedding integrity by keeping the original visual context intact while adding commercial context, giving Meta's delivery system richer product evidence for more accurate product-market matching.

Section 01

The old mental model: audience targeting

For years, advertisers described Meta performance in the language of audiences: interests, lookalikes, retargeting pools, demographic filters, exclusions, and placements. In that world, creative was often treated as the message delivered to a preselected audience.

That framing is increasingly incomplete. Meta's own engineering language points to a different system: representation learning, sequence learning, embeddings, retrieval, ranking, multimodal signals, and user-ad interaction modeling. Meta says AI plays a fundamental role in creating valuable connections between people and advertisers, and its sequence-learning post describes a shift from human-engineered feature aggregations toward event-based learning and learned representations from engagement and conversion events.

The old question

Which audience did the advertiser choose?

The newer question

Given the user's representation, which product representation is likely to create the most valuable outcome?

The advertiser still sees campaigns, ads, product feeds, creatives, and reports. But under the surface, those inputs are converted into machine-readable signals.

The strategic implication is that creative and catalog data are no longer separate from targeting. They are part of matching.

Section 02

The modern model: representation matching

A simplified model of modern ads delivery pairs two learned representations and lets a delivery system retrieve, rank, and serve the highest-value match.

User side

behavior sequencesengagement eventsconversion historysession contextinterestsfresh actions
user representation / embedding

Product / ad side

titledescriptioncategorybrandpriceprimary imageadditional imagescreative formatad history
product / ad representation / embedding

Delivery system

retrieve candidatesrank candidatesserve the highest-value match

This is not a claim that Meta exposes one literal "product embedding" per SKU to advertisers. It does not. The point is that Meta's public work shows ads delivery depends on learned representations across people, ads, events, features, creatives, and products.

Meta's Scaling User Modeling paper says SUM is widely deployed in Meta's ads ranking system and synthesizes user embeddings from massive amounts of user features, serving as inputs to downstream online ads ranking models. Meta's Andromeda post describes a personalized ads retrieval engine that improves personalization at the retrieval stage, the first step in Meta's multi-stage system, selecting from tens of millions of candidates into a few thousand before larger ranking models determine the final ads shown.

Meta's sequence-learning post says its next-generation engine incorporates advances from natural language understanding and computer vision, uses event embeddings, and scales toward richer semantic signals through multimodal content embeddings. Meta's GEM post describes a Generative Ads Recommendation Model trained at LLM scale, learning from ad content and user engagement data, deriving sequence and non-sequence features, and including user and ad attributes such as ad format and creative representation.

Meta's Lattice work describes a recommendation framework for industry-scale ads that addresses data fragmentation through cross-domain knowledge sharing, data consolidation, model unification, distillation, and system optimizations, processing mixed-format, multimodal inputs such as categorical features, dense features, and sequence features.

Taken together, the public record supports a high-level view of Meta ads delivery as representation-driven, multimodal, and interaction-based.

Section 03

The catalog as a product-side context layer

A Meta product catalog is often treated operationally as a feed: a CSV, XML file, API payload, Google Sheet, or platform sync. But strategically, it is better understood as a structured product document. A catalog item may contain:

Catalog item · structured fields

idtitledescriptionbrandproduct_typegoogle_product_categorypricesale_priceavailabilityconditioncolorsizematerialpatternitem_group_idimage_linkadditional_image_linkcustom_label_0-4

Those fields are not equally important in every auction, and Meta does not disclose every downstream use of every field. But they are structured inputs that define the product record.

Publicly accessible Meta catalog documentation defines additional_image_link as a field containing URLs for up to 20 additional images of an item, following the same specifications as image_link. It also states that additional images can be displayed in ads and that the field is supported by supplementary feeds.

This matters because additional_image_link is not an informal creative dumping ground. It is a structured catalog field for additional product imagery.

Google's Merchant Center documentation, while not Meta-specific, reinforces the broader product-feed convention: additional product images are commonly used to show products from different angles, with staging elements, in use, or with detail that helps customers understand the product.

Additional images are product evidence. They help describe what the product is, how it looks, how it is used, and how it differs from other products.

Section 04

What additional images contribute

Additional images can carry product meaning that the primary image does not. A primary image might identify the product. Additional images can provide:

alternate anglelifestyle contextscaletexturematerialfitvariant / coloruse caseenvironmenthuman contextcompatibilitydetail viewbundle / accessory context

Sofa

The furniture in a living room, scale, fabric texture, rear construction, or one shade of gray vs. another.

Apparel

Fit on a body, drape, material, pattern detail, movement, or styling context.

Equipment

Scale, use case, attachments, mechanical features, compatibility, or installed orientation.

A human merchandiser understands this intuitively. A multimodal model may also extract useful signals from these differences. Meta's public materials do not say every additional_image_link is always used in every ads ranking event. That would be too specific. But they do show its ads systems use multimodal content embeddings, creative representations, user and ad attributes, sequence features, and precomputed ad embeddings.

Additional product images should be treated as part of the product-side evidence available to an AI-driven ads system, not merely as optional display assets.

Section 05

The overwrite workflow

Many catalog creative workflows operate by creating rendered ad treatments and writing them back into catalog image fields through primary or supplementary feeds. In the most common form, a tool takes the primary product image and generates multiple treated versions, then inserts those assets into image_link or additional_image_link.

Operationally, this creates more image URLs, and may create more visible ad formats for Meta to display. But strategically, it may replace product evidence with ad treatments.

A catalog tool replacing a product's original context with a generated background
Rendering ad treatments back into image fields swaps product context for generated context.

Before overwrite

  • product cutout
  • lifestyle scene
  • alternate angle
  • material / detail shot
  • variant image
  • in-use image

After overwrite

  • product cutout
  • same cutout + sale badge
  • same cutout + review stars
  • same cutout + financing
  • same cutout + promo frame
  • same cutout + template
The catalog gained ad files.
It lost product evidence.

Because overwrite tools replace diverse additional_image_link assets with repetitive graphic templates, the system may lose visual evidence required for richer product-side matching. Consequently, distinct SKUs can be pushed toward artificial similarity. Not because the products are similar, but because the templates applied to them are similar.

Section 06

False creative diversity

The central failure mode of overwrite workflows is false creative diversity, which can trigger product-side representation collapse. It occurs when a catalog contains many rendered image variants that are operationally different files but remain visually or semantically similar to the ad system.

A human advertiser may count six images. A representation-driven delivery system may see one product view repeated with minor promotional modifications. This distinction matters because representation-driven systems do not value variation simply because filenames or URLs differ. They value useful distinction.

Meta's Andromeda post says the retrieval stage processes tens of millions of ad candidates and reduces them to a few thousand relevant candidates before ranking, describing hierarchical indexing, jointly trained index representations, and precomputed ad embeddings. That does not prove a specific advertiser-facing "creative similarity" rule, but it supports the broader mechanism: at Meta scale, ad delivery systems must compress, index, retrieve, and rank candidates according to learned representations. Superficial variants that stay close together in embedding space are unlikely to create the same value as meaningfully distinct concepts.

Not the question

Did we create more assets?

The question

Did we create more machine-recognized distinction?

The issue is not creative volume. The issue is machine-recognized distinction.

Section 07

Product-side representation collapse

Product-side representation collapse occurs when diverse product evidence is replaced by repetitive or template-dominant signals, reducing the distinctiveness of individual product representations in embedding space.

This is not the same as saying Meta cannot target after images are overwritten. Meta still has many signals: product IDs, titles, descriptions, categories, prices, events, clicks, conversions, user behavior, campaign objectives, and more. The claim is narrower: overwriting additional product images may reduce one important class of product-side signal, visual context diversity.

If 10,000 SKUs are rendered into the same five template systems, each SKU may appear to have more creative variants. But across the catalog, many products now share the same visual grammar:

sale badgereview badgefinancing calloutsame framesame background treatmentsame layout systemsame promotional structure

The template becomes a repeated feature across products. This creates two risks:

1Loss of product context

Lifestyle imagery, alternate angles, material details, variant imagery, and use-case context are removed or demoted.

2Template-dominant similarity

Products may become more visually similar to each other because the template layer is repeated across the catalog.

The result is a catalog that is more ad-like but potentially less product-rich.

Section 08

Why this matters more in catalog ads than in static ads

In static ad testing, near-duplicate creative is inefficient. It may waste budget, create fatigue, or fail to reach meaningfully different audience pockets. In catalog advertising, the overwrite problem is more structural.

Static ad problem

We created too many similar ads.

Catalog overwrite problem

We used similar ads to replace the product evidence that made each SKU distinct.

Catalog ads are product-driven. The product record is the source from which Meta dynamically renders and delivers product advertising. Publicly accessible Meta catalog documentation treats additional images as product-level fields and says additional_image_link is supported by supplementary feeds.

When additional product images are overwritten, the change is not isolated to one ad test. It changes the catalog item itself.

Section 09

The overlay alternative

Preserve product context first. Add commercial context second.

Instead of replacing additional product images with rendered ad treatments, an overlay system keeps the original product image set intact and adds information on top:

pricesale messagediscountreview proofstar ratingBNPL / payment messageurgencyshippingavailabilitycategory-specific offermargin-aware promotion

The point is not that design treatments are bad. The point is that the treatment should not erase the product evidence. Overlay is not anti-creative. It is a signal-preserving creative method.

Overwrite

Replace product context with ad context.

Changes the product-side input surface.

Overlay

Preserve product context and add ad context.

Enriches the visible creative while preserving original product-side evidence.

Section 10

Architectural comparison: overwrite vs. overlay

This contrast is the center of the argument.

DimensionOverwrite workflowOverlay workflow
Core actionReplaces catalog image fields with rendered creative treatmentsPreserves original imagery and appends commercial context
Typical methodWrites new assets into image_link / additional_image_link via feedsAdds price, promo, review, BNPL, urgency, or proof on top of existing product images
Product signalProduct evidence may be replaced, obscured, or compressedProduct evidence remains intact
Visual diversityOften multiple treatments of the same primary imagePreserves alternate angles, lifestyle, detail, variants, use-case context
Machine-readable effectCan create false creative diversity: more files, less product-side distinctionSignal enrichment: product context plus commercial context
Representation riskTemplate-dominant drift and product-side representation collapseProduct embedding integrity and context preservation
Algorithmic concernDistinct SKUs may converge in representation space via shared templatesDistinct SKUs stay separable because original evidence is preserved
Best use caseWhen original imagery is poor, duplicate, noncompliant, or unusableWhen original imagery has valuable context worth keeping available
Core thesisReplaces product context with ad contextAdds ad context without erasing product context

Overwrite workflows replace or compress product context to create ad context. Overlay workflows preserve product context and append ad context. Therefore, overlay workflows are better aligned with product embedding integrity.

Section 11

A practical example

A real overlay ad: product photo preserved, with color variants and BNPL pricing added on top
A real overlay: the product photo stays, and variants plus "4 easy payments" pricing are added on top.

Consider a furniture retailer with one sofa SKU. The original image set tells a rich product story, the overwritten set trades that story for offer mechanics, and the overlay version keeps both.

The example ad shows the overlay principle in practice: the product photograph is preserved as evidence, while color options and real "as low as" pricing ride on top as commercial context.

Original catalog imagerya rich product story

image_link

front-facing product cutout

additional_image_link

living-room lifestyle photo · side angle · fabric closeup · scale photo with person · rear construction · color variant

Overwritten catalog imagerycutout + offer mechanics only

image_link

front-facing product cutout

additional_image_link

cutout + "20% Off" · cutout + "As low as $99/mo" · cutout + "4.8 Stars" · cutout + "Free Shipping" · cutout + branded frame · cutout + sale template

Overlay-preserved versionproduct-rich and conversion-aware

image_link

front-facing product cutout

additional_image_link

living-room lifestyle + "20% Off" · side angle + "As low as $99/mo" · fabric closeup + "4.8 Stars" · scale photo + "Free Shipping" · rear construction + warranty · color variant + sale message

The overwritten set keeps the offer mechanics but removes lifestyle, material, scale, and variant context. The product becomes more template-rich and less product-rich. The overlay version keeps the product evidence and adds commercial intent, so the catalog stays product-rich while the creative becomes conversion-aware.

Section 12

Evidence ledger

This paper intentionally separates public facts from technical inference.

1

Meta uses user embeddings in ads personalization

The Scaling User Modeling paper says effective user representations are pivotal in personalized advertising and describes SUM as a framework, deployed in Meta's ads ranking system, that synthesizes user embeddings from massive user features and feeds them into downstream online ranking models.

2

Meta uses ads retrieval before ranking

The Andromeda post says retrieval is the first step in Meta's multi-stage recommendation system, selecting from tens of millions of candidates into a few thousand before larger ranking models determine the final ads shown.

3

Meta stores precomputed ad embeddings and features

The Andromeda post says precomputed ad embeddings and features are stored in the local memory of the NVIDIA Grace Hopper Superchip used by the system.

4

Ads systems use event embeddings and multimodal semantic signals

The sequence-learning post says event models synthesize event embeddings from event attributes, and that the next-generation system scales toward richer semantic signals through multimodal content embeddings.

5

GEM uses ad content, engagement, attributes, format, and creative representation

The GEM post says the model is trained on ad content and user engagement data from ads and organic interactions, deriving sequence features such as activity history and non-sequence features such as user and ad attributes, including ad format and creative representation.

6

Lattice processes mixed-format, multimodal inputs

The Lattice work describes a framework for cost-effective industry-scale ads recommendations, with networks processing categorical features, dense features, and sequence features.

7

additional_image_link is a structured catalog field

Publicly accessible Meta catalog documentation defines it as URLs for up to 20 additional images, following the same specifications as image_link, notes additional images can be displayed in ads, and states the field is supported by supplementary feeds.

8

Additional images are commonly used to add product context

Google Merchant Center documentation describes additional images as a way to show different angles, product staging, product use, details, bundles, and lifestyle context.

Section 13

Technical inference

The public facts support the following inference:

1Meta's ads systems use learned representations, embeddings, user-ad interactions, creative representations, multimodal inputs, and retrieval/ranking models.
2Meta catalogs contain structured product fields, including primary and additional product images.
3Additional images can encode product context that is not present in the primary image.
4Replacing those images with repetitive treatments of the same primary image changes the product-side input surface.
5If those treatments are visually or semantically similar, they may create false creative diversity: more files, but not more meaningful distinction.
6In a representation-driven delivery system, reduced product-side distinction can plausibly weaken the system's ability to understand and match products to relevant users.

This is a reasoned technical model, not a claim of direct access to Meta's proprietary production logic.

Section 14

The Waterbucket thesis

Waterbucket's thesis is that catalog creative should enrich the product record without collapsing it.

Do not

  • ×Replace the product evidence.
  • ×Overwrite diverse additional images with repetitive templates.
  • ×Confuse asset count with representational diversity.
  • ×Turn every SKU into the same promotional visual system.

Instead

  • Preserve the original product imagery.
  • Preserve lifestyle, detail, variant, and use-case context.
  • Add offer logic as an overlay: proof, price, review, urgency, financing.
  • Keep the catalog product-rich and the ad conversion-aware.

The strategic difference is not simply "overlay vs. overwrite." It is:

embedding preservationvs.embedding collapse
product contextvs.template dominance
true distinctionvs.false creative diversity
signal enrichmentvs.signal replacement

Waterbucket's position is not that advertisers should avoid commercial creative signals. It is that commercial signals should be added without erasing the product signals that help the system understand what the product is.

Section 15

Recommended terminology

Product embedding integrity
The degree to which catalog inputs preserve the original product evidence needed to represent an item accurately in machine-readable space.
Product-side representation
The learned or derived representation of a product/ad candidate based on catalog fields, creative inputs, attributes, historical performance, and other product/ad signals.
Catalog context preservation
Keeping product images, attributes, and contextual imagery intact while adding commercial messaging in a non-destructive way.
False creative diversity
A condition where many rendered creative variants exist operationally but remain visually or semantically similar in representation space.
Product-side representation collapse
The reduction of product-level distinctiveness caused by replacing diverse product evidence with repetitive or template-dominant creative signals.
Template-dominant embedding drift
A condition where repeated design templates become a stronger shared signal across many SKUs than the product-specific imagery those templates replaced.
Product-market matching
The process by which a delivery system matches learned user-side representations to learned product/ad-side representations.

Section 16

Implications for advertisers

The old question

Can this tool generate more catalog creative?

The better question

Does this tool preserve or replace the product-side evidence in my catalog?

The tooling audit

?Does the tool overwrite image_link or additional_image_link?
?Does it use supplementary feeds to replace original product images?
?Does it preserve lifestyle, detail, variant, and use-case images?
?Does it create multiple treatments of the same primary image?
?Does it make many SKUs visually similar through repeated templates?
Does it add commercial context without erasing product context?

Performance should also be evaluated beyond blended ROAS. Advertisers should examine:

long-tail SKU deliveryproduct-set-level performancenew customer acquisitioncategory-level liftSKU-level impression distributionincrementality by product groupfeed healthcreative fatiguecoverage of product variants

If overwriting additional images improves performance for a small subset of hero SKUs but reduces discovery across the long tail, blended account metrics may hide the damage.

Section 17

Implications for creative testing

In a representation-driven system, variation is only valuable when it creates useful distinction. A small copy change may not create a new concept. A badge swap may not create a new visual concept. A color change in the template may not create meaningful product-side diversity. The strongest testing framework protects both dimensions:

Product distinction

  • What is the product?
  • How is it used?
  • What does it look like in context?
  • What makes this SKU different?

Commercial context

  • Why buy now?
  • What is the offer?
  • What proof exists?
  • What payment option is available?
  • What urgency exists?

Overwrite workflows often trade product distinction for commercial context. Overlay workflows can preserve both. Therefore, the goal is not more creative variants. The goal is more distinct product-market matches.

Section 18

Limitations

This paper does not claim access to Meta's proprietary production models.
It does not claim that every additional_image_link is always used in every auction, retrieval pass, ranking pass, or delivery decision.
It does not claim that all overwrites are always harmful. Where original images are poor, duplicate, low-quality, noncompliant, or commercially ineffective, replacing them may improve performance.
It does not claim that overlays automatically outperform overwrites in every account. Performance depends on category, feed quality, creative quality, auction conditions, conversion data, pixel/app events, price competitiveness, and campaign structure.

The narrow claim

In a representation-driven ads system, replacing diverse product imagery with repetitive rendered treatments should be understood as a change to the product-side input surface, not merely as a creative formatting choice.

Section 19

Conclusion

Modern catalog advertising operates inside AI-driven recommendation systems that increasingly depend on learned representations, embeddings, multimodal signals, user-ad interactions, and retrieval/ranking pipelines. Meta's public engineering work supports this high-level model, and publicly accessible catalog documentation confirms that additional product images are structured catalog fields.

The implication is that catalog images are not just creative assets. They are product signals. When additional product images are overwritten with repetitive treatments of the same primary image, the catalog may appear to gain creative volume, but it can lose visual context, product distinction, and SKU-level separability. That is false creative diversity.

The better objective is product embedding integrity. Preserve the product evidence. Preserve the visual context. Preserve the distinction between SKUs. Then add commercial context through overlays.

The goal

Not more creative files. More distinct product-market matches.

Core axiom

In AI-driven catalog advertising, overwrite workflows can weaken product-side matching by erasing or compressing crucial multimodal product signals, causing false creative diversity and product-side representation collapse. Overlay workflows preserve product embedding integrity by keeping the original visual context intact while adding commercial context, giving Meta's delivery system richer product evidence for more accurate product-market matching.

One-sentence thesis

In AI-driven catalog advertising, overwriting additional product images is not just a creative workflow; it can collapse product-side context by replacing diverse product evidence with repetitive template treatments, while overlay workflows preserve product embedding integrity by adding commercial context without erasing the signals that help the system understand the product.

References

Public sources

  1. 01Meta AI. Scaling User Modeling: Large-Scale Online User Representations for Ads Ranking in Meta. arXiv:2311.09544.
  2. 02Engineering at Meta. Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine. December 2024.
  3. 03Engineering at Meta. Sequence learning: A paradigm shift for personalized ads recommendations. November 2024.
  4. 04Engineering at Meta. Meta's Generative Ads Model (GEM): The central brain accelerating ads recommendation AI innovation. November 2025.
  5. 05Meta AI. New AI advancements drive Meta's ads system performance and efficiency (Meta Lattice).
  6. 06Meta for Business. AI innovation in Meta's ads ranking driving advertiser performance.
  7. 07Meta Business Help Center. Supported fields for catalogs, including additional_image_link.
  8. 08Google Merchant Center Help. Additional image link [additional_image_link].

Product names, engineering system names, and documentation referenced above are the property of their respective owners. This paper cites public materials for analysis and does not represent an affiliation or endorsement.