VISION · SEGMENTATION · SAM~0.4s / FRAME

VISION · SEGMENTATION · SAM

SAM 3.1.

Segment anything.Object Multiplex.

Meta's latest segmentation model — find, segment, and track objects across images and video from a single prompt. Object Multiplex lets you pull many subjects in one pass instead of one-call-per-object. Open weights, fine-tunable, sub-cent per frame.

AVG LATENCY · ~0.4s / FRAME

STARTING AT · $0.002 / IMG

Video subject tracking demo — click a subject on frame 0, get per-frame masks across the entire clip.

Background removal with preserved fine fabric edges. — Background removal: catalog-scale subject isolation, preserving fabric edges and contact shadows.

FathomNet collaboration: underwater species segmentation across video frames.

SAM 3.1 Object Multiplex demo segmenting many subjects in a single pass. — SAM 3.1 Object Multiplex: segment every distinct object in this image in a single pass.Generated by SAM 3.1 · Meta

03 / USE CASES

Where SAM 3.1 shines.

Video editing tools

Subject isolation for masking, color grading, and effects work. SAM 3.1's video tracking holds across cuts and occlusion.

EXAMPLEClick a subject on frame 0, get per-frame masks across the entire clip.

E-commerce background removal

Catalog-scale background isolation with higher accuracy than commercial alternatives at a fraction of the cost.

EXAMPLERemove background from this product photo, preserving fine fabric edges and contact shadows.

Scientific imagery

Microscopy, astronomy, medical imaging — anywhere precise segmentation matters and labels are scarce. The FathomNet collaboration shows the underwater pipeline.

EXAMPLESegment all visible species in this underwater video frame (FathomNet pipeline).

Dataset labeling

Bootstrap large-scale segmentation datasets at sub-cent cost per frame. Useful for in-house ML training data prep.

EXAMPLEGenerate polygon segmentation for every distinct object in this image.

06 / EXAMPLES

Generated with SAM 3.1.

A live cross-section of the model's range — portraits, products, typography, illustration, fashion, cinematic. Hover any tile to pause and read its prompt.

Video subject tracking — click a subject on frame 0, get per-frame masks across the entire clip with consistent identity.

FathomNet collaboration — underwater species segmentation across video frames for marine research datasets.

07 / BENCHMARKS

By the numbers.

Best in classObject Multiplex (multi-subject)

Top 1Video tracking IoU

Apache 2.0License

Commercial bg-removal SaaSSubstantially cheaper per frame; fully self-hostable when you scale.

FLUX.1 Kontext [pro]SAM produces masks; Kontext consumes them — pair them in a pipeline rather than compete.

Nano Banana 2Different jobs — SAM is segmentation, Nano Banana is generation. Pair in a creative pipeline.

08 / PRICING

$0.002/ image

Pay only for successful generations. No idle, no minimums, no per-seat. Volume discounts kick in at 10K req/mo.

VS NATIVECheaper than every commercial alternative on the market — SAM is open and lightweight, and Infer prices on compute pass-through.

VS SELF-HOSTApache-licensed, fully self-hostable. Use Infer to skip the GPU ops; switch to on-prem at scale.

09 / FAQ

Things teams ask.

Q.01What's Object Multiplex?

SAM 3.1's marquee new capability: segment many distinct subjects in a single prompt rather than calling once per object. Massive cost and latency win for crowded scenes.

Q.02Can it track across video?

Yes — SAM 3.1 holds subject identity across cuts and partial occlusion. The video API takes a click or box on frame 0 and returns per-frame masks across the entire clip.

Q.03How is this prompted?

Three modes: a click (point), a bounding box, or a coarse mask. Text prompting (semantic segmentation) is on the roadmap but not in this version.

Q.04Is it really $0.002 per image?

Yes. SAM is open-weights and lightweight — the inference cost is genuinely low. Infer prices on compute pass-through.

Q.05Can I use the outputs commercially?

Yes. Apache 2.0 license. You can also self-host the weights if you outgrow our hosted endpoint.

Q.06What are the rate limits?

Default tier is 60 requests per minute, with burst capacity to 120/min. SAM is fast enough that this rarely binds in practice.

Q.07How is this different from running it myself?

Infer handles GPU provisioning, queueing, and updates as new SAM checkpoints land. You get the open weights' flexibility without owning infrastructure.

10 / READY TO SHIP?

Ship with SAM 3.1.

One key. One bill. One SDK shape — across 100+ models. Free credits on signup, no card required.

Get started→Read the docs →