Drag & drop your file here

or click to browse local folders

AI Research from NVIDIA

LocateAnything

NVIDIA's advanced 3B vision-language model. Locate any object, UI target, or text in images and videos with natural language.

⚙️ Advanced parameters
Temperature 0.7
Top P 0.9
Top K 20
Max Video Frames 4
status: No Media Loaded
🖼️ Interactive Quick Sandbox
Book
Sweet
People
OCR
compiled:
📊 Metrics Log
Status: Idle
Tokens/Frames: -
Detections: -
TPS / BPS: - / -
Time: -
🎯 Detected Target Overlays 0
Run inference to populate target tags here.