interlocute.ai beta
Coming Soon

Image Intelligence

Three-layer image analysis that scales from an instant structural fingerprint to semantic intelligence and full forensic verification. The structural layer is free and deterministic. Semantic adds LLM-powered understanding. Forensic adds adversarial verification and manipulation detection.

Structural fingerprint — instant local analysis

Deterministic, zero-API-call analysis that runs locally: structural properties (dimensions, format, aspect ratio), visual quality metrics (blur, noise, exposure), colour analysis (dominant colours, luminance, palette), and metadata extraction (EXIF, ICC profiles). Free, instant, and always available.

Semantic intelligence — LLM-powered understanding

A multimodal LLM produces structured semantic analysis: scene classification (photo, screenshot, meme, chart, product shot), entity detection, intent analysis (informational, persuasive, transactional), emotional tone (valence–arousal model), and design assessment scores (clarity, hierarchy, readability, trust, professionalism). Includes dense captions and content tags from vision APIs.

Forensic verification — adversarial analysis

Multi-pass adversarial analysis adds OCR evidence anchors with bounding boxes, claim validation against extracted text, contradiction detection, manipulation signals (editing detection, lighting inconsistencies, AI artifact likelihood), contextual risk assessment, and a tamper-proof evidence report hash.

Object detection & vision features

Independent vision feature flags — object detection with bounding boxes, people detection, background removal, safe search, web detection (reverse image search), logo detection, and landmark detection — can be combined with any analysis layer or used standalone.

Built-in profiles

Four built-in profiles: Quick Scan (structural fingerprint only — free), Full Intelligence (structural + semantic with dense captions), Forensic (full pipeline with safe search and web detection), and Object Detection (standalone bounding-box detection). Custom profiles can specify any combination of layers and vision features.

Confidence-driven escalation

When the semantic model reports low confidence, the pipeline automatically escalates to forensic verification. High-stakes mode forces forensic verification regardless of confidence — ideal for content moderation, legal evidence, and brand safety workflows.

Frequently Asked Questions

Image Intelligence

What image formats are supported?
Interlocute supports JPEG, PNG, WebP, TIFF, BMP, and GIF. Images can be submitted as raw bytes, a public URL, a blob SAS URI, or a provider file ID.
What are the three analysis layers?
The structural fingerprint runs locally with zero API calls — structural properties, quality metrics, colour analysis, and metadata. Semantic intelligence adds understanding from a multimodal LLM — scene classification, entities, intent, emotion, and design scores. Forensic verification adds adversarial analysis — OCR evidence anchors, claim validation, manipulation detection, and risk assessment.
Is the structural fingerprint really free?
Yes. The structural fingerprint is fully deterministic and runs locally. It makes no external API calls and incurs no provider costs. It is always executed as the first step of any analysis.
How does confidence-driven escalation work?
When semantic intelligence produces a confidence score below the profile's escalation threshold (default 0.65), the pipeline automatically runs forensic verification. You can also force forensic verification by setting highStakes to true.
What is adversarial verification?
Forensic verification uses a second model pass that cross-examines claims made by the semantic analysis against OCR-extracted evidence. It detects contradictions, manipulation signals (edited images, AI-generated artifacts, lighting inconsistencies), and produces a contextual risk assessment with a tamper-proof evidence hash.
Can I detect objects without running the full pipeline?
Yes. The Object Detection profile runs independently of the layered pipeline. It detects objects and people with bounding boxes, extracts content tags, and optionally removes the background — all without invoking semantic intelligence or forensic verification.
How is image analysis billed?
The structural fingerprint is free. Semantic intelligence incurs LLM token costs plus a platform premium. Forensic verification incurs additional LLM costs for the adversarial pass and vision API costs for OCR and feature extraction. All costs are attributed per-request in your usage ledger.

Ready to build with Image Intelligence?

Deploy your node in seconds and start using Image Intelligence today.