🔥 Major breakthrough · Beta testing

Vision LLM

Grounding

Timeline

Local-first

Real Multimodal Video Understanding is now operational in beta

The latest beta can extract real video frames, analyze them with Vision models, align results to the timeline, and fuse visual understanding with speech, OCR, metadata and semantic memory.

Download the latest beta LM Studio + Gemma Vision setup Join the closed beta

What this means for beta testers

This is the biggest Vision Intelligence upgrade so far — an early operational breakthrough, not a marketing promise. The system separates what is visible, what is interpreted, and what remains uncertain.

Visual observation

Frame-grounded evidence from extracted video frames.

Interpretation

Fused timeline events from Vision LLM, speech and OCR.

Uncertain assumptions

Flagged separately with grounding scores and diagnostics.

Real frame analysis

Vision LLMs analyze extracted video frames — multimodal payloads with actual image data.

Timeline grounding

Scene understanding aligned to searchable multimodal timeline events.

Evidence-based fusion

Hallucination control with grounding scores, confidence and evidence sources.

BLIP / CLIP / OCR / speech

Multi-signal fusion combined with Vision LLM output and .vtag export.

Local-first

LM Studio + Gemma Vision on your machine — no cloud upload required.

Semantic Memory

Multimodal timeline events indexed for search across your library.

Operational in current beta builds

Real video frame extraction and Vision LLM analysis
Multimodal payloads with actual image frames
Timeline-aligned scene understanding
Grounded cinematic scene descriptions
Visual observation / interpretation / uncertainty separation
Vision diagnostics with confidence and grounding scores
BLIP, CLIP, OCR, speech and metadata fusion
Local-first via LM Studio + Gemma Vision
Searchable multimodal timeline events in Semantic Memory

Interested in testing the latest build?

Contact Bomark Analys for beta access — or download the latest build if you are already approved.

Request beta access Contact Bomark Analys

Early operational breakthrough — results vary by model, hardware and media. Grounding scores surface evidence but do not guarantee perfection. Fallback to text-based analysis when vision payloads fail.

VisionaryAI Suite 1.5.2 Closed Beta Now Open

A new way to search, understand and navigate your media

Fully local. Fully private. Fully intelligent.

Version 1.5.2 deepens semantic memory, analysis quality, benchmarking and end-to-end media intelligence. People-based premium capabilities—including face recognition where your programme allows—are licence-controlled: they are not enabled in the public Trial by default and may only be activated for selected beta, pilot or commercial agreements.

Closed Beta access may include selected premium capabilities such as people-based video intelligence and face recognition, depending on approval, licence configuration and intended use. These features require responsible use and may involve biometric data.

This is a major step towards turning unstructured media into something you can actually work with.

Same program for trial and beta: the Windows package on our download page is the same build either way. You can apply for a closed-beta license key to run it for a longer evaluation period than the default trial—coordinated with onboarding in the closed beta community.

Join the Closed Beta Apply for Closed Beta

Download is live: get the Closed Beta portable ZIP on the beta tester download page — VisionaryAI_Suite_Portable_1.5.2_ClosedBeta.zip (~5.8 GiB).

Access, onboarding, and license keys for extended evaluation are coordinated through our private Facebook group. The link opens in a new tab. Prefer not to use Facebook? Reach us via the contact page.

For closed-beta access and longer-trial licence keys, use the Facebook group or contact us—Pilot Portal onboarding is paused for now.

Live program — actively developed New features ship frequently Community feedback shapes the roadmap Direct line to the creator

Meet the creator

Robert Bomark · Bomark Analys AB

I build VisionaryAI Suite as a local-first AI platform for intelligent media analysis, semantic memory, transcription, and metadata you can keep beside your files.

Testers get personal support when onboarding gets sticky—this is a real beta, not a hands-off drop.

LinkedIn Join the Closed Beta

Robert Bomark · Founder & creator, VisionaryAI Suite

Getting started is easier than you think

Four concise steps—most people finish setup in under a quarter of an hour.

1
Download VisionaryAI Suite

Install the Windows package from download. It is the same build used for the public trial and the closed beta. Once you are approved, activate your beta license key in the app to keep evaluating beyond the default trial window.
2
Run the setup wizard

Paths, GPU checks, and first-run calibration are spelled out—no guesswork.
3
Select or download AI models

Pick the pack that matches your VRAM; the UI explains trade-offs in plain language.
4
Analyze media locally

Queue a folder, watch semantic memory populate, and search locally—people-based features only where your licence enables them.

Estimated setup: 10–15 minutes

Stuck? Message robert.bomark@bomarkanalys.se or use contact—I answer beta issues personally.

What’s new in 1.5.2

Core features in the VisionaryAI Suite 1.5.2 closed beta—built for large media libraries, without sending your files to the cloud.

Licence-controlled face recognition

Where enabled for your approved licence, the suite can support local detection and recognition workflows for images and video—always under your control on your hardware.

Face databases and identity linking are premium, licence-governed capabilities. You must have a valid legal basis and rights for any identifiable persons in your media.

Fully local processing when configured
Manual + automatic identity confirmation (where licenced)
Improvement from your own confirmed data
Images and video where your licence allows

Semantic Memory Engine

Your media is no longer just files.

VisionaryAI builds a semantic memory of your content, making it searchable even without manually tagging anything. For positioning against ordinary filename search—and how semantic media search fits your archive—see AI file search with VisionaryAI Suite.

Search using natural language
Understands context, not just keywords
Works across images, video, and metadata
Persistent memory database

People search (licensed workflows)

Where people intelligence is enabled for your licence, find matching images and segments across large libraries.

Integrated with semantic search; identity features follow your licence terms and legal posture—not a free-for-all public capability.

Integrated with semantic search
Works in images and video where enabled
Supports confirmed identities under policy
Designed for controlled, licenced use

Video timeline intelligence

When people features are enabled by licence, navigate by detected faces and timestamps instead of manual scrubbing alone.

Use timeline cues as editorial or review aids—subject to your rights in the underlying material.

Frame-based detection
Timeline visualization
Jump to exact moments
Segment grouping

Face-focused analysis (licensed)

Optional lighter face-only passes where your licence and policy allow—without implying this is a default public Trial feature.

Useful when building structured review workflows under an approved configuration.

Lightweight processing
Batch support
Right-click integration
Instant results

Identity suggestions (licensed)

Where enabled, the system can propose matches from prior confirmations—always under your review and legal responsibility.

Automation is constrained by licence and intended use; biometric sensitivity requires careful governance.

Confidence-based suggestions
Auto-confirm workflows
Batch operations
Continuous improvement

Your media never leaves your environment

VisionaryAI is engineered for teams that cannot normalize cloud upload. Source media stays on machines you operate, models run locally, and metadata lands in open formats you can index—ideal for archives, researchers, journalists, and studios.

You decide what touches the internet. No forced pipeline to someone else’s servers just to understand your own library.

Air-gap friendly posture for confidential collections.
Full custody of rushes, stills, and AI outputs.
Professional accountability—Swedish company, active development, direct beta support.

See the workflow before you install

Motion beats mystery—watch a clip, then grab the build.

Getting started

Subtitle and translation automation—fast proof that the desktop stack is real.

Open gallery

Quick demo

Identity, layout, and the “feel” of VisionaryAI Suite in one take.

Open gallery

Feature walkthrough

Reserve this card for a deep semantic-memory session—link your newest recording here.

Placeholder — add video

Where the platform is heading

Semantic Memory
Licence-controlled people & face capabilities
Timeline Search
.vtag metadata
AI Shell overlay
Advanced forensics
Cross-media intelligence

Join the Closed Beta

We are opening a limited number of spots for early users. You run the same program version as everyone who downloads the public trial—and you can apply for a license key that lets you try it for longer while you explore features and share feedback.

Join the Closed Beta Apply for Closed Beta

Limited availability

Beta software can change quickly; use non-production or well-backed-up archives. The public trial and closed beta use the same Windows build; licensing and evaluation length differ. See also pricing.

What this means for beta testers

Real frame analysis

Timeline grounding

Evidence-based fusion

BLIP / CLIP / OCR / speech

Local-first

Semantic Memory

Operational in current beta builds

Interested in testing the latest build?

VisionaryAI Suite 1.5.2 Closed Beta Now Open

Robert Bomark · Bomark Analys AB

Getting started is easier than you think

Download VisionaryAI Suite

Run the setup wizard

Select or download AI models

Analyze media locally

What’s new in 1.5.2

Licence-controlled face recognition

Semantic Memory Engine

People search (licensed workflows)

Video timeline intelligence

Face-focused analysis (licensed)

Identity suggestions (licensed)

Your media never leaves your environment

See the workflow before you install

Getting started

Quick demo

Feature walkthrough

Where the platform is heading

Join the Closed Beta