Frame-grounded evidence from extracted video frames.
🔥 Major breakthrough · Beta testing
Real Multimodal Video Understanding is now operational in beta
The latest beta can extract real video frames, analyze them with Vision models, align results to the timeline, and fuse visual understanding with speech, OCR, metadata and semantic memory.
What this means for beta testers
This is the biggest Vision Intelligence upgrade so far — an early operational breakthrough, not a marketing promise. The system separates what is visible, what is interpreted, and what remains uncertain.
Fused timeline events from Vision LLM, speech and OCR.
Flagged separately with grounding scores and diagnostics.
Real frame analysis
Vision LLMs analyze extracted video frames — multimodal payloads with actual image data.
Timeline grounding
Scene understanding aligned to searchable multimodal timeline events.
Evidence-based fusion
Hallucination control with grounding scores, confidence and evidence sources.
BLIP / CLIP / OCR / speech
Multi-signal fusion combined with Vision LLM output and .vtag export.
Local-first
LM Studio + Gemma Vision on your machine — no cloud upload required.
Semantic Memory
Multimodal timeline events indexed for search across your library.
Operational in current beta builds
- Real video frame extraction and Vision LLM analysis
- Multimodal payloads with actual image frames
- Timeline-aligned scene understanding
- Grounded cinematic scene descriptions
- Visual observation / interpretation / uncertainty separation
- Vision diagnostics with confidence and grounding scores
- BLIP, CLIP, OCR, speech and metadata fusion
- Local-first via LM Studio + Gemma Vision
- Searchable multimodal timeline events in Semantic Memory
Interested in testing the latest build?
Contact Bomark Analys for beta access — or download the latest build if you are already approved.
Early operational breakthrough — results vary by model, hardware and media. Grounding scores surface evidence but do not guarantee perfection. Fallback to text-based analysis when vision payloads fail.
VisionaryAI Suite 1.5.2 Closed Beta Now Open
A new way to search, understand and navigate your media
Fully local. Fully private. Fully intelligent.
Version 1.5.2 deepens semantic memory, analysis quality, benchmarking and end-to-end media intelligence. People-based premium capabilities—including face recognition where your programme allows—are licence-controlled: they are not enabled in the public Trial by default and may only be activated for selected beta, pilot or commercial agreements.
Closed Beta access may include selected premium capabilities such as people-based video intelligence and face recognition, depending on approval, licence configuration and intended use. These features require responsible use and may involve biometric data.
This is a major step towards turning unstructured media into something you can actually work with.
Same program for trial and beta: the Windows package on our download page is the same build either way. You can apply for a closed-beta license key to run it for a longer evaluation period than the default trial—coordinated with onboarding in the closed beta community.
Download is live: get the Closed Beta portable ZIP on the beta tester download page — VisionaryAI_Suite_Portable_1.5.2_ClosedBeta.zip (~5.8 GiB).
Access, onboarding, and license keys for extended evaluation are coordinated through our private Facebook group. The link opens in a new tab. Prefer not to use Facebook? Reach us via the contact page.
For closed-beta access and longer-trial licence keys, use the Facebook group or contact us—Pilot Portal onboarding is paused for now.
Meet the creator
Robert Bomark · Bomark Analys AB
I build VisionaryAI Suite as a local-first AI platform for intelligent media analysis, semantic memory, transcription, and metadata you can keep beside your files.
Testers get personal support when onboarding gets sticky—this is a real beta, not a hands-off drop.
Getting started is easier than you think
Four concise steps—most people finish setup in under a quarter of an hour.
-
1
Download VisionaryAI Suite
Install the Windows package from download. It is the same build used for the public trial and the closed beta. Once you are approved, activate your beta license key in the app to keep evaluating beyond the default trial window.
-
2
Run the setup wizard
Paths, GPU checks, and first-run calibration are spelled out—no guesswork.
-
3
Select or download AI models
Pick the pack that matches your VRAM; the UI explains trade-offs in plain language.
-
4
Analyze media locally
Queue a folder, watch semantic memory populate, and search locally—people-based features only where your licence enables them.
Estimated setup: 10–15 minutes
Stuck? Message robert.bomark@bomarkanalys.se or use contact—I answer beta issues personally.
What’s new in 1.5.2
Core features in the VisionaryAI Suite 1.5.2 closed beta—built for large media libraries, without sending your files to the cloud.
Licence-controlled face recognition
Where enabled for your approved licence, the suite can support local detection and recognition workflows for images and video—always under your control on your hardware.
Face databases and identity linking are premium, licence-governed capabilities. You must have a valid legal basis and rights for any identifiable persons in your media.
- Fully local processing when configured
- Manual + automatic identity confirmation (where licenced)
- Improvement from your own confirmed data
- Images and video where your licence allows
Semantic Memory Engine
Your media is no longer just files.
VisionaryAI builds a semantic memory of your content, making it searchable even without manually tagging anything. For positioning against ordinary filename search—and how semantic media search fits your archive—see AI file search with VisionaryAI Suite.
- Search using natural language
- Understands context, not just keywords
- Works across images, video, and metadata
- Persistent memory database
People search (licensed workflows)
Where people intelligence is enabled for your licence, find matching images and segments across large libraries.
Integrated with semantic search; identity features follow your licence terms and legal posture—not a free-for-all public capability.
- Integrated with semantic search
- Works in images and video where enabled
- Supports confirmed identities under policy
- Designed for controlled, licenced use
Video timeline intelligence
When people features are enabled by licence, navigate by detected faces and timestamps instead of manual scrubbing alone.
Use timeline cues as editorial or review aids—subject to your rights in the underlying material.
- Frame-based detection
- Timeline visualization
- Jump to exact moments
- Segment grouping
Face-focused analysis (licensed)
Optional lighter face-only passes where your licence and policy allow—without implying this is a default public Trial feature.
Useful when building structured review workflows under an approved configuration.
- Lightweight processing
- Batch support
- Right-click integration
- Instant results
Identity suggestions (licensed)
Where enabled, the system can propose matches from prior confirmations—always under your review and legal responsibility.
Automation is constrained by licence and intended use; biometric sensitivity requires careful governance.
- Confidence-based suggestions
- Auto-confirm workflows
- Batch operations
- Continuous improvement
Your media never leaves your environment
VisionaryAI is engineered for teams that cannot normalize cloud upload. Source media stays on machines you operate, models run locally, and metadata lands in open formats you can index—ideal for archives, researchers, journalists, and studios.
You decide what touches the internet. No forced pipeline to someone else’s servers just to understand your own library.
- Air-gap friendly posture for confidential collections.
- Full custody of rushes, stills, and AI outputs.
- Professional accountability—Swedish company, active development, direct beta support.
See the workflow before you install
Motion beats mystery—watch a clip, then grab the build.
Getting started
Subtitle and translation automation—fast proof that the desktop stack is real.
Open galleryFeature walkthrough
Reserve this card for a deep semantic-memory session—link your newest recording here.
Placeholder — add videoWhere the platform is heading
- Semantic Memory
- Licence-controlled people & face capabilities
- Timeline Search
- .vtag metadata
- AI Shell overlay
- Advanced forensics
- Cross-media intelligence
Join the Closed Beta
We are opening a limited number of spots for early users. You run the same program version as everyone who downloads the public trial—and you can apply for a license key that lets you try it for longer while you explore features and share feedback.
Limited availability
Beta software can change quickly; use non-production or well-backed-up archives. The public trial and closed beta use the same Windows build; licensing and evaluation length differ. See also pricing.