Windows · Professional Evaluation Program · 1.5.3
Evaluation access & setup guide
VisionaryAI Suite is currently available through a controlled evaluation program. Access is provided to selected testers, creators, researchers and enterprise users—with onboarding, support and controlled deployment during the current beta phase.
Approved applicants receive a Windows evaluation build and a signed license.json. The installer opens in Awaiting Activation until you import your approved license. This page covers installation for approved users.
Windows only — today
This download is Windows-only. There is no native Mac build yet. We are working toward similar software for macOS, but current engineering focus stays on Windows—and on virtual environments (such as a Windows VM) for teams that need to run the suite on non-Windows hardware.
Latest release · May 2026
VisionaryAI Suite 1.5.3 is here
A major milestone in local-first AI media intelligence — stronger multimodal analysis, improved OCR, deeper semantic memory and richer .vtag metadata.
VisionaryAI Suite 1.5.3 strengthens multimodal analysis, OCR intelligence, semantic memory and richer .vtag metadata workflows — making the platform more capable than ever at understanding images, video, audio, text and context.
This is no longer just a media tagging tool. Version 1.5.3 moves VisionaryAI Suite toward a local-first AI platform for deep media understanding — with professional benchmarking and diagnostics built in.
What’s new in 1.5.3
- Stronger multimodal AI analysis across vision, speech and metadata
- Improved OCR intelligence — visible text as an intelligence layer
- Deeper semantic memory with OCR-aware search
- Richer .vtag sidecar metadata workflows
- Evidence-based fusion and timeline grounding (where your build supports it)
- Benchmark Dashboard, Smart Whisper profiles and latency reporting included
Try the latest VisionaryAI Suite 1.5.3 evaluation build — request access to explore multimodal analysis, semantic memory and improved OCR on your own archive.
Before you install
License required: Evaluation builds open in Awaiting Activation until you import a signed license.json sent after approval. Face recognition and biometric person identification are licence-controlled premium features—see pricing and closed beta.
- Platform: VisionaryAI Suite runs only on Windows. Mac is not supported yet—we aim for a macOS offering later; until then, expect best results on Windows itself or inside a Windows virtual machine.
- You are installing an approved evaluation build—a real workflow preview, not a slideshow demo.
- The first launch may show a Windows SmartScreen notice—common for newer desktop apps.
- AI models download during first setup (allow time, disk space, and a stable connection).
- A free Hugging Face account is usually required to authenticate model downloads.
- For advanced reasoning and AI-assisted workflows, VisionaryAI Suite also uses an AI backend—typically LM Studio (local) or OpenRouter (cloud).
- Your media stays on your computer for core local analysis.
Typical first-time setup: 10–20 minutes
Most of the setup process is guided directly inside VisionaryAI Suite.
VisionaryAI Trial 1.5.3 may require downloading or configuring local models, or—depending on your choices—using an external AI provider. For professional use, treat outputs as assisted drafts and review them.
Your setup journey
Seven calm steps from download to your first analysis—nothing here requires a developer background.
-
Download VisionaryAI Suite
Use the Trial package below (or the primary button at the bottom of this page). Keep Wi‑Fi stable—the archive is large.
-
Launch the application
Extract if needed, then run the suite and follow on-screen prompts.
-
Approve the SmartScreen warning
Windows may ask once—see the SmartScreen section for exact clicks.
-
Complete Hugging Face setup
Sign in, paste your READ token, and approve any gated model prompts so weights can download.
-
Download Hugging Face models
Let first-time model downloads finish—they cache locally for vision, OCR, transcription, and more.
-
Choose your AI backend
Connect LM Studio for private local LLMs or OpenRouter for powerful cloud models—see AI backend setup.
-
Start analyzing media
Run the in-app connection test, then explore images, audio, or video within Trial limits—see contact or closed beta if you need more.
Windows SmartScreen warning
Because VisionaryAI Suite may be new or uncommon on consumer PCs, Windows can show a SmartScreen security prompt the first time you launch the app.
That does not mean the file is unofficial—only that Windows has not yet built “reputation” for this installer profile.
Exact steps (wording may vary by Windows version):
- Click More information
- Click Run anyway
Screenshot placeholder — SmartScreen warning screen
Screenshot placeholder — “More information” highlighted
Screenshot placeholder — “Run anyway” highlighted
This usually happens only once per machine under typical Windows settings.
Hugging Face AI model setup
VisionaryAI Suite uses AI models from Hugging Face for capabilities such as:
- Transcription
- Image understanding
- OCR
- Semantic analysis
- AI-assisted workflows
Models download to your PC. VisionaryAI Suite does not upload your personal media to Hugging Face for analysis—tokens authenticate downloads.
A free Hugging Face account is required.
Create your Hugging Face account
Use an email you monitor—you’ll reuse this login inside VisionaryAI Suite.
Open huggingface.co/joinCreate a READ access token
Under Settings → Access Tokens, create a token with read permission.
Open token settingsPaste token into VisionaryAI Suite
When guided setup asks, paste the token once—it stays on your machine.
Approve gated models when prompted
If a browser window opens on Hugging Face, accept access once per gated model.
Only a READ token is required. Never publish tokens or commit them to repositories.
Choose Your AI Backend
VisionaryAI Suite uses AI backends for advanced analysis, contextual understanding, semantic reasoning, and AI-assisted workflows. Hugging Face supplies many specialist weights; your backend supplies the conversational “brain” that ties advanced steps together.
Option 1
LM Studio
Local AI running on your computer
Option 2
OpenRouter
Cloud AI using online models
AI Backend Setup (LM Studio or OpenRouter)
VisionaryAI Suite requires a backend connection for the advanced reasoning layers above standard CV/audio checkpoints.
Install VisionaryAI Suite
Download and launch the Trial build from this page.
Set up Hugging Face
Account, READ token, gated approvals.
Choose your AI backend
Decide between LM Studio or OpenRouter using the comparison below.
Connect LM Studio or OpenRouter
Follow the guided steps, then run the connection test inside the suite.
Start analyzing media
Confirm “AI backend ready”, then run your first archive job.
LM Studio
- Runs locally
- Maximum privacy
- Works offline once models are cached
- Benefits from stronger hardware
- Slower first-time setup
- No API usage bills
- Best for privacy-focused teams
OpenRouter
- Cloud-based
- Faster setup
- Access to frontier-class models
- Requires internet
- API usage costs may apply
- Best for the easiest first experience
Which should I choose?
- Weak hardware or tight VRAM → start with OpenRouter
- Maximum privacy / sensitive archives → choose LM Studio
- Fastest path to “it just works” → choose OpenRouter
- Offline-capable AI after downloads → choose LM Studio
Approximate hints from your browser
Recommended for you
Lean toward LM Studio
Recommended for you
Lean toward OpenRouter
VisionaryAI Suite can detect LM Studio, running local servers, GPU tier, and VRAM on your PC. If you use Ollama or another local gateway, the same principles apply—point the suite at your compatible local endpoint.
LM Studio guided setup
Download LM Studio
LM Studio lets VisionaryAI Suite run compatible chat models locally.
Open lmstudio.aiInstall LM Studio
Complete the installer, then open LM Studio once so it can finish its own setup.
Screenshot placeholder — LM Studio installer / first launch
Download a recommended model
Pick a preset that matches your VRAM. Names change often—search inside LM Studio for these families.
| Preset | VRAM guide | RAM guide | Example families |
|---|---|---|---|
| Lightweight | ~6–8 GB | 16 GB+ | Quantised Gemma, small Qwen, compact Llama |
| Balanced | ~10–14 GB | 24 GB+ | Mistral, Llama 3.x mid-size quants |
| High-quality | 16 GB+ | 32 GB+ | Larger Llama / Qwen / Mistral builds |
VisionaryAI Suite surfaces smarter presets after it reads your actual GPU/VRAM—use those hints first.
Start the local server
Open the Developer tab → Start server. LM Studio defaults to port 1234 for local HTTP—keep it unless you know another app conflicts.
Screenshot placeholder — Developer tab & Start server
Connect VisionaryAI Suite
Inside VisionaryAI Suite open backend settings, choose LM Studio, and run Detect server / Test connection. The suite validates latency and chat readiness automatically.
OpenRouter guided setup
Create an OpenRouter account
Use a mailbox you monitor—you’ll manage billing alerts here.
Open openrouter.aiCreate an API key
Keys stay private—never paste them into chats or screenshots.
Open OpenRouter keysPaste the key into VisionaryAI Suite
Use the secure field in settings. Hit Validate, then Test connection—only metadata about success/failure returns to you.
Pick a starter model
Models and pricing move quickly—start small, then upgrade.
| Preset | Goal | Example IDs / vendors |
|---|---|---|
| Fast setup | Low latency trials | x-ai/grok-*-style fast routes, compact GPT-class listings |
| Balanced | Everyday archives | Anthropic Claude family, Google Gemini listings |
| Highest quality | Deepest reasoning | Frontier Claude / Gemini / DeepSeek / Qwen flagship SKUs |
OpenRouter shows live price estimates—set spend alerts and watch the dashboard during first tests.
Run your first AI test
After saving settings, VisionaryAI Suite sends a tiny prompt, confirms a response, then surfaces:
- Latency
- Active provider (LM Studio vs OpenRouter)
- Model name
What are gated models?
Some models need a one-time approval on Hugging Face before download. That is normal—model authors set the rules.
You usually click “Access repository”, “Agree and access repository”, or similar wording.
This is Hugging Face’s official flow. Afterwards, VisionaryAI Suite downloads weights straight to your disk.
Installation & setup video
A beginner-friendly walkthrough is coming soon and will appear here as an embedded player.
Video embed placeholder
Replace this block with a YouTube iframe (16:9) when the tutorial is ready.
Setup FAQ
Answers for first-time installs—tap a question to expand.
Why people trust this workflow
Local-first platform
Designed so analysis happens on hardware you control.
Your media stays with you
Source files are not shipped to our servers as part of the default workflow.
Models cached locally
Download once, reuse offline-capable stacks where your build allows.
Transparent expectations
We explain SmartScreen, Hugging Face, choosing an AI backend, and disk reality before you commit time.
Active product development
Trial 1.5.3 continues rapid iteration toward measurable, inspectable AI pipelines—distinct from the invite-only Closed Beta package.
Release history
Previously · VisionaryAI Suite Trial 1.5.2
Release history for the prior public Trial drop. Current downloads on this page use VisionaryAI_Suite_Portable_1.5.3.zip.
Trial 1.5.2 introduced a professional benchmarking and diagnostics layer: Benchmark Dashboard, Smart Whisper Optimization Profiles, Fusion Smart Whisper support, regression detection and exportable latency reports.
Previously · VisionaryAI Suite Trial 1.5.1
Trial 1.5.1 is deliberately not the Closed Beta ZIP: both tracks evolve in parallel with different installers, licensing and onboarding. That build focused on the public Trial archive VisionaryAI_Suite_Portable_1.5.1.zip.
Major emphasis in 1.5.1:
- Improved benchmarking history
- Better Whisper transcription tracing and clarity when runs complete or skip
- Enhanced Fusion Smart Whisper visibility and pipeline fingerprints
- Richer audit-style context across analysis sessions
- Sharper comparisons between estimated and actual throughput
It remains part of the roadmap that led into the 1.5.2 benchmarking work and the 1.5.3 deep media intelligence milestone above.
What VisionaryAI Suite does
VisionaryAI Suite is a local-first AI system that helps you analyse, tag, describe, transcribe, and organise media files on your own machine. The aim is simple: make large libraries searchable, understandable, and structured—so teams can find what they need without drowning in filenames and folders alone.
The platform can work with:
- Images, video, and audio
- Embedded and companion metadata
- AI-generated descriptions and tags
- Transcriptions where supported
- Semantic search across generated knowledge—not only filenames
- Structured .vtag sidecar metadata you can keep beside originals
- Workflows that stay local-first where your configuration allows
What you can try in Trial 1.5.3
Below is a practical checklist of what the trial lets you explore. Features may vary slightly depending on models installed and how you configure local or provider-backed AI.
Core media intelligence
- AI analysis of images
- Object and scene understanding (where models support it)
- Automatic tag generation
- AI-generated descriptions
- OCR / text extraction when available for your content
Advanced analysis
- Fusion-based analysis that combines multiple AI signals
- Deep Analysis with LLM-assisted interpretation
- Stronger overall stability and analysis flow vs earlier trials
- UI and workflow refinements you can feel day to day
Memory, metadata, archive
- Semantic Memory and searchable .vtag metadata
- Local generation of rich metadata (with local models where configured)
- Preview of the end-to-end VisionaryAI media intelligence workflow
- Foundations for video, audio, and broader archive intelligence work
Fusion & Deep Analysis
Fusion
Instead of relying on a single model output, Fusion combines multiple AI signals—such as visual cues, captions, tags, OCR text, and optional LLM reasoning—into a richer, more coherent interpretation of each media file. It is designed to make results feel more “complete” than isolated model runs, while still being something you can inspect and validate.
Deep Analysis
Deep Analysis goes further: using the signals already extracted, the system can attempt a more human-readable summary and interpretation of what is in the file. This supports exploration and documentation workflows—but it is not a guarantee of perfection. For archival, editorial, or legal-adjacent use, treat outputs as assisted drafts that benefit from professional review.
Semantic Memory
Semantic Memory is one of VisionaryAI’s most important ideas. It helps the application treat your archive as more than a pile of paths: it can connect generated metadata, descriptions, tags, and structured .vtag records so you can search by meaning, not only by filename. For how this contrasts with typical desktop search tools, see AI-powered file search.
In plain terms: “Semantic Memory allows VisionaryAI to understand your archive beyond filenames. It can search through generated metadata, descriptions, tags and structured .vtag files, making it possible to find relevant media even when you do not remember the exact filename.”
- Search by meaning, not only by filename
- Build a searchable layer on top of generated metadata
- Reuse existing .vtag files where you already have them
- Prepare archives for future AI-assisted workflows with portable structure
.vtag metadata
.vtag is VisionaryAI’s structured metadata format—typically stored as a sidecar file next to your original media, not unlike how an .xmp file travels with an image. It is how the suite keeps AI-generated knowledge portable, inspectable, and reusable.
A .vtag record can hold (depending on analysis settings):
- Tags, descriptions, transcriptions
- Object and scene information
- Timeline-oriented cues where applicable
- AI analysis results and semantic fields
- Room to grow for future forensic and intelligence-oriented layers
“.vtag is designed to make AI-generated metadata portable, searchable and reusable across the VisionaryAI ecosystem.” Learn more on the dedicated .vtag page.
Trial limits
The Trial version is limited to 20 analyses per device. It is designed to give you a hands-on preview of the real platform—Fusion, Deep Analysis, Semantic Memory, and .vtag workflows—before you move into a pilot, closed beta (with a longer evaluation key), or a commercial arrangement.
If you outgrow the cap quickly, contact us or explore closed beta or contact us for paths with longer evaluation.
Who is this Trial for?
VisionaryAI Trial 1.5.3 is a strong fit if you:
- Work as a media professional, archivist, researcher, or journalist
- Produce or manage video or audio at scale
- Are an AI-curious creator who wants real desktop software, not a slideshow demo
- Maintain large image, video, or audio libraries inside your organisation
- Want to test local-first AI analysis before committing to cloud-centric tools
- Are exploring smarter metadata workflows ahead of catalog, MAM, or archive upgrades
Local-first, privacy-aware analysis
VisionaryAI is built around a local-first philosophy. The goal is to help you analyse and structure sensitive media while keeping as much of the workflow as possible on your own machine—or inside environments you control—especially when you configure local models.
Depending on your choices, some advanced steps may still use external AI services if you enable them. Think of the trial as flexible: you can bias toward offline-capable stacks where supported, or blend in online models when that matches your policy. If you need a strictly air-gapped posture, treat that as a conversation with our team—we can help you map what is realistic for your configuration.
Why Trial 1.5.3 matters
Public Trial 1.5.3 is about deep media understanding into how VisionaryAI Suite processes your media—with stronger multimodal analysis, OCR, semantic memory and richer .vtag metadata.
- Benchmark discipline anchored in real latency and media identity metadata
- Regression awareness when hardware, drivers or models shift
- Whisper tuning through optimization profiles and adaptive recommendations
- Fusion flexibility including Smart Whisper paths and visual-first video analysis
- Operational documentation via TXT, JSON, HTML, CSV and PNG exports
- Still the same Semantic Memory / .vtag story—now with sharper diagnostics
- An honest waypoint for enterprise pilots that need measurable AI pipelines
Models, downloads, and getting started
Some AI features need models to be downloaded or configured before first use. That is intentional: it gives you more control over weights, VRAM use, and whether work stays local. First launches may take longer while caches fill—plan enough disk and bandwidth.
We are expanding onboarding guidance and walkthrough video material so you are never guessing alone. Until then, start with How it works and the Watch & learn clips on the gallery, and read the notes bundled with your download.
Deep hardware detail lives on System requirements; storage expectations are summarised below.
Evaluation builds
Evaluation builds are distributed to approved applicants only—not as a public download. After approval, you receive a secure link and signed license.json. Need access? Submit an application or contact us.
Request Trial AccessSystem requirements (summary)
Detailed minimum vs recommended hardware: system requirements for VisionaryAI Suite.
Windows (required)
The trial is a Windows desktop application. Use a 64-bit Windows version supported by the build you install. Run on bare metal or a well-provisioned VM with GPU pass-through if you use discrete acceleration.
NVIDIA GPU (recommended)
A modern NVIDIA GPU with sufficient VRAM for your chosen models is strongly recommended for practical throughput. CPU-only mode may be possible in some scenarios but is not the reference experience for large archives. AMD and integrated GPUs are not guaranteed for every pipeline—check the release notes for the build you run.
Memory and CPU
Expect requirements to scale with resolution, model size and concurrent jobs. Treat published minimums in your installer or readme as the binding numbers; the marketing site only sets expectations: 16 GB+ system RAM is a reasonable floor for many workloads, and more helps.
Network
The product is local-first: your source media is processed on the machine. Some builds may still contact the network for licence validation, model downloads, or updates. Treat air-gapped use as a deployment question for your edition, not a promise from this static page.
Install size and storage
Application and AI models
The download and install footprint is large and depends on which AI models you select and download. A minimal install is smaller; a full set of high-capability models can reach many gigabytes to tens of gigabytes or more on disk. Plan free space on a fast drive (NVMe or SSD) for both the app and the model cache you enable.
Your media and .vtag output
Local AI analysis of video and high-resolution stills can require a significant amount of additional storage for working data, model caches, and structured metadata written beside your files. A serious archive may need terabytes of free space in total. Scale hardware to your collection—this site does not give a one-number guarantee.
Request evaluation access
Experience benchmark dashboards, Smart Whisper profiles, upgraded Fusion and the full local-first workflow—within a controlled evaluation window, on hardware you control.
Applications are reviewed manually. See Privacy and your licence terms in the product.