Windows · Professional Evaluation Program · 1.5.3

Evaluation access & setup guide

VisionaryAI Suite is currently available through a controlled evaluation program. Access is provided to selected testers, creators, researchers and enterprise users—with onboarding, support and controlled deployment during the current beta phase.

Approved applicants receive a Windows evaluation build and a signed license.json. The installer opens in Awaiting Activation until you import your approved license. This page covers installation for approved users.

Windows only — today

This download is Windows-only. There is no native Mac build yet. We are working toward similar software for macOS, but current engineering focus stays on Windows—and on virtual environments (such as a Windows VM) for teams that need to run the suite on non-Windows hardware.

Latest release · May 2026

VisionaryAI Suite 1.5.3 is here

A major milestone in local-first AI media intelligence — stronger multimodal analysis, improved OCR, deeper semantic memory and richer .vtag metadata.

VisionaryAI Suite 1.5.3 strengthens multimodal analysis, OCR intelligence, semantic memory and richer .vtag metadata workflows — making the platform more capable than ever at understanding images, video, audio, text and context.

This is no longer just a media tagging tool. Version 1.5.3 moves VisionaryAI Suite toward a local-first AI platform for deep media understanding — with professional benchmarking and diagnostics built in.

What’s new in 1.5.3

  • Stronger multimodal AI analysis across vision, speech and metadata
  • Improved OCR intelligence — visible text as an intelligence layer
  • Deeper semantic memory with OCR-aware search
  • Richer .vtag sidecar metadata workflows
  • Evidence-based fusion and timeline grounding (where your build supports it)
  • Benchmark Dashboard, Smart Whisper profiles and latency reporting included

Try the latest VisionaryAI Suite 1.5.3 evaluation build — request access to explore multimodal analysis, semantic memory and improved OCR on your own archive.

Recommended first read

Before you install

License required: Evaluation builds open in Awaiting Activation until you import a signed license.json sent after approval. Face recognition and biometric person identification are licence-controlled premium features—see pricing and closed beta.

  • Platform: VisionaryAI Suite runs only on Windows. Mac is not supported yet—we aim for a macOS offering later; until then, expect best results on Windows itself or inside a Windows virtual machine.
  • You are installing an approved evaluation build—a real workflow preview, not a slideshow demo.
  • The first launch may show a Windows SmartScreen notice—common for newer desktop apps.
  • AI models download during first setup (allow time, disk space, and a stable connection).
  • A free Hugging Face account is usually required to authenticate model downloads.
  • For advanced reasoning and AI-assisted workflows, VisionaryAI Suite also uses an AI backend—typically LM Studio (local) or OpenRouter (cloud).
  • Your media stays on your computer for core local analysis.

Typical first-time setup: 10–20 minutes

Most of the setup process is guided directly inside VisionaryAI Suite.

VisionaryAI Trial 1.5.3 may require downloading or configuring local models, or—depending on your choices—using an external AI provider. For professional use, treat outputs as assisted drafts and review them.

Your setup journey

Seven calm steps from download to your first analysis—nothing here requires a developer background.

  1. Download VisionaryAI Suite

    Use the Trial package below (or the primary button at the bottom of this page). Keep Wi‑Fi stable—the archive is large.

  2. Launch the application

    Extract if needed, then run the suite and follow on-screen prompts.

  3. Approve the SmartScreen warning

    Windows may ask once—see the SmartScreen section for exact clicks.

  4. Complete Hugging Face setup

    Sign in, paste your READ token, and approve any gated model prompts so weights can download.

  5. Download Hugging Face models

    Let first-time model downloads finish—they cache locally for vision, OCR, transcription, and more.

  6. Choose your AI backend

    Connect LM Studio for private local LLMs or OpenRouter for powerful cloud models—see AI backend setup.

  7. Start analyzing media

    Run the in-app connection test, then explore images, audio, or video within Trial limits—see contact or closed beta if you need more.

Windows SmartScreen warning

Because VisionaryAI Suite may be new or uncommon on consumer PCs, Windows can show a SmartScreen security prompt the first time you launch the app.

That does not mean the file is unofficial—only that Windows has not yet built “reputation” for this installer profile.

Exact steps (wording may vary by Windows version):

  1. Click More information
  2. Click Run anyway
Windows SmartScreen prompt (first launch)
Show “More information”
Then choose “Run anyway”

This usually happens only once per machine under typical Windows settings.

Hugging Face AI model setup

VisionaryAI Suite uses AI models from Hugging Face for capabilities such as:

  • Transcription
  • Image understanding
  • OCR
  • Semantic analysis
  • AI-assisted workflows

Models download to your PC. VisionaryAI Suite does not upload your personal media to Hugging Face for analysis—tokens authenticate downloads.

A free Hugging Face account is required.

Step 1

Create your Hugging Face account

Use an email you monitor—you’ll reuse this login inside VisionaryAI Suite.

Open huggingface.co/join
Step 2

Log in

Confirm you can sign in before creating tokens.

Open huggingface.co/login
Step 3

Create a READ access token

Under Settings → Access Tokens, create a token with read permission.

Open token settings
Step 4

Paste token into VisionaryAI Suite

When guided setup asks, paste the token once—it stays on your machine.

Step 5

Approve gated models when prompted

If a browser window opens on Hugging Face, accept access once per gated model.

Only a READ token is required. Never publish tokens or commit them to repositories.

Choose Your AI Backend

VisionaryAI Suite uses AI backends for advanced analysis, contextual understanding, semantic reasoning, and AI-assisted workflows. Hugging Face supplies many specialist weights; your backend supplies the conversational “brain” that ties advanced steps together.

Option 1

LM Studio

Local AI running on your computer

Option 2

OpenRouter

Cloud AI using online models

Both options are fully supported inside VisionaryAI Suite.

AI Backend Setup (LM Studio or OpenRouter)

VisionaryAI Suite requires a backend connection for the advanced reasoning layers above standard CV/audio checkpoints.

  1. Install VisionaryAI Suite

    Download and launch the Trial build from this page.

  2. Set up Hugging Face

    Account, READ token, gated approvals.

  3. Choose your AI backend

    Decide between LM Studio or OpenRouter using the comparison below.

  4. Connect LM Studio or OpenRouter

    Follow the guided steps, then run the connection test inside the suite.

  5. Start analyzing media

    Confirm “AI backend ready”, then run your first archive job.

LM Studio

  • Runs locally
  • Maximum privacy
  • Works offline once models are cached
  • Benefits from stronger hardware
  • Slower first-time setup
  • No API usage bills
  • Best for privacy-focused teams

OpenRouter

  • Cloud-based
  • Faster setup
  • Access to frontier-class models
  • Requires internet
  • API usage costs may apply
  • Best for the easiest first experience

Which should I choose?

  • Weak hardware or tight VRAM → start with OpenRouter
  • Maximum privacy / sensitive archives → choose LM Studio
  • Fastest path to “it just works” → choose OpenRouter
  • Offline-capable AI after downloads → choose LM Studio

Approximate hints from your browser

VisionaryAI Suite can detect LM Studio, running local servers, GPU tier, and VRAM on your PC. If you use Ollama or another local gateway, the same principles apply—point the suite at your compatible local endpoint.

LM Studio guided setup

Step 1

Download LM Studio

LM Studio lets VisionaryAI Suite run compatible chat models locally.

Open lmstudio.ai
Step 2

Install LM Studio

Complete the installer, then open LM Studio once so it can finish its own setup.

Installer welcome screen
Step 3

Download a recommended model

Pick a preset that matches your VRAM. Names change often—search inside LM Studio for these families.

PresetVRAM guideRAM guideExample families
Lightweight~6–8 GB16 GB+Quantised Gemma, small Qwen, compact Llama
Balanced~10–14 GB24 GB+Mistral, Llama 3.x mid-size quants
High-quality16 GB+32 GB+Larger Llama / Qwen / Mistral builds

VisionaryAI Suite surfaces smarter presets after it reads your actual GPU/VRAM—use those hints first.

Step 4

Start the local server

Open the Developer tab → Start server. LM Studio defaults to port 1234 for local HTTP—keep it unless you know another app conflicts.

Developer tab · server running
Step 5

Connect VisionaryAI Suite

Inside VisionaryAI Suite open backend settings, choose LM Studio, and run Detect server / Test connection. The suite validates latency and chat readiness automatically.

ConnectedGreen check · ready
Not detectedServer stopped
Wrong portFix URL/port
Server offlineLaunch LM Studio

OpenRouter guided setup

Step 1

Create an OpenRouter account

Use a mailbox you monitor—you’ll manage billing alerts here.

Open openrouter.ai
Step 2

Create an API key

Keys stay private—never paste them into chats or screenshots.

Open OpenRouter keys
Step 3

Paste the key into VisionaryAI Suite

Use the secure field in settings. Hit Validate, then Test connection—only metadata about success/failure returns to you.

Step 4

Pick a starter model

Models and pricing move quickly—start small, then upgrade.

PresetGoalExample IDs / vendors
Fast setupLow latency trialsx-ai/grok-*-style fast routes, compact GPT-class listings
BalancedEveryday archivesAnthropic Claude family, Google Gemini listings
Highest qualityDeepest reasoningFrontier Claude / Gemini / DeepSeek / Qwen flagship SKUs

OpenRouter shows live price estimates—set spend alerts and watch the dashboard during first tests.

Run your first AI test

After saving settings, VisionaryAI Suite sends a tiny prompt, confirms a response, then surfaces:

  • Latency
  • Active provider (LM Studio vs OpenRouter)
  • Model name

AI backend ready. You can start analyzing media immediately.

What are gated models?

Some models need a one-time approval on Hugging Face before download. That is normal—model authors set the rules.

You usually click “Access repository”, “Agree and access repository”, or similar wording.

This is Hugging Face’s official flow. Afterwards, VisionaryAI Suite downloads weights straight to your disk.

Installation & setup video

A beginner-friendly walkthrough is coming soon and will appear here as an embedded player.

Setup FAQ

Answers for first-time installs—tap a question to expand.

Why people trust this workflow

Local-first platform

Designed so analysis happens on hardware you control.

Your media stays with you

Source files are not shipped to our servers as part of the default workflow.

Models cached locally

Download once, reuse offline-capable stacks where your build allows.

Transparent expectations

We explain SmartScreen, Hugging Face, choosing an AI backend, and disk reality before you commit time.

Active product development

Trial 1.5.3 continues rapid iteration toward measurable, inspectable AI pipelines—distinct from the invite-only Closed Beta package.

Human support paths

Questions? Visit FAQ or contact for pilots and licensing.

Release history

Previously · VisionaryAI Suite Trial 1.5.2

Release history for the prior public Trial drop. Current downloads on this page use VisionaryAI_Suite_Portable_1.5.3.zip.

Trial 1.5.2 introduced a professional benchmarking and diagnostics layer: Benchmark Dashboard, Smart Whisper Optimization Profiles, Fusion Smart Whisper support, regression detection and exportable latency reports.

Previously · VisionaryAI Suite Trial 1.5.1

Trial 1.5.1 is deliberately not the Closed Beta ZIP: both tracks evolve in parallel with different installers, licensing and onboarding. That build focused on the public Trial archive VisionaryAI_Suite_Portable_1.5.1.zip.

Major emphasis in 1.5.1:

  • Improved benchmarking history
  • Better Whisper transcription tracing and clarity when runs complete or skip
  • Enhanced Fusion Smart Whisper visibility and pipeline fingerprints
  • Richer audit-style context across analysis sessions
  • Sharper comparisons between estimated and actual throughput

It remains part of the roadmap that led into the 1.5.2 benchmarking work and the 1.5.3 deep media intelligence milestone above.

What VisionaryAI Suite does

VisionaryAI Suite is a local-first AI system that helps you analyse, tag, describe, transcribe, and organise media files on your own machine. The aim is simple: make large libraries searchable, understandable, and structured—so teams can find what they need without drowning in filenames and folders alone.

The platform can work with:

  • Images, video, and audio
  • Embedded and companion metadata
  • AI-generated descriptions and tags
  • Transcriptions where supported
  • Semantic search across generated knowledge—not only filenames
  • Structured .vtag sidecar metadata you can keep beside originals
  • Workflows that stay local-first where your configuration allows

What you can try in Trial 1.5.3

Below is a practical checklist of what the trial lets you explore. Features may vary slightly depending on models installed and how you configure local or provider-backed AI.

Core media intelligence

  • AI analysis of images
  • Object and scene understanding (where models support it)
  • Automatic tag generation
  • AI-generated descriptions
  • OCR / text extraction when available for your content

Advanced analysis

  • Fusion-based analysis that combines multiple AI signals
  • Deep Analysis with LLM-assisted interpretation
  • Stronger overall stability and analysis flow vs earlier trials
  • UI and workflow refinements you can feel day to day

Memory, metadata, archive

  • Semantic Memory and searchable .vtag metadata
  • Local generation of rich metadata (with local models where configured)
  • Preview of the end-to-end VisionaryAI media intelligence workflow
  • Foundations for video, audio, and broader archive intelligence work

Fusion & Deep Analysis

Fusion

Instead of relying on a single model output, Fusion combines multiple AI signals—such as visual cues, captions, tags, OCR text, and optional LLM reasoning—into a richer, more coherent interpretation of each media file. It is designed to make results feel more “complete” than isolated model runs, while still being something you can inspect and validate.

Deep Analysis

Deep Analysis goes further: using the signals already extracted, the system can attempt a more human-readable summary and interpretation of what is in the file. This supports exploration and documentation workflows—but it is not a guarantee of perfection. For archival, editorial, or legal-adjacent use, treat outputs as assisted drafts that benefit from professional review.

Semantic Memory

Semantic Memory is one of VisionaryAI’s most important ideas. It helps the application treat your archive as more than a pile of paths: it can connect generated metadata, descriptions, tags, and structured .vtag records so you can search by meaning, not only by filename. For how this contrasts with typical desktop search tools, see AI-powered file search.

In plain terms: “Semantic Memory allows VisionaryAI to understand your archive beyond filenames. It can search through generated metadata, descriptions, tags and structured .vtag files, making it possible to find relevant media even when you do not remember the exact filename.”

  • Search by meaning, not only by filename
  • Build a searchable layer on top of generated metadata
  • Reuse existing .vtag files where you already have them
  • Prepare archives for future AI-assisted workflows with portable structure

.vtag metadata

.vtag is VisionaryAI’s structured metadata format—typically stored as a sidecar file next to your original media, not unlike how an .xmp file travels with an image. It is how the suite keeps AI-generated knowledge portable, inspectable, and reusable.

A .vtag record can hold (depending on analysis settings):

  • Tags, descriptions, transcriptions
  • Object and scene information
  • Timeline-oriented cues where applicable
  • AI analysis results and semantic fields
  • Room to grow for future forensic and intelligence-oriented layers

“.vtag is designed to make AI-generated metadata portable, searchable and reusable across the VisionaryAI ecosystem.” Learn more on the dedicated .vtag page.

Trial limits

The Trial version is limited to 20 analyses per device. It is designed to give you a hands-on preview of the real platform—Fusion, Deep Analysis, Semantic Memory, and .vtag workflows—before you move into a pilot, closed beta (with a longer evaluation key), or a commercial arrangement.

If you outgrow the cap quickly, contact us or explore closed beta or contact us for paths with longer evaluation.

Who is this Trial for?

VisionaryAI Trial 1.5.3 is a strong fit if you:

  • Work as a media professional, archivist, researcher, or journalist
  • Produce or manage video or audio at scale
  • Are an AI-curious creator who wants real desktop software, not a slideshow demo
  • Maintain large image, video, or audio libraries inside your organisation
  • Want to test local-first AI analysis before committing to cloud-centric tools
  • Are exploring smarter metadata workflows ahead of catalog, MAM, or archive upgrades

Local-first, privacy-aware analysis

VisionaryAI is built around a local-first philosophy. The goal is to help you analyse and structure sensitive media while keeping as much of the workflow as possible on your own machine—or inside environments you control—especially when you configure local models.

Depending on your choices, some advanced steps may still use external AI services if you enable them. Think of the trial as flexible: you can bias toward offline-capable stacks where supported, or blend in online models when that matches your policy. If you need a strictly air-gapped posture, treat that as a conversation with our team—we can help you map what is realistic for your configuration.

Why Trial 1.5.3 matters

Public Trial 1.5.3 is about deep media understanding into how VisionaryAI Suite processes your media—with stronger multimodal analysis, OCR, semantic memory and richer .vtag metadata.

  • Benchmark discipline anchored in real latency and media identity metadata
  • Regression awareness when hardware, drivers or models shift
  • Whisper tuning through optimization profiles and adaptive recommendations
  • Fusion flexibility including Smart Whisper paths and visual-first video analysis
  • Operational documentation via TXT, JSON, HTML, CSV and PNG exports
  • Still the same Semantic Memory / .vtag story—now with sharper diagnostics
  • An honest waypoint for enterprise pilots that need measurable AI pipelines

Models, downloads, and getting started

Some AI features need models to be downloaded or configured before first use. That is intentional: it gives you more control over weights, VRAM use, and whether work stays local. First launches may take longer while caches fill—plan enough disk and bandwidth.

We are expanding onboarding guidance and walkthrough video material so you are never guessing alone. Until then, start with How it works and the Watch & learn clips on the gallery, and read the notes bundled with your download.

Deep hardware detail lives on System requirements; storage expectations are summarised below.

Evaluation builds

Evaluation builds are distributed to approved applicants only—not as a public download. After approval, you receive a secure link and signed license.json. Need access? Submit an application or contact us.

Request Trial Access

System requirements (summary)

Detailed minimum vs recommended hardware: system requirements for VisionaryAI Suite.

Windows (required)

The trial is a Windows desktop application. Use a 64-bit Windows version supported by the build you install. Run on bare metal or a well-provisioned VM with GPU pass-through if you use discrete acceleration.

NVIDIA GPU (recommended)

A modern NVIDIA GPU with sufficient VRAM for your chosen models is strongly recommended for practical throughput. CPU-only mode may be possible in some scenarios but is not the reference experience for large archives. AMD and integrated GPUs are not guaranteed for every pipeline—check the release notes for the build you run.

Memory and CPU

Expect requirements to scale with resolution, model size and concurrent jobs. Treat published minimums in your installer or readme as the binding numbers; the marketing site only sets expectations: 16 GB+ system RAM is a reasonable floor for many workloads, and more helps.

Network

The product is local-first: your source media is processed on the machine. Some builds may still contact the network for licence validation, model downloads, or updates. Treat air-gapped use as a deployment question for your edition, not a promise from this static page.

Install size and storage

Application and AI models

The download and install footprint is large and depends on which AI models you select and download. A minimal install is smaller; a full set of high-capability models can reach many gigabytes to tens of gigabytes or more on disk. Plan free space on a fast drive (NVMe or SSD) for both the app and the model cache you enable.

Your media and .vtag output

Local AI analysis of video and high-resolution stills can require a significant amount of additional storage for working data, model caches, and structured metadata written beside your files. A serious archive may need terabytes of free space in total. Scale hardware to your collection—this site does not give a one-number guarantee.

Request evaluation access

Experience benchmark dashboards, Smart Whisper profiles, upgraded Fusion and the full local-first workflow—within a controlled evaluation window, on hardware you control.

Applications are reviewed manually. See Privacy and your licence terms in the product.

Learn more