Backing Up Your Local AI Workspace

Author: Michael J. Leaver, 2BrightSparks Pte. Ltd.

Local AI tools have a habit of quietly filling drives. A few months of Ollama, LM Studio and ComfyUI use can leave you with several hundred gigabytes of model weights, plus fine-tunes you spent days creating, chat histories, embeddings, custom workflows and configuration spread across half a dozen folders. Some of this is replaceable. Some of it is not.

Re-downloading a public 70B model is mostly an inconvenience: a few hours of bandwidth and some patience. A custom LoRA you trained on your own photographs, or a vector database built from years of private notes, is a different matter. If the drive holding it fails, that work is gone.

This article covers what is actually worth backing up from a local AI workspace, where common tools put their data on Windows, and how to build a SyncBackPro profile that copies the valuable parts without dragging in dozens of gigabytes of cache files you do not need.

TL;DR

Split your AI workspace into two profiles. Run a small, fast profile every hour or two over configs, chats and vector databases with versioning enabled. Run a larger profile weekly or after changes over model files, with versioning off and compression off. Filter aggressively to skip caches, partial downloads and Python virtual environments.

Quick AI Backup Checklist

Back up prompts, chats, vector databases and workflows.
Back up custom LoRAs, fine-tunes and trained models.
Exclude caches, temporary files and Python virtual environments.
Use versioning for configs and chats.
Disable versioning for large model files.
Test a restore every few months.

Why AI Workspaces Are Different
What is Worth Backing Up
Where AI Tools Store Data on Windows
Building the SyncBackPro Profile
Splitting Into Two Profiles
Schedule and Destination
Restore Considerations
Conclusion

Why AI Workspaces Are Different

Traditional backups focus on documents and photos. AI workspaces are different because they combine large, replaceable model files with small but irreplaceable assets such as prompts, vector databases, workflows and fine-tunes. Treating them all the same often leads to slow backups and wasted storage.

Imagine spending two months building a private RAG knowledge base containing thousands of documents. Recreating the vector database may take days, while restoring it from backup takes minutes. The whole point of separating the workspace into "easy to replace" and "hard or impossible to replace" is to make sure the second category is always somewhere safe.

The rest of this article works from that starting point: separate what is replaceable from what is not, and let each side of the workspace use the settings that suit it.

What is Worth Backing Up

Think about each item in your AI workspace in one of three categories: irreplaceable, expensive to replace, and disposable.

Irreplaceable. Always include these in a backup.

Custom fine-tunes, LoRAs and merges that you created yourself.
System prompts, agent definitions, character cards and any prompt library you have built up.
Conversation history from tools such as LM Studio, Open WebUI, Jan and Cherry Studio.
Vector databases and embeddings created from your own documents.
RAG document corpora (the source files you fed into a vector database).
Custom ComfyUI workflows, ControlNet models and trained textual inversions or embeddings.
Stable Diffusion outputs you want to keep, especially anything where you no longer have the prompt and seed to reproduce them.

Expensive to replace. Worth backing up if your bandwidth is limited or the model is no longer published.

Public model weights downloaded from Hugging Face or Ollama. A 70B model is typically 40 to 130 GB. On a 100 Mbps line that is several hours per model, and the original publisher may pull the file at any time.
Older quantisations of models that have since been superseded. If you depend on a specific version, treat it as irreplaceable.

Disposable. Exclude these from any backup.

Python virtual environments (venv, .venv, conda envs). They are large, full of small files, and trivially recreated from a requirements.txt.
node_modules folders.
__pycache__ folders and .pyc files.
Partial downloads, lock files and temporary files. Hugging Face in particular litters its cache with .lock, .metadata and incomplete-* files.
Build artefacts and compiled binaries from any tool you have built from source.

Where AI Tools Store Data on Windows

The single biggest reason AI backups go wrong is targeting the wrong folder. Most tools split their data between a user profile folder for configuration, an AppData folder for state, and somewhere on a chosen data drive for model weights. The table below covers the common defaults at the time of writing. Always check the application's settings, because the model directory in particular is often moved to a larger drive.

Tool	Default location on Windows	What lives there
Ollama	%USERPROFILE%\.ollama\	Model blobs under models\blobs\, manifests under models\manifests\, plus a small history file.
LM Studio	%USERPROFILE%\.cache\lm-studio\ and %APPDATA%\LM Studio\	Models in the .cache path; configuration, chat history and presets under AppData.
Hugging Face cache	%USERPROFILE%\.cache\huggingface\hub\	Shared by many tools (transformers, diffusers, sentence-transformers). Uses content-addressed blobs and symbolic links.
ComfyUI	Inside the install folder, typically ComfyUI\models\, ComfyUI\custom_nodes\, ComfyUI\output\, ComfyUI\user\workflows\	Checkpoints, LoRAs, VAEs, ControlNet, custom nodes, your saved workflows, and rendered outputs.
Automatic1111 and Forge	Inside the install folder, under models\, outputs\, embeddings\	Same shape as ComfyUI but laid out under stable-diffusion-webui.
AnythingLLM	%APPDATA%\anythingllm-desktop\storage\	Workspaces, vector store data, uploaded documents and SQLite databases. Often the most irreplaceable folder on disk.
Jan	%USERPROFILE%\jan\	Models, threads (conversations), assistants and extensions.
Cherry Studio	%APPDATA%\CherryStudio\	Configuration, knowledge bases, conversation history.
Open WebUI (Docker)	Inside the Docker volume open-webui	Use docker cp or mount the volume on a backed-up path. Includes SQLite database, user accounts, chats and uploaded files.

The simplest reliable approach is to point each application's "models directory" or "data directory" setting at a single AI drive (for example D:\AI\) with subfolders per tool. The cleaner the layout, the cleaner the backup profile.

Building the SyncBackPro Profile

Once the locations are pinned down, the profile itself is straightforward, but a few settings are worth getting right before the first run.

Filters and Excludes

Filtering matters more here than in a typical document backup, because AI installs include large amounts of regeneratable junk. Suggested exclude patterns to add to your filter:

__pycache__\ and *.pyc
venv\, .venv\, env\, conda-meta\
node_modules\
*.tmp, *.lock, *.partial, incomplete-*
logs\ and *.log (unless you specifically want diagnostic history)
.git\, .svn\, .hg\ if you have cloned repositories of model code

For the bulk model side of the profile, you can also limit included files by extension so that nothing else slips in. The relevant extensions are .gguf, .safetensors, .bin, .ckpt, .pt, .pth, .onnx and .tflite. Adding .json alongside captures the small manifest and configuration files that sit beside model weights.

Compression and Delta Copy

Turn compression off for any profile that backs up model weights. GGUF, safetensors and similar formats already store numerical data in quantised form. Running them through Zip-compatible compression typically saves only a few percent while making the backup substantially slower. The CPU time is better spent finishing the run quickly.

Delta copy behaves differently depending on what is being backed up. Model files are written and replaced as whole units rather than edited in place, so delta copy has no opportunity to skip blocks. Vector databases and SQLite-backed application state are the opposite: they change in place, often only in small regions, and delta copy works well on them. Keep delta copy on for the small, fast-changing profile and accept that it does little for the bulk profile.

Versioning

Versioning is one of the strongest features in SyncBackPro for protecting irreplaceable work, but it cuts the other way for large model files. If a user updates a 40 GB model and the backup is configured to keep three versions, you have just spent 120 GB on a file that you can re-download.

The pragmatic split:

Configs and chats profile: enable versioning. Keep several versions over a reasonable time window. The data is small and the history is worth more than the space.
Bulk models profile: set the profile to replace existing files without versioning. If a model gets replaced, you do not want a stale 40 GB copy hanging around.

Splitting Into Two Profiles

A single profile can technically cover the whole AI workspace, but the two halves have such different requirements that it is cleaner to use two.

AI configs (small, fast-changing): conversation history, prompts, character cards, vector databases, RAG documents, ComfyUI workflows, application settings. A few hundred megabytes to a few gigabytes in total. Set this profile to run every hour or two with versioning and delta copy enabled.
AI bulk (large, slow-changing): model weights, LoRAs, VAEs, ControlNet models, ONNX exports. Tens to hundreds of gigabytes. Set this profile to run weekly, or trigger it manually after you download or train something new. Versioning and compression off.

If you already use group queues, put both profiles into a single AI group so they run in a defined order without overlapping with your other backup jobs.

Schedule and Destination

For the configs profile, a fast local destination such as a NAS, a USB SSD or even a separate internal drive is a good first stop, with a second copy going to cloud storage on a daily schedule. The data is small enough that even modest cloud plans cope easily.

The bulk profile is the more expensive one. A few options work well in practice:

A dedicated external HDD that lives on the desk and is connected only during the weekly backup run. Cheap, fast, and an effective ransomware defence, since the drive is only attached during the backup window.
A NAS with enough headroom to hold the workspace plus a generation or two of changes. Pairs nicely with the configs profile pointing at the same NAS.
Object storage for offsite redundancy. Egress costs matter when restoring a 200 GB workspace, so consider providers with predictable egress pricing if you take this route.

If your home or office network is the bottleneck, schedule the bulk profile overnight and let it finish at its own pace.

Restore Considerations

A few things tend to trip up AI workspace restores that are worth knowing in advance.

Hardcoded paths. ComfyUI workflows and some Python scripts contain absolute paths to model files. If you restore to a different drive letter, expect to fix these manually or run a search-and-replace over the workflow JSON.
App version mismatches. Vector database formats and chat database schemas occasionally change between application versions. Keep a copy of the installer or installer link alongside the backup so you can restore the exact version that wrote the files.
Symbolic links. The Hugging Face cache uses symbolic links from the model directory into a content-addressed blob store. SyncBackPro can be configured to follow or preserve symbolic links, depending on whether you want a portable copy or a faithful clone. For a portable copy, choose to dereference links so that the destination contains the actual files.
GPU and runtime dependencies. A restored model may still fail if the required CUDA version, Python package, or inference runtime is missing. Keep a copy of environment notes alongside the backup.
Test a restore. Once a quarter, restore one model and one workflow into a scratch folder and confirm they load. This is the single best protection against silently broken backups.

Conclusion

A local AI workspace looks like a backup problem at first glance, but it really is two problems stitched together: a small, fast-changing set of irreplaceable configuration and chat history, and a large, slow-changing set of model files that are mostly recoverable from the internet. Treat the two halves with the settings each one needs, filter aggressively, and the result is a backup that finishes in reasonable time without leaving holes in the data that matters.

If you are setting this up for the first time, the best starting move is to consolidate every tool's data under a single drive letter or path. From there, a pair of SyncBackPro profiles, sensible filters, and a weekly bulk run will keep the work you cannot easily replace safely copied somewhere else.

Ready to protect your local AI workspace? Create two SyncBackPro profiles today: one for irreplaceable AI data and one for large model files. Verify that you can restore both before you need them.