Onboarding Overview
In this guide you will learn:
- The four stages of onboarding: Discovery, Manifesting, Review, and Sync
- What each stage does and what you need to do at each step
- How local and cloud-stored data are handled differently
What onboarding does
Section titled “What onboarding does”Onboarding is the process of getting your existing image collection into finwave. The desktop app guides you through a structured pipeline that organizes your raw files into encounters, verifies the data, and uploads everything to the server where the ML pipeline processes it.
The pipeline
Section titled “The pipeline”1. Discovery
Section titled “1. Discovery”Scan your image directories to build a file inventory. Discovery extracts metadata from every image (EXIF dates, GPS, camera info, IPTC creator) and analyzes folder structures and spreadsheets for additional encounter data.
- Local directories — Select folders on your machine or external drives. Scanning runs locally.
- Azure blob storage — Connect a storage account and scan blob containers remotely. A cloud worker processes the files without downloading them to your machine.
2. Manifesting
Section titled “2. Manifesting”Configure how your files are grouped into encounters. A manifest defines the rules for extracting date, location, photographer, and individual IDs from your file metadata.
- Choose a grouping strategy (folder, filename, time proximity, or fusion)
- Configure field source mappings
- Resolve photographer names to finwave user accounts
- Preview the results and refine until the grouping looks correct
3. Pre-Sync Review
Section titled “3. Pre-Sync Review”After approving a manifest, it is materialized into concrete encounter records. The pre-sync review lets you inspect each encounter:
- Verify data completeness (date, location, GPS, photographer)
- Assign licenses (Public, Public+Attribution, or Private)
- Assign organizations
- Approve or deny individual encounters
- Edit encounter data inline if corrections are needed
Learn more about Pre-Sync Review
4. Sync
Section titled “4. Sync”Upload approved encounters and their images to finwave. The sync engine:
- Creates encounter records on the server (or finds existing ones via dedup)
- Uploads images via SAS URIs (local files) or blob-to-blob copy (Azure files)
- Tracks progress per encounter and per image
- Handles retries on transient errors automatically
- Triggers the ML pipeline for each uploaded image
Resumability
Section titled “Resumability”Each stage saves its state. You can close the app and resume later:
- Scans persist in the local database
- Manifests are saved as you edit them
- Materialized encounters persist across sessions
- Sync progress is tracked per image — interrupted syncs resume where they left off
Local vs. cloud data
Section titled “Local vs. cloud data”| Local directories | Azure blob storage | |
|---|---|---|
| Discovery | Scans locally, fast | Cloud worker, may take hours for large datasets |
| Metadata | Full EXIF/IPTC extraction | Header-only download (128KB per image) |
| Sync upload | Desktop uploads via SAS URI | Server-side blob-to-blob copy (fast, no local bandwidth) |
Related
Section titled “Related”- Azure Blob Storage — Connecting cloud storage
- Ensure Individuals — Pre-sync individual creation
- Upload Process — Technical upload details
- Getting Help — Support and diagnostics