Skip to content

09 - Data Management Guide

πŸ’Ύ Backup, Export, and Manage Your Data
⏱️ Time Estimate: 15 minutes
πŸ“‹ What You’ll Learn: Export options, backup strategies, data recovery, storage management



Selfoss provides three levels of export:

LevelScopeUse CaseFeature
IndividualSingle transcriptShare specific meetingF006
ProjectAll transcripts in projectArchive completed projectF006
Full BackupEntire applicationDevice migration, disaster recoveryF013

Transcript data:

  • πŸ“„ JSON: Raw structured data
  • πŸ“Š PDF: Professional report
  • πŸ“ TXT: Plain text transcript
  • πŸ–ΌοΈ PNG/SVG: Individual visualizations

Audio files:

  • 🎡 WebM: Original recording format
  • πŸ“¦ ZIP: Bundled with transcript

Database:

  • πŸ—„οΈ SQLite: Complete database backup
  • πŸ“¦ ZIP: Compressed archive

Step 1: Select Transcript

Click transcript in project list
↓
Right-click β†’ Export Options

Step 2: Choose Format

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Export Transcript β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ ☐ JSON (raw data) β”‚
β”‚ ☐ PDF (formatted report) β”‚
β”‚ ☐ TXT (plain text) β”‚
β”‚ ☐ Audio file (if recording) β”‚
β”‚ β”‚
β”‚ [Cancel] [Export] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 3: Select Destination

  • File picker opens
  • Choose save location
  • Name the file

Contains:

{
"metadata": {
"title": "Q4 Planning Meeting",
"date": "2024-10-16",
"participants": ["John", "Sarah", "Mike"],
"upload_date": "2024-10-16T14:30:00Z"
},
"decisions": [
{
"id": "D1",
"text": "Approve Q4 marketing budget: $500K",
"timestamp": "00:15:30",
"flows_to": ["A1", "A2"]
}
],
"actions": [
{
"id": "A1",
"task": "Finalize vendor contracts",
"owner": "John",
"deadline": "2024-10-20",
"priority": "high",
"source_decision": "D1"
}
],
"concepts": [...],
"temporal_data": {...},
"processing_metadata": {
"model_used": "gpt-4o-mini",
"provider": "openai",
"tokens_used": 5234,
"processing_cost": 0.0023
}
}

Use cases:

  • Integration with other tools
  • Custom analysis scripts
  • Data migration
  • Backup reference

Contains:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Q4 Planning Meeting β”‚
β”‚ October 16, 2024 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Participants: β”‚
β”‚ β€’ John Smith β”‚
β”‚ β€’ Sarah Johnson β”‚
β”‚ β€’ Mike Chen β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Decisions: β”‚
β”‚ 1. Approve Q4 marketing... β”‚
β”‚ β”‚
β”‚ [Decision Flowchart PNG] β”‚
β”‚ β”‚
β”‚ Actions: β”‚
β”‚ [Action Matrix Table] β”‚
β”‚ β”‚
β”‚ Concepts: β”‚
β”‚ [Mind Map PNG] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Use cases:

  • Sharing with executives
  • Meeting minutes documentation
  • Archival records
  • Stakeholder reports

For recorded transcripts:

  1. Right-click transcript
  2. Select β€œDownload Audio”
  3. Original WebM file downloads

File details:

  • Format: WebM (original)
  • Size: ~1MB per minute
  • Includes: Full audio recording
  • Location: Choose save destination

πŸ’‘ Pro Tip: Export audio before deleting transcripts if you might need it later.


Step 1: Right-Click Project

Projects Sidebar
↓
Right-click project β†’ Export Project

Step 2: Choose Options

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Export Project: Engineering β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β˜‘ Transcripts (12 files) β”‚
β”‚ β˜‘ Audio recordings (8 files) β”‚
β”‚ β˜‘ Visualizations (36 images) β”‚
β”‚ ☐ Processing metadata β”‚
β”‚ β”‚
β”‚ Format: [ZIP Archive β–Ό] β”‚
β”‚ β”‚
β”‚ [Cancel] [Export Project] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 3: Wait for Processing

  • Progress bar shows status
  • Large projects may take 1-2 minutes
  • Audio files are largest component
Engineering_Project_2024-10-16.zip
β”œβ”€β”€ project_metadata.json
β”œβ”€β”€ transcripts/
β”‚ β”œβ”€β”€ meeting_001.json
β”‚ β”œβ”€β”€ meeting_002.json
β”‚ └── ...
β”œβ”€β”€ audio_recordings/
β”‚ β”œβ”€β”€ recording_001.webm
β”‚ β”œβ”€β”€ recording_002.webm
β”‚ └── ...
└── visualizations/
β”œβ”€β”€ meeting_001_flowchart.png
β”œβ”€β”€ meeting_001_mindmap.svg
└── ...

Project completion:

  • Archive finished initiatives
  • Free up database space
  • Create handoff packages

Client deliverables:

  • Package all meeting records
  • Include audio for reference
  • Professional presentation

Compliance:

  • Legal record keeping
  • Audit trail documentation
  • Historical reference

Create a complete snapshot of all Selfoss data.

Step 1: Open Settings

Settings β†’ Data Management β†’ Export All Data

Step 2: Configure Backup

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Full Application Backup β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Include: β”‚
β”‚ β˜‘ All projects (5) β”‚
β”‚ β˜‘ All transcripts (47) β”‚
β”‚ β˜‘ All audio recordings (23) β”‚
β”‚ β˜‘ LLM settings β”‚
β”‚ β˜‘ Application preferences β”‚
β”‚ β˜‘ License information β”‚
β”‚ β”‚
β”‚ Estimated size: 2.3 GB β”‚
β”‚ Destination: Downloads folder β”‚
β”‚ β”‚
β”‚ [Cancel] [Create Backup] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 3: Wait for Completion

  • Progress indicator shows stages:
    • Exporting database…
    • Archiving audio files…
    • Compressing data…
    • Saving to disk…

⏱️ Time estimate:

  • Small (< 100 transcripts): 30 seconds
  • Medium (100-500): 2-3 minutes
  • Large (500+): 5-10 minutes
selfoss_backup_2024-10-16_14-30-00.zip
β”œβ”€β”€ metadata.json
β”œβ”€β”€ projects.json
β”œβ”€β”€ transcripts.json
β”œβ”€β”€ llm_settings.json
β”œβ”€β”€ visualizations.json
└── audio_recordings/
β”œβ”€β”€ project_1/
β”‚ β”œβ”€β”€ recording_001.webm
β”‚ └── recording_002.webm
β”œβ”€β”€ project_2/
β”‚ └── recording_003.webm
└── ...

The metadata.json file contains:

{
"backup_version": "1.0",
"backup_date": "2024-10-16T14:30:00Z",
"selfoss_version": "1.5.2",
"total_projects": 5,
"total_transcripts": 47,
"total_audio_files": 23,
"total_size_bytes": 2400000000,
"includes_audio": true,
"includes_settings": true
}

Use cases:

  • Validate backup integrity
  • Track backup versions
  • Verify completeness

Step 1: Upload JSON File

Upload & Process β†’ Select JSON file
↓
Selfoss detects JSON format
↓
"Import processed transcript?" prompt

Step 2: Choose Import Options

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Import Transcript β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Title: Q4 Planning Meeting β”‚
β”‚ Date: Oct 16, 2024 β”‚
β”‚ β”‚
β”‚ Destination: β”‚
β”‚ β—‹ New Project: [___________] β”‚
β”‚ ● Existing: [Engineering β–Ό] β”‚
β”‚ β”‚
β”‚ ☐ Overwrite if exists β”‚
β”‚ β”‚
β”‚ [Cancel] [Import] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 3: Verify Import

  • Transcript appears in project
  • All visualizations recreated
  • Metadata preserved

Step 1: Select Backup File

Settings β†’ Data Management β†’ Import Data
↓
File picker β†’ Select .zip backup

Step 2: Conflict Resolution

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Import Backup β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Backup contains: β”‚
β”‚ β€’ 5 projects β”‚
β”‚ β€’ 47 transcripts β”‚
β”‚ β€’ 23 audio recordings β”‚
β”‚ β”‚
β”‚ Conflicts detected: β”‚
β”‚ β€’ 2 projects already exist β”‚
β”‚ β€’ 8 transcripts already exist β”‚
β”‚ β”‚
β”‚ β˜‘ Overwrite existing data β”‚
β”‚ ☐ Skip duplicates β”‚
β”‚ ☐ Merge (keep both) β”‚
β”‚ β”‚
β”‚ [Cancel] [Import] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 3: Wait for Import

  • Progress shows:
    • Extracting archive…
    • Importing projects…
    • Restoring transcripts…
    • Copying audio files…
    • Rebuilding database…

⏱️ Time estimate: Similar to backup time

Step 4: Review Results

βœ… Import Complete
Imported:
β€’ 5 projects
β€’ 47 transcripts
β€’ 23 audio files
Skipped:
β€’ 2 duplicate projects
β€’ 8 duplicate transcripts
Errors:
β€’ None

Overwrite Mode:

  • Replaces existing data
  • Use for: Device migration, clean restore

Skip Duplicates:

  • Keeps existing, ignores conflicts
  • Use for: Selective restore, adding new data

Merge Mode:

  • Renames imported items (adds suffix)
  • Use for: Comparing versions, keeping both

πŸ’‘ Pro Tip: Test imports on a backup device first to verify data integrity.


Windows:

C:\Users\{Username}\AppData\Roaming\selfoss\
β”œβ”€β”€ selfoss.db (SQLite database)
β”œβ”€β”€ audio_recordings/
β”‚ β”œβ”€β”€ project_1/
β”‚ └── project_2/
└── cache/
└── whisper_models/

macOS:

~/Library/Application Support/selfoss/
β”œβ”€β”€ selfoss.db
β”œβ”€β”€ audio_recordings/
└── cache/

Linux:

~/.local/share/selfoss/
β”œβ”€β”€ selfoss.db
β”œβ”€β”€ audio_recordings/
└── cache/

Structure:

audio_recordings/
β”œβ”€β”€ project_1/
β”‚ β”œβ”€β”€ recording_2024-10-16_14-30-25.webm
β”‚ └── recording_2024-10-16_15-45-10.webm
β”œβ”€β”€ project_2/
β”‚ └── recording_2024-10-17_09-15-30.webm
└── ...

Naming convention:

recording_YYYY-MM-DD_HH-MM-SS.webm

Location:

cache/whisper_models/
β”œβ”€β”€ tiny.en.bin (75MB)
β”œβ”€β”€ base.en.bin (140MB)
└── small.en.bin (460MB)

Purpose:

  • Local transcription models
  • Downloaded via Ollama
  • Shared across all transcripts

SQLite file: selfoss.db

Key tables:

  • projects - Project metadata
  • transcripts - Transcript data and processed JSON
  • llm_settings - AI provider configurations
  • visualizations - Generated images
  • api_usage - Cost tracking (planned)

Typical 100-transcript installation:

Database (SQLite): 50 MB
Audio recordings: 6 GB (60 hours @ 1MB/min)
Whisper models: 1 GB
Visualizations cache: 200 MB
Total: ~7.25 GB

Windows:

Settings β†’ System β†’ Storage
↓
Look for Selfoss in app list

macOS:

Terminal window
du -sh ~/Library/Application\ Support/selfoss

Linux:

Terminal window
du -sh ~/.local/share/selfoss

1. Delete Old Recordings

Settings β†’ Data Management β†’ Audio Cleanup
↓
Select recordings older than: [6 months β–Ό]
↓
Preview: 45 recordings, 2.7 GB
↓
[Delete Selected]

2. Export and Delete Projects

1. Export project archive
2. Verify backup is complete
3. Delete project from Selfoss
4. Move archive to external storage

3. Clean Whisper Models

Keep only models you use:
β˜‘ whisper:base (140MB) ← Keep
☐ whisper:tiny (75MB) ← Delete
☐ whisper:small (460MB) ← Delete
Saves: 535 MB

4. Database Optimization

Settings β†’ Advanced β†’ Optimize Database
↓
Vacuum: Reclaim unused space
Reindex: Improve query performance

Selfoss monitors disk space:

⚠️ Low Disk Space Warning
Available: 850 MB
Required: 2 GB (recommended)
Suggestions:
β€’ Delete old recordings (1.5 GB)
β€’ Export and remove projects (500 MB)
β€’ Move backups to external drive
[Manage Storage] [Dismiss]

Scenario: Device crash, need to restore everything

Steps:

  1. Install Selfoss on new device
  2. Open Settings β†’ Data Management β†’ Import
  3. Select full backup ZIP file
  4. Choose β€œOverwrite” mode
  5. Wait for import (5-10 minutes)
  6. βœ… All data restored

Scenario: Accidentally deleted project

Steps:

  1. Locate most recent backup
  2. Import backup with β€œSkip duplicates” mode
  3. Deleted project is restored
  4. Existing data unchanged

Symptoms:

  • App won’t open
  • SQLite errors
  • Missing data

Solutions:

Option 1: Restore from backup

1. Rename corrupted selfoss.db
2. Import latest backup
3. Verify data integrity

Option 2: Database repair

Terminal window
# Linux/macOS
cd ~/.local/share/selfoss/
sqlite3 selfoss.db "PRAGMA integrity_check;"
sqlite3 selfoss.db ".recover" | sqlite3 selfoss_recovered.db

Option 3: Manual export

If database accessible:
1. Export all projects individually
2. Reinstall Selfoss (fresh database)
3. Import exported projects

Scenario: Audio files deleted but database intact

If you have backup:

1. Extract backup ZIP
2. Copy audio_recordings/ folder
3. Paste into Selfoss data directory
4. Restart Selfoss

If no backup:

  • Audio is lost
  • Transcripts still accessible
  • Can re-record if needed

Recommended:

Daily: None (auto-save handles this)
Weekly: Quick export of changed projects
Monthly: Full application backup
Quarterly: Archive old projects, clean storage

Automation (planned):

  • Scheduled automatic backups
  • Incremental backups (changes only)
  • Cloud sync options

Local:

  • βœ… Fast access
  • βœ… No internet needed
  • ❌ Vulnerable to device failure

External drive:

  • βœ… Safe from device issues
  • βœ… Large capacity
  • ❌ Must remember to backup

Cloud storage:

  • βœ… Off-site protection
  • βœ… Accessible anywhere
  • ❌ Privacy concerns (encrypt first)

Recommended approach:

Primary: Local SSD/HDD
Secondary: External drive (monthly)
Tertiary: Encrypted cloud (quarterly)

Manual versioning:

selfoss_backup_2024-10-16.zip ← Latest
selfoss_backup_2024-10-01.zip ← Last month
selfoss_backup_2024-09-01.zip ← Archive

Keep:

  • Latest backup (always)
  • Monthly backups (last 6 months)
  • Quarterly backups (last 2 years)

Verify backups work:

1. Create test project
2. Add test transcript
3. Export full backup
4. Delete test data
5. Restore from backup
6. Verify test data returned

Frequency: Test quarterly

Before sharing backups:

  • Remove sensitive transcripts
  • Redact participant names
  • Clear API keys (Settings export)
  • Encrypt ZIP file with password

Encrypted backups:

# Linux/macOS
zip -e selfoss_backup.zip selfoss_backup/*
# Enter password when prompted
# Extract
unzip selfoss_backup.zip

πŸŽ‰ You’re now a data management expert!

  1. πŸ’Ύ Create first full backup - Test the process
  2. πŸ“… Set backup schedule - Calendar reminders
  3. πŸ—‚οΈ Organize backup storage - External drive or cloud
  4. πŸ§ͺ Test restore process - Verify backups work
  5. πŸ”’ Consider encryption β†’ 14_PRIVACY_SECURITY_GUIDE.md
  • Automated backup scripts
  • Database optimization
  • Selective sync for large datasets
  • Disaster recovery planning

πŸ’Ύ Your data, protected and portable.