09 - Data Management Guide
09 - Data Management Guide
Section titled β09 - Data Management GuideβπΎ Backup, Export, and Manage Your Data
β±οΈ Time Estimate: 15 minutes
π What Youβll Learn: Export options, backup strategies, data recovery, storage management
Table of Contents
Section titled βTable of Contentsβ- Export Options Overview
- Individual Transcript Export
- Project-Level Export
- Full Application Backup
- Import Workflows
- Storage Locations
- Disk Space Management
- Data Recovery
- Best Practices
Export Options Overview
Section titled βExport Options OverviewβFeature: F006 & F013 - Export/Backup System
Section titled βFeature: F006 & F013 - Export/Backup SystemβSelfoss provides three levels of export:
| Level | Scope | Use Case | Feature |
|---|---|---|---|
| Individual | Single transcript | Share specific meeting | F006 |
| Project | All transcripts in project | Archive completed project | F006 |
| Full Backup | Entire application | Device migration, disaster recovery | F013 |
Export Formats
Section titled βExport FormatsβTranscript data:
- π JSON: Raw structured data
- π PDF: Professional report
- π TXT: Plain text transcript
- πΌοΈ PNG/SVG: Individual visualizations
Audio files:
- π΅ WebM: Original recording format
- π¦ ZIP: Bundled with transcript
Database:
- ποΈ SQLite: Complete database backup
- π¦ ZIP: Compressed archive
Individual Transcript Export
Section titled βIndividual Transcript ExportβExporting Single Transcripts
Section titled βExporting Single TranscriptsβStep 1: Select Transcript
Click transcript in project list βRight-click β Export OptionsStep 2: Choose Format
ββββββββββββββββββββββββββββββββ Export Transcript ββββββββββββββββββββββββββββββββ€β β JSON (raw data) ββ β PDF (formatted report) ββ β TXT (plain text) ββ β Audio file (if recording) ββ ββ [Cancel] [Export] ββββββββββββββββββββββββββββββββStep 3: Select Destination
- File picker opens
- Choose save location
- Name the file
JSON Export Format
Section titled βJSON Export FormatβContains:
{ "metadata": { "title": "Q4 Planning Meeting", "date": "2024-10-16", "participants": ["John", "Sarah", "Mike"], "upload_date": "2024-10-16T14:30:00Z" }, "decisions": [ { "id": "D1", "text": "Approve Q4 marketing budget: $500K", "timestamp": "00:15:30", "flows_to": ["A1", "A2"] } ], "actions": [ { "id": "A1", "task": "Finalize vendor contracts", "owner": "John", "deadline": "2024-10-20", "priority": "high", "source_decision": "D1" } ], "concepts": [...], "temporal_data": {...}, "processing_metadata": { "model_used": "gpt-4o-mini", "provider": "openai", "tokens_used": 5234, "processing_cost": 0.0023 }}Use cases:
- Integration with other tools
- Custom analysis scripts
- Data migration
- Backup reference
PDF Export Format
Section titled βPDF Export FormatβContains:
βββββββββββββββββββββββββββββββ Q4 Planning Meeting ββ October 16, 2024 βββββββββββββββββββββββββββββββ€β Participants: ββ β’ John Smith ββ β’ Sarah Johnson ββ β’ Mike Chen βββββββββββββββββββββββββββββββ€β Decisions: ββ 1. Approve Q4 marketing... ββ ββ [Decision Flowchart PNG] ββ ββ Actions: ββ [Action Matrix Table] ββ ββ Concepts: ββ [Mind Map PNG] βββββββββββββββββββββββββββββββUse cases:
- Sharing with executives
- Meeting minutes documentation
- Archival records
- Stakeholder reports
Exporting Audio Files
Section titled βExporting Audio FilesβFor recorded transcripts:
- Right-click transcript
- Select βDownload Audioβ
- Original WebM file downloads
File details:
- Format: WebM (original)
- Size: ~1MB per minute
- Includes: Full audio recording
- Location: Choose save destination
π‘ Pro Tip: Export audio before deleting transcripts if you might need it later.
Project-Level Export
Section titled βProject-Level ExportβExporting Entire Projects
Section titled βExporting Entire ProjectsβStep 1: Right-Click Project
Projects Sidebar βRight-click project β Export ProjectStep 2: Choose Options
ββββββββββββββββββββββββββββββββββββ Export Project: Engineering ββββββββββββββββββββββββββββββββββββ€β β Transcripts (12 files) ββ β Audio recordings (8 files) ββ β Visualizations (36 images) ββ β Processing metadata ββ ββ Format: [ZIP Archive βΌ] ββ ββ [Cancel] [Export Project] ββββββββββββββββββββββββββββββββββββStep 3: Wait for Processing
- Progress bar shows status
- Large projects may take 1-2 minutes
- Audio files are largest component
Project Archive Structure
Section titled βProject Archive StructureβEngineering_Project_2024-10-16.zipβββ project_metadata.jsonβββ transcripts/β βββ meeting_001.jsonβ βββ meeting_002.jsonβ βββ ...βββ audio_recordings/β βββ recording_001.webmβ βββ recording_002.webmβ βββ ...βββ visualizations/ βββ meeting_001_flowchart.png βββ meeting_001_mindmap.svg βββ ...Use Cases
Section titled βUse CasesβProject completion:
- Archive finished initiatives
- Free up database space
- Create handoff packages
Client deliverables:
- Package all meeting records
- Include audio for reference
- Professional presentation
Compliance:
- Legal record keeping
- Audit trail documentation
- Historical reference
Full Application Backup
Section titled βFull Application BackupβFeature: F013 - Complete Data Export
Section titled βFeature: F013 - Complete Data ExportβCreate a complete snapshot of all Selfoss data.
Creating Full Backup
Section titled βCreating Full BackupβStep 1: Open Settings
Settings β Data Management β Export All DataStep 2: Configure Backup
ββββββββββββββββββββββββββββββββββββ Full Application Backup ββββββββββββββββββββββββββββββββββββ€β Include: ββ β All projects (5) ββ β All transcripts (47) ββ β All audio recordings (23) ββ β LLM settings ββ β Application preferences ββ β License information ββ ββ Estimated size: 2.3 GB ββ Destination: Downloads folder ββ ββ [Cancel] [Create Backup] ββββββββββββββββββββββββββββββββββββStep 3: Wait for Completion
- Progress indicator shows stages:
- Exporting databaseβ¦
- Archiving audio filesβ¦
- Compressing dataβ¦
- Saving to diskβ¦
β±οΈ Time estimate:
- Small (< 100 transcripts): 30 seconds
- Medium (100-500): 2-3 minutes
- Large (500+): 5-10 minutes
Backup Archive Structure
Section titled βBackup Archive Structureβselfoss_backup_2024-10-16_14-30-00.zipβββ metadata.jsonβββ projects.jsonβββ transcripts.jsonβββ llm_settings.jsonβββ visualizations.jsonβββ audio_recordings/ βββ project_1/ β βββ recording_001.webm β βββ recording_002.webm βββ project_2/ β βββ recording_003.webm βββ ...Backup Metadata
Section titled βBackup MetadataβThe metadata.json file contains:
{ "backup_version": "1.0", "backup_date": "2024-10-16T14:30:00Z", "selfoss_version": "1.5.2", "total_projects": 5, "total_transcripts": 47, "total_audio_files": 23, "total_size_bytes": 2400000000, "includes_audio": true, "includes_settings": true}Use cases:
- Validate backup integrity
- Track backup versions
- Verify completeness
Import Workflows
Section titled βImport WorkflowsβImporting Individual Transcripts
Section titled βImporting Individual TranscriptsβStep 1: Upload JSON File
Upload & Process β Select JSON file βSelfoss detects JSON format β"Import processed transcript?" promptStep 2: Choose Import Options
ββββββββββββββββββββββββββββββββββββ Import Transcript ββββββββββββββββββββββββββββββββββββ€β Title: Q4 Planning Meeting ββ Date: Oct 16, 2024 ββ ββ Destination: ββ β New Project: [___________] ββ β Existing: [Engineering βΌ] ββ ββ β Overwrite if exists ββ ββ [Cancel] [Import] ββββββββββββββββββββββββββββββββββββStep 3: Verify Import
- Transcript appears in project
- All visualizations recreated
- Metadata preserved
Importing Full Backups
Section titled βImporting Full BackupsβStep 1: Select Backup File
Settings β Data Management β Import Data βFile picker β Select .zip backupStep 2: Conflict Resolution
ββββββββββββββββββββββββββββββββββββ Import Backup ββββββββββββββββββββββββββββββββββββ€β Backup contains: ββ β’ 5 projects ββ β’ 47 transcripts ββ β’ 23 audio recordings ββ ββ Conflicts detected: ββ β’ 2 projects already exist ββ β’ 8 transcripts already exist ββ ββ β Overwrite existing data ββ β Skip duplicates ββ β Merge (keep both) ββ ββ [Cancel] [Import] ββββββββββββββββββββββββββββββββββββStep 3: Wait for Import
- Progress shows:
- Extracting archiveβ¦
- Importing projectsβ¦
- Restoring transcriptsβ¦
- Copying audio filesβ¦
- Rebuilding databaseβ¦
β±οΈ Time estimate: Similar to backup time
Step 4: Review Results
β
Import Complete
Imported: β’ 5 projects β’ 47 transcripts β’ 23 audio files
Skipped: β’ 2 duplicate projects β’ 8 duplicate transcripts
Errors: β’ NoneImport Strategies
Section titled βImport StrategiesβOverwrite Mode:
- Replaces existing data
- Use for: Device migration, clean restore
Skip Duplicates:
- Keeps existing, ignores conflicts
- Use for: Selective restore, adding new data
Merge Mode:
- Renames imported items (adds suffix)
- Use for: Comparing versions, keeping both
π‘ Pro Tip: Test imports on a backup device first to verify data integrity.
Storage Locations
Section titled βStorage LocationsβApplication Data Directory
Section titled βApplication Data DirectoryβWindows:
C:\Users\{Username}\AppData\Roaming\selfoss\βββ selfoss.db (SQLite database)βββ audio_recordings/β βββ project_1/β βββ project_2/βββ cache/ βββ whisper_models/macOS:
~/Library/Application Support/selfoss/βββ selfoss.dbβββ audio_recordings/βββ cache/Linux:
~/.local/share/selfoss/βββ selfoss.dbβββ audio_recordings/βββ cache/Audio File Storage
Section titled βAudio File StorageβStructure:
audio_recordings/βββ project_1/β βββ recording_2024-10-16_14-30-25.webmβ βββ recording_2024-10-16_15-45-10.webmβββ project_2/β βββ recording_2024-10-17_09-15-30.webmβββ ...Naming convention:
recording_YYYY-MM-DD_HH-MM-SS.webmWhisper Model Cache
Section titled βWhisper Model CacheβLocation:
cache/whisper_models/βββ tiny.en.bin (75MB)βββ base.en.bin (140MB)βββ small.en.bin (460MB)Purpose:
- Local transcription models
- Downloaded via Ollama
- Shared across all transcripts
Database Schema
Section titled βDatabase SchemaβSQLite file: selfoss.db
Key tables:
projects- Project metadatatranscripts- Transcript data and processed JSONllm_settings- AI provider configurationsvisualizations- Generated imagesapi_usage- Cost tracking (planned)
Disk Space Management
Section titled βDisk Space ManagementβStorage Usage Breakdown
Section titled βStorage Usage BreakdownβTypical 100-transcript installation:
Database (SQLite): 50 MBAudio recordings: 6 GB (60 hours @ 1MB/min)Whisper models: 1 GBVisualizations cache: 200 MBTotal: ~7.25 GBChecking Disk Usage
Section titled βChecking Disk UsageβWindows:
Settings β System β Storage βLook for Selfoss in app listmacOS:
du -sh ~/Library/Application\ Support/selfossLinux:
du -sh ~/.local/share/selfossCleanup Strategies
Section titled βCleanup Strategiesβ1. Delete Old Recordings
Settings β Data Management β Audio Cleanup βSelect recordings older than: [6 months βΌ] βPreview: 45 recordings, 2.7 GB β[Delete Selected]2. Export and Delete Projects
1. Export project archive2. Verify backup is complete3. Delete project from Selfoss4. Move archive to external storage3. Clean Whisper Models
Keep only models you use:β whisper:base (140MB) β Keepβ whisper:tiny (75MB) β Deleteβ whisper:small (460MB) β Delete
Saves: 535 MB4. Database Optimization
Settings β Advanced β Optimize Database βVacuum: Reclaim unused spaceReindex: Improve query performanceLow Disk Space Warnings
Section titled βLow Disk Space WarningsβSelfoss monitors disk space:
β οΈ Low Disk Space Warning
Available: 850 MBRequired: 2 GB (recommended)
Suggestions:β’ Delete old recordings (1.5 GB)β’ Export and remove projects (500 MB)β’ Move backups to external drive
[Manage Storage] [Dismiss]Data Recovery
Section titled βData RecoveryβBackup Restoration
Section titled βBackup RestorationβScenario: Device crash, need to restore everything
Steps:
- Install Selfoss on new device
- Open Settings β Data Management β Import
- Select full backup ZIP file
- Choose βOverwriteβ mode
- Wait for import (5-10 minutes)
- β All data restored
Partial Data Loss
Section titled βPartial Data LossβScenario: Accidentally deleted project
Steps:
- Locate most recent backup
- Import backup with βSkip duplicatesβ mode
- Deleted project is restored
- Existing data unchanged
Database Corruption
Section titled βDatabase CorruptionβSymptoms:
- App wonβt open
- SQLite errors
- Missing data
Solutions:
Option 1: Restore from backup
1. Rename corrupted selfoss.db2. Import latest backup3. Verify data integrityOption 2: Database repair
# Linux/macOScd ~/.local/share/selfoss/sqlite3 selfoss.db "PRAGMA integrity_check;"sqlite3 selfoss.db ".recover" | sqlite3 selfoss_recovered.dbOption 3: Manual export
If database accessible:1. Export all projects individually2. Reinstall Selfoss (fresh database)3. Import exported projectsAudio File Recovery
Section titled βAudio File RecoveryβScenario: Audio files deleted but database intact
If you have backup:
1. Extract backup ZIP2. Copy audio_recordings/ folder3. Paste into Selfoss data directory4. Restart SelfossIf no backup:
- Audio is lost
- Transcripts still accessible
- Can re-record if needed
Best Practices
Section titled βBest PracticesβBackup Schedule
Section titled βBackup ScheduleβRecommended:
Daily: None (auto-save handles this)Weekly: Quick export of changed projectsMonthly: Full application backupQuarterly: Archive old projects, clean storageAutomation (planned):
- Scheduled automatic backups
- Incremental backups (changes only)
- Cloud sync options
Backup Storage
Section titled βBackup StorageβLocal:
- β Fast access
- β No internet needed
- β Vulnerable to device failure
External drive:
- β Safe from device issues
- β Large capacity
- β Must remember to backup
Cloud storage:
- β Off-site protection
- β Accessible anywhere
- β Privacy concerns (encrypt first)
Recommended approach:
Primary: Local SSD/HDDSecondary: External drive (monthly)Tertiary: Encrypted cloud (quarterly)Version Control
Section titled βVersion ControlβManual versioning:
selfoss_backup_2024-10-16.zip β Latestselfoss_backup_2024-10-01.zip β Last monthselfoss_backup_2024-09-01.zip β ArchiveKeep:
- Latest backup (always)
- Monthly backups (last 6 months)
- Quarterly backups (last 2 years)
Testing Backups
Section titled βTesting BackupsβVerify backups work:
1. Create test project2. Add test transcript3. Export full backup4. Delete test data5. Restore from backup6. Verify test data returnedFrequency: Test quarterly
Privacy Considerations
Section titled βPrivacy ConsiderationsβBefore sharing backups:
- Remove sensitive transcripts
- Redact participant names
- Clear API keys (Settings export)
- Encrypt ZIP file with password
Encrypted backups:
# Linux/macOSzip -e selfoss_backup.zip selfoss_backup/*# Enter password when prompted
# Extractunzip selfoss_backup.zipNext Steps
Section titled βNext Stepsβπ Youβre now a data management expert!
Recommended Actions:
Section titled βRecommended Actions:β- πΎ Create first full backup - Test the process
- π Set backup schedule - Calendar reminders
- ποΈ Organize backup storage - External drive or cloud
- π§ͺ Test restore process - Verify backups work
- π Consider encryption β 14_PRIVACY_SECURITY_GUIDE.md
Advanced Topics:
Section titled βAdvanced Topics:β- Automated backup scripts
- Database optimization
- Selective sync for large datasets
- Disaster recovery planning
πΎ Your data, protected and portable.