14 - Privacy & Security Guide
14 - Privacy & Security Guide
Section titled “14 - Privacy & Security Guide”🔒 Data Protection & Security Best Practices
⏱️ Time Estimate: 15 minutes
📋 What You’ll Learn: Data handling, security features, privacy compliance, best practices
Table of Contents
Section titled “Table of Contents”- Privacy Philosophy
- Local-Only Storage Architecture
- Encrypted API Key Storage
- No Telemetry Policy
- Cloud Provider Privacy
- API Key Security
- GDPR Compliance
- Enterprise Deployment
- Secure Backup Strategies
Privacy Philosophy
Section titled “Privacy Philosophy”Selfoss Core Principles
Section titled “Selfoss Core Principles”Privacy by Design:
- 🔒 Local-first: All data stored on your device
- 🚫 No tracking: Zero telemetry or analytics
- 🔐 Encrypted keys: Secure API key storage
- 📍 Data control: You own all your data
- 🌐 Offline capable: Works without internet (with local models)
User Control:
- Choose your AI providers
- Decide what goes to cloud
- Full export/delete capabilities
- Transparent data handling
Local-Only Storage Architecture
Section titled “Local-Only Storage Architecture”Where Your Data Lives
Section titled “Where Your Data Lives”Everything stays local:
Your Device├── Selfoss Database (SQLite)│ ├── Projects│ ├── Transcripts│ ├── Processed data│ └── Settings│├── Audio Recordings│ └── WebM files by project│└── Whisper Models └── Local AI modelsWhat this means:
- ✅ No Selfoss cloud servers
- ✅ No data uploaded to us
- ✅ Complete data sovereignty
- ✅ Works offline with local models
Data Storage Locations
Section titled “Data Storage Locations”Windows:
C:\Users\{Username}\AppData\Roaming\selfoss\macOS:
~/Library/Application Support/selfoss/Linux:
~/.local/share/selfoss/Security features:
- User-specific directories (OS-level isolation)
- File system permissions (read/write by user only)
- No shared storage
- No system-wide data
Network Activity
Section titled “Network Activity”Selfoss ONLY connects to:
-
License server (LemonSqueezy)
- Purpose: License validation
- Frequency: On activation only
- Data sent: License key, device ID
-
AI providers (if configured)
- OpenAI, Google, Ollama (if remote)
- Purpose: Transcription/analysis
- Frequency: Per transcript processing
- Data sent: Audio/text for processing
Selfoss NEVER connects to:
- ❌ Analytics servers
- ❌ Tracking services
- ❌ Ad networks
- ❌ Selfoss company servers (except license)
Encrypted API Key Storage
Section titled “Encrypted API Key Storage”Current Implementation
Section titled “Current Implementation”Storage method:
SQLite database with application-level encryptionLocation: selfoss.db (encrypted fields)Algorithm: AES-256 (planned: OS keyring)How it works:
- User enters API key in Settings
- Key encrypted before database storage
- Decrypted only when needed for API calls
- Never logged or exposed
Future Enhancement: OS Keyring
Section titled “Future Enhancement: OS Keyring”Planned migration (F007):
Windows:
- Windows Credential Manager
- System-level encryption
- Requires user authentication
macOS:
- Keychain Access
- Secure Enclave (on supported devices)
- TouchID/Password required
Linux:
- Secret Service API (libsecret)
- GNOME Keyring / KWallet
- User password required
Benefits:
- ✅ OS-level security
- ✅ Separate from app data
- ✅ Hardware-backed encryption (where available)
- ✅ Better protection if device stolen
Security Best Practices
Section titled “Security Best Practices”For API keys:
DO:
- ✅ Use unique keys per application
- ✅ Set spending limits on provider dashboards
- ✅ Rotate keys every 3-6 months
- ✅ Store backup in password manager (encrypted)
- ✅ Delete keys before device disposal
DON’T:
- ❌ Share keys via email/chat
- ❌ Commit to version control
- ❌ Use same key on shared devices
- ❌ Screenshot keys
- ❌ Leave keys on public computers
No Telemetry Policy
Section titled “No Telemetry Policy”What We DON’T Collect
Section titled “What We DON’T Collect”Zero data collection:
- ❌ Usage statistics
- ❌ Feature usage
- ❌ Error reports (unless you submit)
- ❌ Analytics
- ❌ Crash reports
- ❌ Device information
- ❌ IP addresses
Why This Matters
Section titled “Why This Matters”Your privacy:
- No user profiling
- No behavior tracking
- No data monetization
- No third-party sharing (because there’s no data!)
Transparency:
- Open source codebase
- Auditable network calls
- Clear privacy policy
- Community verification
Verification
Section titled “Verification”How to verify:
1. Use network monitoring tool: - Wireshark (desktop) - Charles Proxy (mobile) - Browser DevTools (web version)
2. Monitor outbound connections: - Should only see AI provider calls - License validation on activation - No other network activity
3. Review source code: - GitHub: shobankr/selfoss - Search for analytics calls - None found!Cloud Provider Privacy
Section titled “Cloud Provider Privacy”When You Use Cloud AI
Section titled “When You Use Cloud AI”What leaves your device:
For transcription:
- 🎵 Audio file (OpenAI, Gemini)
- ⏱️ Duration metadata
- 🔤 Returned: Text transcript
For analysis:
- 📄 Text transcript
- 🤖 Returned: Structured JSON (decisions, actions, concepts)
What doesn’t leave:
- ❌ Project names
- ❌ Other transcripts
- ❌ Database contents
- ❌ API keys (used for auth header only)
- ❌ Personal information (unless in transcript)
Provider Data Policies
Section titled “Provider Data Policies”OpenAI:
Data retention: 30 days (API data)Training: Not used (as of 2024)Policy: api.openai.com/data-usageGoogle Gemini:
Data retention: Per user agreementTraining: Not used for improvement (standard)Policy: cloud.google.com/terms/aupOllama (Local):
Data retention: N/A (never leaves device)Training: N/APolicy: Runs on your machineMinimizing Cloud Exposure
Section titled “Minimizing Cloud Exposure”Strategy 1: Sanitize before sending
Before processing:1. Remove names (replace with roles)2. Redact sensitive numbers3. Generic places/companies4. Remove context not needed for analysis
Example:Before: "John Smith at Acme Corp, account #12345"After: "Team member at Company A, account redacted"Strategy 2: Local transcription only
Audio → Ollama Whisper → Text (local)Text → Review → Redact → Send to cloud analysisStrategy 3: Fully local
Audio → Ollama Whisper → Text (local)Text → Ollama Llama → Analysis (local)Zero cloud exposureAPI Key Security
Section titled “API Key Security”Secure Key Management
Section titled “Secure Key Management”Best practices:
Generation:
1. Use provider's dashboard (official only)2. Give descriptive name ("Selfoss Desktop")3. Set permissions (read-only where possible)4. Set spending limits5. Note creation date for rotationStorage:
Primary: Selfoss app (encrypted)Backup: Password manager (1Password, Bitwarden)Never: Plain text files, screenshots, emailRotation:
Schedule:- Every 3 months: Routine rotation- Immediately: If compromised- Before: Selling/giving away device- After: Shared device usage
Process:1. Generate new key (keep old active)2. Update Selfoss settings3. Test new key works4. Revoke old keyIf Key Compromised
Section titled “If Key Compromised”Immediate actions:
1. Revoke key on provider dashboard - OpenAI: platform.openai.com/api-keys - Gemini: console.cloud.google.com
2. Generate new key - Different name - New permissions
3. Update Selfoss - Settings → API keys - Test connection
4. Monitor usage - Check for unauthorized calls - Verify billing
5. Report if fraudulent - Contact provider support - Dispute unauthorized chargesKey Permissions
Section titled “Key Permissions”Principle of least privilege:
For transcription:
OpenAI:- Required: Whisper API access- Not needed: GPT models, DALL-E, etc.
Gemini:- Required: Generative Language API- Not needed: Other Google Cloud servicesFor analysis:
OpenAI:- Required: GPT API access- Not needed: Whisper, DALL-E, etc.
Gemini:- Required: Generative Language APIGDPR Compliance
Section titled “GDPR Compliance”Self-Hosting Benefits
Section titled “Self-Hosting Benefits”GDPR principles:
1. Data Minimization:
- ✅ Only stores necessary data
- ✅ No telemetry/tracking
- ✅ User controls what’s processed
2. Purpose Limitation:
- ✅ Data used only for transcription/analysis
- ✅ Not shared with third parties
- ✅ Not used for other purposes
3. Storage Limitation:
- ✅ User controls retention
- ✅ Easy deletion (project/transcript level)
- ✅ Complete data export
4. Right to Access:
- ✅ All data accessible locally
- ✅ Full export capabilities
- ✅ No barriers to data access
5. Right to Erasure:
- ✅ Delete projects/transcripts
- ✅ Uninstall = complete removal
- ✅ No cloud data to delete
6. Data Portability:
- ✅ Export in standard formats (JSON, PDF, CSV)
- ✅ No vendor lock-in
- ✅ Easy migration
For Organizations
Section titled “For Organizations”GDPR considerations:
Data Controller:
- Organization using Selfoss
- Controls what data is processed
- Responsible for compliance
Data Processor (AI Providers):
- OpenAI, Google (if used)
- Process data on behalf of controller
- Have their own GDPR compliance
Recommendations:
1. Use local-only mode for sensitive data - Ollama for transcription - Ollama for analysis - Zero data to processors
2. Data Processing Agreements (DPA): - OpenAI: Available in dashboard - Google: Available in Cloud Console - Review and sign before use
3. Employee training: - What data can be processed - Redaction procedures - Handling sensitive information
4. Audit logs: - Track what was processed - Who processed it - When and why
5. Regular reviews: - Quarterly data audits - Provider policy updates - Compliance verificationEnterprise Deployment
Section titled “Enterprise Deployment”Deployment Models
Section titled “Deployment Models”Model 1: Individual Installations
Each user's device:├── Own Selfoss installation├── Own database (isolated)├── Own API keys (or shared)└── Own backups
Pros:✅ Maximum isolation✅ No shared infrastructure✅ User-specific settings
Cons:❌ No centralized management❌ Individual license per userModel 2: Shared Ollama Server
Corporate network:├── Central Ollama server│ └── All models cached└── User devices └── Selfoss pointing to server
Pros:✅ Shared model downloads✅ GPU-powered processing✅ Consistent performance
Cons:❌ Network dependency❌ Potential bottleneckModel 3: Air-Gapped Deployment
Secure environment:├── No internet access├── Local Ollama only└── Manual model transfer
Pros:✅ Maximum security✅ No data exfiltration risk✅ Compliance-friendly
Cons:❌ No cloud models❌ Manual updatesNetwork Configuration
Section titled “Network Configuration”Firewall rules:
Outbound allowed (if using cloud):- api.openai.com (443)- generativelanguage.googleapis.com (443)- license.lemonsqueezy.com (443)
Inbound: None required
For Ollama server:- Internal network: Port 11434Proxy configuration:
Selfoss → Settings → Advanced → ProxyHTTP Proxy: http://proxy.company.com:8080HTTPS Proxy: https://proxy.company.com:8443Centralized License Management
Section titled “Centralized License Management”Volume licensing:
Contact for enterprise:- Multiple seat licenses- Centralized billing- Admin dashboard (planned)- SSO integration (future)Secure Backup Strategies
Section titled “Secure Backup Strategies”Backup Security
Section titled “Backup Security”Threat model:
- Device theft
- Hardware failure
- Accidental deletion
- Ransomware
Protection layers:
Layer 1: Encryption
Encrypt backups before storing:
# Linux/macOSzip -e -r backup.zip selfoss_backup/# Enter password when prompted
# Windows (7-Zip)7z a -p -mhe=on backup.7z selfoss_backup\Layer 2: Off-site Storage
Store in multiple locations:1. Local drive (primary)2. External drive (secondary)3. Cloud storage (encrypted) (tertiary)Layer 3: Access Control
Limit backup access:- Password-protected archives- Encrypted cloud storage- OS-level file permissionsCloud Backup Privacy
Section titled “Cloud Backup Privacy”If using cloud storage (Dropbox, Google Drive, etc.):
DO:
✅ Encrypt locally before upload (see above)✅ Use strong password (password manager)✅ Enable 2FA on cloud account✅ Regularly test restore✅ Rotate encryption passwordsDON’T:
❌ Upload unencrypted backups❌ Share backup links❌ Use weak passwords❌ Store password with backupRecommended services:
- Tresorit (end-to-end encrypted)
- ProtonDrive (privacy-focused)
- Cryptomator (client-side encryption)
Security Checklist
Section titled “Security Checklist”Daily Operations
Section titled “Daily Operations”☑ Keep Selfoss updated
- Check for updates monthly
- Apply security patches promptly
☑ Monitor API usage
- Unusual activity = potential compromise
- Review monthly
☑ Lock device when away
- Password/biometric required
- Auto-lock after 5 minutes
Monthly Review
Section titled “Monthly Review”☑ Review API keys
- Still needed?
- Still secure?
- Time to rotate?
☑ Check backups
- Backup exists?
- Restore test passed?
- Stored securely?
☑ Audit processed data
- Any sensitive data needs removal?
- Old projects to archive?
Quarterly Security Audit
Section titled “Quarterly Security Audit”☑ Rotate API keys
- Generate new keys
- Update Selfoss
- Revoke old keys
☑ Review provider policies
- Privacy policy changes?
- Terms of service updates?
- Data retention changes?
☑ Update security practices
- New threats?
- Better encryption?
- Enhanced procedures?
Next Steps
Section titled “Next Steps”🔒 Your data is now secure!
Implement Security:
Section titled “Implement Security:”- 🔐 Enable encryption - OS keyring or encrypted backups
- 🔄 Rotate API keys - Quarterly schedule
- 📋 Document procedures - For team/organization
- 🧪 Test recovery - Verify backups work
- 📚 Train users - Security best practices
Stay Secure:
Section titled “Stay Secure:”- Monitor for updates
- Review security logs
- Test disaster recovery
- Keep informed of threats
🔒 Privacy-first, security-focused, user-controlled.