* Fix VPN routing by adding output interface to NAT rules The NAT rules were missing the output interface specification (-o eth0), which caused routing failures on multi-homed systems (servers with multiple network interfaces). Without specifying the output interface, packets might not be NAT'd correctly. Changes: - Added -o {{ ansible_default_ipv4['interface'] }} to all NAT rules - Updated both IPv4 and IPv6 templates - Updated tests to verify output interface is present - Added ansible_default_ipv4/ipv6 to test fixtures This fixes the issue where VPN clients could connect but not route traffic to the internet on servers with multiple network interfaces (like DigitalOcean droplets with private networking enabled). 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix VPN routing by adding output interface to NAT rules On multi-homed systems (servers with multiple network interfaces or multiple IPs on one interface), MASQUERADE rules need to specify which interface to use for NAT. Without the output interface specification, packets may not be routed correctly. This fix adds the output interface to all NAT rules: -A POSTROUTING -s [vpn_subnet] -o eth0 -j MASQUERADE Changes: - Modified roles/common/templates/rules.v4.j2 to include output interface - Modified roles/common/templates/rules.v6.j2 for IPv6 support - Added tests to verify output interface is present in NAT rules - Added ansible_default_ipv4/ipv6 variables to test fixtures For deployments on providers like DigitalOcean where MASQUERADE still fails due to multiple IPs on the same interface, users can enable the existing alternative_ingress_ip option in config.cfg to use explicit SNAT. Testing: - Verified on live servers - All unit tests pass (67/67) - Mutation testing confirms test coverage This fixes VPN connectivity on servers with multiple interfaces while remaining backward compatible with single-interface deployments. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix dnscrypt-proxy not listening on VPN service IPs Problem: dnscrypt-proxy on Ubuntu uses systemd socket activation by default, which overrides the configured listen_addresses in dnscrypt-proxy.toml. The socket only listens on 127.0.2.1:53, preventing VPN clients from resolving DNS queries through the configured service IPs. Solution: Disable and mask the dnscrypt-proxy.socket unit to allow dnscrypt-proxy to bind directly to the VPN service IPs specified in its configuration file. This fixes DNS resolution for VPN clients on Ubuntu 20.04+ systems. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Apply Python linting and formatting - Run ruff check --fix to fix linting issues - Run ruff format to ensure consistent formatting - All tests still pass after formatting changes 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Restrict DNS access to VPN clients only Security fix: The firewall rule for DNS was accepting traffic from any source (0.0.0.0/0) to the local DNS resolver. While the service IP is on the loopback interface (which normally isn't routable externally), this could be a security risk if misconfigured. Changed firewall rules to only accept DNS traffic from VPN subnets: - INPUT rule now includes -s {{ subnets }} to restrict source IPs - Applied to both IPv4 and IPv6 rules - Added test to verify DNS is properly restricted This ensures the DNS resolver is only accessible to connected VPN clients, not the entire internet. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix dnscrypt-proxy service startup with masked socket Problem: dnscrypt-proxy.service has a dependency on dnscrypt-proxy.socket through the TriggeredBy directive. When we mask the socket before starting the service, systemd fails with "Unit dnscrypt-proxy.socket is masked." Solution: 1. Override the service to remove socket dependency (TriggeredBy=) 2. Reload systemd daemon immediately after override changes 3. Start the service (which now doesn't require the socket) 4. Only then disable and mask the socket This ensures dnscrypt-proxy can bind directly to the configured IPs without socket activation, while preventing the socket from being re-enabled by package updates. Changes: - Added TriggeredBy= override to remove socket dependency - Added explicit daemon reload after service overrides - Moved socket masking to after service start in main.yml - Fixed YAML formatting issues Testing: Deployment now succeeds with dnscrypt-proxy binding to VPN IPs 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix dnscrypt-proxy by not masking the socket Problem: Masking dnscrypt-proxy.socket prevents the service from starting because the service has Requires=dnscrypt-proxy.socket dependency. Solution: Simply stop and disable the socket without masking it. This prevents socket activation while allowing the service to start and bind directly to the configured IPs. Changes: - Removed socket masking (just disable it) - Moved socket disabling before service start - Removed invalid systemd directives from override Testing: Confirmed dnscrypt-proxy now listens on VPN service IPs 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Use systemd socket activation properly for dnscrypt-proxy Instead of fighting systemd socket activation, configure it to listen on the correct VPN service IPs. This is more systemd-native and reliable. Changes: - Create socket override to listen on VPN IPs instead of localhost - Clear default listeners and add VPN service IPs - Use empty listen_addresses in dnscrypt-proxy.toml for socket activation - Keep socket enabled and let systemd manage the activation - Add handler for restarting socket when config changes Benefits: - Works WITH systemd instead of against it - Survives package updates better - No dependency conflicts - More reliable service management This approach is cleaner than disabling socket activation entirely and ensures dnscrypt-proxy is accessible to VPN clients on the correct IPs. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Document debugging lessons learned in CLAUDE.md Added comprehensive debugging guidance based on our troubleshooting session: - VPN connectivity troubleshooting order (DNS first!) - systemd socket activation best practices - Common deployment failures and solutions - Time wasters to avoid (lessons learned the hard way) - Multi-homed system considerations - Testing notes for DigitalOcean These additions will help future debugging sessions avoid the same rabbit holes and focus on the most likely issues first. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix DNS resolution for VPN clients by enabling route_localnet The issue was that dnscrypt-proxy listens on a special loopback IP (randomly generated in 172.16.0.0/12 range) which wasn't accessible from VPN clients. This fix: 1. Enables net.ipv4.conf.all.route_localnet sysctl to allow routing to loopback IPs from other interfaces 2. Ensures dnscrypt-proxy socket is properly restarted when its configuration changes 3. Adds proper handler flushing after socket configuration updates This allows VPN clients to reach the DNS resolver at the local_service_ip address configured on the loopback interface. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Improve security by using interface-specific route_localnet Instead of enabling route_localnet globally (net.ipv4.conf.all.route_localnet), this change enables it only on the specific interfaces that need it: - WireGuard interface (wg0) for WireGuard VPN clients - Main network interface (eth0/etc) for IPsec VPN clients This minimizes the security impact by restricting loopback routing to only the VPN interfaces, preventing other interfaces from being able to route to loopback addresses. The interface-specific approach provides the same functionality (allowing VPN clients to reach the DNS resolver on the local_service_ip) while reducing the potential attack surface. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Revert to global route_localnet to fix deployment failure The interface-specific route_localnet approach failed because: - WireGuard interface (wg0) doesn't exist until the service starts - We were trying to set the sysctl before the interface was created - This caused deployment failures with "No such file or directory" Reverting to the global setting (net.ipv4.conf.all.route_localnet=1) because: - It always works regardless of interface creation timing - VPN users are trusted (they have our credentials) - Firewall rules still restrict access to only port 53 - The security benefit of interface-specific settings is minimal - The added complexity isn't worth the marginal security improvement This ensures reliable deployments while maintaining the DNS resolution fix. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix dnscrypt-proxy socket restart and remove problematic BPF hardening Two important fixes: 1. Fix dnscrypt-proxy socket not restarting with new configuration - The socket wasn't properly restarting when its override config changed - This caused DNS to listen on wrong IP (127.0.2.1 instead of local_service_ip) - Now directly restart the socket when configuration changes - Add explicit daemon reload before restarting 2. Remove BPF JIT hardening that causes deployment errors - The net.core.bpf_jit_enable sysctl isn't available on all kernels - It was causing "Invalid argument" errors during deployment - This was optional security hardening with minimal benefit - Removing it eliminates deployment errors for most users These fixes ensure reliable DNS resolution for VPN clients and clean deployments without error messages. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Update CLAUDE.md with comprehensive debugging lessons learned Based on our extensive debugging session, this update adds critical documentation: ## DNS Architecture and Troubleshooting - Explained the local_service_ip design and why it requires route_localnet - Added detailed DNS debugging methodology with exact steps in order - Documented systemd socket activation complexities and common mistakes - Added specific commands to verify DNS is working correctly ## Architectural Decisions - Added new section explaining trade-offs in Algo's design choices - Documented why local_service_ip uses loopback instead of alternatives - Explained iptables-legacy vs iptables-nft backend choice ## Enhanced Debugging Guidance - Expanded troubleshooting with exact commands and expected outputs - Added warnings about configuration changes that need restarts - Documented socket activation override requirements in detail - Added common pitfalls like interface-specific sysctls ## Time Wasters Section - Added new lessons learned from this debugging session - Interface-specific route_localnet (fails before interface exists) - DNAT for loopback addresses (doesn't work) - BPF JIT hardening (causes errors on many kernels) This documentation will help future maintainers avoid the same debugging rabbit holes and understand why things are designed the way they are. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
18 KiB
CLAUDE.md - LLM Guidance for Algo VPN
This document provides essential context and guidance for LLMs working on the Algo VPN codebase. It captures important learnings, patterns, and best practices discovered through extensive work with this project.
Project Overview
Algo is an Ansible-based tool that sets up a personal VPN in the cloud. It's designed to be:
- Security-focused: Creates hardened VPN servers with minimal attack surface
- Easy to use: Automated deployment with sensible defaults
- Multi-platform: Supports various cloud providers and operating systems
- Privacy-preserving: No logging, minimal data retention
Core Technologies
- VPN Protocols: WireGuard (preferred) and IPsec/IKEv2
- Configuration Management: Ansible (currently v9.x)
- Languages: Python, YAML, Shell, Jinja2 templates
- Supported Providers: AWS, Azure, DigitalOcean, GCP, Vultr, Hetzner, local deployment
Architecture and Structure
Directory Layout
algo/
├── main.yml # Primary playbook
├── users.yml # User management playbook
├── server.yml # Server-specific tasks
├── config.cfg # Main configuration file
├── pyproject.toml # Python project configuration and dependencies
├── uv.lock # Exact dependency versions lockfile
├── requirements.yml # Ansible collections
├── roles/ # Ansible roles
│ ├── common/ # Base system configuration
│ ├── wireguard/ # WireGuard VPN setup
│ ├── strongswan/ # IPsec/IKEv2 setup
│ ├── dns/ # DNS configuration (dnsmasq, dnscrypt)
│ ├── ssh_tunneling/ # SSH tunnel setup
│ └── cloud-*/ # Cloud provider specific roles
├── library/ # Custom Ansible modules
├── playbooks/ # Supporting playbooks
└── tests/ # Test suite
└── unit/ # Python unit tests
Key Roles
- common: Firewall rules, system hardening, package management
- wireguard: WireGuard server/client configuration
- strongswan: IPsec server setup with certificate generation
- dns: DNS encryption and ad blocking
- cloud-*: Provider-specific instance creation
Critical Dependencies and Version Management
Current Versions (MUST maintain compatibility)
ansible==11.8.0 # Stay current to get latest security, performance and bugfixes
jinja2~=3.1.6 # Security fix for CVE-2025-27516
netaddr==1.3.0 # Network address manipulation
Version Update Guidelines
- Be Conservative: Prefer minor version bumps over major ones
- Security First: Always prioritize security updates (CVEs)
- Test Thoroughly: Run all tests before updating
- Document Changes: Explain why each update is necessary
Ansible Collections
Currently unpinned in requirements.yml
, but key ones include:
community.general
ansible.posix
openstack.cloud
Development Practices
Code Style and Linting
Python (ruff)
# pyproject.toml configuration
[tool.ruff]
target-version = "py311"
line-length = 120
[tool.ruff.lint]
select = ["E", "W", "F", "I", "B", "C4", "UP"]
YAML (yamllint)
- Document start markers (
---
) required - No trailing spaces
- Newline at end of file
- Quote
'on':
in GitHub workflows (truthy value)
Shell Scripts (shellcheck)
- Quote all variables:
"${var}"
- Use
set -euo pipefail
for safety
PowerShell Scripts (PSScriptAnalyzer)
- Use approved verbs (Get-, Set-, New-, etc.)
- Avoid positional parameters in functions
- Use proper error handling with try/catch
- Note: Algo's PowerShell script is a WSL wrapper since Ansible doesn't run natively on Windows
Ansible (ansible-lint)
- Many warnings are suppressed in
.ansible-lint
- Focus on errors, not warnings
- Common suppressions:
name[missing]
,risky-file-permissions
Documentation Style
- Avoid excessive header nesting (prefer 2-3 levels maximum)
- Don't overuse bold formatting in lists - use sparingly for emphasis only
- Write flowing paragraphs instead of choppy bullet-heavy sections
- Keep formatting clean and readable - prefer natural text over visual noise
- Use numbered lists for procedures, simple bullets for feature lists
- Example: "Navigate to Network → Interfaces" not "Navigate to Network → Interfaces"
Git Workflow
- Create feature branches from
master
- Make atomic commits with clear messages
- Run all linters before pushing
- Update PR description with test results
- Squash commits if requested
Testing Requirements
Before pushing any changes:
# Python tests
pytest tests/unit/ -v
# Ansible syntax
ansible-playbook main.yml --syntax-check
ansible-playbook users.yml --syntax-check
# Linters
ansible-lint
yamllint .
ruff check .
shellcheck *.sh
# PowerShell (if available)
pwsh -Command "Invoke-ScriptAnalyzer -Path ./algo.ps1"
Writing Effective Tests - Mutation Testing Approach
When writing tests, always verify that your test actually detects the failure case. This is a form of lightweight mutation testing that ensures tests add real value:
- Write the test for the bug/issue you're preventing
- Temporarily introduce the bug to verify the test fails
- Fix the bug and verify the test passes
- Document what specific issue the test prevents
Example from our codebase:
def test_regression_openssl_inline_comments():
"""Tests that we detect inline comments in Jinja2 expressions."""
# This pattern SHOULD fail (has inline comments)
problematic = "{{ ['DNS:' + id, # comment ] }}"
assert not validate(problematic), "Should detect inline comments"
# This pattern SHOULD pass (no inline comments)
fixed = "{{ ['DNS:' + id] }}"
assert validate(fixed), "Should pass without comments"
This practice ensures:
- Tests aren't just checking happy paths
- Tests will actually catch regressions
- The test's purpose is clear to future maintainers
- We avoid false confidence from tests that always pass
Common Issues and Solutions
1. Ansible-lint "name[missing]" Warnings
- Added to skip_list in
.ansible-lint
- Too many tasks to fix immediately (113+)
- Focus on new code having proper names
2. DNS Architecture and Common Issues
Understanding local_service_ip
- Algo uses a randomly generated IP in the 172.16.0.0/12 range on the loopback interface
- This IP (
local_service_ip
) is where dnscrypt-proxy should listen - Requires
net.ipv4.conf.all.route_localnet=1
sysctl for VPN clients to reach loopback IPs - This is by design for consistency across VPN types (WireGuard + IPsec)
dnscrypt-proxy Service Failures
Problem: "Unit dnscrypt-proxy.socket is masked" or service won't start
- The service has
Requires=dnscrypt-proxy.socket
dependency - Masking the socket prevents the service from starting
- Solution: Configure socket properly instead of fighting it
DNS Not Accessible to VPN Clients
Symptoms: VPN connects but no internet/DNS access
- First check what's listening:
sudo ss -ulnp | grep :53
- Should show
local_service_ip:53
(e.g., 172.24.117.23:53) - If showing only 127.0.2.1:53, socket override didn't apply
- Should show
- Check socket status:
systemctl status dnscrypt-proxy.socket
- Look for "configuration has changed while running" - needs restart
- Verify route_localnet:
sysctl net.ipv4.conf.all.route_localnet
- Must be 1 for VPN clients to reach loopback IPs
- Check firewall: Ensure allows VPN subnets:
-A INPUT -s {{ subnets }} -d {{ local_service_ip }}
- Never allow DNS from all sources (0.0.0.0/0) - security risk!
3. Multi-homed Systems and NAT
DigitalOcean and other providers with multiple IPs:
- Servers may have both public and private IPs on same interface
- MASQUERADE needs output interface:
-o {{ ansible_default_ipv4['interface'] }}
- Don't overengineer with SNAT - MASQUERADE with interface works fine
- Use
alternative_ingress_ip
option only when truly needed
4. iptables Backend Changes (nft vs legacy)
Critical: Switching between iptables-nft and iptables-legacy can break subtle behaviors
- Ubuntu 22.04+ defaults to iptables-nft which may have implicit NAT behaviors
- Algo forces iptables-legacy for consistent rule ordering
- This switch can break DNS routing that "just worked" before
- Always test thoroughly after backend changes
5. systemd Socket Activation Gotchas
- Interface-specific sysctls (e.g.,
net.ipv4.conf.wg0.route_localnet
) fail if interface doesn't exist yet - WireGuard interface only created when service starts
- Use global sysctls or apply settings after service start
- Socket configuration changes require explicit restart (not just reload)
6. Jinja2 Template Complexity
- Many templates use Ansible-specific filters
- Test templates with
tests/unit/test_template_rendering.py
- Mock Ansible filters when testing
7. OpenSSL Version Compatibility
# Check version and use appropriate flags
{{ (openssl_version is version('3', '>=')) | ternary('-legacy', '') }}
8. IPv6 Endpoint Formatting
- WireGuard configs must bracket IPv6 addresses
- Template logic:
{% if ':' in IP %}[{{ IP }}]:{{ port }}{% else %}{{ IP }}:{{ port }}{% endif %}
Security Considerations
Always Priority One
- Never expose secrets: No passwords/keys in commits
- CVE Response: Update immediately when security issues found
- Least Privilege: Minimal permissions, dropped capabilities
- Secure Defaults: Strong crypto, no logging, firewall rules
Certificate Management
- Elliptic curve cryptography (secp384r1)
- Proper CA password handling
- Certificate revocation support
- Secure storage in
/etc/ipsec.d/
Network Security
- Strict firewall rules (iptables/ip6tables)
- No IP forwarding except for VPN
- DNS leak protection
- Kill switch implementation
Platform Support
Operating Systems
- Primary: Ubuntu 20.04/22.04 LTS
- Secondary: Debian 11/12
- Clients: Windows, macOS, iOS, Android, Linux
Cloud Providers
Each has specific requirements:
- AWS: Requires boto3, specific AMI IDs
- Azure: Complex networking setup
- DigitalOcean: Simple API, good for testing (watch for multiple IPs on eth0)
- Local: KVM/Docker for development
Testing Note: DigitalOcean droplets often have both public and private IPs on the same interface, making them excellent test cases for multi-IP scenarios and NAT issues.
Architecture Considerations
- Support both x86_64 and ARM64
- Some providers have limited ARM support
- Performance varies by instance type
CI/CD Pipeline
GitHub Actions Workflows
- lint.yml: Runs ansible-lint on all pushes
- main.yml: Tests cloud provider configurations
- smart-tests.yml: Selective test running based on changes
- integration-tests.yml: Full deployment tests (currently disabled)
Test Categories
- Unit Tests: Python-based, test logic and templates
- Syntax Checks: Ansible playbook validation
- Linting: Code quality enforcement
- Integration: Full deployment testing (needs work)
Maintenance Guidelines
Dependency Updates
- Check for security vulnerabilities monthly
- Update conservatively (minor versions)
- Test on multiple platforms
- Document in PR why updates are needed
Issue Triage
- Security issues: Priority 1
- Broken functionality: Priority 2
- Feature requests: Priority 3
- Check issues for duplicates
Pull Request Standards
- Clear description of changes
- Test results included
- Linter compliance
- Conservative approach
Time Wasters to Avoid (Lessons Learned)
Don't spend time on these unless absolutely necessary:
- Converting MASQUERADE to SNAT - MASQUERADE works fine for Algo's use case
- Fighting systemd socket activation - Configure it properly instead of trying to disable it
- Debugging NAT before checking DNS - Most "routing" issues are DNS issues
- Complex IPsec policy matching - Keep NAT rules simple, avoid
-m policy --pol none
- Testing on existing servers - Always test on fresh deployments
- Interface-specific route_localnet - WireGuard interface doesn't exist until service starts
- DNAT for loopback addresses - Packets to local IPs don't traverse PREROUTING
- Removing BPF JIT hardening - It's optional and causes errors on many kernels
Working with Algo
Local Development Setup
# Install dependencies
uv sync
uv run ansible-galaxy install -r requirements.yml
# Run local deployment
ansible-playbook main.yml -e "provider=local"
Common Tasks
Adding a New User
ansible-playbook users.yml -e "server=SERVER_NAME"
Updating Dependencies
- Create a new branch
- Update pyproject.toml conservatively
- Run
uv lock
to update lockfile - Run all tests
- Document security fixes
Debugging Deployment Issues
- Check
ansible-playbook -vvv
output - Verify cloud provider credentials
- Check firewall rules
- Review generated configs in
configs/
Troubleshooting VPN Connectivity
Debugging Methodology
When VPN connects but traffic doesn't work, follow this exact order (learned from painful experience):
-
Check DNS listening addresses first
ss -lnup | grep :53 # Should show local_service_ip:53 (e.g., 172.24.117.23:53) # If showing 127.0.2.1:53, socket override didn't apply
-
Check both socket AND service status
systemctl status dnscrypt-proxy.socket dnscrypt-proxy.service # Look for "configuration has changed while running" warnings
-
Verify route_localnet is enabled
sysctl net.ipv4.conf.all.route_localnet # Must be 1 for VPN clients to reach loopback IPs
-
Test DNS resolution from server
dig @172.24.117.23 google.com # Use actual local_service_ip # Should return results if DNS is working
-
Check firewall counters
iptables -L INPUT -v -n | grep -E '172.24|10.49|10.48' # Look for increasing packet counts
-
Verify NAT is happening
iptables -t nat -L POSTROUTING -v -n # Check for MASQUERADE rules with packet counts
Key insight: 90% of "routing" issues are actually DNS issues. Always check DNS first!
systemd and dnscrypt-proxy (Critical for Ubuntu/Debian)
Background: Ubuntu's dnscrypt-proxy package uses systemd socket activation which completely overrides the listen_addresses
setting in the config file.
How it works:
- Default socket listens on 127.0.2.1:53 (hardcoded in package)
- Socket activation means systemd opens the port, not dnscrypt-proxy
- Config file
listen_addresses
is ignored when socket activation is used - Must configure the socket, not just the service
Correct approach:
# Create socket override at /etc/systemd/system/dnscrypt-proxy.socket.d/10-algo-override.conf
[Socket]
ListenStream= # Clear ALL defaults first
ListenDatagram= # Clear UDP defaults too
ListenStream=172.x.x.x:53 # Add TCP on VPN IP
ListenDatagram=172.x.x.x:53 # Add UDP on VPN IP
Config requirements:
- Use empty
listen_addresses = []
in dnscrypt-proxy.toml for socket activation - Socket must be restarted (not just reloaded) after config changes
- Check with:
systemctl status dnscrypt-proxy.socket
for warnings - Verify with:
ss -lnup | grep :53
to see actual listening addresses
Common mistakes:
- Trying to disable/mask the socket (breaks service with Requires= dependency)
- Only setting ListenStream (need ListenDatagram for UDP)
- Forgetting to clear defaults first (results in listening on both IPs)
- Not restarting socket after configuration changes
Architectural Decisions and Trade-offs
DNS Service IP Design
Algo uses a randomly generated IP in the 172.16.0.0/12 range on the loopback interface for DNS (local_service_ip
). This design has trade-offs:
Why it's done this way:
- Provides a consistent DNS IP across both WireGuard and IPsec
- Avoids binding to VPN gateway IPs which differ between protocols
- Survives interface changes and restarts
- Works the same way across all cloud providers
The cost:
- Requires
route_localnet=1
sysctl (minor security consideration) - Adds complexity with systemd socket activation
- Can be confusing to debug
Alternatives considered but rejected:
- Binding to VPN gateway IPs directly (breaks unified configuration)
- Using dummy interface instead of loopback (non-standard, more complex)
- DNAT redirects (doesn't work with loopback destinations)
iptables Backend Choice
Algo forces iptables-legacy instead of iptables-nft on Ubuntu 22.04+ because:
- nft reorders rules unpredictably, breaking VPN traffic
- Legacy backend provides consistent, predictable behavior
- Trade-off: Lost some implicit NAT behaviors that nft provided
Important Context for LLMs
What Makes Algo Special
- Simplicity: One command to deploy
- Security: Hardened by default
- No Bloat: Minimal dependencies
- Privacy: No telemetry or logging
User Expectations
- It should "just work"
- Security is non-negotiable
- Backwards compatibility matters
- Clear error messages
Common User Profiles
- Privacy Advocates: Want secure communications
- Travelers: Need reliable VPN access
- Small Teams: Shared VPN for remote work
- Developers: Testing and development
Maintenance Philosophy
- Stability over features
- Security over convenience
- Clarity over cleverness
- Test everything
Final Notes
When working on Algo:
- Think Security First: Every change should maintain or improve security
- Test Thoroughly: Multiple platforms, both VPN types
- Document Clearly: Users may not be technical
- Be Conservative: This is critical infrastructure
- Respect Privacy: No tracking, minimal logging
Remember: People trust Algo with their privacy and security. Every line of code matters.