mirror of
https://github.com/trailofbits/algo.git
synced 2025-09-08 04:53:08 +02:00
* Refactor StrongSwan PKI automation with Ansible crypto modules - Replace shell-based OpenSSL commands with community.crypto modules - Remove custom OpenSSL config template and manual file management - Upgrade Ansible to 11.8.0 in requirements.txt - Improve idempotency, maintainability, and security of certificate and CRL handling * Enhance nameConstraints with comprehensive exclusions - Add email domain exclusions (.com, .org, .net, .gov, .edu, .mil, .int) - Include private IPv4 network exclusions - Add IPv6 null route exclusion - Preserve all security constraints from original openssl.cnf.j2 - Note: Complex IPv6 conditional logic simplified for Ansible compatibility Security: Maintains defense-in-depth certificate scope restrictions * Refactor StrongSwan PKI with comprehensive security enhancements and hybrid testing ## StrongSwan PKI Modernization - Migrated from shell-based OpenSSL commands to Ansible community.crypto modules - Simplified complex Jinja2 templates while preserving all security properties - Added clear, concise comments explaining security rationale and Apple compatibility ## Enhanced Security Implementation (Issues #75, #153) - **Name constraints**: CA certificates restricted to specific IP/email domains - **EKU role separation**: Server certs (serverAuth only) vs client certs (clientAuth only) - **Domain exclusions**: Blocks public domains (.com, .org, etc.) and private IP ranges - **Apple compatibility**: SAN extensions and PKCS#12 compatibility2022 encryption - **Certificate revocation**: Automated CRL generation for removed users ## Comprehensive Test Suite - **Hybrid testing**: Validates real certificates when available, config validation for CI - **Security validation**: Verifies name constraints, EKU restrictions, role separation - **Apple compatibility**: Tests SAN extensions and PKCS#12 format compliance - **Certificate chain**: Validates CA signing and certificate validity periods - **CI-compatible**: No deployment required, tests Ansible configuration directly ## Configuration Updates - Updated CLAUDE.md: Ansible version rationale (stay current for security/performance) - Streamlined comments: Removed duplicative explanations while preserving technical context - Maintained all Issue #75/#153 security enhancements with modern Ansible approach 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix linting issues across the codebase ## Python Code Quality (ruff) - Fixed import organization and removed unused imports in test files - Replaced `== True` comparisons with direct boolean checks - Added noqa comments for intentional imports in test modules ## YAML Formatting (yamllint) - Removed trailing spaces in openssl.yml comments - All YAML files now pass yamllint validation (except one pre-existing long regex line) ## Code Consistency - Maintained proper import ordering in test files - Ensured all code follows project linting standards - Ready for CI pipeline validation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Replace magic number with configurable certificate validity period ## Maintainability Improvement - Replaced hardcoded `+3650d` (10 years) with configurable variable - Added `certificate_validity_days: 3650` in vars section with clear documentation - Applied consistently to both server and client certificate signing ## Benefits - Single location to modify certificate validity period - Supports compliance requirements for shorter certificate lifespans - Improves code readability and maintainability - Eliminates magic number duplication ## Backwards Compatibility - Default remains 10 years (3650 days) - no behavior change - Organizations can now easily customize certificate validity as needed 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Update test to validate configurable certificate validity period ## Test Update - Fixed test failure after replacing magic number with configurable variable - Now validates both variable definition and usage patterns: - `certificate_validity_days: 3650` (configurable parameter) - `ownca_not_after: "+{{ certificate_validity_days }}d"` (variable usage) ## Improved Test Coverage - Better validation: checks that validity is configurable, not hardcoded - Maintains backwards compatibility verification (10-year default) - Ensures proper Ansible variable templating is used ## Verified - Config validation mode: All 6 tests pass ✓ - Validates the maintainability improvement from previous commit 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Update to Python 3.11 minimum and fix IPv6 constraint format - Update Python requirement from 3.10 to 3.11 to align with Ansible 11 - Pin Ansible collections in requirements.yml for stability - Fix invalid IPv6 constraint format causing deployment failure - Update ruff target-version to py311 for consistency 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix x509_crl mode parameter and auto-fix Python linting - Remove deprecated 'mode' parameter from x509_crl task - Add separate file task to set CRL permissions (0644) - Auto-fix Python datetime import (use datetime.UTC alias) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix final IPv6 constraint format in defaults template - Update nameConstraints template in defaults/main.yml - Change malformed IP:0:0:0:0:0:0:0:0/0:0:0:0:0:0:0:0 to correct IP:::/0 - This ensures both Ansible crypto modules and OpenSSL template use consistent IPv6 format * Fix critical certificate generation issues for macOS/iOS VPN compatibility This commit addresses multiple certificate generation bugs in the Ansible crypto module implementation that were causing VPN authentication failures on Apple devices. Fixes implemented: 1. **Basic Constraints Extension**: Added missing `CA:FALSE` constraints to both server and client certificate CSRs. This was causing certificate chain validation errors on macOS/iOS devices. 2. **Subject Key Identifier**: Added `create_subject_key_identifier: true` to CA certificate generation to enable proper Authority Key Identifier creation in signed certificates. 3. **Complete Name Constraints**: Fixed missing DNS and IPv6 constraints in CA certificate that were causing size differences compared to legacy shell-based generation. Now includes: - DNS constraints for the deployment-specific domain - IPv6 permitted addresses when IPv6 support is enabled - Complete IPv6 exclusion ranges (fc00::/7, fe80::/10, 2001:db8::/32) These changes bring the certificate format much closer to the working shell-based implementation and should resolve most macOS/iOS VPN connectivity issues. **Outstanding Issue**: Authority Key Identifier still incomplete - missing DirName and serial components. The community.crypto module limitation may require additional investigation or alternative approaches. Certificate size improvements: Server certificates increased from ~750 to ~775 bytes, CA certificates from ~1070 to ~1250 bytes, bringing them closer to the expected ~3000 byte target size. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix certificate generation and improve version parsing This commit addresses multiple issues found during macOS certificate validation: Certificate Generation Fixes: - Add Basic Constraints (CA:FALSE) to server and client certificates - Generate Subject Key Identifier for proper AKI creation - Improve Name Constraints implementation for security - Update community.crypto to version 3.0.3 for latest fixes Code Quality Improvements: - Clean up certificate comments and remove obsolete references - Fix server certificate identification in tests - Update datetime comparisons for cryptography library compatibility - Fix Ansible version parsing in main.yml with proper regex handling Testing: - All certificate validation tests pass - Ansible syntax checks pass - Python linting (ruff) clean - YAML linting (yamllint) clean These changes restore macOS/iOS certificate compatibility while maintaining security best practices and improving code maintainability. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Enhance security documentation with comprehensive inline comments Add detailed technical explanations for critical PKI security features: - Name Constraints: Defense-in-depth rationale and attack prevention - Public domain/network exclusions: Impersonation attack prevention - RFC 1918 private IP blocking: Lateral movement prevention - IPv6 constraint strategy: ULA/link-local/documentation range handling - Role separation enforcement: Server vs client EKU restrictions - CA delegation prevention: pathlen:0 security implications - Cross-deployment isolation: UUID-based certificate scope limiting These comments provide essential context for maintainers to understand the security importance of each configuration without referencing external issue numbers, ensuring long-term maintainability. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix CI test failures in PKI certificate validation Resolve Smart Test Selection workflow failures by fixing test validation logic: **Certificate Configuration Fixes:** - Remove unnecessary serverAuth/clientAuth EKUs from CA certificate - CA now only has IPsec End Entity EKU for VPN-specific certificate issuance - Maintains proper role separation between server and client certificates **Test Validation Improvements:** - Fix domain exclusion detection to handle both single and double quotes in YAML - Improve EKU validation to check actual configuration lines, not comments - Server/client certificate tests now correctly parse YAML structure - Tests pass in both CI mode (config validation) and local mode (real certificates) **Root Cause:** The CI failures were caused by overly broad test assertions that: 1. Expected double-quoted strings but found single-quoted YAML 2. Detected EKU keywords in comments rather than actual configuration 3. Failed to properly parse YAML list structures All security constraints remain intact - no actual security issues were present. The certificate generation produces properly constrained certificates for VPN use. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix trailing space in openssl.yml for yamllint compliance --------- Co-authored-by: Dan Guido <dan@trailofbits.com> Co-authored-by: Claude <noreply@anthropic.com>
8.9 KiB
8.9 KiB
CLAUDE.md - LLM Guidance for Algo VPN
This document provides essential context and guidance for LLMs working on the Algo VPN codebase. It captures important learnings, patterns, and best practices discovered through extensive work with this project.
Project Overview
Algo is an Ansible-based tool that sets up a personal VPN in the cloud. It's designed to be:
- Security-focused: Creates hardened VPN servers with minimal attack surface
- Easy to use: Automated deployment with sensible defaults
- Multi-platform: Supports various cloud providers and operating systems
- Privacy-preserving: No logging, minimal data retention
Core Technologies
- VPN Protocols: WireGuard (preferred) and IPsec/IKEv2
- Configuration Management: Ansible (currently v9.x)
- Languages: Python, YAML, Shell, Jinja2 templates
- Supported Providers: AWS, Azure, DigitalOcean, GCP, Vultr, Hetzner, local deployment
Architecture and Structure
Directory Layout
algo/
├── main.yml # Primary playbook
├── users.yml # User management playbook
├── server.yml # Server-specific tasks
├── config.cfg # Main configuration file
├── requirements.txt # Python dependencies
├── requirements.yml # Ansible collections
├── roles/ # Ansible roles
│ ├── common/ # Base system configuration
│ ├── wireguard/ # WireGuard VPN setup
│ ├── strongswan/ # IPsec/IKEv2 setup
│ ├── dns/ # DNS configuration (dnsmasq, dnscrypt)
│ ├── ssh_tunneling/ # SSH tunnel setup
│ └── cloud-*/ # Cloud provider specific roles
├── library/ # Custom Ansible modules
├── playbooks/ # Supporting playbooks
└── tests/ # Test suite
└── unit/ # Python unit tests
Key Roles
- common: Firewall rules, system hardening, package management
- wireguard: WireGuard server/client configuration
- strongswan: IPsec server setup with certificate generation
- dns: DNS encryption and ad blocking
- cloud-*: Provider-specific instance creation
Critical Dependencies and Version Management
Current Versions (MUST maintain compatibility)
ansible==11.8.0 # Stay current to get latest security, performance and bugfixes
jinja2~=3.1.6 # Security fix for CVE-2025-27516
netaddr==1.3.0 # Network address manipulation
Version Update Guidelines
- Be Conservative: Prefer minor version bumps over major ones
- Security First: Always prioritize security updates (CVEs)
- Test Thoroughly: Run all tests before updating
- Document Changes: Explain why each update is necessary
Ansible Collections
Currently unpinned in requirements.yml
, but key ones include:
community.general
ansible.posix
openstack.cloud
Development Practices
Code Style and Linting
Python (ruff)
# pyproject.toml configuration
[tool.ruff]
target-version = "py311"
line-length = 120
[tool.ruff.lint]
select = ["E", "W", "F", "I", "B", "C4", "UP"]
YAML (yamllint)
- Document start markers (
---
) required - No trailing spaces
- Newline at end of file
- Quote
'on':
in GitHub workflows (truthy value)
Shell Scripts (shellcheck)
- Quote all variables:
"${var}"
- Use
set -euo pipefail
for safety - FreeBSD rc scripts will show false positives (ignore)
Ansible (ansible-lint)
- Many warnings are suppressed in
.ansible-lint
- Focus on errors, not warnings
- Common suppressions:
name[missing]
,risky-file-permissions
Git Workflow
- Create feature branches from
master
- Make atomic commits with clear messages
- Run all linters before pushing
- Update PR description with test results
- Squash commits if requested
Testing Requirements
Before pushing any changes:
# Python tests
pytest tests/unit/ -v
# Ansible syntax
ansible-playbook main.yml --syntax-check
ansible-playbook users.yml --syntax-check
# Linters
ansible-lint
yamllint .
ruff check .
shellcheck *.sh
Common Issues and Solutions
1. Ansible-lint "name[missing]" Warnings
- Added to skip_list in
.ansible-lint
- Too many tasks to fix immediately (113+)
- Focus on new code having proper names
2. FreeBSD rc Script Warnings
- Variables like
rcvar
,start_cmd
appear unused to shellcheck - These are used by the rc.subr framework
- Safe to ignore these specific warnings
3. Jinja2 Template Complexity
- Many templates use Ansible-specific filters
- Test templates with
tests/unit/test_template_rendering.py
- Mock Ansible filters when testing
4. OpenSSL Version Compatibility
# Check version and use appropriate flags
{{ (openssl_version is version('3', '>=')) | ternary('-legacy', '') }}
5. IPv6 Endpoint Formatting
- WireGuard configs must bracket IPv6 addresses
- Template logic:
{% if ':' in IP %}[{{ IP }}]:{{ port }}{% else %}{{ IP }}:{{ port }}{% endif %}
Security Considerations
Always Priority One
- Never expose secrets: No passwords/keys in commits
- CVE Response: Update immediately when security issues found
- Least Privilege: Minimal permissions, dropped capabilities
- Secure Defaults: Strong crypto, no logging, firewall rules
Certificate Management
- Elliptic curve cryptography (secp384r1)
- Proper CA password handling
- Certificate revocation support
- Secure storage in
/etc/ipsec.d/
Network Security
- Strict firewall rules (iptables/ip6tables)
- No IP forwarding except for VPN
- DNS leak protection
- Kill switch implementation
Platform Support
Operating Systems
- Primary: Ubuntu 20.04/22.04 LTS
- Secondary: Debian 11/12
- Special: FreeBSD (requires platform-specific code)
- Clients: Windows, macOS, iOS, Android, Linux
Cloud Providers
Each has specific requirements:
- AWS: Requires boto3, specific AMI IDs
- Azure: Complex networking setup
- DigitalOcean: Simple API, good for testing
- Local: KVM/Docker for development
Architecture Considerations
- Support both x86_64 and ARM64
- Some providers have limited ARM support
- Performance varies by instance type
CI/CD Pipeline
GitHub Actions Workflows
- lint.yml: Runs ansible-lint on all pushes
- main.yml: Tests cloud provider configurations
- smart-tests.yml: Selective test running based on changes
- integration-tests.yml: Full deployment tests (currently disabled)
Test Categories
- Unit Tests: Python-based, test logic and templates
- Syntax Checks: Ansible playbook validation
- Linting: Code quality enforcement
- Integration: Full deployment testing (needs work)
Maintenance Guidelines
Dependency Updates
- Check for security vulnerabilities monthly
- Update conservatively (minor versions)
- Test on multiple platforms
- Document in PR why updates are needed
Issue Triage
- Security issues: Priority 1
- Broken functionality: Priority 2
- Feature requests: Priority 3
- Check issues for duplicates
Pull Request Standards
- Clear description of changes
- Test results included
- Linter compliance
- Conservative approach
Working with Algo
Local Development Setup
# Install dependencies
pip install -r requirements.txt
ansible-galaxy install -r requirements.yml
# Run local deployment
ansible-playbook main.yml -e "provider=local"
Common Tasks
Adding a New User
ansible-playbook users.yml -e "server=SERVER_NAME"
Updating Dependencies
- Create a new branch
- Update requirements.txt conservatively
- Run all tests
- Document security fixes
Debugging Deployment Issues
- Check
ansible-playbook -vvv
output - Verify cloud provider credentials
- Check firewall rules
- Review generated configs in
configs/
Important Context for LLMs
What Makes Algo Special
- Simplicity: One command to deploy
- Security: Hardened by default
- No Bloat: Minimal dependencies
- Privacy: No telemetry or logging
User Expectations
- It should "just work"
- Security is non-negotiable
- Backwards compatibility matters
- Clear error messages
Common User Profiles
- Privacy Advocates: Want secure communications
- Travelers: Need reliable VPN access
- Small Teams: Shared VPN for remote work
- Developers: Testing and development
Maintenance Philosophy
- Stability over features
- Security over convenience
- Clarity over cleverness
- Test everything
Final Notes
When working on Algo:
- Think Security First: Every change should maintain or improve security
- Test Thoroughly: Multiple platforms, both VPN types
- Document Clearly: Users may not be technical
- Be Conservative: This is critical infrastructure
- Respect Privacy: No tracking, minimal logging
Remember: People trust Algo with their privacy and security. Every line of code matters.