algo/tests
Jack Ivanov 4289db043a
Refactor StrongSwan PKI tasks to use Ansible crypto modules and remove legacy OpenSSL scripts (#14809)
* Refactor StrongSwan PKI automation with Ansible crypto modules

- Replace shell-based OpenSSL commands with community.crypto modules
- Remove custom OpenSSL config template and manual file management
- Upgrade Ansible to 11.8.0 in requirements.txt
- Improve idempotency, maintainability, and security of certificate and CRL handling

* Enhance nameConstraints with comprehensive exclusions

- Add email domain exclusions (.com, .org, .net, .gov, .edu, .mil, .int)
- Include private IPv4 network exclusions
- Add IPv6 null route exclusion
- Preserve all security constraints from original openssl.cnf.j2
- Note: Complex IPv6 conditional logic simplified for Ansible compatibility

Security: Maintains defense-in-depth certificate scope restrictions

* Refactor StrongSwan PKI with comprehensive security enhancements and hybrid testing

## StrongSwan PKI Modernization
- Migrated from shell-based OpenSSL commands to Ansible community.crypto modules
- Simplified complex Jinja2 templates while preserving all security properties
- Added clear, concise comments explaining security rationale and Apple compatibility

## Enhanced Security Implementation (Issues #75, #153)
- **Name constraints**: CA certificates restricted to specific IP/email domains
- **EKU role separation**: Server certs (serverAuth only) vs client certs (clientAuth only)
- **Domain exclusions**: Blocks public domains (.com, .org, etc.) and private IP ranges
- **Apple compatibility**: SAN extensions and PKCS#12 compatibility2022 encryption
- **Certificate revocation**: Automated CRL generation for removed users

## Comprehensive Test Suite
- **Hybrid testing**: Validates real certificates when available, config validation for CI
- **Security validation**: Verifies name constraints, EKU restrictions, role separation
- **Apple compatibility**: Tests SAN extensions and PKCS#12 format compliance
- **Certificate chain**: Validates CA signing and certificate validity periods
- **CI-compatible**: No deployment required, tests Ansible configuration directly

## Configuration Updates
- Updated CLAUDE.md: Ansible version rationale (stay current for security/performance)
- Streamlined comments: Removed duplicative explanations while preserving technical context
- Maintained all Issue #75/#153 security enhancements with modern Ansible approach

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix linting issues across the codebase

## Python Code Quality (ruff)
- Fixed import organization and removed unused imports in test files
- Replaced `== True` comparisons with direct boolean checks
- Added noqa comments for intentional imports in test modules

## YAML Formatting (yamllint)
- Removed trailing spaces in openssl.yml comments
- All YAML files now pass yamllint validation (except one pre-existing long regex line)

## Code Consistency
- Maintained proper import ordering in test files
- Ensured all code follows project linting standards
- Ready for CI pipeline validation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Replace magic number with configurable certificate validity period

## Maintainability Improvement
- Replaced hardcoded `+3650d` (10 years) with configurable variable
- Added `certificate_validity_days: 3650` in vars section with clear documentation
- Applied consistently to both server and client certificate signing

## Benefits
- Single location to modify certificate validity period
- Supports compliance requirements for shorter certificate lifespans
- Improves code readability and maintainability
- Eliminates magic number duplication

## Backwards Compatibility
- Default remains 10 years (3650 days) - no behavior change
- Organizations can now easily customize certificate validity as needed

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update test to validate configurable certificate validity period

## Test Update
- Fixed test failure after replacing magic number with configurable variable
- Now validates both variable definition and usage patterns:
  - `certificate_validity_days: 3650` (configurable parameter)
  - `ownca_not_after: "+{{ certificate_validity_days }}d"` (variable usage)

## Improved Test Coverage
- Better validation: checks that validity is configurable, not hardcoded
- Maintains backwards compatibility verification (10-year default)
- Ensures proper Ansible variable templating is used

## Verified
- Config validation mode: All 6 tests pass ✓
- Validates the maintainability improvement from previous commit

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update to Python 3.11 minimum and fix IPv6 constraint format

- Update Python requirement from 3.10 to 3.11 to align with Ansible 11
- Pin Ansible collections in requirements.yml for stability
- Fix invalid IPv6 constraint format causing deployment failure
- Update ruff target-version to py311 for consistency

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix x509_crl mode parameter and auto-fix Python linting

- Remove deprecated 'mode' parameter from x509_crl task
- Add separate file task to set CRL permissions (0644)
- Auto-fix Python datetime import (use datetime.UTC alias)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix final IPv6 constraint format in defaults template

- Update nameConstraints template in defaults/main.yml
- Change malformed IP:0:0:0:0:0:0:0:0/0:0:0:0:0:0:0:0 to correct IP:::/0
- This ensures both Ansible crypto modules and OpenSSL template use consistent IPv6 format

* Fix critical certificate generation issues for macOS/iOS VPN compatibility

This commit addresses multiple certificate generation bugs in the Ansible crypto
module implementation that were causing VPN authentication failures on Apple devices.

Fixes implemented:

1. **Basic Constraints Extension**: Added missing `CA:FALSE` constraints to both
   server and client certificate CSRs. This was causing certificate chain validation
   errors on macOS/iOS devices.

2. **Subject Key Identifier**: Added `create_subject_key_identifier: true` to CA
   certificate generation to enable proper Authority Key Identifier creation in
   signed certificates.

3. **Complete Name Constraints**: Fixed missing DNS and IPv6 constraints in CA
   certificate that were causing size differences compared to legacy shell-based
   generation. Now includes:
   - DNS constraints for the deployment-specific domain
   - IPv6 permitted addresses when IPv6 support is enabled
   - Complete IPv6 exclusion ranges (fc00::/7, fe80::/10, 2001:db8::/32)

These changes bring the certificate format much closer to the working shell-based
implementation and should resolve most macOS/iOS VPN connectivity issues.

**Outstanding Issue**: Authority Key Identifier still incomplete - missing DirName
and serial components. The community.crypto module limitation may require
additional investigation or alternative approaches.

Certificate size improvements: Server certificates increased from ~750 to ~775 bytes,
CA certificates from ~1070 to ~1250 bytes, bringing them closer to the expected
~3000 byte target size.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix certificate generation and improve version parsing

This commit addresses multiple issues found during macOS certificate validation:

Certificate Generation Fixes:
- Add Basic Constraints (CA:FALSE) to server and client certificates
- Generate Subject Key Identifier for proper AKI creation
- Improve Name Constraints implementation for security
- Update community.crypto to version 3.0.3 for latest fixes

Code Quality Improvements:
- Clean up certificate comments and remove obsolete references
- Fix server certificate identification in tests
- Update datetime comparisons for cryptography library compatibility
- Fix Ansible version parsing in main.yml with proper regex handling

Testing:
- All certificate validation tests pass
- Ansible syntax checks pass
- Python linting (ruff) clean
- YAML linting (yamllint) clean

These changes restore macOS/iOS certificate compatibility while maintaining
security best practices and improving code maintainability.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Enhance security documentation with comprehensive inline comments

Add detailed technical explanations for critical PKI security features:

- Name Constraints: Defense-in-depth rationale and attack prevention
- Public domain/network exclusions: Impersonation attack prevention
- RFC 1918 private IP blocking: Lateral movement prevention
- IPv6 constraint strategy: ULA/link-local/documentation range handling
- Role separation enforcement: Server vs client EKU restrictions
- CA delegation prevention: pathlen:0 security implications
- Cross-deployment isolation: UUID-based certificate scope limiting

These comments provide essential context for maintainers to understand
the security importance of each configuration without referencing
external issue numbers, ensuring long-term maintainability.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix CI test failures in PKI certificate validation

Resolve Smart Test Selection workflow failures by fixing test validation logic:

**Certificate Configuration Fixes:**
- Remove unnecessary serverAuth/clientAuth EKUs from CA certificate
- CA now only has IPsec End Entity EKU for VPN-specific certificate issuance
- Maintains proper role separation between server and client certificates

**Test Validation Improvements:**
- Fix domain exclusion detection to handle both single and double quotes in YAML
- Improve EKU validation to check actual configuration lines, not comments
- Server/client certificate tests now correctly parse YAML structure
- Tests pass in both CI mode (config validation) and local mode (real certificates)

**Root Cause:**
The CI failures were caused by overly broad test assertions that:
1. Expected double-quoted strings but found single-quoted YAML
2. Detected EKU keywords in comments rather than actual configuration
3. Failed to properly parse YAML list structures

All security constraints remain intact - no actual security issues were present.
The certificate generation produces properly constrained certificates for VPN use.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix trailing space in openssl.yml for yamllint compliance

---------

Co-authored-by: Dan Guido <dan@trailofbits.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-05 05:40:28 -07:00
..
fixtures feat: Add comprehensive performance optimizations to reduce deployment time by 30-60% 2025-08-03 16:42:17 -07:00
integration feat: Add comprehensive performance optimizations to reduce deployment time by 30-60% 2025-08-03 16:42:17 -07:00
legacy-lxd Optimize GitHub Actions workflows for security and performance (#14769) 2025-08-02 23:31:54 -04:00
unit Refactor StrongSwan PKI tasks to use Ansible crypto modules and remove legacy OpenSSL scripts (#14809) 2025-08-05 05:40:28 -07:00
README.md Optimize GitHub Actions workflows for security and performance (#14769) 2025-08-02 23:31:54 -04:00
test-aws-credentials.yml feat: Add comprehensive performance optimizations to reduce deployment time by 30-60% 2025-08-03 16:42:17 -07:00
test-local-config.sh Optimize GitHub Actions workflows for security and performance (#14769) 2025-08-02 23:31:54 -04:00
test-wireguard-async.yml Fix IPv6 address selection on BSD systems (#14786) 2025-08-03 17:15:27 -07:00
test-wireguard-fix.yml Fix IPv6 address selection on BSD systems (#14786) 2025-08-03 17:15:27 -07:00
test-wireguard-real-async.yml Fix IPv6 address selection on BSD systems (#14786) 2025-08-03 17:15:27 -07:00
test_bsd_ipv6.yml Fix IPv6 address selection on BSD systems (#14786) 2025-08-03 17:15:27 -07:00
test_cloud_init_template.py feat: Add comprehensive performance optimizations to reduce deployment time by 30-60% 2025-08-03 16:42:17 -07:00
test_package_preinstall.py feat: Add comprehensive performance optimizations to reduce deployment time by 30-60% 2025-08-03 16:42:17 -07:00

Algo VPN Test Suite

Current Test Coverage

What We Test Now

  1. Basic Sanity (test_basic_sanity.py)

    • Python version >= 3.10
    • requirements.txt exists
    • config.cfg is valid YAML
    • Ansible playbook syntax
    • Shell scripts pass shellcheck
    • Dockerfile exists and is valid
  2. Docker Build (test_docker_build.py)

    • Docker image builds successfully
    • Container can start
    • Ansible is available in container
  3. Configuration Generation (test-local-config.sh)

    • Ansible templates render without errors
    • Basic configuration can be generated
  4. Config Validation (test_config_validation.py)

    • WireGuard config format validation
    • Base64 key format checking
    • IP address and CIDR notation
    • Mobile config XML validation
    • Port range validation
  5. Certificate Validation (test_certificate_validation.py)

    • OpenSSL availability
    • Certificate subject formats
    • Key file permissions (600)
    • Password complexity
    • IPsec cipher suite security
  6. User Management (test_user_management.py) - Addresses #14745, #14746, #14738, #14726

    • User list parsing from config
    • Server selection string parsing
    • SSH key preservation
    • CA password handling
    • User config path generation
    • Duplicate user detection
  7. OpenSSL Compatibility (test_openssl_compatibility.py) - Addresses #14755, #14718

    • OpenSSL version detection
    • Legacy flag support detection
    • Apple device key format compatibility
    • Certificate generation compatibility
    • PKCS#12 export for mobile devices
  8. Cloud Provider Configs (test_cloud_provider_configs.py) - Addresses #14752, #14730, #14762

    • Cloud provider configuration validation
    • Hetzner server type updates (cx11 → cx22)
    • Azure dependency compatibility
    • Region format validation
    • Server size naming conventions
    • OS image naming validation

What We DON'T Test Yet

1. VPN Functionality

  • WireGuard configuration validation
    • Private/public key generation
    • Client config file format
    • QR code generation
    • Mobile config profiles
  • IPsec configuration validation
    • Certificate generation and validation
    • StrongSwan config format
    • Apple profile generation
  • SSH tunnel configuration
    • Key generation
    • SSH config file format

2. Cloud Provider Integrations

  • DigitalOcean API interactions
  • AWS EC2/Lightsail deployments
  • Azure deployments
  • Google Cloud deployments
  • Other providers (Vultr, Hetzner, etc.)

3. User Management

  • Adding new users
  • Removing users
  • Updating user configurations

4. Advanced Features

  • DNS ad-blocking configuration
  • On-demand VPN settings
  • MTU calculations
  • IPv6 configuration

5. Security Validations

  • Certificate constraints
  • Key permissions
  • Password generation
  • Firewall rules

Potential Improvements

Short Term (Easy Wins)

  1. Add job names to fix zizmor warnings

  2. Test configuration file generation without deployment:

    def test_wireguard_config_format():
        # Generate a test config
        # Validate it has required sections
        # Check key format with regex
    
  3. Test user management scripts in isolation:

    # Test that update-users generates valid YAML
    ./algo update-users --dry-run
    
  4. Add XML validation for mobile configs:

    xmllint --noout generated_configs/*.mobileconfig
    

Medium Term

  1. Mock cloud provider APIs to test deployment logic
  2. Container-based integration tests using Docker Compose
  3. Test certificate generation without full deployment
  4. Validate generated configs against schemas

Long Term

  1. End-to-end tests with actual VPN connections (using network namespaces)
  2. Performance testing for large user counts
  3. Upgrade path testing (old configs → new configs)
  4. Multi-platform client testing

Security Improvements (from zizmor)

Current status: No security issues found

Recommendations:

  1. Add explicit job names for better workflow clarity
  2. Consider pinning Ubuntu runner versions to specific releases
  3. Add GITHUB_TOKEN with minimal permissions when needed for API checks

Test Philosophy

Our approach focuses on:

  1. Fast feedback - Tests run in < 3 minutes
  2. No flaky tests - Avoid complex networking setups
  3. Test what matters - Config generation, not VPN protocols
  4. Progressive enhancement - Start simple, add coverage gradually