mirror of
https://github.com/trailofbits/algo.git
synced 2025-09-29 23:25:18 +02:00
* Fix DigitalOcean cloud-init compatibility issue causing SSH timeout on port 4160 This commit addresses the issue described in GitHub issue #14800 where DigitalOcean deployments fail during the "Wait until SSH becomes ready..." step due to cloud-init not processing the write_files directive correctly. ## Problem - DigitalOcean's cloud-init shows "Unhandled non-multipart (text/x-not-multipart) userdata" warning - write_files module gets skipped, leaving SSH on default port 22 instead of port 4160 - Algo deployment times out when trying to connect to port 4160 ## Solution Added proactive detection and remediation to the DigitalOcean role: 1. Check if SSH is listening on the expected port (4160) after droplet creation 2. If not, automatically apply the SSH configuration manually via SSH on port 22 3. Verify SSH is now listening on the correct port before proceeding ## Changes - Added SSH port check with 30-second timeout - Added fallback remediation block that: - Connects via SSH on port 22 to apply Algo's SSH configuration - Backs up the original sshd_config - Applies the correct SSH settings (port 4160, security hardening) - Restarts the SSH service - Verifies the fix worked This ensures DigitalOcean deployments succeed even when cloud-init fails to process the user_data correctly, maintaining backward compatibility and reliability. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Implement cleaner fix for DigitalOcean cloud-init encoding issue This replaces the previous workaround with two targeted fixes that address the root cause of the "Unhandled non-multipart (text/x-not-multipart) userdata" issue that prevents write_files from being processed. ## Root Cause Cloud-init receives user_data as binary/bytes instead of UTF-8 string, causing it to fail parsing and skip the write_files directive that configures SSH on port 4160. ## Cleaner Solutions Implemented ### Fix 1: String Encoding (user_data | string) - Added explicit string conversion to user_data template lookup - Ensures DigitalOcean API receives proper UTF-8 string, not bytes - Minimal change with maximum compatibility ### Fix 2: Use runcmd Instead of write_files - Replaced write_files approach with runcmd shell commands - Bypasses the cloud-init parsing issue entirely - More reliable as it executes direct shell commands - Includes automatic SSH config backup for safety ## Changes Made - `roles/cloud-digitalocean/tasks/main.yml`: Added | string filter to user_data - `files/cloud-init/base.yml`: Replaced write_files with runcmd approach - Removed complex SSH detection/remediation workaround (no longer needed) ## Benefits - ✅ Fixes root cause instead of working around symptoms - ✅ Much simpler and more maintainable code - ✅ Backward compatible - no API changes required - ✅ Handles both potential failure modes (encoding + parsing) - ✅ All tests pass, linters clean This should resolve DigitalOcean SSH timeout issues while being much cleaner than the previous workaround approach. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix cloud-init header format for DigitalOcean compatibility The space in '# cloud-config' (introduced in PR #14775) breaks cloud-init YAML parsing on DigitalOcean, causing SSH configuration to be skipped. Cloud-init documentation requires '#cloud-config' without a space. Fixes #14800 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Revert to write_files approach for SSH configuration Using write_files is more maintainable and Ansible-native than runcmd. The root cause was the cloud-config header format, not write_files itself. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix Ansible deprecation and variable warnings - Replace deprecated network filters with ansible.utils equivalents: - ipaddr → ansible.utils.ipaddr - ipmath → ansible.utils.ipmath - ipv4 → ansible.utils.ipv4 - ipv6 → ansible.utils.ipv6 - next_nth_usable → ansible.utils.next_nth_usable - Fix reserved variable name: no_log → algo_no_log - Fix SSH user groups warning by explicitly specifying groups parameter Addresses deprecation warnings that would become errors after 2024-01-01. All linter checks pass with only cosmetic warnings remaining. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Add comprehensive protection for cloud-config header format - Add inline documentation explaining critical #cloud-config format requirement - Exclude files/cloud-init/ from yamllint and ansible-lint to prevent automatic 'fixes' - Create detailed README.md documenting the issue and protection measures - Reference GitHub issue #14800 for future maintainers This prevents regression of the critical cloud-init header format that causes deployment failures when changed from '#cloud-config' to '# cloud-config'. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Add test for cloud-init header format to prevent regression This test ensures the cloud-init header remains exactly ''#cloud-config'' without a space. The regression in PR #14775 that added a space broke DigitalOcean deployments by causing cloud-init YAML parsing to fail, resulting in SSH timeouts on port 4160. Co-authored-by: Dan Guido <dguido@users.noreply.github.com> * Refactor SSH config template and fix MOTD task permissions - Use dedicated sshd_config template instead of inline content - Add explicit become: true to MOTD task to fix permissions warning 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix no_log variable references after renaming to algo_no_log Update all remaining references from old 'no_log' variable to 'algo_no_log' in WireGuard, SSH tunneling, and StrongSwan roles. This fixes deployment failures caused by undefined variable references. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Correct YAML indentation in cloud-init template for DigitalOcean The indent filter was not indenting the first line of the sshd_config content, causing invalid YAML structure that cloud-init couldn't parse. This resulted in SSH timeouts during deployment as the port was never changed from 22 to 4160. - Add first=True parameter to indent filter to ensure all lines are indented - Remove extra indentation in base template to prevent double-indentation - Add comprehensive test suite to validate template rendering and prevent regressions Fixes deployment failures where cloud-init would show: "Invalid format at line X: expected <block end>, but found '<scalar>'" 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> Co-authored-by: Dan Guido <dguido@users.noreply.github.com>
233 lines
8.4 KiB
INI
233 lines
8.4 KiB
INI
---
|
|
|
|
# This is the list of users to generate.
|
|
# Every device must have a unique user.
|
|
# You can add up to 65,534 new users over the lifetime of an AlgoVPN.
|
|
# User names with leading 0's or containing only numbers should be escaped in double quotes, e.g. "000dan" or "123".
|
|
# Email addresses are not allowed.
|
|
users:
|
|
- phone
|
|
- laptop
|
|
- desktop
|
|
|
|
### Review these options BEFORE you run Algo, as they are very difficult/impossible to change after the server is deployed.
|
|
|
|
# Change default SSH port for the cloud roles only
|
|
# It doesn't apply if you deploy to your existing Ubuntu Server
|
|
ssh_port: 4160
|
|
|
|
# Deploy StrongSwan to enable IPsec support
|
|
ipsec_enabled: true
|
|
|
|
# Deploy WireGuard
|
|
# WireGuard will listen on 51820/UDP. You might need to change to another port
|
|
# if your network blocks this one. Be aware that 53/UDP (DNS) is blocked on some
|
|
# mobile data networks.
|
|
wireguard_enabled: true
|
|
wireguard_port: 51820
|
|
|
|
# This feature allows you to configure the Algo server to send outbound traffic
|
|
# through a different external IP address than the one you are establishing the VPN connection with.
|
|
# More info https://trailofbits.github.io/algo/cloud-alternative-ingress-ip.html
|
|
# Available for the following cloud providers:
|
|
# - DigitalOcean
|
|
alternative_ingress_ip: false
|
|
|
|
# Reduce the MTU of the VPN tunnel
|
|
# Some cloud and internet providers use a smaller MTU (Maximum Transmission
|
|
# Unit) than the normal value of 1500 and if you don't reduce the MTU of your
|
|
# VPN tunnel some network connections will hang. Algo will attempt to set this
|
|
# automatically based on your server, but if connections hang you might need to
|
|
# adjust this yourself.
|
|
# See: https://github.com/trailofbits/algo/blob/master/docs/troubleshooting.md#various-websites-appear-to-be-offline-through-the-vpn
|
|
reduce_mtu: 0
|
|
|
|
# Algo will use the following lists to block ads. You can add new block lists
|
|
# after deployment by modifying the line starting "BLOCKLIST_URLS=" at:
|
|
# /usr/local/sbin/adblock.sh
|
|
# If you load very large blocklists, you may also have to modify resource limits:
|
|
# /etc/systemd/system/dnsmasq.service.d/100-CustomLimitations.conf
|
|
adblock_lists:
|
|
- "https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts"
|
|
|
|
# Enable DNS encryption.
|
|
# If 'false', 'dns_servers' should be specified below.
|
|
# DNS encryption can not be disabled if DNS adblocking is enabled
|
|
dns_encryption: true
|
|
|
|
# Block traffic between connected clients. Change this to false to enable
|
|
# connected clients to reach each other, as well as other computers on the
|
|
# same LAN as your Algo server (i.e. the "road warrior" setup). In this
|
|
# case, you may also want to enable SMB/CIFS and NETBIOS traffic below.
|
|
BetweenClients_DROP: true
|
|
|
|
# Block SMB/CIFS traffic
|
|
block_smb: true
|
|
|
|
# Block NETBIOS traffic
|
|
block_netbios: true
|
|
|
|
# Your Algo server will automatically install security updates. Some updates
|
|
# require a reboot to take effect but your Algo server will not reboot itself
|
|
# automatically unless you change 'enabled' below from 'false' to 'true', in
|
|
# which case a reboot will take place if necessary at the time specified (as
|
|
# HH:MM) in the time zone of your Algo server. The default time zone is UTC.
|
|
unattended_reboot:
|
|
enabled: false
|
|
time: 06:00
|
|
|
|
### Advanced users only below this line ###
|
|
|
|
# DNS servers which will be used if 'dns_encryption' is 'true'. Multiple
|
|
# providers may be specified, but avoid mixing providers that filter results
|
|
# (like Cisco) with those that don't (like Cloudflare) or you could get
|
|
# inconsistent results. The list of available public providers can be found
|
|
# here:
|
|
# https://github.com/DNSCrypt/dnscrypt-resolvers/blob/master/v2/public-resolvers.md
|
|
dnscrypt_servers:
|
|
ipv4:
|
|
- cloudflare
|
|
# - google
|
|
# - <YourCustomServer> # E.g., if using NextDNS, this will be something like NextDNS-abc123.
|
|
# You must also fill in custom_server_stamps below. You may specify
|
|
# multiple custom servers.
|
|
ipv6:
|
|
- cloudflare-ipv6
|
|
|
|
custom_server_stamps:
|
|
# YourCustomServer: 'sdns://...'
|
|
|
|
# DNS servers which will be used if 'dns_encryption' is 'false'.
|
|
# Fallback resolvers for systemd-resolved
|
|
# The default is to use Cloudflare.
|
|
dns_servers:
|
|
ipv4:
|
|
- 1.1.1.1
|
|
- 1.0.0.1
|
|
ipv6:
|
|
- 2606:4700:4700::1111
|
|
- 2606:4700:4700::1001
|
|
|
|
# Store the PKI in a ram disk. Enabled only if store_pki (retain the PKI) is set to false
|
|
# Supports on MacOS and Linux only (including Windows Subsystem for Linux)
|
|
pki_in_tmpfs: true
|
|
|
|
# Set this to 'true' when running './algo update-users' if you want ALL users to get new certs, not just new users.
|
|
keys_clean_all: false
|
|
|
|
# StrongSwan log level
|
|
# https://wiki.strongswan.org/projects/strongswan/wiki/LoggerConfiguration
|
|
strongswan_log_level: 2
|
|
|
|
# rightsourceip for ipsec
|
|
# ipv4
|
|
strongswan_network: 10.48.0.0/16
|
|
# ipv6
|
|
strongswan_network_ipv6: '2001:db8:4160::/48'
|
|
|
|
# If you're behind NAT or a firewall and you want to receive incoming connections long after network traffic has gone silent.
|
|
# This option will keep the "connection" open in the eyes of NAT.
|
|
# See: https://www.wireguard.com/quickstart/#nat-and-firewall-traversal-persistence
|
|
wireguard_PersistentKeepalive: 0
|
|
|
|
# WireGuard network configuration
|
|
wireguard_network_ipv4: 10.49.0.0/16
|
|
wireguard_network_ipv6: 2001:db8:a160::/48
|
|
|
|
# Randomly generated IP address for the local dns resolver
|
|
local_service_ip: "{{ '172.16.0.1' | ansible.utils.ipmath(1048573 | random(seed=algo_server_name + ansible_fqdn)) }}"
|
|
local_service_ipv6: "{{ 'fd00::1' | ansible.utils.ipmath(1048573 | random(seed=algo_server_name + ansible_fqdn)) }}"
|
|
|
|
# Hide sensitive data
|
|
algo_no_log: true
|
|
|
|
congrats:
|
|
common: |
|
|
"# Congratulations! #"
|
|
"# Your Algo server is running. #"
|
|
"# Config files and certificates are in the ./configs/ directory. #"
|
|
"# Go to https://whoer.net/ after connecting #"
|
|
"# and ensure that all your traffic passes through the VPN. #"
|
|
"# Local DNS resolver {{ local_service_ip }}{{ ', ' + local_service_ipv6 if ipv6_support else '' }} #"
|
|
p12_pass: |
|
|
"# The p12 and SSH keys password for new users is {{ p12_export_password }} #"
|
|
ca_key_pass: |
|
|
"# The CA key password is {{ CA_password|default(omit) }} #"
|
|
ssh_access: |
|
|
"# Shell access: ssh -F configs/{{ ansible_ssh_host|default(omit) }}/ssh_config {{ algo_server_name }} #"
|
|
|
|
SSH_keys:
|
|
comment: algo@ssh
|
|
private: configs/algo.pem
|
|
private_tmp: /tmp/algo-ssh.pem
|
|
public: configs/algo.pem.pub
|
|
|
|
cloud_providers:
|
|
azure:
|
|
size: Standard_B1S
|
|
osDisk:
|
|
# The storage account type to use for the OS disk. Possible values:
|
|
# 'Standard_LRS', 'Premium_LRS', 'StandardSSD_LRS', 'UltraSSD_LRS',
|
|
# 'Premium_ZRS', 'StandardSSD_ZRS', 'PremiumV2_LRS'.
|
|
type: Standard_LRS
|
|
image:
|
|
publisher: Canonical
|
|
offer: 0001-com-ubuntu-minimal-jammy-daily
|
|
sku: minimal-22_04-daily-lts
|
|
version: latest
|
|
digitalocean:
|
|
# See docs for extended droplet options, pricing, and availability.
|
|
# Possible values: 's-1vcpu-512mb-10gb', 's-1vcpu-1gb', ...
|
|
size: s-1vcpu-1gb
|
|
image: "ubuntu-22-04-x64"
|
|
ec2:
|
|
# Change the encrypted flag to "false" to disable AWS volume encryption.
|
|
encrypted: true
|
|
# Set use_existing_eip to "true" if you want to use a pre-allocated Elastic IP
|
|
# Additional prompt will be raised to determine which IP to use
|
|
use_existing_eip: false
|
|
size: t2.micro
|
|
image:
|
|
name: "ubuntu-jammy-22.04"
|
|
arch: x86_64
|
|
owner: "099720109477"
|
|
# Change instance_market_type from "on-demand" to "spot" to launch a spot
|
|
# instance. See deploy-from-ansible.md for spot's additional IAM permission
|
|
instance_market_type: on-demand
|
|
gce:
|
|
size: e2-micro
|
|
image: ubuntu-2204-lts
|
|
external_static_ip: false
|
|
lightsail:
|
|
size: nano_2_0
|
|
image: ubuntu_22_04
|
|
scaleway:
|
|
size: DEV1-S
|
|
image: Ubuntu 22.04 Jammy Jellyfish
|
|
arch: x86_64
|
|
hetzner:
|
|
server_type: cpx11
|
|
image: ubuntu-22.04
|
|
openstack:
|
|
flavor_ram: ">=512"
|
|
image: Ubuntu-22.04
|
|
cloudstack:
|
|
size: Micro
|
|
image: Linux Ubuntu 22.04 LTS 64-bit
|
|
disk: 10
|
|
vultr:
|
|
os: Ubuntu 22.04 LTS x64
|
|
size: vc2-1c-1gb
|
|
linode:
|
|
type: g6-nanode-1
|
|
image: linode/ubuntu22.04
|
|
local:
|
|
|
|
fail_hint:
|
|
- Sorry, but something went wrong!
|
|
- Please check the troubleshooting guide.
|
|
- https://trailofbits.github.io/algo/troubleshooting.html
|
|
|
|
booleans_map:
|
|
Y: true
|
|
y: true
|