- Add comprehensive VL1 (ZeroTier protocol) fragmentation metrics:
* Track fragmented packets, fragments, reassembly failures
* Monitor fragment ordering issues and duplicates
* Histogram for fragments per packet distribution
- Add VL2 (TAP/Ethernet) fragmentation metrics for virtual ethernet interfaces:
* Track oversized frames from TAP devices
* Monitor frames that would fragment or drop
* Histogram for frame size distribution with common MTU buckets
- Integration across all TAP implementations (Linux, Mac, BSD, Windows)
This allows monitoring of fragmentation patterns for nodes participating
as members in ZeroTier networks, helping identify MTU mismatches and
optimize virtual ethernet performance.
* internal db metrics
* use shared mutexes for read/write locks
* remove this lock. only used for a metric
* more metrics
* remove exploratory metrics
place controller request benchmarks behind ifdef
* add new metrics for network config request queue size and sso expirations
* move sso expiration to its own thread in the controller
* fix potential undefined behavior when modifying a set
* Rename zt_packet_incoming -> zt_packet
Also consolidate zt_peer_packets into a single metric with tx and rx labels. Same for ztc_tcp_data and ztc_udp_data
* Further collapse tcp & udp into metric labels for zt_data
* Fix zt_data metric description
* zt_peer_packets description fix
* Consolidate incoming/outgoing network packets to a single metric
* zt_incoming_packet_error -> zt_packet_error
* Disable peer metrics for central controllers
Can change in the future if needed, but given the traffic our controllers serve, that's going to be a *lot* of data
* Disable peer metrics for controllers pt 2
* Adding peer metrics
still need to be wired up for use
* per peer packet metrics
* Fix crash from bad instantiation of histogram
* separate alive & dead path counts
* Add peer metric update block
* add peer latency values in doPingAndKeepalive
* prevent deadlock
* peer latency histogram actually works now
* cleanup
* capture counts of packets to specific peers
---------
Co-authored-by: Joseph Henry <joseph.henry@zerotier.com>
* allow specifying authtoken in central startup
* set allowManagedFrom
* move redis_mem_notification to the correct place
* add node checkins metric
* wire up min/max connection pool size metrics
* add header-only prometheus lib to ext
* rename folder
* Undo rename directory
* prometheus simpleapi included on mac & linux
* wip
* wire up some controller stats
* Get windows building with prometheus
* bsd build flags for prometheus
* Fix multiple network join from environment entrypoint.sh.release (#1961)
* _bond_m guards _bond, not _paths_m (#1965)
* Fix: warning: mutex '_aqm_m' is not held on every path through here [-Wthread-safety-analysis] (#1964)
* Serve prom metrics from /metrics endpoint
* Add prom metrics for Central controller specific things
* reorganize metric initialization
* testing out a labled gauge on Networks
* increment error counter on throw
* Consolidate metrics definitions
Put all metric definitions into node/Metrics.hpp. Accessed as needed
from there.
* Revert "testing out a labled gauge on Networks"
This reverts commit 499ed6d95e.
* still blows up but adding to the record for completeness right now
* Fix runtime issues with metrics
* Add metrics files to visual studio project
* Missed an "extern"
* add copyright headers to new files
* Add metrics for sent/received bytes (total)
* put /metrics endpoint behind auth
* sendto returns int on Win32
---------
Co-authored-by: Leonardo Amaral <leleobhz@users.noreply.github.com>
Co-authored-by: Brenton Bostick <bostick@gmail.com>