Commit graph

14 commits

Author SHA1 Message Date
Aaron Johnson
8285e0f45b Add Prometheus metrics for packet fragmentation monitoring in nodes
- Add comprehensive VL1 (ZeroTier protocol) fragmentation metrics:
  * Track fragmented packets, fragments, reassembly failures
  * Monitor fragment ordering issues and duplicates
  * Histogram for fragments per packet distribution

- Add VL2 (TAP/Ethernet) fragmentation metrics for virtual ethernet interfaces:
  * Track oversized frames from TAP devices
  * Monitor frames that would fragment or drop
  * Histogram for frame size distribution with common MTU buckets

- Integration across all TAP implementations (Linux, Mac, BSD, Windows)

This allows monitoring of fragmentation patterns for nodes participating
as members in ZeroTier networks, helping identify MTU mismatches and
optimize virtual ethernet performance.
2025-07-15 10:41:03 -07:00
Grant Limberg
6d532944bd
stop clang-format from breaking the build by sorting headers here 2025-07-03 08:38:21 -07:00
Adam Ierymenko
ba2a4a605c
clang-format 2025-07-03 11:26:23 -04:00
Adam Ierymenko
1982071d46 1.14.0 version bump for Linux and macOS, date update. 2024-03-19 14:38:48 -07:00
Grant Limberg
17f6b3a10b
central controller metrics & request path updates (#2012)
* internal db metrics

* use shared mutexes for read/write locks

* remove this lock. only used for a metric

* more metrics

* remove exploratory metrics

place controller request benchmarks behind ifdef
2023-05-23 12:11:26 -07:00
Grant Limberg
adfbbc3fb0
Controller Metrics & Network Config Request Fix (#2003)
* add new metrics for network config request queue size and sso expirations
* move sso expiration to its own thread in the controller
* fix potential undefined behavior when modifying a set
2023-05-16 11:56:58 -07:00
Grant Limberg
00d55fc4b4
Metrics consolidation (#1997)
* Rename zt_packet_incoming -> zt_packet

Also consolidate zt_peer_packets into a single metric with tx and rx labels.  Same for ztc_tcp_data and ztc_udp_data

* Further collapse tcp & udp into metric labels for zt_data

* Fix zt_data metric description

* zt_peer_packets description fix

* Consolidate incoming/outgoing network packets to a single metric

* zt_incoming_packet_error -> zt_packet_error

* Disable peer metrics for central controllers

Can change in the future if needed, but given the traffic our controllers serve, that's going to be a *lot* of data

* Disable peer metrics for controllers pt 2
2023-05-04 11:12:55 -07:00
Grant Limberg
74dc41c7c7
Peer metrics (#1995)
* Adding peer metrics

still need to be wired up for use

* per peer packet metrics

* Fix crash from bad instantiation of histogram

* separate alive & dead path counts

* Add peer metric update block

* add peer latency values in doPingAndKeepalive

* prevent deadlock

* peer latency histogram actually works now

* cleanup

* capture counts of packets to specific peers

---------

Co-authored-by: Joseph Henry <joseph.henry@zerotier.com>
2023-05-04 07:58:02 -07:00
Grant Limberg
925599cab0
Network-metrics (#1994)
* Add a couple quick functions for converting a uint64_t network ID/node ID into std::string

* Network metrics
2023-05-03 13:43:45 -07:00
Grant Limberg
06b487119d
More packet metrics (#1982)
* found path negotation sends that weren't accounted for

* Fix histogram so it will actually compile

* Found more places for packet metrics
2023-05-02 11:16:55 -07:00
Grant Limberg
595e033776
Outgoing Packet Metrics (#1980)
add tx/rx labels to packet counters and add metrics for outgoing packets
2023-04-28 14:24:19 -07:00
Grant Limberg
411e54023a
adding incoming zt packet type metrics (#1976) 2023-04-26 08:49:54 -07:00
Grant Limberg
e4cb74896b
Central startup update (#1973)
* allow specifying authtoken in central startup

* set allowManagedFrom

* move redis_mem_notification to the correct place

* add node checkins metric

* wire up min/max connection pool size metrics
2023-04-25 12:44:18 -07:00
Grant Limberg
8e6e4ede6d
Add prometheus metrics for Central controllers (#1969)
* add header-only prometheus lib to ext

* rename folder

* Undo rename directory

* prometheus simpleapi included on mac & linux

* wip

* wire up some controller stats

* Get windows building with prometheus

* bsd build flags for prometheus

* Fix multiple network join from environment entrypoint.sh.release (#1961)

* _bond_m guards _bond, not _paths_m (#1965)

* Fix: warning: mutex '_aqm_m' is not held on every path through here [-Wthread-safety-analysis] (#1964)

* Serve prom metrics from /metrics endpoint

* Add prom metrics for Central controller specific things

* reorganize metric initialization

* testing out a labled gauge on Networks

* increment error counter on throw

* Consolidate metrics definitions

Put all metric definitions into node/Metrics.hpp.  Accessed as needed
from there.

* Revert "testing out a labled gauge on Networks"

This reverts commit 499ed6d95e.

* still blows up but adding to the record for completeness right now

* Fix runtime issues with metrics

* Add metrics files to visual studio project

* Missed an "extern"

* add copyright headers to new files

* Add metrics for sent/received bytes (total)

* put /metrics endpoint behind auth

* sendto returns int on Win32

---------

Co-authored-by: Leonardo Amaral <leleobhz@users.noreply.github.com>
Co-authored-by: Brenton Bostick <bostick@gmail.com>
2023-04-21 12:12:43 -07:00