{{tag>Brouillon Réseau Linux Kernel TCP CA}}
= Réseau Linux pile TCP/IP
Voir aussi :
* MPTCP, SCTP, DCCP
Voir :
* https://docs.kernel.org/networking/ip-sysctl.html
* https://www.inetdoc.net/guides/lartc/lartc.kernel.obscure.html
* https://man7.org/linux/man-pages/man7/tcp.7.html
* https://frsag.frsag.narkive.com/IYrhTt32/incidence-des-tcp-timestamps-sur-les-connexions
* hping2
man 7 tcp
== Contrack
Voir :
* /proc/net/nf_conntrack
* /proc/sys/net/nf_conntrack_max
apt-get install conntrack
Flush
conntrack -F
=== /proc/sys/net/ipv4/tcp_syn_retries
$ sysctl net.ipv4.tcp_syn_retries
net.ipv4.tcp_syn_retries = 6
Effectively, this takes 1+2+4+8+16+32+64=127s before the connection finally aborts.
=== /proc/sys/net/ipv4/tcp_synack_retries
=== /proc/sys/net/ipv4/tcp_retries2
Voir :
* https://stackoverflow.com/questions/5227520/how-many-times-will-tcp-retransmit
* https://pracucci.com/linux-tcp-rto-min-max-and-tcp-retries2.html
Voir aussi :
* /proc/sys/net/ipv4/tcp_retries
* /proc/sys/net/ipv4/tcp_syn_retries
* /proc/sys/net/ipv4/tcp_synack_retries
==== Cluster
In a High Availability (HA) situation consider decreasing the setting to 3.
RFC 1122 recommends at least 100 seconds for the timeout, which corresponds to a value of at least 8.
Oracle suggest a value of 3 for a RAC configuration.
Source : https://access.redhat.com/solutions/726753
==== Nb de retransmissions vs temps
An experiment confirms that (on a recent Linux at least) the timeout is more like 13s with the suggested net.ipv4.tcp_retries2=5
“Windows defaults to just 5 retransmissions which corresponds with a timeout of around 6 seconds.”
“Five retransmissions corresponds with a timeout of around six seconds.”
tcp_retries2=5 means timeout with first transmission plus 5 retransmissions: 12.6 seconds=(2^6 - 1) * 0.2.
tcp_retries2=15: 924.6 seconds=(2^10 - 1) * 0.2 + (16 - 10) * 120.
Source : https://github.com/elastic/elasticsearch/issues/102788
Voir aussi : https://www.elastic.co/guide/en/elasticsearch/reference/current/system-config-tcpretries.html#_related_configuration
=== F_RTO
https://access.redhat.com/solutions/4978771
== TCP keepalive
Voir :
* https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die
* Python Scripts https://github.com/cloudflare/cloudflare-blog/tree/master/2019-09-tcp-keepalives
tcp_keepalive_time
https://www.veritas.com/support/en_US/article.100028680
=== Configuring TCP/IP keepalive parameters for high availability clients (JDBC)
* tcp_keepalive_probes - the number of probes that are sent and unacknowledged before the client considers the connection broken and notifies the application layer
* tcp_keepalive_time - the interval between the last data packet sent and the first keepalive probe
* tcp_keepalive_intvl - the interval between subsequent keepalive probes
* tcp_retries2 - the maximum number of times a packet is retransmitted before giving up
echo "6" > /proc/sys/net/ipv4/tcp_keepalive_time
echo "1" > /proc/sys/net/ipv4/tcp_keepalive_intvl
echo "10" > /proc/sys/net/ipv4/tcp_keepalive_probes
echo "3" > /proc/sys/net/ipv4/tcp_retries2
Source : https://www.ibm.com/docs/en/db2/9.7?topic=ctkp-configuring-operating-system-tcpip-keepalive-parameters-high-availability-clients
ss -o
== Process / diag tools
Voir :
* https://access.redhat.com/solutions/30453
== Outils
=== TCP retransmissions
Voir :
* http://arthurchiao.art/blog/tcp-retransmission-may-be-misleading/
* net.ipv4.tcp_early_retrans
Outils :
* tcpretrans.bt (bpftrace)
* tcpretrans ([[https://github.com/brendangregg/perf-tools/blob/master/net/tcpretrans|perf-tools]])
* tcpretrans.py ([[https://github.com/iovisor/bcc/blob/master/tools/tcpretrans.py|bpfcc-tools - iovisor/bcc]]))
Connaitre le rto_min et le rto_max
# grep ^Tcp /proc/net/snmp |column -t |cut -c1-99
Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens AttemptFails EstabResets
Tcp: 1 200 120000 -1 6834 964 161 4614
yum install bpftrace
/usr/share/bcc/tools/tcpretrans
timeout 60 ./tcpretrans | nl
sar -n ETCP
sar -n TCP
# netstat -s |egrep 'segments retransmited|segments send out'
107428604792 segments send out
47511527 segments retransmited
# echo "$(( 47511527 * 10000 / 107428604792 ))"
4
https://www.ibm.com/support/pages/tracking-tcp-retransmissions-linux
''tcpretransmits.sh''
#! /usr/bin/bash
test -x /usr/sbin/tcpretrans.bt && TCPRETRANS=/usr/sbin/tcpretrans.bt
test -x /usr/share/bpftrace/tools/tcpretrans.bt && TCPRETRANS=/usr/share/bpftrace/tools/tcpretrans.bt
# https://github.com/brendangregg/perf-tools/blob/master/net/tcpretrans
test -x ./tcpretrans.pl && TCPRETRANS=./tcpretrans.pl
OUT=/tmp/tcpretransmits.log
if [ -z "$TCPRETRANS" ]; then
echo "It looks like 'bpftrace' is not installed"
else
date > $OUT
netstat -s |awk '/segments sen. out$/ { R=$1; } /segments retransmit+ed$/ { printf("%.4f\n", ($1/R)*100); }' >> $OUT
$TCPRETRANS | tee -a $OUT
netstat -s |awk '/segments sen. out$/ { R=$1; } /segments retransmit+ed$/ { printf("%.4f\n", ($1/R)*100); }' >> $OUT
fi
Resolving The Problem \\
TCP retransmissions are almost exclusively caused by failing network hardware, not applications or middleware. Report the failing IP pairs to a network administrator.
== Autres
** horodatages TCP **
https://access.redhat.com/documentation/fr-fr/red_hat_enterprise_linux/9/html/monitoring_and_managing_system_status_and_performance/benefits-of-tcp-timestamps_tuning-the-network-performance
tcp_low_latency (Boolean; default: disabled; since Linux 2.4.21/2.6; obsolete since Linux 4.14)
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_moderate_rcvbuf = 1
----------------
# ip route get 192.168.100.11
192.168.100.11 dev virbr1 src 192.168.100.1 uid 1000
cache
# ip route show dev virbr1
192.168.100.0/24 proto kernel scope link src 192.168.100.1
# ip route change dev virbr1 192.168.100.0/24 proto kernel scope link src 192.168.100.1 rto_min 8ms