Table des matières
0 billet(s) pour février 2026
Linux - Pb long shutdown
journalctl -rb -1 #journalctl -b -1 -e --no-pager
Voir si il n'y a pas un TimeoutStopSec=infinity dans un service SystemD
Autres
/etc/systemd/system.conf
#DefaultTimeoutStopSec=90s
Pour la durée du boot
systemd-analysis
Linux - File Descriptor - deleted files
Libérer de l'espace
Voir aussi :
lsfd
Source https://access.redhat.com/solutions/2316
Identifier le process et trouver son PID
lsof | egrep "deleted|COMMAND" #lsof +L1
Note : COMMAND in grep is for lsof headers
Tronquer le fichier
echo > /proc/pid/fd/fd_number
Utiliser gdb
-bash-4.1# lsof +L1
java 21568 root 24w REG 253,2 23000046 18
/var/log/plop_2022-03-23_09.2.log (deleted)
-bash-4.1# gdb -p 21568
(gdb) p close(24)
$1 = 0
(gdb) quit
A debugging session is active.
Inferior 1 [process 21568] will be detached.
Quit anyway? (y or n) y
Detaching from program: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-0.b15.el6_8.x86_64/jre/bin/java, process 21568
-bash-4.1# lsof | grep "(deleted)"
Limiter les ressources CPU pour un process donne
Restreindre l'usage de CPU
How to limit CPU Usage of a process
Voir :
nice -n 19 COMMAND
ou
renice +19 1234 #ionice -c3 -p 1234
Ou encode
systemd-run -p IOWeight=10 updatedb
Exemple
# best effort, highest priority sudo ionice -c2 -n0 -p `pgrep etcd`
cgcreate -g cpu:/cpulimit cgset -r cpu.cfs_period_us=1000000 cpulimit cgset -r cpu.cfs_quota_us=100000 cpulimit
check
cgget -g cpu:cpulimit
cgexec -g cpu:cpulimit COMMAND
ou
echo 1234 > /sys/fs/cgroup/cpu/cpulimit/tasks
ou
#cpulimit -p <PID> -l <%CPU> cpulimit -p 1234 -l 80
Pb
write error: No space left on device
Erreur
# echo 1234 > /sys/fs/cgroup/cpuset/cpulimit/tasks -bash: echo: write error: No space left on device
Solution
echo 0 > /sys/fs/cgroup/cpuset/cpulimit/cpuset.mems echo 0 > /sys/fs/cgroup/cpuset/cpulimit/cpuset.cpu
How to limit a process to one CPU core in Linux - CPU affinity
start a command with the given affinity
taskset -c 0 mycommand --option
set the affinity of a running process
taskset -c -pa 0 1234
Ansible sudo su become method
N'est pas autorisé
sudo -u testplop ls
Mais est autorisé :
sudo su - testplop
/etc/sudoers.d/userc1
User_Alias USER_T_USERC1=userc1 Cmnd_Alias CMND_USERC1=/bin/su - oracle, \ /bin/su - testplop Defaults:CMND_USERC1 !requiretty USER_T_USERC1 ALL= EXEC: NOPASSWD: CMND_USERC1
Alors que ça serait tellement plus propre de faire :
Runas_Alias RUNAS_DBA_ALL = oracle, testplop #USER_T_USERC1 ALL= (testplop) EXEC: NOPASSWD: ALL USER_T_USERC1 ALL= (RUNAS_DBA_ALL) EXEC: NOPASSWD: ALL
Solution 1
Utiliser le become plugin community.general.sudosu
Pas applicable dans notre cas, et nous avons l'erreur :
fatal: [test-ansible]: FAILED! => {"msg": "Missing community.general.sudosu password"}
Car si il est possible de faire :
sudo su - testplop
Il n'est pas possible de faire :
sudo su -l testplop -c 'ls'
Il faudrait la conf sudoers suivantes :
Cmnd_Alias CMND_USERC1=/bin/su -l oracle *, \ /bin/su -l testplop *
Cela n'est pas sans poser des problèmes de sécurité.
Voici la conf
ansible-galaxy collection install community.general
play.yml
#!/usr/bin/ansible-playbook --- - name: test sudosu hosts: srvtest gather_facts: false become_method: community.general.sudosu become_user: testplop become: true tasks: - name: test command: id register: cmd_ls - name: test debug: var: cmd_ls.stdout_lines
Solution 2
Source : https://github.com/ansible/ansible/issues/12686
/usr/local/bin/sudosu.sh
#!/bin/bash # #sudosu.sh "user" -c "cmd" if [ $# -lt 3 ]; then echo 'Not enough arguments: sudosu.sh "user" -c "cmd"' >&2 exit 1 fi if [ x"-c" != x"$2" ]; then echo 'Wrong 2nd arg: sudosu.sh "user" -c "cmd"' >&2 exit 1 fi printf '%s\n' "$3" | sudo su - "$1"
play.yml
#!/usr/bin/ansible-playbook --- - name: test hosts: test-ansible gather_facts: false become_method: su # become_flags: "su -c" # become_flags: "-H -S -n" # default value become_exe: /usr/local/bin/sudosu.sh become_user: testplop become: true tasks: - name: test command: id register: cmd_ls
Autres
ansible-doc -t become -l
AAP Diagnostic supervision et métrologie
Voir :
Diag :
Stanfard officielle
Via l'API :
Via Prometheus :
Logs :
- /var/log/tower/
- ERROR
- FATAL
Les status des services de Supervisorctl Prévoir des tests ‘podman run –rm’ (en tant que awx) sur les Exécutions Nodes
Diagnostic d'un job
sudo -u awx -i awx-manage shell_plus --ipython
j=UnifiedJob.objects.filter(id=211242) j[0].status j.values_list()[0] UnifiedJob.objects.filter(id=211242).values_list()[0] UnifiedJob.objects.filter(id=211242)[0].result_stdout for j in UnifiedJob.objects.filter(name='Template_name' , status='failed').values_list(): print( ';'.join([ str(j[0]), str(j[2].isoformat(timespec='minutes')), j[12], j[27] ] ) ) UnifiedJob.objects.filter(name='Template_name' , status='error').values_list()
Voir aussi
/api/v2/jobs/211242//api/v2/jobs/211242/stdout/
Supervision
awx-manage --help | egrep "check|test"
Générique supervision
https://www.redhat.com/sysadmin/monitor-users-linux
- Vérif que le ReadOnly des points de montage remonte bien en erreur
- Heure NTP
- Test et temps de la résolution DNS
Pb
Erreur 502
/etc/nginx/nginx.conf
upstream daphne { server unix:/var/run/tower/daphne.sock; }
/etc/nginx/nginx.conf
location /websocket { # Pass request to the upstream alias proxy_pass http://daphne; # Require http version 1.1 to allow for upgrade requests proxy_http_version 1.1; # We want proxy_buffering off for proxying to websockets. proxy_buffering off; # http://en.wikipedia.org/wiki/X-Forwarded-For proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # enable this if you use HTTPS: proxy_set_header X-Forwarded-Proto https; # pass the Host: header from the client for the sake of redirects proxy_set_header Host $http_host; # We've set the Host header, so we don't need Nginx to muddle # about with redirects proxy_redirect off; # Depending on the request value, set the Upgrade and # connection headers proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; }
/var/log/nginx/access.log
192.168.6.57 - - [16/Oct/2023:09:03:31 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-" 192.168.213.56 - - [16/Oct/2023:09:03:32 +0000] "GET /websocket/broadcast/ HTTP/1.1" 502 150 "-" "Python/3.9 aiohttp/3.7.4" "-" 192.168.6.57 - - [16/Oct/2023:09:03:36 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-" 192.168.6.57 - - [16/Oct/2023:09:03:41 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-" 192.168.213.56 - - [16/Oct/2023:09:03:42 +0000] "GET /websocket/broadcast/ HTTP/1.1" 502 150 "-" "Python/3.9 aiohttp/3.7.4" "-" 192.168.6.57 - - [16/Oct/2023:09:03:46 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-" 192.168.6.57 - - [16/Oct/2023:09:03:51 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-" 192.168.213.56 - - [16/Oct/2023:09:03:52 +0000] "GET /websocket/broadcast/ HTTP/1.1" 502 150 "-" "Python/3.9 aiohttp/3.7.4" "-"
# fuser -v /var/run/tower/daphne.sock
USER PID ACCESS COMMAND
/run/tower/daphne.sock:
awx 452944 F.... python3
# ps -f -p 452944
UID PID PPID C STIME TTY TIME CMD
awx 452944 1641 0 08:20 ? 00:01:13 python3 /var/lib/awx/venv/awx/bin/daphne -u /var/run/tower/daphne.sock awx.asgi:channel_layer
Solution
Le problème était du à la lenteur de la résolution DNS
