{{tag>Brouillon AAP AWX Nginx Supervision Socket CA}}
= AAP Diagnostic supervision et métrologie
Voir :
* [[AAP - Metrics - Prometheous]]
Diag :
* https://www.ansible.com/hubfs/AnsibleFest%20ATL%20Slide%20Decks/Troubleshooting%20Tips%20and%20Tricks%20.pdf
* https://docs.ansible.com/ansible-tower/latest/html/administration/troubleshooting.html
Stanfard officielle
* https://access.redhat.com/documentation/fr-fr/reference_architectures/2023/html/deploying_ansible_automation_platform_2_on_red_hat_openshift/monitoring_your_ansible_automation_platform
Via l'API :
* https://stackoverflow.com/questions/74935350/detect-all-cancelled-jobs-in-ansible-awx-tower
Via Prometheus :
* https://www.redhat.com/sysadmin/introduction-prometheus-metrics-and-performance-monitoring
* https://docs.ansible.com/automation-controller/4.0.0/html/administration/metrics.html
Logs :
* /var/log/tower/
* ERROR
* FATAL
Les status des services de Supervisorctl
Prévoir des tests ‘podman run –rm’ (en tant que awx) sur les Exécutions Nodes
== Diagnostic d'un job
sudo -u awx -i
awx-manage shell_plus --ipython
j=UnifiedJob.objects.filter(id=211242)
j[0].status
j.values_list()[0]
UnifiedJob.objects.filter(id=211242).values_list()[0]
UnifiedJob.objects.filter(id=211242)[0].result_stdout
for j in UnifiedJob.objects.filter(name='Template_name' , status='failed').values_list():
print(
';'.join([
str(j[0]),
str(j[2].isoformat(timespec='minutes')),
j[12],
j[27]
]
)
)
UnifiedJob.objects.filter(name='Template_name' , status='error').values_list()
Voir aussi
* ''/api/v2/jobs/211242/''
* ''/api/v2/jobs/211242/stdout/''
== Supervision
awx-manage --help | egrep "check|test"
== Générique supervision
https://www.redhat.com/sysadmin/monitor-users-linux
* Vérif que le ReadOnly des points de montage remonte bien en erreur
* Heure NTP
* Test et temps de la résolution DNS
== Pb
=== Erreur 502
''/etc/nginx/nginx.conf''
upstream daphne {
server unix:/var/run/tower/daphne.sock;
}
''/etc/nginx/nginx.conf''
location /websocket {
# Pass request to the upstream alias
proxy_pass http://daphne;
# Require http version 1.1 to allow for upgrade requests
proxy_http_version 1.1;
# We want proxy_buffering off for proxying to websockets.
proxy_buffering off;
# http://en.wikipedia.org/wiki/X-Forwarded-For
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# enable this if you use HTTPS:
proxy_set_header X-Forwarded-Proto https;
# pass the Host: header from the client for the sake of redirects
proxy_set_header Host $http_host;
# We've set the Host header, so we don't need Nginx to muddle
# about with redirects
proxy_redirect off;
# Depending on the request value, set the Upgrade and
# connection headers
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
}
''/var/log/nginx/access.log''
192.168.6.57 - - [16/Oct/2023:09:03:31 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-"
192.168.213.56 - - [16/Oct/2023:09:03:32 +0000] "GET /websocket/broadcast/ HTTP/1.1" 502 150 "-" "Python/3.9 aiohttp/3.7.4" "-"
192.168.6.57 - - [16/Oct/2023:09:03:36 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-"
192.168.6.57 - - [16/Oct/2023:09:03:41 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-"
192.168.213.56 - - [16/Oct/2023:09:03:42 +0000] "GET /websocket/broadcast/ HTTP/1.1" 502 150 "-" "Python/3.9 aiohttp/3.7.4" "-"
192.168.6.57 - - [16/Oct/2023:09:03:46 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-"
192.168.6.57 - - [16/Oct/2023:09:03:51 +0000] "GET /websocket/ HTTP/1.1" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" "-"
192.168.213.56 - - [16/Oct/2023:09:03:52 +0000] "GET /websocket/broadcast/ HTTP/1.1" 502 150 "-" "Python/3.9 aiohttp/3.7.4" "-"
# fuser -v /var/run/tower/daphne.sock
USER PID ACCESS COMMAND
/run/tower/daphne.sock:
awx 452944 F.... python3
# ps -f -p 452944
UID PID PPID C STIME TTY TIME CMD
awx 452944 1641 0 08:20 ? 00:01:13 python3 /var/lib/awx/venv/awx/bin/daphne -u /var/run/tower/daphne.sock awx.asgi:channel_layer
=== Solution
Le problème était du à la lenteur de la résolution DNS