The commands are provided for informational purposes only; it is recommended to study their operation on test environments.
-
List all containers and their statuses
docker ps -a --format "table {{.Names}}\t{{.Status}}" -
Display containers that are NOT in “Up” status (if all are running successfully, the output will be empty).
docker ps -a --format "table {{.Names}}\t{{.Status}}\t{{.State}}" | grep -v "Up" -
Container resource usage (CPU, RAM, network I/O, block I/O)
watch docker stats --no-stream -
Display container logs for a specific time period
(e.g., proxy, haproxy, backend-api, postgres, minio, kafka, rabbitmq, and others)
1 hour
docker logs --since 1h backend-api
10 minutes
docker logs --since 10m backend-api -
View container logs in real time starting from the current moment
docker logs -fn 0 postgres -
Output logs for a specific time period
docker logs --since "2024-07-26T11:30:00" --until "2024-07-26T12:30:00" postgres -
Docker container disk usage (OverlayFS), show top 5 largest
docker ps -as --format "{{.Names}}: {{.Size}}" | sort -hr -k2 | head -n 5 -
Save logs from the last hour:
Create a directory to store logs
mkdir logs-"$(hostname -s)-$(date +%Y-%m-%d)"
Navigate into the created directory usingcdand run the command to generate log files for each container
docker ps -a --format '{{.Names}}' | xargs -I {} sh -c 'docker logs --since "1h" {} > {}.log 2>&1' -
Save all available logs for a container
docker logs -t indication-calculator > logs.txt 2>&1
Server Computing Resources (CPU/RAM)
-
Interactive monitoring of processes and load
htop -
RAM and SWAP size and utilization
free -hw -
Information about number of cores and processor
lscpu
-
Information about server block devices (hard drives, storage devices, partitions, logical volumes)
lsblk -
Display filesystem usage and type information
df -Th -
Size of specific directories
du -sh /data/postgres/pgdata/pgroot/data/pg_wal/
du -sh /data/kafka* -
Size of all files and directories in the current
./directory, sorted in descending order
du -sh * | sort -hr -
Utility for analyzing disk space usage (ncdu) (-r flag to operate safely in read-only mode and avoid deleting critical data)
ncdu -r / -
Docker container disk usage (OverlayFS), show top 5 largest
docker ps -as --format "{{.Names}}: {{.Size}}" | sort -hr -k2 | head -n 5 -
Extended per-second I/O statistics for disks
iostat -x 1(Install with:
sudo apt install sysstatorsudo dnf install sysstat)w_await (Write Await) - Average write wait time in milliseconds
r_await (Read Await) - Average read wait time in milliseconds
%iowait (I/O Wait) - Percentage of time the CPU waits for I/O operations to complete (critical if > 15%)
%util (Utilization) - Percentage of time the device was busy (critical if > 80%) -
Monitor processes performing I/O
sudo iotop -o -d 1 -a -k(Install with:
sudo apt install -y iotoporsudo dnf install -y iotop) -
Replication slot lag in the DB cluster
docker exec -i postgres psql -U postgres -c "SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS replicationSlotLag, active, database, restart_lsn, confirmed_flush_lsn FROM pg_replication_slots;" -
Size of pg_wal (database write-ahead log)
docker exec -i $(docker ps -aq --filter name=postgres --filter expose=5432) psql -U postgres -c "select pg_size_pretty(sum(size)) as \"SIZE_WAL\" from pg_ls_waldir();" -
Check size of specific databases in the cluster
docker exec -it $(docker ps -aq --filter name=postgres --filter expose=5432) psql -U postgres -c "\\l+" -
Diagnose table sizes in the DB:
SELECT schemaname || '.' || relname AS table_name, pg_size_pretty(pg_total_relation_size(relid)) AS total_size, pg_total_relation_size(relid) AS bytes FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC LIMIT 10;
-
Check server response and certificate by domain name from outside:
curl -v https://stage001-rusakov.simpleone.ru/ -
Show proxy container volumes (where certificates are stored on the server and in the container)
docker inspect proxy | jq '.[].Mounts' -
Public key and certificate data (domain, expiration date), executed from the directory containing certificates (proxy container bind volume):
openssl x509 -in public.crt -text -noout | grep -E "Subject:|Not Before|Not After" -
Which certificate is currently being served by proxy on the server:
openssl s_client -connect localhost:443 -servername localhost 2>/dev/null | openssl x509 -noout -dates -subject
-
Check internet connectivity from the server
ping 1.1.1.1 -
Server network interfaces and routes
ip a
ip r -
Public IP address from which the server accesses the internet + domain resolution check from the server
curl 2ip.ru -
Configuration files determining how domain names are resolved from the server
cat /etc/resolv.conf
cat /etc/hosts -
Check packet loss and availability from outside, ping by domain
ping stage001-rusakov.simpleone.ru -
Network trace by domain
traceroute stage001-rusakov.simpleone.ru
mtr stage001-rusakov.simpleone.ru -
External domain resolution showing the server IP or load balancer IP behind it
host stage001-rusakov.simpleone.ru
nslookup stage001-rusakov.simpleone.ru
* We would greatly appreciate if you share your own tools for high-quality server diagnostics or provide feedback on the commands presented above!!!