Kubernetes POD vs VM

20 min readMay 16, 2021

쿠린이 운영자 입장에서 느낀 kubernetes POD와 VM 차이점을 정리해 보았습니다. POD는 VM, OS 인듯 아닌듯 싶습니다. 정답은 아마 단일 Application Process 이겠죠.

접속 — ssh vs exec

kube POD는 ssh가 아닌 exec 를 사용해서 접속(?) 합니다.

현재 POD 내역

[spkr@erdia22 99. ETC (spkn02:nginx)]$ kgp (k get pod)
NAME                           READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
cent-tools-75795cfbbb-49zjm    1/1     Running   0          18h   10.233.90.70   node1   <none>           <none>
nginx-hello-6c74c6f84d-65wl2   1/1     Running   0          19h   10.233.92.47   node3   <none>           <none>
nginx-hello-6c74c6f84d-wjsz5   1/1     Running   0          19h   10.233.96.64   node2   <none>           <none>

POD에 접속해 보겠습니다.

[spkr@erdia22 ~ (spkn02:nginx)]$ k exec -it cent-tools-75795cfbbb-49zjm -- bash[root@cent-tools-75795cfbbb-49zjm /]# ps aux
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          1  0.0  0.0   4360   360 ?        Ss   05:07   0:00 sleep inf
root         33  2.0  0.0  14448  3104 pts/0    Ss   23:56   0:00 bash
root         56  0.0  0.0  53340  1872 pts/0    R+   23:56   0:00 ps aux

POD 접속은 sshd 데몬이 아니라 위와 같이 원격 POD에 bash process를 실행(exec) 합니다. (명령어가 exec(execute) — bash 입니다.) bash를 실행하므로 ssh 접속한 것과 동일합니다.

NGINX POD는 bash 실행이 안됩니다.

[spkr@erdia22 99. ETC (spkn02:nginx)]$ k exec -it nginx-hello-6c74c6f84d-65wl2 -- bash
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "d91befe2dd1dc0e08d657408b00859608d7c8ac4ee03b426a759c9abf106db1d": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"bash\": executable file not found in $PATH": unknown

‘bash’ process가 실행 중이지 않습니다. ‘sh’ process가 실행 중입니다. nginx는 bash가 아닌 sh을 실행해야 합니다.

[spkr@erdia22 99. ETC (spkn02:nginx)]$ k exec -it nginx-hello-6c74c6f84d-65wl2 -- sh
/ # ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 nginx: master process nginx -g daemon off;
    6 nginx      0:00 nginx: worker process
    7 root       0:00 sh
   20 root       0:00 sh
   26 root       0:00 ps aux

흥미롭게 2개가 실행 중 이네요. (master, work process 2개가 실행 중이라서 아마 그런 가 봅니다.)

물론 VM은 ssh로 접속 합니다.

[spkr@erdia22 tmp (spkn02:nginx)]$ ssh node1
Last login: Thu Apr 22 02:54:03 2021 from 172.17.18.2[spkr@node1 ~]$ ps aux|grep ssh
root       1091  0.0  0.0 112936  4368 ?        Ss   Apr20   0:00 /usr/sbin/sshd -D
root     127133  0.0  0.0 161036  5792 ?        Ss   08:53   0:00 sshd: spkr [priv]
spkr     127137  0.0  0.0 161036  2536 ?        S    08:53   0:00 sshd: spkr@pts/0
spkr     133229  0.0  0.0 112816   968 pts/0    S+   08:58   0:00 grep --color=auto ssh

당연히 ssh daemon이 실행 중 입니다.

여담으로 실제로 단순히 POD 접속을 위하여 POD에 ssh daemon 실행시켜 달라고 한적이 있습니다. 아주 특별한 이유가 없다면 POD에서는 당연히 sshd 설치 & 실행할 필요가 없습니다.

기본 명령어

그럼, POD에서 몇가지 명령어를 실행해 보겠습니다. 서버에 접속하면 자동 반사로 몇가지 명령어를 실행하는 것과 동일합니다.

[spkr@erdia22 ~ (spkcluster:default)]$ k exec -it nginx-f89759699-zk9jh -- sh
# ping www.google.com
sh: 1: ping: not found

황당하게 ping도 안됩니다. ㅎㅎ

# ifconfig
sh: 2: ifconfig: not found
# ip a show
sh: 3: ip: not found
# netstat -lntp
sh: 4: netstat: not found# free -h
sh: 5: free: not found
# uptime
sh: 6: uptime: not found
# w
sh: 7: w: not found
# top
sh: 8: top: not found

되는 게 먼지 의심 스럽네요.

다행인지 cent-tools 라는 POD는 그나마 정상적입니다. (여담으로 cent-tools 또는 busybox POD로 다른 POD의 문제 troubleshooting 하곤 합니다.)

[spkr@erdia22 ~ (spkcluster:default)]$ k exec -it cent-tools-697fc7d5fc-mzv6n -- bash
[root@cent-tools-697fc7d5fc-mzv6n /]# ping www.google.com
ping: socket: Operation not permitted[root@cent-tools-697fc7d5fc-mzv6n /]# nc -vz www.google.com 443
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 216.58.220.100:443.
Ncat: 0 bytes sent, 0 bytes received in 0.10 seconds.[root@cent-tools-697fc7d5fc-mzv6n /]# ip a show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: mgmt0@if127: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 52:ed:b1:2a:d2:08 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.20.0.43/24 scope global mgmt0
       valid_lft forever preferred_lft forever
4: eth0@enp129s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1496 qdisc noqueue state UP group default qlen 1000
    link/ether 8e:a2:00:28:60:0e brd ff:ff:ff:ff:ff:ff
    inet 10.10.100.30/24 scope global eth0
       valid_lft forever preferred_lft forever
41: enp129s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 8e:a2:00:28:60:0e brd ff:ff:ff:ff:ff:ff[root@cent-tools-697fc7d5fc-mzv6n /]# netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name[root@cent-tools-697fc7d5fc-mzv6n /]# free -h
              total        used        free      shared  buff/cache   available
Mem:           125G         25G         57G         26M         42G         90G
Swap:           63G         22M         63G[root@cent-tools-697fc7d5fc-mzv6n /]# w
 01:11:11 up 3 days, 21:10,  0 users,  load average: 2.80, 3.16, 3.11
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT[root@cent-tools-697fc7d5fc-mzv6n /]# top
top - 01:11:18 up 3 days, 21:10,  0 users,  load average: 2.82, 3.16, 3.11
Tasks:   5 total,   1 running,   4 sleeping,   0 stopped,   0 zombie
%Cpu(s): 10.8 us,  4.3 sy,  0.0 ni, 84.6 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
KiB Mem : 13173772+total, 60208320 free, 26475372 used, 45054036 buff/cache
KiB Swap: 67108860 total, 67086076 free,    22784 used. 94828840 avail Mem

이처럼 POD마다 사용할 수 있는 명령어가 다릅니다.

[spkr@erdia22 ~ (spkcluster:default)]$ k exec -it nginx-f89759699-zk9jh -- sh# apt-get update
Get:1 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]# apt-get -y install netcat
Reading package lists... Done# nc -vz www.google.com 443
DNS fwd/rev mismatch: www.google.com != nrt20s19-in-f4.1e100.net
www.google.com [172.217.175.36] 443 (?) open

물론 nginx POD에서 원하는 package를 설치하시면 됩니다. 하지만 관리 편의성, 실행 속도 및 보안 등의 이유로 권장하지 않습니다. POD Troubleshooting 은 다른 POD와 로그 등을 통해서 진행합니다.

OS 및 Kernel

POD와 Host VM은 OS는 다르고 Kernel은 공유합니다.

POD[spkr@erdia22 99. ETC (spkn02:nginx)]$ k exec -it nginx-hello-6c74c6f84d-65wl2 -- sh
/ # cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.5.2
PRETTY_NAME="Alpine Linux v3.5"
HOME_URL="http://alpinelinux.org"
BUG_REPORT_URL="http://bugs.alpinelinux.org"/ # uname -r
3.10.0-1062.12.1.el7.x86_64VM[spkr@node1 ~]$ cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)[spkr@node1 ~]$ uname -r
3.10.0-1160.15.2.el7.x86_64

위와 같이 POD는 호스트 OS와 다르게 Alpine linux를 사용하고 kernel은 3.10.0 으로 동일합니다. (참고로 Alpine Image는 초경량화 이미지로 용량이 1.24MB 뿐이 안 되네요)

spkr@erdia22 ~ (dz-saas2:default)]$ docker images
REPOSITORY                            TAG       IMAGE ID       CREATED        SIZE
busybox                               latest    c55b0f125dc6   9 days ago     1.24MB

Process List

실행 중인 Process list를 확인해 보면 너무나도 차이가 큽니다.

POD[spkr@erdia22 99. ETC (spkn02:nginx)]$ k exec -it nginx-hello-6c74c6f84d-65wl2 -- sh
/ # ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 nginx: master process nginx -g daemon off;
    6 nginx      0:00 nginx: worker process
    7 root       0:00 sh
   20 root       0:00 sh
   26 root       0:00 ps auxVM[spkr@node1 ~]$ ps aux
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          1  0.1  0.0 194772  7964 ?        Ss   Apr20  53:04 /usr/lib/systemd/systemd --switched-root --system --deserialize 22
root          2  0.0  0.0      0     0 ?        S    Apr20   0:03 [kthreadd]
root          4  0.0  0.0      0     0 ?        S<   Apr20   0:00 [kworker/0:0H]
root          6  0.0  0.0      0     0 ?        S    Apr20   6:34 [ksoftirqd/0]
root          7  0.0  0.0      0     0 ?        S    Apr20   1:22 [migration/0]
root          8  0.0  0.0      0     0 ?        S    Apr20   0:00 [rcu_bh]
root          9  0.1  0.0      0     0 ?        S    Apr20  38:32 [rcu_sched]
root         10  0.0  0.0      0     0 ?        S<   Apr20   0:00 [lru-add-drain]
root         11  0.0  0.0      0     0 ?        S    Apr20   0:09 [watchdog/0]
root         12  0.0  0.0      0     0 ?        S    Apr20   0:07 [watchdog/1]
root         13  0.0  0.0      0     0 ?        S    Apr20   1:23 [migration/1]
root         14  0.0  0.0      0     0 ?        S    Apr20   2:21 [ksoftirqd/1]
root         16  0.0  0.0      0     0 ?        S<   Apr20   0:00 [kworker/1:0H]
root         17  0.0  0.0      0     0 ?        S    Apr20   0:08 [watchdog/2]
(이하 생략)

POD는 특정 하나의 Process 만 실행 중 입니다. 다른 Process는 실행되지 않습니다. OS 설치 시 minimum version 에 최소 application만 설치해서 실행한다라고 생각하시면 됩니다. (컨테이너 이미지 생성 시 OS도 일반 CentOS가 아니라 최소 버전인 alpine linux를 주로 사용 합니다. 보안 및 resource 절약.)

위 그림과 같이 POD(컨테이너)는 경량화된(Lightweight) Guest OS를 사용하고 Application에 필요한 최소한의 Library, Binary만 올려져 있습니다. ‘ps aux’ 로 확인해 보면 단일 Process vs 수십 Process 차이가 납니다.

Hardware

하드웨어 관련 filesystem, cpu, memory 등의 정보를 알아 보겠습니다.

POD[spkr@erdia22 ~ (spkcluster:default)]$ k exec -it cent-tools-697fc7d5fc-mzv6n -- bash
[root@cent-tools-697fc7d5fc-mzv6n /]# df -h
Filesystem                       Size  Used Avail Use% Mounted on
overlay                          447G  9.0G  438G   3% /
tmpfs                             64M     0   64M   0% /dev
tmpfs                             63G     0   63G   0% /sys/fs/cgroup
shm                               64M     0   64M   0% /dev/shm
tmpfs                             63G   12M   63G   1% /run/secrets
/dev/mapper/centos_diamanti-var   64G  8.7G   56G  14% /etc/hosts
tmpfs                             63G   12K   63G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs                             63G     0   63G   0% /proc/acpi
tmpfs                             63G     0   63G   0% /proc/scsi
tmpfs                             63G     0   63G   0% /sys/firmwareHost Node[diamanti@dia01 ~]$ df -h
Filesystem                                 Size  Used Avail Use% Mounted on
/dev/mapper/centos_diamanti-root            40G   15G   26G  36% /
(생략)
/dev/mapper/centos_diamanti-var_log_audit  4.0G   74M  4.0G   2% /var/log/audit
/dev/mapper/crio--vg-crio--lv              447G  6.4G  441G   2% /var/lib/containers
/dev/mapper/centos_diamanti-var_tmp        4.0G  158M  3.9G   4% /var/tmp[root@dia01 overlay]# ls /var/lib/containers/storage/
libpod  mounts  overlay  overlay-containers  overlay-images  overlay-layers  storage.lock  tmp  userns.lock

위와 같이 POD는 Host Node의 /dev/mapper/crio-vg-crio0 — lv FileSystem을 공유합니다. 저는 현재 Container Runtime으로 crio 를 사용 중입니다. host node의 pod 관련 디렉토리는 위와 같이 /var/lib/containers 임을 확인 할 수 있습니다.

[diamanti@hci1-dzn1 ~]$ df -h
Filesystem                                 Size  Used Avail Use% Mounted on
/dev/mapper/centos_diamanti-root            40G   12G   29G  28% /
(생략)
/dev/mapper/docker--vg-docker--lv          447G  156G  292G  35% /var/lib/docker[root@hci1-dzn1 overlay2]# ls /var/lib/docker
containers  image  network  overlay2  plugins  swarm  tmp  trust  volumes

docker를 사용하는 분들이 많을 있을 것 같은데, 위와 같이 docker 와 cri-o 디렉토리 정보가 다름을 확인 할 수 있습니다.

이제, memory/cpu 정보를 확인해 보겠습니다. POD, VM 순인데 아래와 같이 (거의) 동일 합니다.

[root@cent-tools-697fc7d5fc-mzv6n /]# free -h
              total        used        free      shared  buff/cache   available
Mem:           125G        8.5G         89G         24M         27G        107G
Swap:           63G          0B         63G[root@dia01 overlay]# free -h
              total        used        free      shared  buff/cache   available
Mem:           125G        7.8G         91G         25M         26G        108G
Swap:           63G          0B         63G[root@cent-tools-697fc7d5fc-mzv6n /]# cat /proc/cpuinfo
(생략)
processor       : 39
vendor_id       : GenuineIntel
cpu family      : 6
model           : 79
model name      : Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz[root@dia01 overlay]# cat /proc/cpuinfo
(생략)
processor       : 39
vendor_id       : GenuineIntel
cpu family      : 6
model           : 79
model name      : Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

POD는 Node와 kernel 정보를 공유하여 전체 resource 현황을 보실 수 있습니다. 그렇다고 당연히 모든 Resource 사용 가능하지 않습니다. 매니페스트 YAML 파일에 아래와 같이 설정하면 실제 사용 가능한 Resource 제한이 가능합니다.

위 resources — limits 설정으로 cpu는 2 Core, memory 4Gi 까지 사용 가능합니다. 초과해서 사용하면 POD는 Termination — Start 됩니다.

Pets vs Cattle

아마도 한번씩 Pets vs Cattle 이라는 비유를 들어보셨을 것 같습니다. POD는 기존 VM과 다르게 상세하게 관리(High Touch) 하지 않습니다. host name도 random하게 만들어지고 ip 도 마찬가지 입니다. 무엇보다 Down되면 기존 POD가 아닌 새로운 POD를 실행해서 대체 합니다. (장례식은 없습니다.)

섣부른 논리의 비약이기는 위와 같이 POD는 위에서 보셨듯이 기존 VM과는 차이점이 많습니다. VM처럼 접속해서 무엇을 하실 생각을 안 하는 게 낫습니다. 그냥 하나의 단일 Application Process라 생각하시고 그냥 잘 실행 중인지만 확인하면 됩니다. 장애 등의 문제가 발생하면 events, logs 등을 확인하셔서 다른 방법으로 해결을 해야 합니다.

Kubernetes POD vs VM

Written by Jerry(이정훈)