Linux

Docker - 沒有出站流量/網橋僅在混雜模式下工作

  • February 17, 2021

在過去的一周裡,我一直在努力解決一個非常奇怪的網路問題。總之,除非我執行,否則我的容器無法訪問網際網路tcpdump -i br-XXXXX(這會將橋置於混雜模式)

我有兩個使用 Compose 提出的容器:

version: '3'
services:
 seafile:
   build: ./seafile/build
   container_name: seafile
   restart: always
   ports:
     - 8080:80
   networks:
     seafile_net:
       ipv4_address: 192.168.0.2
   volumes:
     - /mnt/gluster/files/redacted/data:/shared
   environment:
     - DB_HOST=10.200.7.100
     - DB_PASSWD=redacted
     - TIME_ZONE=America/Chicago
   depends_on:
     - seafile-memcached
 seafile-memcached:
   image: memcached:1.5.6
   container_name: seafile-memcached
   restart: always
   networks:
     seafile_net:
       ipv4_address: 192.168.0.3
   entrypoint: memcached -m 256
networks:
 seafile_net:
   driver: bridge
   ipam:
     driver: default
     config:
       - subnet: 192.168.0.0/24

執行的容器:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                  NAMES
93b1b773ad4e        docker_seafile      "/sbin/my_init -- /s…"   2 minutes ago       Up 2 minutes        0.0.0.0:8080->80/tcp   seafile
1f6b124c3be4        memcached:1.5.6     "memcached -m 256"       2 minutes ago       Up 2 minutes        11211/tcp              seafile-memcached

網路資訊:

$ docker network ls
NETWORK ID          NAME                 DRIVER              SCOPE
f67b015c4b84        bridge               bridge              local
d21cb7ba8ee4        docker_seafile_net   bridge              local
d0eb86ca57fa        host                 host                local
01f03fcfa103        none                 null                local

$ docker inspect d21cb7ba8ee4
[
   {
       "Name": "docker_seafile_net",
       "Id": "d21cb7ba8ee4a477497a7d343ea1a5f9b109237dce878a40605a281e1a2db1e9",
       "Created": "2020-09-24T15:03:46.39761472-04:00",
       "Scope": "local",
       "Driver": "bridge",
       "EnableIPv6": false,
       "IPAM": {
           "Driver": "default",
           "Options": null,
           "Config": [
               {
                   "Subnet": "192.168.0.0/24"
               }
           ]
       },
       "Internal": false,
       "Attachable": true,
       "Ingress": false,
       "ConfigFrom": {
           "Network": ""
       },
       "ConfigOnly": false,
       "Containers": {
           "1f6b124c3be414040a6def3b3bc3e9f06e2af6a28afd6737823d1da65d5ab047": {
               "Name": "seafile-memcached",
               "EndpointID": "ab3e3c4aa216d158473fa3dde3f87e654422ffeca6ebb7626d072da10ba9a5cf",
               "MacAddress": "02:42:c0:a8:00:03",
               "IPv4Address": "192.168.0.3/24",
               "IPv6Address": ""
           },
           "93b1b773ad4e3685aa8ff2db2f342c617c42f1c5ab4ce693132c1238e73e705d": {
               "Name": "seafile",
               "EndpointID": "a895a417c22a4755df15b180d1c38b712c36047b01596c370815964a212f7105",
               "MacAddress": "02:42:c0:a8:00:02",
               "IPv4Address": "192.168.0.2/24",
               "IPv6Address": ""
           }
       },
       "Options": {},
       "Labels": {
           "com.docker.compose.network": "seafile_net",
           "com.docker.compose.project": "docker",
           "com.docker.compose.version": "1.27.4"
       }
   }
]

$ ip link show master br-d21cb7ba8ee4
18: veth8fd88c9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-d21cb7ba8ee4 state UP mode DEFAULT group default
   link/ether b6:37:9e:fd:9e:da brd ff:ff:ff:ff:ff:ff
20: vetheb84e16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-d21cb7ba8ee4 state UP mode DEFAULT group default
   link/ether ca:90:c8:a6:2e:9b brd ff:ff:ff:ff:ff:ff

一旦容器啟動,它們就無法訪問網際網路或主機網路上的任何其他資源。以下curl命令是從其中一個容器內執行的。在主機伺服器上,同樣的命令可以正常工作:

root@93b1b773ad4e:/opt/seafile# curl -viLk http://1.1.1.1
* Rebuilt URL to: http://1.1.1.1/
*   Trying 1.1.1.1...
* TCP_NODELAY set
**hangs**

這是網橋的 tcpdump(在主機上執行),但未將其置於混雜模式。這是在我嘗試curl從上面執行命令時擷取的:

$ tcpdump --no-promiscuous-mode -lnni br-d21cb7ba8ee4
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-d21cb7ba8ee4, link-type EN10MB (Ethernet), capture size 262144 bytes
14:15:42.447055 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:43.449058 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:45.448787 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:46.451049 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:47.453058 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:49.449789 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:15:50.451048 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28

但是如果我讓tcpdump橋進入混雜模式,事情就會開始工作:

$ tcpdump -lnni br-d21cb7ba8ee4
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-d21cb7ba8ee4, link-type EN10MB (Ethernet), capture size 262144 bytes
14:16:05.457844 ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
14:16:05.457863 ARP, Reply 192.168.0.2 is-at 02:42:c0:a8:00:02, length 28
**traffic continues**

碼頭工人資訊:

$ docker info
Client:
Debug Mode: false
Server:
Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 6
Server Version: 19.03.13
Storage Driver: devicemapper
 Pool Name: docker-8:3-3801718-pool
 Pool Blocksize: 65.54kB
 Base Device Size: 10.74GB
 Backing Filesystem: xfs
 Udev Sync Supported: true
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 1.695GB
 Data Space Total: 107.4GB
 Data Space Available: 80.76GB
 Metadata Space Used: 3.191MB
 Metadata Space Total: 2.147GB
 Metadata Space Available: 2.144GB
 Thin Pool Minimum Free Space: 10.74GB
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.164-RHEL7 (2019-08-27)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-123.9.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.704GiB
Name: redacted.novalocal
ID: redacted
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.
WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
        Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.

主機資訊:

$ docker --version
Docker version 19.03.13, build 4484c46d9d

$ docker-compose --version
docker-compose version 1.27.4, build 40524192

$ cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)

$ getenforce
Disabled

$ free -h
             total        used        free      shared  buff/cache   available
Mem:           3.7G        1.1G        826M        109M        1.8G        2.2G
Swap:          1.0G        292M        731M

$ nproc
2

$ uptime
10:39:49 up 3 days, 19:56,  1 user,  load average: 0.00, 0.01, 0.05

$ iptables-save
# Generated by iptables-save v1.4.21 on Mon Sep 28 10:41:22 2020
*filter
:INPUT ACCEPT [17098775:29231856941]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [15623889:13475217196]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o br-d21cb7ba8ee4 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-d21cb7ba8ee4 -j DOCKER
-A FORWARD -i br-d21cb7ba8ee4 ! -o br-d21cb7ba8ee4 -j ACCEPT
-A FORWARD -i br-d21cb7ba8ee4 -o br-d21cb7ba8ee4 -j ACCEPT
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 192.168.0.2/32 ! -i br-d21cb7ba8ee4 -o br-d21cb7ba8ee4 -p tcp -m tcp --dport 80 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i br-d21cb7ba8ee4 ! -o br-d21cb7ba8ee4 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o br-d21cb7ba8ee4 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
COMMIT
# Completed on Mon Sep 28 10:41:22 2020
# Generated by iptables-save v1.4.21 on Mon Sep 28 10:41:22 2020
*nat
:PREROUTING ACCEPT [408634:24674574]
:INPUT ACCEPT [380413:22825327]
:OUTPUT ACCEPT [520596:31263683]
:POSTROUTING ACCEPT [711734:42731963]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 192.168.0.0/24 ! -o br-d21cb7ba8ee4 -j MASQUERADE
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -s 192.168.0.2/32 -d 192.168.0.2/32 -p tcp -m tcp --dport 80 -j MASQUERADE
-A DOCKER -i br-d21cb7ba8ee4 -j RETURN
-A DOCKER -i docker0 -j RETURN
-A DOCKER ! -i br-d21cb7ba8ee4 -p tcp -m tcp --dport 8080 -j DNAT --to-destination 192.168.0.2:80
COMMIT
# Completed on Mon Sep 28 10:41:22 2020

感謝@AB 的評論,我找到了解決方案。

我認為主要問題是br_netfilter模組未載入:

$ lsmod | grep br_netfilter
$

在另一個 CentOS 7 Docker 主機(沒有這個問題)上,模組被載入:

$ lsmod | grep br_netfilter
br_netfilter           22256  0
bridge                146976  1 br_netfilter

手動載入模組對我不起作用:

$ modprobe br_netfilter
modprobe: FATAL: Module br_netfilter not found.

我在這裡讀到這br_netfilter是一個內置模組,直到核心版本 3.18。

我發現我們從更新中排除了核心(我沒有設置這個伺服器,所以這對我來說是個新聞)。

$ grep exclude /etc/yum.conf
exclude=kernel*

由於這個排除,我之前yum update的 s 沒有更新核心。我認為分離br_netfilter沒有被反向移植到我們正在執行的核心中。

在沒有核心排除的情況下執行更新 ( yum --disableexcludes=all update kernel) 並重新啟動後,一切都開始工作了!

核心更新將我3.10.0-123.9.2.el7.x86_643.10.0-1127.19.1.el7.

引用自:https://serverfault.com/questions/1035608