Firewall

Azure Windows 虛擬機上的 Oracle 數據庫的連接問題 (ORA-03135)

  • September 15, 2020

我們在 Azure Windows VM 上執行 Oracle 19c 數據庫。我們的配置如下:

  • Azure Windows 虛擬機

    • 標準 B4ms(4 vcpus,16 GiB 記憶體)
    • “空閒超時”為 30 分鐘(允許的最大值)的公共 IP 地址。
    • 甲骨文數據庫 19c

從 VM 執行時,不會斷開“空閒 SQL 會話”,但從公共 IP 地址訪問時,會在 30 分鐘後斷開“空閒 SQL 連接”。

ORA-03135: connection lost contact

我們希望我們的“空閒 SQL 連接”是持久的,這是我們迄今為止所嘗試的:

  1. 將 sqlnet.ora 文件編輯如下(客戶端和伺服器):
SQLNET.EXPIRE_TIME = 1000000
SQLNET.INBOUND_CONNECT_TIMEOUT = 1000000
  1. 編輯 tnsnames.ora 文件如下(客戶端和伺服器):
(ENABLE=BROKEN)

目前我正在嘗試實現 Oracle keep-alive 包,但我沒有成功。

有什麼方向或想法嗎?謝謝

$$ UPDATE $$ 我找到了一個解決方案,這就是我為了解決它所做的:

  1. 將“sqlnet.ora”恢復正常(客戶端和伺服器)
  2. 在“tnsnames.ora”文件上配置(ENABLE=BROKEN)(僅在客戶端上)
  3. 使用以下參數在 Regedit 上設置 TCP KeepAlive(僅在客戶端)(https://docs.microsoft.com/en-us/sql/connect/jdbc/connecting-to-an-azure-sql-database?view =sql-server-ver15 )
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
"KeepAliveTime"=dword:00007530
"KeepAliveInterval"=dword:000003e8
"TcpMaxDataRetransmissions"=dword:00000010
  1. 重啟客戶端

您需要在客戶端機器上更改一兩個核心可調參數。使 TCP keepalive (ENABLE=BROKEN) 有用。預設情況下,Linux 核心在 2 小時後發送第一個 TCP keepalive 探測:

Linux核心:

# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
Delay before the 1st probe is sent - after 2 hours !!!

# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
Interval between probes.

# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
The number of "lost" probes indicating that the connection is dead.

當將 keepavlive 時間和 keepalive 間隔設置為 5 秒(僅用於測試。30 秒對於 prod 來說應該綽綽有餘),那麼您應該在 TCP 轉儲中看到類似這樣的內容:

[~]# tcpdump -pni ens224 -v "tcp port 1521 and ( tcp[tcpflags] & tcp-ack != 0 and ( (ip[2:2] - ((ip[0]&0xf)<<2) ) - ((tcp[12]&0xf0)>>2) ) == 0 )"
tcpdump: listening on ens224, link-type EN10MB (Ethernet), capture size 262144 bytes
14:56:49.994675 IP (tos 0x0, ttl 64, id 12096, offset 0, flags [DF], proto TCP (6), length 52)
   192.168.8.200.37196 > 192.168.8.212.ncube-lm: Flags [.], cksum 0x9313 (incorrect -> 0x4b94), ack 2263270845, win 359, options [nop,nop,TS val 78140944 ecr 7133273], length 0
14:56:49.994829 IP (tos 0x0, ttl 64, id 55327, offset 0, flags [DF], proto TCP (6), length 52)
   192.168.8.212.ncube-lm > 192.168.8.200.37196: Flags [.], cksum 0xadcf (correct), ack 1, win 207, options [nop,nop,TS val 7138281 ecr 78110940], length 0
14:56:55.002668 IP (tos 0x0, ttl 64, id 12097, offset 0, flags [DF], proto TCP (6), length 52)
   192.168.8.200.37196 > 192.168.8.212.ncube-lm: Flags [.], cksum 0x9313 (incorrect -> 0x2474), ack 1, win 359, options [nop,nop,TS val 78145952 ecr 7138281], length 0
14:56:55.002809 IP (tos 0x0, ttl 64, id 55328, offset 0, flags [DF], proto TCP (6), length 52)
   192.168.8.212.ncube-lm > 192.168.8.200.37196: Flags [.], cksum 0x9a3f (correct), ack 1, win 207, options [nop,nop,TS val 7143289 ecr 78110940], length 0
14:57:00.010680 IP (tos 0x0, ttl 64, id 12098, offset 0, flags [DF], proto TCP (6), length 52)
   192.168.8.200.37196 > 192.168.8.212.ncube-lm: Flags [.], cksum 0x9313 (incorrect -> 0xfd53), ack 1, win 359, options [nop,nop,TS val 78150960 ecr 7143289], length 0
14:57:00.011296 IP (tos 0x0, ttl 64, id 55329, offset 0, flags [DF], proto TCP (6), length 52)
   192.168.8.212.ncube-lm > 192.168.8.200.37196: Flags [.], cksum 0x86af (correct), ack 1, win 207, options [nop,nop,TS val 7148297 ecr 78110940], length 0

6 packets captured
6 packets received by filter
0 packets dropped by kernel

引用自:https://serverfault.com/questions/1033844