前言
我們在處理數據庫的問題的時候,有時在數據庫端看不到異常的等待,但是應用卻報連接異常。
這種情況,有可能是在壓力還沒有傳遞到數據庫,在網絡這層已經發生問題了。如連接風暴,此時的數據庫listener的進程已經忙不過來,將連接丟棄。(在主機層面可以看到tcpListeneDrop增加)。
我們如果去看普通的listener日志,看到的東西是這樣的:
……24-FEB-2017 09:11:03 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=sqlplus@mynwdb)(HOST=mynwdb)(USER=appl))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65184)) * establish * mynwdb * 024-FEB-2017 09:11:06 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=STANDARD)(HOST=mynwdb)(USER=appl))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65186)) * establish * mynwdb * 024-FEB-2017 09:11:10 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=STANDARD)(HOST=mynwdb)(USER=appl))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65188)) * establish * mynwdb * 024-FEB-2017 09:11:10 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=STANDARD)(HOST=mynwdb)(USER=appl))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65189)) * establish * mynwdb * 0Fri Feb 24 09:11:14 201724-FEB-2017 09:11:14 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65194)) * establish * mynwdb * 024-FEB-2017 09:11:14 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65195)) * establish * mynwdb * 024-FEB-2017 09:11:15 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65196)) * establish * mynwdb * 024-FEB-2017 09:11:15 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65197)) * establish * mynwdb * 024-FEB-2017 09:11:15 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65198)) * establish * mynwdb * 024-FEB-2017 09:11:15 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65199)) * establish * mynwdb * 024-FEB-2017 09:11:15 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65200)) * establish * mynwdb * 024-FEB-2017 09:11:15 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65201)) * establish * mynwdb * 024-FEB-2017 09:11:16 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65202)) * establish * mynwdb * 024-FEB-2017 09:11:16 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65203)) * establish * mynwdb * 024-FEB-2017 09:11:16 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65204)) * establish * mynwdb * 024-FEB-2017 09:11:16 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65205)) * establish * mynwdb * 024-FEB-2017 09:11:16 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65206)) * establish * mynwdb * 024-FEB-2017 09:11:17 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65207)) * establish * mynwdb * 024-FEB-2017 09:11:17 * (CONNECT_DATA=(SID=mynwdb)(CID=(PROGRAM=INVMGR)(HOST=mynwdb)(USER=nusr))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.22.3.123)(PORT=65208)) * establish * mynwdb * 0……
這樣我們會看的比較暈,不知道里面的連接數變化。
有一個listener日志的解析利器,可以分析到這種類型的問題。下載見文末。
用法:lsnr_analyzer.pl -t ‘time rexge' [-h] filename
如,我們要分析24-FEB-2017 09:30~09:59的連接情況,可以如下操作:
[ora@mynwdb trace]$ lsnr_analyzer.pl -t '24-FEB-2017 09:[3-5][0-9]' mynwdb_cut1.log24-FEB-2017 09:30 => Total:3724-FEB-2017 09:31 => Total:5024-FEB-2017 09:32 => Total:1724-FEB-2017 09:33 => Total:1624-FEB-2017 09:34 => Total:1924-FEB-2017 09:35 => Total:1624-FEB-2017 09:36 => Total:1724-FEB-2017 09:37 => Total:1624-FEB-2017 09:38 => Total:1324-FEB-2017 09:39 => Total:2924-FEB-2017 09:40 => Total:4324-FEB-2017 09:41 => Total:3824-FEB-2017 09:42 => Total:1724-FEB-2017 09:43 => Total:1624-FEB-2017 09:44 => Total:4224-FEB-2017 09:45 => Total:2124-FEB-2017 09:46 => Total:1624-FEB-2017 09:47 => Total:1524-FEB-2017 09:48 => Total:1224-FEB-2017 09:49 => Total:2524-FEB-2017 09:50 => Total:4724-FEB-2017 09:51 => Total:4124-FEB-2017 09:52 => Total:2124-FEB-2017 09:53 => Total:1224-FEB-2017 09:54 => Total:1724-FEB-2017 09:55 => Total:1624-FEB-2017 09:56 => Total:2024-FEB-2017 09:57 => Total:1724-FEB-2017 09:58 => Total:1724-FEB-2017 09:59 => Total:21[ora@mynwdb trace]$
可以看到,在09:31的時候,有個連接數的增加,到了一分鐘50個連接。
我們可以加-h的參數,breakdown具體是哪些主機連接過來的:
[ora@mynwdb trace]$ lsnr_analyzer.pl -t '24-FEB-2017 09:[3-5][0-9]' -h mynwdb_cut1.log24-FEB-2017 09:30 => Total:3711.22.3.123 3724-FEB-2017 09:31 => Total:5022.3.4.111 111.22.3.123 4924-FEB-2017 09:32 => Total:1710.8.1.218 111.22.3.123 1624-FEB-2017 09:33 => Total:1611.22.3.123 1624-FEB-2017 09:34 => Total:1911.22.3.123 1924-FEB-2017 09:35 => Total:1611.22.3.123 1624-FEB-2017 09:36 => Total:1722.3.4.111 111.22.3.123 1624-FEB-2017 09:37 => Total:1612.5.6.221 111.22.3.123 1524-FEB-2017 09:38 => Total:1311.22.3.123 1324-FEB-2017 09:39 => Total:2911.22.3.123 2924-FEB-2017 09:40 => Total:4311.22.3.123 4324-FEB-2017 09:41 => Total:3822.3.4.111 111.22.3.123 3724-FEB-2017 09:42 => Total:1712.5.6.221 111.22.3.123 1624-FEB-2017 09:43 => Total:1611.22.3.123 1624-FEB-2017 09:44 => Total:4211.22.3.123 4224-FEB-2017 09:45 => Total:2111.22.3.123 2124-FEB-2017 09:46 => Total:1622.3.4.111 111.22.3.123 1524-FEB-2017 09:47 => Total:1512.5.6.221 111.22.3.123 1424-FEB-2017 09:48 => Total:1211.22.3.123 1224-FEB-2017 09:49 => Total:2511.22.3.123 2524-FEB-2017 09:50 => Total:4711.22.3.123 4724-FEB-2017 09:51 => Total:4122.3.4.111 111.22.3.123 4024-FEB-2017 09:52 => Total:2112.5.6.221 111.22.3.123 2024-FEB-2017 09:53 => Total:1211.22.3.123 1224-FEB-2017 09:54 => Total:1711.22.3.123 1724-FEB-2017 09:55 => Total:1611.22.3.123 1624-FEB-2017 09:56 => Total:2022.3.4.111 111.22.3.123 1924-FEB-2017 09:57 => Total:1712.5.6.221 111.22.3.123 1624-FEB-2017 09:58 => Total:1711.22.3.123 1724-FEB-2017 09:59 => Total:2111.22.3.123 21[ora@mynwdb trace]$
可以看到09:30的時候,是來自11.22.3.123有49個連接,來自22.3.4.111有1個連接。
注1:lsnr_analyzer.pl 是用來分析已經establish的連接,如果是其他連接,不統計在內。
注2:lsnr_analyzer.pl 的統計比較消耗cpu,30萬行的listener日志分析大約1秒出結果,300萬行的listener日志分析大約6秒出結果。所以如果listener日志比較大,建議tail剪切后統計。
對于已經出現tcpListenDrop的情況,我們分幾種情況分析:
(1)如果是突然的連接風暴,需要控制前端的連接,如weblogic中間件逐臺啟動,而不是同時啟動;
(2)如果一直連接數很高,導致listener無法處理,就需要拆分listener,如按照端口拆分。
點擊下載:lsnr_analyzer.pl
總結
以上就是這篇文章的全部內容了,希望本文的內容對大家的學習或者工作能帶來一定的幫助,如果有疑問大家可以留言交流,謝謝大家對VeVb武林網的支持。
新聞熱點
疑難解答