Websphere

辨識 WebSphere 中的執行緒死鎖

  • July 28, 2021

在 WebSphere 8.5.5.13 中,我遇到了一些記憶體不足錯誤,並且數據庫連接已達到極限。在我看來,這是由於執行緒飢餓(我有一些程序試圖以 10 秒的超時時間做某事,而其他任務通常需要約 200 毫秒,但實際上需要約 10200 毫秒)。但我認為最後一個甚至可能是一個僵局。我有大約 100 個執行緒像這樣等待

3XMTHREADINFO      "WorkManager.DefaultWorkManager : 648" J9VMThread:0x000000000F2AA300, omrthread_t:0x00007FE38D060D78, java/lang/Thread:0x000000018ACD99E8, state:B, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x68C86, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0xF8DE, native priority:0x5, native policy:UNKNOWN, vmstate:B, vm thread flags:0x00000201)
3XMTHREADINFO2            (native stack address range from:0x00007FE09C92F000, to:0x00007FE09C96F000, size:0x40000)
3XMCPUTIME               CPU usage total: 2.131995383 secs, current category="Application"
3XMTHREADBLOCK     Blocked on: com/ibm/ws/util/ThreadPool@0x000000011CD4B888 Owned by: "WorkManager.DefaultWorkManager : 689" (J9VMThread:0x00000000011B3000, java/lang/Thread:0x00000001B148B9A8)
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool.getTask(ThreadPool.java:1083(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1916(Compiled Code))

WorkManager.DefaultWorkManager : 689 的堆棧看起來像這樣

3XMTHREADINFO      "WorkManager.DefaultWorkManager : 689" J9VMThread:0x00000000011B3000, omrthread_t:0x00007FE1A41A70D0, java/lang/Thread:0x00000001B148B9A8, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x68CCD, isDaemon:true)
3XMTHREADINFO1            (native thread ID:0x11410, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00001001)
3XMTHREADINFO2            (native stack address range from:0x00007FE1EFF3E000, to:0x00007FE1EFF7E000, size:0x40000)
3XMCPUTIME               CPU usage total: 1.663139688 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:502(Compiled Code))
4XESTACKTRACE                at java/lang/ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:311(Compiled Code))
4XESTACKTRACE                at java/lang/ThreadLocal.setInitialValue(ThreadLocal.java:197(Compiled Code))
4XESTACKTRACE                at java/lang/ThreadLocal.get(ThreadLocal.java:183(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/util/objectpool/TwoTierObjectPool.purgeThreadLocal(TwoTierObjectPool.java:264(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/buffermgmt/impl/WsByteBufferPool.purgeThreadLocal(WsByteBufferPool.java:173(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/buffermgmt/impl/WsByteBufferPoolManagerImpl.purgeThreadLocals(WsByteBufferPoolManagerImpl.java:1169(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/runtime/component/WSBBPoolListener.threadDestroyed(WSBBPoolListener.java:62(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/runtime/component/ThreadPoolMgrImpl.threadDestroyed(ThreadPoolMgrImpl.java:459(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool.fireThreadDestroyed(ThreadPool.java:1593(Compiled Code))
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool.workerDone(ThreadPool.java:1005(Compiled Code))
5XESTACKTRACE                   (entered lock: com/ibm/ws/util/ThreadPool@0x000000011CD4B888, entry count: 1)
4XESTACKTRACE                at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1929(Compiled Code))

作為參考,空閒的執行緒(並且不等待釋放某些東西)看起來像這樣

 at sun/misc/Unsafe.park(Native Method)
 at java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:222)
 at java/util/concurrent/locks/AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2127)
 at com/ibm/ws/util/BoundedBuffer$GetQueueLock.await(BoundedBuffer.java:285)
 at com/ibm/ws/util/BoundedBuffer.waitGet_(BoundedBuffer.java:424)
 at com/ibm/ws/util/BoundedBuffer.take(BoundedBuffer.java:817)
 at com/ibm/ws/util/ThreadPool.getTask(ThreadPool.java:934)
 at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1704)

或者

 at java/lang/Object.wait(Native Method)
 at java/lang/Object.wait(Object.java:231)
 at com/ibm/ws/util/BoundedBuffer.waitGet_(BoundedBuffer.java:192)
 at com/ibm/ws/util/BoundedBuffer.take(BoundedBuffer.java:543)
 at com/ibm/ws/util/ThreadPool.getTask(ThreadPool.java:819)
 at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1544)

我的一個都不像那些。

謝謝!

死鎖的一般範例如下

  • 執行緒1持有資源A,需要資源B繼續
  • 執行緒2持有資源B,需要資源A繼續

在這種情況下,兩個執行緒都無法取得進展,因此存在死鎖。

您發布的片段與該模式不匹配,因此我認為這不是死鎖。

需要注意的是,我不熟悉發布的程式碼片段中顯示的特定程式碼,在我看來,顯示的第一個執行緒只是在等待從 WorkManager 隊列中獲取任務,該隊列可能為空。

另外順便說一句,在 IBM Java 執行緒轉儲(您的程式碼片段似乎來自)中,在創建轉儲的過程中檢測到死鎖執行緒,並使用 DEADLOCK 標記進行標記。因此,您可以在 java 執行緒轉儲中搜尋它,以節省您匹配所有可能的執行緒/資源組合以手動查找死鎖所需的時間。

希望這可以幫助。

引用自:https://serverfault.com/questions/1022760