备注 | 修改日期 | 修改人 |
创建版本 | 2021-08-26 17:41:46[当前版本] | 系统管理员 |
忽然有一天,跳板机无法正常登陆。通过登陆阿里云后台查询内存发现,tomcat启动了11个,这是个很大的bug。
领导说了:如果发生在生产环境如何解决,一下蒙了。
先说临时解决方案,就是执行kill -9 PID ,强制杀死进程。
(JDK11+TOMCAT9+SSM+MAVEN3.5.4+MYSQL5.7+DRUID)
第一步:在shutdown文件最后一行改成如下这样。
原来的
exec "$PRGDIR"/"$EXECUTABLE" stop "$@"
改后的
exec "$PRGDIR"/"$EXECUTABLE" stop -force "$@"
第二步:在catalina文件加入这一行,加在139行后面
if [ -z "$CATALINA_PID" ]; then CATALINA_PID=$PRGDIR/CATALINA_PID cat $CATALINA_PID fi
上面操作相当于强制杀死进程,非长久之计。
tomcat的catalina日志报错,如下:
22-Feb-2019 11:02:40.692 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesJdbc The web application [ROOT] registered the JDBC driver [com.alibaba.druid.proxy.DruidDriver] but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC Driver has been forcibly unregistered. 22-Feb-2019 11:02:40.695 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesJdbc The web application [ROOT] registered the JDBC driver [com.mysql.jdbc.Driver] but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC Driver has been forcibly unregistered. 22-Feb-2019 11:02:40.697 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [Abandoned connection cleanup thread] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: java.base@11.0.1/java.lang.Object.wait(Native Method) java.base@11.0.1/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155) com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:43) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:339) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:167) java.base@11.0.1/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124) java.base@11.0.1/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136) org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:340) org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) java.base@11.0.1/java.lang.Thread.run(Thread.java:834) 26-Feb-2019 09:45:50.596 警告 [localhost-startStop-1] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [I/O dispatcher 1] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:339) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:167) java.base@11.0.1/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124) java.base@11.0.1/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136) org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255) org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) java.base@11.0.1/java.lang.Thread.run(Thread.java:834) 26-Feb-2019 09:45:50.600 警告 [localhost-startStop-1] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [I/O dispatcher 2] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:339) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:167) java.base@11.0.1/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124) java.base@11.0.1/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136) org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255) org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) java.base@11.0.1/java.lang.Thread.run(Thread.java:834) 26-Feb-2019 09:45:50.603 警告 [localhost-startStop-1] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [I/O dispatcher 3] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:339) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:167) java.base@11.0.1/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124) java.base@11.0.1/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136) org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255) org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) java.base@11.0.1/java.lang.Thread.run(Thread.java:834) 26-Feb-2019 09:45:50.608 警告 [localhost-startStop-1] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [I/O dispatcher 4] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:339) java.base@11.0.1/sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:167) java.base@11.0.1/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124) java.base@11.0.1/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136) org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255) org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) java.base@11.0.1/java.lang.Thread.run(Thread.java:834)
分析错误得出结论:
第一个问题:是JDBC的错误,初步理解为,DBCP在tomcat关闭时没有注销驱动。
第二个问题:存在内存泄漏,初步理解为,有线程在tomcat关闭时,存在线程没有关闭或者tomcat关闭不了。
解决:
第一个问题:
在spring中 web.xml 添加监听,在应用关闭时注销驱动。
1.监听类:
package tbcloud.admin.until; import com.mysql.jdbc.AbandonedConnectionCleanupThread; import javax.servlet.ServletContextEvent; import javax.servlet.ServletContextListener; import java.sql.Driver; import java.sql.DriverManager; import java.sql.SQLException; import java.util.Enumeration; /** * @author: Dmm * @date: 2019/2/25 17:44 */ public class MyServletContextListener implements ServletContextListener { @Override public void contextInitialized(ServletContextEvent sce) { } @Override public void contextDestroyed(ServletContextEvent sce) { //这里如果Web应用拥有多个数据库的连接,可以一并关闭 Enumeration<Driver> drivers = DriverManager.getDrivers(); Driver driver = null; while (drivers.hasMoreElements()) { try { driver = drivers.nextElement(); DriverManager.deregisterDriver(driver); } catch (SQLException ex) { ex.printStackTrace(); } } try { AbandonedConnectionCleanupThread.shutdown(); } catch (InterruptedException e) { e.printStackTrace(); } } }
2.在web.xml中配置该监听类
<listener> <listener-class>tbcloud.admin.until.MyServletContextListener</listener-class> </listener>
第二个问题:
首先查看的堆内存的情况,如下,存在非守护线程在运行,没有关闭。
在TOMCAT执行了shutdown以后。
"pool-1-thread-1" #21 prio=5 os_prio=0 cpu=331.44ms elapsed=5696.46s tid=0x00007f2855649800 nid=0x7260 runnable [0x00007f281484e000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPoll.wait(java.base@11.0.1/Native Method) at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.1/EPollSelectorImpl.java:120) at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.1/SelectorImpl.java:124) - locked <0x00000000ff213660> (a sun.nio.ch.Util$2) - locked <0x00000000ff213510> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(java.base@11.0.1/SelectorImpl.java:136) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:340) at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) at java.lang.Thread.run(java.base@11.0.1/Thread.java:834) "I/O dispatcher 1" #22 prio=5 os_prio=0 cpu=279.80ms elapsed=5696.41s tid=0x00007f2828009800 nid=0x7261 runnable [0x00007f27ffdfe000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPoll.wait(java.base@11.0.1/Native Method) at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.1/EPollSelectorImpl.java:120) at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.1/SelectorImpl.java:124) - locked <0x00000000fecb2680> (a sun.nio.ch.Util$2) - locked <0x00000000fecb2530> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(java.base@11.0.1/SelectorImpl.java:136) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) at java.lang.Thread.run(java.base@11.0.1/Thread.java:834) "I/O dispatcher 2" #23 prio=5 os_prio=0 cpu=278.50ms elapsed=5696.41s tid=0x00007f2828012800 nid=0x7262 runnable [0x00007f27ffcfd000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPoll.wait(java.base@11.0.1/Native Method) at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.1/EPollSelectorImpl.java:120) at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.1/SelectorImpl.java:124) - locked <0x00000000fecb2a20> (a sun.nio.ch.Util$2) - locked <0x00000000fecb28d0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(java.base@11.0.1/SelectorImpl.java:136) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) at java.lang.Thread.run(java.base@11.0.1/Thread.java:834)
命令:
ps -ef|grep **
top Hp PID
jstack PID
思考最近添加了什么功能,经查,使用ES工具类进行操作。经过自己的查询与别人交流发现自己在web容器销毁时,没有进行线程的关闭,即没有调用close()。原因所在在!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
(使用@PostConstruct 和 @PreDestory注解)
第一步:导入@PostConstruct 和 @PreDestory所需的jar
<dependency> <groupId>javax.annotation</groupId> <artifactId>javax.annotation-api</artifactId> <version>1.2</version> </dependency>
第二步:在spring的配置文件中设置扫描包:
<context:component-scan base-package="tbcloud.admin.service.impl"/>
第三步:在方法上配置注解
/** * 初始化tbcloudEsClientUntil */ @PostConstruct public void init(){ try { //"elastic.properties" elastic.properties tbcloudEsClientUntil=new TbcloudEsClientUntil(HttpProxyRecordServiceImpl.class.getResourceAsStream("/****.properties")); } catch (IOException e) { logger.error("config file fail {}",e); e.printStackTrace(); } } /** * 关闭线程的方法 */ @PreDestroy public void close(){ tbcloudEsClientUntil.close(); }
至此,问题得到完美解决。
经此一役,感觉自己处理问题的好差,浪费我一个星期。