1. OBPROXY 基于TCP keep-alive的超时检测和保活机制
- 在分析某数据包时,我们发现,针对空闲的TCP连接,obproxy 服务端每隔5分钟都会发送一些特殊的 tcp数据包,这些数据包在wireshark中显示为 [TCP KEEP-ALIVE],其示例如下::
- 这些数据包其实就是大名鼎鼎的tcp keep-alive 心跳包。而 obproxy 之所以会发送这些心跳包,其实是因为obproxy 开启了其SO_KEEPALIVE 选项(keep-alive packets are sent only when the SO_KEEPALIVE socket option is enabled),并使用了 LINUX 操作系统的 socket 套接字级别的基于 tcp keep-alive的超时检测和保活机制,该机制的详情见后文。
- obproxy相关参数和配置方式如下:
2. LINUX基于TCP keep-alive的超时检测和保活机制
Linux操作系统中,基于TCP keep-alive的超时检测和保活机制,分为两个层面,一个是操作系统级别的,一个是 socket 套接字级别的。
2.1. LINUX中操作系统级别的基于TCP keep-alive的超时检测和保活机制
操作系统级别的,基于TCP keep-alive的超时检测和保活机制,主要跟以下几个内核参数相关,可以在操作系统层面,通过 sysctl 命令查看和更改这些内核参数:
- /proc/sys/net/ipv4/tcp_keepalive_intvl: 默认 75秒,The number of seconds between TCP keep-alive probes;
- /proc/sys/net/ipv4/tcp_keepalive_probes: 默认 9 次,The maximum number of TCP keep-alive probes to send before giving up and killing the connection if no response is obtained from the other end;
- /proc/sys/net/ipv4/tcp_keepalive_time: 默认 7200 秒即2小时,The number of seconds a connection needs to be idle before TCP begins sending out keep-alive probes. Keep-alives are sent only when the SO_KEEPALIVE socket option is enabled. An idle connection is terminated after approximately an additional 11 minutes (9 probes an interval of 75 seconds apart) when keep-alive is enabled;
- sysctl net.ipv4.tcp_keepalive_time
- sysctl net.ipv4.tcp_keepalive_intvl
- sysctl net.ipv4.tcp_keepalive_probes
2.2. LINUX中socket套接字级别的基于TCP keep-alive的超时检测和保活机制
socket 套接字级别的,基于TCP keep-alive的超时检测和保活机制,则需要相关应用在其代码中,指定如下这些 socket 套接字选项,事实上 obproxy 就是利用了该机制:
- TCP_KEEPIDLE:the amount of time until the first keepalive packet is sent;
- TCP_KEEPCNT:the number of probes to send;
- TCP_KEEPINTVL:the interval between keepalive packets;
3. JAVA中如何指定 socket端口级别的基于TCP keep-alive的超时检测和保活机制
JDK11及之后的版本,也支持socket端口级别的,基于TCP keep-alive的超时检测和保活机制配置,事实上大部分 JDK8 版本,也在代码层面通过 backport 支持了该机制,相关源码如下:
- java.net.StandardSocketOptions
- java.net.StandardSocketOptions#SO_KEEPALIVE
- jdk.net.ExtendedSocketOptions
- When the SO_KEEPALIVE option is enabled, TCP probes a connection that has been idle for some amount of time. The default value for this idle period is 2 hours which is too long for most applications. The TCP_KEEPIDLE, TCP_KEEPCOUNT, TCP_KEEPINTERVAL option can be used to affect this value for a given socket.
- The default idle time for SO_KEEPALIVE is 2 hours, too long for most applications. Some operation systems have support to configure the idle time on a per connection basis (Linux has TCP_KEEPIDLE, Windows has SIO_KEEPALIVE_VALS). We should consider exposing an extended socket option to configure this.
- TCP_KEEPIDLE, TCP_KEEPCOUNT, and TCP_KEEPINTERVAL are non-standard socket options supported on several platforms to provide fine control over the TCP/IP keep alive mechanism. It should be possible to set these socket options via the setOption method defined by java.net.Socket and java.nio.channels.SocketChannel.
- Add a JDK-specific socket option that supports setting TCP_KEEPIDLE, TCP_KEEPCOUNT, TCP_KEEPINTERVAL, on platforms that support it. The option can be set/get through the existing set/getOption methods on Socket and NetworkChannel.
图片
图片
注意:如果执行 JAVA 程序时,遇到如下错误,Exception in thread "main" java.lang.NoSuchFieldError: TCP_KEEPIDLE,通常是因为使用的 JDK 版本不支持 TCP_KEEPIDLE等jdk.net.ExtendedSocketOptions 扩展选项, 这些扩展选项是 Java 9 才正式引入的,只有部分版本的Java 8 支持该选项。