poniedziałek, 14 stycznia 2013

Ubuntu 12.10: UDP buffer monitoring (kernel 3.5.0-21)

Some time ago I was involved in analysis of a networking problem, the symptom of which were client-side timeouts (no answer received). The protocol was standardized, based on UDP.

Let's first analyze the flow of information from network adapter towards the application. There is an excellent document provided by RedHat as part of their OS documentation - it can be found here. I recommend reading chapter 8 about networking. In general the flow looks as presented on the pic below:

Figure 1 Packet reception diagram (from RedHat Performance Tuning Guide)


Hardware/interrupt buffers

These are internal HW buffers inside the network interface card (NIC). If you see hw buffer overruns on your system the only thing that you will know is that the transfer rate is too high for this interface. The document previously mentioned gives indication how to monitor it using ethtool and what you could do to minimize the impact.

You can monitor this using tools getting stats from the NIC like ifconfig. Below you can find an example from my Ubuntu box (overruns indicate the number of times the HW buffer overrun happened):

$ ifconfig eth0
eth0      Link encap:Ethernet  HWaddr b4:99:ba:e8:d3:6c 
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0
overruns:0 frame:0
          TX packets:0 errors:0 dropped:0
overruns:0 carrier:0          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
          Interrupt:20 Memory:d7500000-d7520000





UDP socket buffers

There is a lot explained what can be tuned in the document mentioned above. I will only try to explain how it can happen that the buffer queue will grow. In order to do that I will prepare a sample Java application (a modification of UDP echo client/server app). It consists of the server application and client one.

Test application

The client will be sending every 10 milliseconds an UDP datagram containing 64 bytes long string (16 times the word "test").

The server on the other hand will be one-thread application reading from socket. The reading thread after retrieving information from the socket will be sleeping for 1 second - 1s is set to show the effect in shorter timeframe but please notice any value above 10 ms (sleep on the client side) will cause this behaviour. In general such problem might happen in real life scenario if the same thread that is reading from socket is doing some processing.

Code for udp echo client:
package org.krystianek.udp.test.echo;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.DatagramPacket;
import java.net.DatagramSocket;
import java.net.InetAddress;
import java.net.SocketException;
import java.util.logging.Logger;
 

public class EchoClient {

          public static void main(String[] args) throws Exception {
                    String hostname = "localhost";
                    InetAddress ia = InetAddress.getByName(hostname);
                    SenderThread sender = new SenderThread(ia, 9999);
                    sender.start();
                  }

}

class SenderThread extends Thread {

          private InetAddress server;
          private DatagramSocket socket;
          private boolean stopped = false;
          private int port;

          public SenderThread(InetAddress address, int port) throws SocketException {
            this.server = address;
            this.port = port;
            this.socket = new DatagramSocket();
            this.socket.connect(server, port);
          }

          public void halt() {
            this.stopped = true;
          }

          public DatagramSocket getSocket() {
            return this.socket;
          }

          public void run() {

            try {
              String theLine ="test
testtesttesttesttesttesttesttesttesttesttesttesttesttesttest";
              while (true) {
                byte[] data = theLine.getBytes();
                DatagramPacket output = new DatagramPacket(data, data.length, server, port);

                System.out.println("Sending : " + output.getData().length + " bytes ");
                socket.send(output);
                Thread.yield();
                Thread.sleep(100);
              }

            }
            catch (Exception ex) {
              System.err.println(ex);
            }
          }
        }


Code for udp echo server:

package org.krystianek.udp.test.echo;

import java.net.DatagramSocket;
import java.net.InetSocketAddress;
import java.net.SocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.DatagramChannel;

/**
 * @author krychu
 *
 */
public class EchoServer {

        public static void main(String[] args) throws Exception {
                DatagramChannel channel = DatagramChannel.open();
                DatagramSocket socket = channel.socket();
                SocketAddress address = new InetSocketAddress(9999);
                socket.bind(address);
                ByteBuffer buffer = ByteBuffer.allocateDirect(65507);
                while (true) {
                        SocketAddress client = channel.receive(buffer);
                        try {
                           Thread.sleep(1000);
                        } catch (Exception ex) {
                           ex.printStackTrace();
                        }

                        buffer.flip();
//                      System.out.println("Server received : " + buffer);
//                      channel.send(buffer, client);
//                      buffer.clear();

                }
        }

}


For the test the classes have been packaged into echo-0.0.1-SNAPSHOT.jar using maven build system.

Monitoring the system

The next step was to identify how to monitor the buffer utilization. There are different ways - in my article I will use the statistics available in the proc filesystem - quite good description can be found in the man page:


$ man proc
...




/proc/net/udp
Holds a dump of the UDP socket table. Much of the information is not of use apart from debugging. The "sl" value is the kernel hash slot for the socket, the "local_address" is the local address and port number pair. The "rem_address" is the remote address and port number pair (if connected). "St" is the internal status of the socket. The "tx_queue" and "rx_queue" are the outgoing and incoming data queue in terms of kernel memory usage. The "tr", "tm->when", and "rexmits" fields are not used by UDP. The "uid" field holds the effective UID of the creator of the socket. The format is:
sl  local_address rem_address   st tx_queue rx_queue tr rexmits  tm->when uid
1: 01642C89:0201 0C642C89:03FF 01 00000000:00000001 01:000071BA 00000000 0
1: 00000000:0801 00000000:0000 0A 00000000:00000000 00:00000000 6F000100 0
1: 00000000:0201 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0
...

There are two files available one for IPv4 (/proc/net/udp) and IPv6 (/proc/net/udp6) - the example output can be found below

$ cat /proc/net/udp
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode ref pointer drops            
  564: 00000000:0801 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13276 2 0000000000000000 0        
  601: 00000000:E026 00000000:0000 07 00000000:00000000 00:00000000 00000000   118        0 12564 2 0000000000000000 0        
...
 1833: 00000000:9CF6 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13808 2 0000000000000000 0        
 1884: 00000000:ED29 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 1145241 2 0000000000000000 0      
$

$ cat /proc/net/udp6
  sl  local_address                         remote_address                        st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode ref pointer drops
    2: 00000000000000000000000000000000:ADCF 00000000000000000000000000000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13287 2 0000000000000000 0
   37: 00000000000000000000000000000000:9DF2 00000000000000000000000000000000:0000 07 00000000:00000000 00:00000000 00000000   104        0 10786 2 0000000000000000 0
...
 1820: 00000000000000000000000000000000:14E9 00000000000000000000000000000000:0000 07 00000000:00000000 00:00000000 00000000   104        0 10784 2 0000000000000000 0 2024: 00000000000000000000000000000000:B5B5 00000000000000000000000000000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13796 2 0000000000000000 0


In addition I developed a simple ruby script for parsing the udp stats.
The full parameter set for the utility:

$ udpstat
Usage: udpstat [options]

Specific options:
    -6, --ipv6                       Enable IPv6 stat parsing
    -K, --kilobytes                  Present values in kilobytes
    -M, --megabytes                  Present values in megabytes
    -p, --ppid pid                   PID of the process to be monitored

Common options:
    -h, --help                       Show this message
    -v, --version                    Show version

    - interval in which statistics are to be reported
       - the number of statistics to be shown

@2013 Krystian Brachmanski, Wroclaw


The example dump can be found below:

$ ./measurements/perftools/udpstat -6 1 2
   Timestamp       Local IP:port        Remote IP:port    rx[B]   rbuf[B]     tx[B]   tbuf[B]   drops
  1353956875 00000000000000000000000000000000:44495 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:40434 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:24105 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:36499 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:51186 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:2049  00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:111   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000001000000:123   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 000080FE00000000FFD724029C55BDFE:123   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:123   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:45593 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:612   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:5353  00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956875 00000000000000000000000000000000:46517 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:44495 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:40434 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:24105 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:36499 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:51186 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:2049  00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:111   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000001000000:123   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 000080FE00000000FFD724029C55BDFE:123   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:123   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:45593 00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:612   00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:5353  00000000000000000000000000000000:0            0  1048576        0  1048576       0
  1353956876 00000000000000000000000000000000:46517 00000000000000000000000000000000:0            0  1048576        0  1048576       0


You can find the utility in google code as part of the yapcg tool: http://code.google.com/p/yapcg/source/browse/measurements/perftools/udpstat

Testing the UDP buffers

Configuration of the test application (actually it is in the code):
- udp port: 9999
- IPv6 socket listening
- system settings:
     net.core.rmem_default = 1048576
     net.core.rmem_max = 1048576

1. Starting the Echo UDP server in the background:
$ java -cp ./target/echo-0.0.1-SNAPSHOT.jar org.krystianek.udp.test.echo.EchoServer &

2. Start the udp6 monitor the ipv6 udp sockets in the background and present the statistics in kilobytes:
$ ./udpstat -6 -KB 1 100000 > udpstat.out 2>&1 &
...

3. Starting the client:
$ java -cp ./target/echo-0.0.1-SNAPSHOT.jar org.krystianek.udp.test.echo.EchoClient

4. Analysis of the results:

- if we take a look at the generated output:
$ cat udpstat.out
   Timestamp       Local IP:port        Remote IP:port     rx[KB]  rbuf[KB]    tx[KB]  tbuf[KB]   drops
...
  1358196897 00000000000000000000000000000000:9999  00000000000000000000000000000000:0            0     1024        0     1024       0
...
  1358196898 00000000000000000000000000000000:9999  00000000000000000000000000000000:0            8     1024        0     1024       0

...
  1358196902 00000000000000000000000000000000:9999  00000000000000000000000000000000:0           38     1024        0     1024       0

...
  1358196906 00000000000000000000000000000000:9999  00000000000000000000000000000000:0           68     1024        0     1024       0

...
  1358196909 00000000000000000000000000000000:9999  00000000000000000000000000000000:0           90     1024        0     1024       0

...
  1358196910 00000000000000000000000000000000:9999  00000000000000000000000000000000:0           97     1024        0     1024       0

...
  1358196917 00000000000000000000000000000000:9999  00000000000000000000000000000000:0          149     1024        0     1024       0

...
  1358196928 00000000000000000000000000000000:9999  00000000000000000000000000000000:0          230     1024        0     1024       0
...

  1358197036 00000000000000000000000000000000:9999  00000000000000000000000000000000:0         1024     1024        0     1024       9  1358197037 00000000000000000000000000000000:9999  00000000000000000000000000000000:0         1024     1024        0     1024       8  1358197038 00000000000000000000000000000000:9999  00000000000000000000000000000000:0         1024     1024        0     1024       7  1358197039 00000000000000000000000000000000:9999  00000000000000000000000000000000:0         1024     1024        0     1024       7
...

As expected the receive socket buffer starts to increase as soon as the UDP echo client starts since the transmission rate of the client is much higher than the reception rate of the server (Server receiver once per second, while client transmitting 100 times 64 bytes per second). As soon as the receive socket buffer utilization level reaches the maximum defined in system kernel parameters one can see packets drops happening (marked with red colour).

Everything seems to be working as expected...

Conclusions

When designing an application that is processing high volumes of data always ensure that the thread reading from the socket is not doing any additional, unnecessary processing since it may cause your socket buffers utilization go high and eventually would cause packet drop. If the additional processing/decoding is necessary then still an option with a thread pool might solve the buffer issue.

Anyway in my opinion the general rule of thumb should be that it is always beneficial to make the socket reading layer as lightweight as possible. With a single threaded approach one could think of creating a queue to which the reader thread would put the payload and a pool of worker threads would read from the queue and continue with the processing.



Brak komentarzy:

Prześlij komentarz