Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-510

NAKACK: adjust retransmission times based on statistics


    • Type: Feature Request
    • Status: Resolved (View Workflow)
    • Priority: Minor
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 2.6
    • Labels:


      NAKACK can maintain a rolling average of the time it takes for a missing message to get retransmitted (time between sending the XMIT-REQ and reception of the missing message). This can be done per sender.

      The retransmit_timeout values can be based on the rolling average, e.g. make this completely dynamic: instead of a set of retransmit timeouts we only define a retransmit_timeout, e.g. 30ms.

      If a message is not able to get retransmitted the first time, we simply double that value (exponential backoff), until a max limit is reached. For each successfully retransmitted message, we reduce the timeout linearly. This is similar to slow start in TCP.

      When we discover that the rolling timeout is less than 30ms (e.g. 4ms), then we lower the retransmission timeout, e.g. to 4 ms (plus possibly a safety time, say 2ms), to a total of 6ms. This will make message retransmission faster, since we don't have to wait for 30ms to ask for retransmission.

      OTOH, if the rolling average increases, we can also increase our retransmission timeout, to avoid overloading the network with spurious retransmission requests.

      RESULT: message retransmission is more dynamic:

      • Retransmit timeout is per sender
      • On low retransmission ack times, we ask for retransmission sooner, therefore increasing message rates
      • On high retransmission ack times, we throttle retransmission timeouts (and use exponential backoff), therefore reducing traffic

        Gliffy Diagrams




              • Assignee:
                belaban Bela Ban
                belaban Bela Ban
              • Votes:
                0 Vote for this issue
                0 Start watching this issue


                • Created: