Enlarge TCP window to boost network transfer speed

Procedure for raising network limits under Solaris

All sys­tem TCP para­met­ers are set with the ‘ndd’ tool (man 1 ndd). Para­met­er val­ues can be read with: 

  ndd /dev/tcp [parameter]

and set with: 

  ndd -set /dev/tcp [parameter] [value]

RFC1323 timestamps, win­dow scal­ing and RFC2018 SACK should be enabled by default. You can double check that these are correct: 

 ndd /dev/tcp tcp_wscale_always  #(should be 1)
  ndd /dev/tcp tcp_tstamp_if_wscale  #(should be 1)
  ndd /dev/tcp tcp_sack_permitted  #(should be 2)

Set the max­im­um (send or receive) TCP buf­fer size an applic­a­tion can request: 

 ndd -set /dev/tcp tcp_max_buf 4000000

Set the max­im­um con­ges­tion window: 

 ndd -set /dev/tcp tcp_cwnd_max 4000000

Set the default send and receive buf­fer sizes: 

 ndd -set /dev/tcp tcp_xmit_hiwat 4000000
  ndd -set /dev/tcp tcp_recv_hiwat 4000000

Procedure for raising network limits for Windows XP (and Windows 2000)

The easi­est way to tune TCP under Win­dows XP (and many earli­er ver­sions of win­dows) is to get DrT­CP from “DSL Reports” [down­load page]. Set the “Tcp receive win­dow” to your com­puted BDP (e.g. 400000), turn on “Win­dow Scal­ing” and “Select­ive Acks”. If you expect to use 90 Mb/​s or faster, you should also turn on “Time Stamp­ing”. You must restart for the changes to take effect.

If you need to get down in the details, you have to use the ‘reged­it’ util­ity to read and set sys­tem para­met­ers. If you are not famil­i­ar with reged­it you may want to fol­low the step-by-step instruc­tions [here].

BEWARE: Mis­takes with reged­it can have very ser­i­ous con­sequences that are dif­fi­cult to cor­rect. You are strongly encour­aged to backup the entire registry before you start (use the backup util­ity) and to export the Tcpip\Parameter sub­tree to a file, so you can put things back if you need to (use “export” under regedit).

The primary TCP tun­ing para­met­ers appear in the registry under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters.

To enable high per­form­ance TCP you must turn on RFC1323 fea­tures (cre­ate REG_​DWORD key “Tcp1323Opts” with value 3) and set the max­im­um TCP buf­fer­size (cre­ate REG_​DWORD key “Glob­al­MaxTcp­Win­dowSize” with an appro­pri­ate value such as 4000000, decimal). 

If you want to set the sys­tem wide default buf­fer size cre­ate REG_​DWORD key “Tcp­Win­dowSize” with an appro­pri­ate value. This para­met­er can also be set per inter­face at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interface\inter­face­GUID, which may help to pro­tect inter­act­ive applic­a­tions that are using dif­fer­ent inter­faces from the effects of overbuffering.

For the most up to date detailed tech­nic­al inform­a­tion, go to the Microsoft know­ledge base (at sup​port​.microsoft​.com) and search product “win­dows XP” for “TCP/​IP per­form­ance tuning”.

Speedguide sum­mar­izes this mater­i­al with an inter­me­di­ate level of detail, how­ever the tar­get audi­ence is for rel­at­ively low data rates.

There is also very good page on tun­ing Win­dows XP, by Carl Har­ris at Vir­gin­ia Tech.

Procedure for raising network limits under FreeBSD

All sys­tem para­met­ers can be read or set with ‘sysctl’. E.g.:

sysctl [parameter]
sysctl -w [parameter]=[value]

You can raise the max­im­um sock­et buf­fer size by, for example:

	sysctl -w kern.ipc.maxsockbuf=4000000

FreeBSD 7.0 imple­ments auto­mat­ic receive and send buf­fer tun­ing which are enabled by default. The default max­im­um value is 256KB which is likely too small. These should likely be increased, e.g. with follows:


You can also set the TCP and UDP default buf­fer sizes using the variables


When using lar­ger sock­et buf­fers, you prob­ably need to make sure that the TCP win­dow scal­ing option is enabled. (The default is not enabled!) Check ‘tcp_extensions=“YES“ ‘ in /etc/rc.conf and ensure it’s enabled via the sysctl variable:


FreeBSD’s TCP has a thing called “inf­light lim­it­ing” turned on by default, which can be det­ri­ment­al to TCP through­put in some situ­ations. If you want “nor­mal” TCP beha­vi­or you should

         sysctl -w net.inet.tcp.inflight_enable=0

You may also want to con­firm that SACK is enabled: (work­ing since FreeBSD 5.3):


MTU dis­cov­ery is on by default in FreeBSD. If you wish to dis­able MTU dis­cov­ery, you can toggle it with the sysctl variable:


Tuning TCP for Linux 2.4 and 2.6

NB: Recent ver­sions of Linux (ver­sion 2.6.17 and later) have full auto­t­un­ing with 4 MB max­im­um buf­fer sizes. Except in some rare cases, manu­al tun­ing is unlikely to sub­stan­tially improve the per­form­ance of these ker­nels over most net­work paths, and is not gen­er­ally recommended

Since auto­t­un­ing and large default buf­fer sizes were released pro­gress­ively over a suc­ces­sion of dif­fer­ent ker­nel ver­sions, it is best to inspect and only adjust the tun­ing as needed. When you upgrade ker­nels, you may want to con­sider remov­ing any loc­al tuning.

All sys­tem para­met­ers can be read or set by access­ing spe­cial files in the /​proc file sys­tem. E.g.:

	cat /proc/sys/net/ipv4/tcp_moderate_rcvbuf

If the para­met­er tcp_​moderate_​rcvbuf is present and has value 1 then auto­t­un­ing is in effect. With auto­t­un­ing, the receiv­er buf­fer size (and TCP win­dow size) is dynam­ic­ally updated (auto­t­uned) for each con­nec­tion. (Sender side auto­t­un­ing has been present and uncon­di­tion­ally enabled for many years now).

The per con­nec­tion memory space defaults are set with two 3 ele­ment arrays:

	/proc/sys/net/ipv4/tcp_rmem       - memory reserved for TCP rcv buffers
	/proc/sys/net/ipv4/tcp_wmem       - memory reserved for TCP snd buffers

These are arrays of three val­ues: min­im­um, ini­tial and max­im­um buf­fer size. They are used to set the bounds on auto­t­un­ing and bal­ance memory usage while under memory stress. Note that these are con­trols on the actu­al memory usage (not just TCP win­dow size) and include memory used by the sock­et data struc­tures as well as memory wasted by short pack­ets in large buf­fers. The max­im­um val­ues have to be lar­ger than the BDP of the path by some suit­able overhead.

With auto­t­un­ing, the middle value just determ­ines the ini­tial buf­fer size. It is best to set it to some optim­al value for typ­ic­al small flows. With auto­t­un­ing, excess­ively large ini­tial buf­fer waste memory and can even hurt performance.

If auto­t­un­ing is not present (Linux 2.4 before 2.4.27 or Linux 2.6 before 2.6.7), you may want to get a new­er ker­nel. Altern­ately, you can adjust the default sock­et buf­fer size for all TCP con­nec­tions by set­ting the middle tcp_​rmem value to the cal­cu­lated BDP. This is NOT recom­men­ded for ker­nels with auto­t­un­ing. Since the send­ing side is auto­t­uned, this is nev­er recom­men­ded for tcp_wmem.

The max­im­um buf­fer size that applic­a­tions can request (the max­im­um accept­able val­ues for SO_​SNDBUF and SO_​RCVBUF argu­ments to the set­sock­opt() sys­tem call) can be lim­ited with /​proc variables:

	/proc/sys/net/core/rmem_max       - maximum receive window
	/proc/sys/net/core/wmem_max       - maximum send window

The ker­nel sets the actu­al memory lim­it to twice the reques­ted value (effect­ively doub­ling rmem_​max and wmem_​max) to provide for suf­fi­cient memory over­head. You do not need to adjust these unless your are plan­ing to use some form of applic­a­tion tuning.

NB: Manu­ally adjust­ing sock­et buf­fer sizes with set­sock­opt() dis­ables auto­t­un­ing. Applic­a­tion that are optim­ized for oth­er oper­at­ing sys­tems may impli­citly defeat Linux autotuning.

The fol­low­ing val­ues (which are the defaults for 2.6.17 with more than 1 GByte of memory) would be reas­on­able for all paths with a 4MB BDP or smal­ler (you must be root):

	echo 1 > /proc/sys/net/ipv4/tcp_moderate_rcvbuf
       	echo 108544 > /proc/sys/net/core/wmem_max
       	echo 108544 > /proc/sys/net/core/rmem_max
       	echo "4096 87380 4194304" > /proc/sys/net/ipv4/tcp_rmem
       	echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem

Do not adjust tcp_​mem unless you know exactly what you are doing. This array (in units of pages) determ­ines how the sys­tem bal­ances the total net­work buf­fer space against all oth­er LOWMEM memory usage. The three ele­ments are ini­tial­ized at boot time to appro­pri­ate frac­tions of the avail­able sys­tem memory.

You do not need to adjust rmem_​default or wmem_​default (at least not for TCP tun­ing). These are the default buf­fer sizes for non-TCP sock­ets (e.g. unix domain and UDP sockets).

All stand­ard advanced TCP fea­tures are on by default. You can check them by:

	cat /proc/sys/net/ipv4/tcp_timestamps
	cat /proc/sys/net/ipv4/tcp_window_scaling
	cat /proc/sys/net/ipv4/tcp_sack

Linux sup­ports both /​proc and sysctl (using altern­ate forms of the vari­able names — e.g. net.core.rmem_max) for inspect­ing and adjust­ing net­work tun­ing para­met­ers. The fol­low­ing is a use­ful short­cut for inspect­ing all tcp parameters:

sysctl -a | fgrep tcp

For addi­tion­al inform­a­tion on ker­nel vari­ables, look at the doc­u­ment­a­tion included with your ker­nel source, typ­ic­ally in some loc­a­tion such as /usr/src/linux-<version>/Documentation/networking/ip-sysctl.txt. There is a very good (but slightly out of date) tutori­al on net­work sysctl’s at http://​ipsy​sctl​-tutori​al​.frozen​tux​.net/​i​p​s​y​s​c​t​l​-​t​u​t​o​r​i​a​l​.​h​tml.

If you would like to have these changes to be pre­served across reboots, you can add the tun­ing com­mands to your the file /etc/rc.d/rc.local .

Auto­t­un­ing was pro­to­typed under the Web100 pro­ject. Web100 also provides com­plete TCP instru­ment­a­tion and some addi­tion­al fea­tures to improve per­form­ance on paths with very large BDP.

Tuning TCP for Mac OS X

Mac OS X has a single sysctl para­met­er, kern.ipc.maxsockbuf, to set the max­im­um com­bined buf­fer size for both sides of a TCP (or oth­er) sock­et. In gen­er­al, it can be set to at least twice the BDP. E.g:

sysctl -w kern.ipc.maxsockbuf=8000000

The default send and receive buf­fer sizes can be set using the fol­low­ing sysctl variables:

sysctl -w net.inet.tcp.sendspace=4000000
sysctl -w net.inet.tcp.recvspace=4000000

If you would like these changes to be pre­served across reboots you can edit /etc/sysctl.conf.

RFC1323 fea­tures are sup­por­ted and on by default. SACK is present and enabled by defult in OS X ver­sion 10.4.6.

Although we have nev­er tested it, there is a com­mer­cial product to tune TCP on Macin­toshes. The URL is http://​www​.sust​works​.com/​p​r​o​d​u​c​t​s​/​p​r​o​d​_​o​t​t​u​n​e​r​.​h​tml. I don’t endorse the product they are selling (since I’ve nev­er tried it). How­ever, it is avail­able for a free tri­al, and they appear to do an excel­lent job of describ­ing perf-tune issues for Macs.

Pas­ted from http://​www​.psc​.edu/​n​e​t​w​o​r​k​i​n​g​/​p​r​o​j​e​c​t​s​/​t​c​p​t​u​ne/

No Comments

Post a Comment

Your email is never shared. Required fields are marked *