[Elecraft] Why the internet (and the elecraft list) is slow

[email protected] [email protected]
Wed Jan 29 15:31:17 2003



On Wed, 29 Jan 2003, Robert C. Abell wrote:

> What has happened to the E-mail list for the last two days?
> Bob  VE3XM

As many of you may know, at 0530GMT on Saturday, 25 January 2003, an
internet worm attack that has been dubbed "SQL Slapper" and "Saphire" was
launched.  This attack exploited a buffer overflow vulneravility in
Microsoft SQL Server 2000 with a patch level prior to SP3. (Not to be
confused with Windows 2000 SP3 "Service Patch 3" which is a patch for the
Windows 2000 operating system but NOT for SQL Server 2000 - the source of
much confusion as delinquent SQL server operators tried to patch their now
compromised servers, something that they should have done MONTHS previous
when the vulnerability was announced.)

Members of the NSP-SEC (Network Service Provider Security) community
identified this attack very quickly and many providers had filters in
place as soon as 0550GMT. UUNet officially announced to the NSP-SEC
community that they had implemented filters to mitigate this attack at
0625GMT. These filters, in their most simple form, will block ANY traffic
destined for ANY location with a destination port of 1434 UDP.  By Sunday
afternoon, all but the most inept of ISPs/NSPs had implemented some
permutation of the filter described above in effort to (a) protect their
clients from the rest of the world, and (b) protect the rest of the world
from their clients.

The problems with latency, inability to access websites, and mailing list
delay that some people have noted are caused both by the attack itself
(which is still ongoing) and as a side effect caused by some filters being
used.

The attack itself consists of a single 404-byte packet destined for
UDP:[VICTIM IP]:1434.  If received by a vulnerable host (one running
Microsoft SQL Server 2000 at patch level prior to SP3), that host is
infected and will immediately start trying to spread itself to other
servers.  It does this by sending the same 404-byte UDP packet to random
IP addresses at the rate of @4700 per second for the first second settling
down to around @2600 per second thereafter.

The shear amount of packets per second produced by this worm was enough to
cause some core routing and switching equipment to fail under load.  This
was more typical of flow based switching systems vs packet switched
systems because, as a result of the completely random target addresses
generated by the worm, the flow switching systems had to track at minimum
2000+ new flows per second.

Beyond the problem outlined with flow switching, providers or rather their
networks had to deal with both extremely high packets-per-second compared
to typical operation and extremely high bits per second.  Each single
attached (single network interface) SQL server that was infected was
generating on average 8.4Mb/s of traffic.  For those of you unfamiliar
with data rates, it would require 6 T1 circuits to carry that traffic, all
from a _single_ infected host and with a search rate of @2600 attempts per
second from each infected host, the worm spread very quickly.  It is
estimated that as of Sunday morning, there were as many as 200,000
infected hosts.

200,000 infected hosts x 8.4Mb/s of traffic each = 1.68 Terrabits per
second of traffic generated by this attack.  That is traffic IN ADDITION
TO the normal traffic like little Suzy downloading her favorite albumn in
MP3 format or little Joey surfing PR0N until his little 13-y/o heart goes
into arrest.  As you may have guessed, this type of additional traffic
causes serious problems.

Now, as if the packets-per-second and bits-per-second overload were not
enough, we have an additional problem to deal with that is a side effect
of some of the filters in place to mitigate the attack.

In TCP/IP, you have Privleged Ports in the range of 0 - 1023 and
Non-Privleged Ports in the range of 1024 - 65534. "Services" or processes
necessary to normal operation generally use Privleged Ports while "user"
programs will use Non-Privleged Ports.  This presents a problem when
"Service Providers" (NSPs/ISPs) attempt to use very course grained filters
to mitigate an attack such as the one we are discussing.

As discussed above, a very simple filter (and the one in use at a great
majority of NSPs world wide) simply blocks ANY UDP packet destined for
port 1434.

This filter may block legitimate communications though.  In the most
common case, the USERs PC does a DNS query to find the IP address of their
favorite website (www.elecraft.com for example).  Their PC picks a
Non-Privleged LOCAL port, 1434 for instance, and sends a DNS query from
that port to port UDP port 53 of their configured DNS server.  The DNS
server replies with the address of www.elecraft.com (63.249.74.212) with a
UDP packet from port 53 of the server to port 1434 of the users machine.  
OOPS!  That packet will never make it back to the user because we're
blocking all packets destined for UDP/1434.  Now the user machine has to
retry its query to the DNS server and hopefully, it will pick a new local
port to query from. (It will in 90% of the cases.) This is only the most
common of problems that will effect the largest set of users.  Many other
common internet services use UDP as well.

On Monday morning (and ever since), many people found that they could no
longer communicate with their remote SQL servers.  This is because every
provider worth anything is blocking UDP/1434 which is used by MS SQL
Server 2000 to determine which authentication method will be used for the
session.  OOPS!  No more database server until either they convince their
provider (and every other provider between them and their server) to drop
the filters or they implement a VPN from their location to the location of
their SQL server. (This is something that should have been done in the
first place but, then again, we don't have a licensing requirement to use
the internet, YET, so we have to deal with fools and idiots who wouldn't
know proper network security if it came up and bit them in the
rump! <sigh>)

The inability to communicate with the SQL server by a front-end server
(like mailman.qth.net) will lead to problems in authentication, content
delivery, mailing list function and management, etc.

A combination of all of the above, along with other factors not mentioned
for a variety of reasons, are the cause of probably 90% of the slowdowns
or "not working" issues that people have reported since Saturday.

I hope that at least someone will have taken the time to read through this
email and it has helped explain what some of the current problems
are.  Perhaps that one person won't beat up their NSP/ISP the next time
they can't load a webpage realizing that perhaps the provider is up to
their armpits trying to deal with an attack.

BTW: Pretty much 24x7x365, there is an attack ongoing.  Some are larger
than others, as the "SQL slammer" has demonstrated.  ALL of the attacks
require the tireless efforts of the people on the front lines.  It ain't
all fun and games.  Not even 50% of it.

73 de K4WTF

---
John Fraizer              | High-Security Datacenter Services |
President                 | Dedicated circuits 64k - 155M OC3 |
EnterZone, Inc            | Virtual, Dedicated, Colocation    |
http://www.enterzone.net/ | Network Consulting Services       |