Thursday, September 8, 2011

Visualizing and Investigating a Distributed Scan

I was checking my logs and trying out a new app I created for Splunk. Normally my graph looks like the one below, with the exception of one host that I recently added that was wildly dropping packets on the left. At the bottom-right, there are typically some bursts, but for the most part the averages are low. The scatterplots are fairly tight and low volume. For this packet length average vs. summation plot, most things cluster at the bottom left (low volume), occasionally blip into one of the upper-left (lots of low, slow) or lower-right (a few large packets) corners, and rarely pop-up in the upper-right (lots of big packets, blasting) quadrant. Here is an example:

Normal Dashboard

The Bird is the Word

Today I check the dashboard and the left side (ingress) looked remarkably different. The averages were pretty normal, but there was a steady baseline at the bottom. Also, the summation of dropped packets was much higher than the 'normal' sampling. The scatterplot also diverged more in the ~40 byte range, with a sample of hosts clearly engaged in a lot of small transactions.  This peaqued my interest.

Abby Normal

And Now It's Time for Breakdown...

Investigating the increase in ingress summation (the chart at the bottom-right), I clicked the 'View Results' link on that chart. I then modified it slightly to reduce the number of extracted fields. This improved the search speed remarkably. I also eliminated the aggregate summing the 'timechart' command performs by default after the top 10, which returned all of the hosts. The 'limit=0' parameter achieved this. Now I had a chart that I could graph on a line chart and analyze more closely.


Prep for launch...

I adjusted the default settings slightly to treat null values as zero and added some axis labels. TIP: If you hover the mouse over one of the IP addresses in legend, Splunk highlights the line for that selection. I noticed the 58.218.199.0/24 IP addresses all have that matching bumpity-bump along the bottom, low and slow. This is pretty good evidence of an undiversified distributed scan. I decided to generate some better metrics.

Launch! Graph of FW drops over a timespan.

Which of these is Not Like the Others?

Looking at the rejected packet meta, I wanted to determine just how alike the packets were that were dropping from these hosts. I limited the search to just that subnet with a straight text search. Again, I limited the selected fields to improve the extraction speed. Then I created a timechart to bin by day the average packet length of each server involved. Once the search got underway, I clicked 'Show Report'.


In the Report Builder, I selected an area chart, treating nulls as zero, and stacked the results. As you see in this screenshot, the report showed the distribution of the packet sizes between the servers was even. The scanning cluster also ramped up just after a preliminary poke from a single server, 58.218.199.49.

Poke then plunge. This stacked area graph shows even traffic from the scanning cluster.

Peek-a-boo, I See You

Having identified the scanning cluster, I wanted to see what ports the aggressor was looking to find. During my investigation, I came across this comment at the Internet Storm Center website, followed by many other confirmatory posts at other sites, regarding the nature of the attacker. It appeared they have been active for a number of years.


After identifying the affected ports, I prettied it up a bit with a lookup table to provide the common uses of those ports. It became plain that the scanner was looking for open proxies, whether legitimate, ill-configured, left by malware, or just plain hacked. There were many supporting statements for all of these in the  resources culled to identify these ports.

Proxy ports, proxy ports... Oh, more proxy ports. No secrets here.

Playing Favorites?

Now that I had an inkling what they were up to, I was interested in seeing if they favor certain ports. The data showed that the scanners were coordinated to test the target port set just about evenly.

Even distribution of scanned ports.

Summing It Up

It seems the aggressors utilizing these Chinese servers really like to scan for open proxies. Intent for such a collection, who can say? It could be folks looking for a way to circumvent Chinese proxy filters. However, judging from the technical coordination and longevity of this scanning behavior, it would seem more coordinated than that. In the least, better funded, and possibly sanctioned. Regardless of speculation, it is without doubt that the aggressors seek open proxy servers without permission, even trolling for botnet proxy remnants, and at the expense of those they locate when they turn on the tubes.

4 comments:

Michael Wilde said...

One word. WOW! Excellent example on how to take data, turn it into information, and massage to become knowledge with Splunk. Email ninja[at]splunk[dot]com and i will have a pizza of your choice delivered to your home or office.

jaymcjay said...

+1 to Michael. Good work with log visualization!

jack@monkeynoodle.org said...

very nice work, this is an excellent use of Splunk.

Unknown said...

it's really a information full post. thanks to shear . this post has removed my some mistaken thing . i thing if you bring on your acctivetice you will achive much popularety.. at last..thanks.

Information visualization Low