Yearly Archives: 2011

Tech Advent Calendars

Several awesome tech and programming communities create advent calendars each year with a different article or demo for each day of December. Here are the ones I’m following.

Update: See my updated 2012 Advent Calendar list. It uses the same RSS feed as before.

I’ve also created a Yahoo Pipe to combine all of these RSS feeds into one: 2011 Tech Advent Feed.

Performance

website: http://calendar.perfplanet.com/2011/
feed: http://calendar.perfplanet.com/feed/

Performance Advent Calendar 2011

24ways

website: http://24ways.org/
feed: http://feeds.feedburner.com/24ways

24ways Advent Calendar 2011

Perl

website: http://perladvent.org/2011/
feed: http://perladvent.org/2011/atom.xml

Perl Advent Calendar 2011

PHP

website: http://phpadvent.org/2011
feed: http://feeds.feedburner.com/phpadvent

PHP Advent Calendar 2011

Clean Up Android Downloaded Files

Today I randomly noticed on my HTC Evo a “Downloads” app which brings you directly to all of the downloaded files and attachments.

Android Downloads Icon

Look for the Downloads application

I didn’t realize there was a separate app for this, but had wondered if there was a way to return to browser downloaded files. Within Downloads, I found over 65MB of files, some of which I downloaded over a year ago. With a few taps I cleaned them all out. These are treated separately and outside of the browser cache, so if you download a lot you should periodically clean them up.

Android Downloads Screenshot

The Downloads app shows how much space is used by everything you’ve ever downloaded

Effective Android Screenshots

Taking screenshots from an Android device is similar to other platforms, although a bit more setup is needed. With a little bit of additional editing, your screenshots can look clean and professional.

To get your PC ready for taking Android screenshots, refer to addictive tips for a good set of instructions; setting up a Mac is a similar process.

In the instructions below, we’ll clean up the extra icons that appear in the notification area (the top left corner of the screen) when the device is connected via a USB cable. In this example screenshot, notice the extra icons that we want to clean up:

Android Screenshot 'Before' Example

Follow these instructions on your Mac to take better screenshots and clean those up:

  1. Prepare the Android device with the screen or application you’re taking a shot of
  2. Connect to your PC/Mac via USB cable
  3. Take the screenshot and save as a PNG file (refer to addictive tips for instructions)
  4. Open the PNG file in the Preview Mac app
  5. Zoom in once or twice and scroll to the upper left corner
  6. Select a rectangular area that is “clean” (showing just the header background)
  7. Copy
  8. Paste
  9. Use the arrow keys to move the pasted block to the left (using the keys keeps the copied rectangle in the proper vertical position)
  10. Repeat as needed to cover the undesired icons

This close-up screenshot show the copy area from the clean background:

Android Screenshot Select

And here we have the clean area being pasted over the icons we are hiding:

Android Screenshot Copy Paste

Finally, the end result:

Android Screenshot 'After' Example

Using Splunk to Analyze Apache Logs

Splunk is an enterprise-grade software tool for collecting and analyzing log files and other data. Actually Splunk uses the broader term “machine data”:

At Splunk we talk a lot about machine data, and by that we mean all of the data generated by the applications, servers, network devices, security devices and other systems that run your business.

Certainly, log files do fall under that umbrella, and are probably the easiest way to understand Splunk’s capabilities. The company offers a free license which has some limitations compared to a paid enterprise license, the most significant limitation being a maximum of 500 MB/day of indexed data. (For more details, see the differences between free and enterprise licenses.) To learn Splunk, or to use it for personal or smaller sites, the limitations are manageable and the free product is a great option.

In this example I’ve uploaded logs from a couple of my websites and let Splunk index them. I also explain the process I used to identify a rogue user agent which I later blocked.

To get started with Splunk, visit the download page and get the appropriate version for your platform. Follow the installation manual (from the excellent documentation site) to get the software installed and configured.

There are several ways to get data into Splunk; for this case I told it to monitor a local directory for files and manually told it the host name to expect. Then I copied down about 6 months’ of compressed Apache logs into that target directory. You can repeat this for each site, using a separate directory and separate hostname.

Splunk will quickly index your data which you’ll see in the Search application. I suggest going through their quick overview to help learn what’s going on. Click on a hostname to start viewing data that Splunk has indexed. Because Splunk automatically recognizes the Apache log file format, it already knows how to pull out the common fields which you can use for searching and filtering, as shown in this screenshot:

Splunk Fields Example

In my case after poking around a bit, I noticed a pretty high amount of traffic fetching my RSS feed file (/rss.xml). The screenshot below shows the number of daily requests, normally hovering around 400 but peaking at about 2,000 per day (click for larger image):

RSS File Accesses Over Time
RSS File Accesses Over Time

By clicking on the useragent field, I found that an agent named “NewsOnFeedsBot” was accounting for over 60% of the total requests (click for larger image):

User Agent Breakout Chart
User Agent Breakout Chart

Once I filtered on just the NewsOnFeedsBot useragent, some more details emerged:

  • The HTTP status code for all requests was 200, meaning it was doing a full request of the 36KB file each time. (Whereas a well-behaved bot would use if-modified-since or other techniques.)
  • All requests were coming from a single IP address
  • The bot was basically continuously fetching the RSS file several times a minute

Blocking this poorly-behaving bot was just a matter of checking for the useragent string and returning a 403 Forbidden response. After I made the change, the bot made a handful of further requests, received the 403, then stopped. At least it has some logic that indicated it should give up trying to fetch this file.

It’s been about a month since I blocked this bot, so I wanted to see an overview of the results. Splunk has a nice built-in charting capability which I used to stack the most popular useragents (again, just for the rss.xml file) and show their portion of visits over the past few months. You can see in the picture below that NewsOnFeedsBot was by far the biggest contributor over the summer, but now it’s gone (click for larger image):

User Agents Over Time
User Agents Over Time

Debug HTTP Traffic From Android Tablets Using Fiddler

Update 2013-11-06: See my newer guide for connecting Android 4.x devices with Fiddler: Capture Android Mobile Web Traffic With Fiddler

Having recently upgraded my Samsung Galaxy Tab 10.1 to the latest Android “Honeycomb” 3.1 release, I wanted to take a closer look at watching the HTTP web traffic from and to the device. Using the Fiddler web debugging tool on Windows, I was able to set this up rather quickly with these steps:

On the PC, Install Fiddler if needed. After install, configure Fiddler by opening the options panel (menu Tools | Fiddler Options). Select the Connections tab and enable the ‘Allow remote computers to connect’ option. Note the ‘Fiddler listens on port’ option (normally 8888), and close the dialog. Exit and restart Fiddler.

Fiddler Options

On the PC, determine it’s current IP address (open a command prompt, then type ipconfig).

On the Android tablet, install HTTP Proxy Settings app on the tablet. This app simply brings up the “HTTP Proxy” setting panel so you can make changes. Start the HTTP Proxy Settings app and enter your PC’s IP address as the host, and port number 8888.

Update 2012-02-20: With the latest updates on my Galaxy Tab 10.1, there is no longer a need for the 3rd-party Proxy Settings app. See this updated note for details: Galaxy Tab 10.1 HTTP Proxy Settings.

HTTP Proxy Settings

Now run the Android Browser and you should see HTTP traffic routed through Fiddler on the PC. Below see an example of visiting Yahoo’s “tablet” home page:

Fiddler Log Results

When you’re done, don’t forget to run the HTTP Proxy Settings app again to clear out the host and port fields. (Otherwise your tablet browser will become unusable when Fiddler is no longer running.)