Inexpensive (semi-)persistent hit counting
September 24th, 2008 by admin | 1 Comment | Filed in UncategorizedThe vast majority of database queries (at least, in a web application) are SELECT statements. In my experience, unless you get Digg’ed, it’s usually somewhere between 80% and 90% reads vs writes. The exception to this rule is web forums and other software which tracks hits. For every page query, you have to increment the thread’s ‘hit’ column. This involves opening a database connection, submitting your update query, and closing the socket once the transactions been completed. As you can imagine, this quickly becomes very expensive. vBulletin’s method of dealing with this is simply not reporting accurate hit stats. Unacceptable in my opinion.
Alternatively, what one could do is store the hits in an array and access the memory directly for every query. Any page render queries this array instead of the database. In practice, I’d probably store this in memcache. At a given interval, a call gets made to the memory (memcached store) and the number of hits incremented gets commited to the database. Since you’re only storing the hits incremented, this should scale pretty well if necessary between application servers. If the memcache happens to crash, in most cases no important data is lost.
