The Internet is not for People

Johnny Five is a jive!

Not good, decent, people like you and me, anyway.

Maybe it used to be. Maybe, back in the old days of akebono.stanford.edu and hit counters and free porn you could find an actual, true, honest-to-goodness, person on the Internet. But not anymore. Nope, not anymore.

You never count your money when you're sitting at the table.

What am I talking about?

Robots man, robots.

Just like in the future, we are living in a world of robots. Or, as they prefer to be called, “bots”. Or, as they prefer us to get used to calling them, “overlords.”

I blame the Japanese.

“Why do I bring this up”, you ask? Why now, when robots have been building our cars, walking on mars, and marrying our daughters for decades?

I’ll give you a clue. It has something to do with that new DreamHost PS service I mentioned a scant one post ago.

Give up?

Good, I win! Now let me explain.

One of the big reasons The only reason you’d want your own Private Server is to be isolated from other sites on your shared server. And the reason you’d want to be isolated from them is so nobody but YOU can crash your server. And the REASON sites crash their server is because they’re getting more visits than they can handle.

It seems to me though, more visits than they can handle is a hugely varying value. For some sites, just one visit is too many. For others, say, a nice static html page, there is pretty much no limit.

Nevertheless, most sites on one of our shared server, even the really poorly coded ones, really have no problem handling a few thousand visitors a day. There only start to be problems when a completely dynamic site gets tens of thousands of daily visitors.

In fact, one of the sites we used to test out DreamHost PS fell in exactly this category. It was a frequently updated, decently popular blog (and for SOME reason, blogs just can’t be static html, can they? oh nooooo….), and on an average day, it got over 10,000 unique page visits (that’s not counting images, css files, etc..).

The blog was constantly causing problems on their shared server. We had them turn on caching, but it would still spike frequently and suck bazookas of memory. I guess it was just TOO POPULAR! Imagine, tens of thousands of good old human beings reading that blog, every single day! It only made sense that a site of this magnitude would need its own private server.

In fact, judging by the amount of load we see on servers, we must host a lot of sites in the five figures of daily visitors. But something about that just didn’t sit right with me. Just from running a few of my own stupid sites, I know how hard it is just to get in the ones figure of daily visitors.

So, I decided to look at the log file from yesterday, August 9th, 2007… and low and behold.. The Internet is not for People:

Pages Percent Type
11406 100% Total Page Views
8033 70.4% Spiders.. Yahoo, Google, MSN, Ask.. including 20% mystery spiders (I assume up to no good!)
1943 17% Comment Spammers
798 7% RSS Readers and Aggregators
632 5.6% Actual Humans ©

We’re a minority out there, you and I! The Internet circa 2007 is made of robots, by robots, for robots! By rampant extrapolation, almost 95% of the page views to the entire Internet are made by machines!

Our future?

However, in the end, all these machines are doing is trying to organize things a little better for us humans. It would be no fun at all to visit every website in the world each and every single time you wanted to find a picture of a monkey eating ice cream! Better to let our future omnipotent masters do the dirty work for now.

(Also! When I examined the “actual humans” visits more closely, 40% of those hits were the result of an image search.. and 35% were multiple pages by the same human. Meaning overall, only 149 different people actually visited that blog yesterday to actually view it in its intended entirety … barely 1% of the total page visits!)

All these robots cause problems though. It’s been well known since 1994 that 99.99% of the sites on the Internet get absolutely NO traffic. It’s how web hosts make money.

But now, that’s all changed! The only thing safe to say is 99.99% of the sites on the Internet get absolutely no HUMAN traffic! Every site now gets search engine spiders, feed aggregators, and spammers.. a veritable ARMY of undead automotrons! And those undead robot hits hit your site just as hard as living human hits.

A ton of times in the past, a site was crashing a shared server, and it turned out all we had to do was block Googlebot from visiting it and everything was fine. We figured that was better than just disabling the entire site, and yet sometimes we caught some crazy flak for it! Which saddened us so greatly we even made a wiki page to try and explain the situation.

It probably would have been better to just disable these sites.. it’s not like any humans were actually visiting them anyway.

After blocking GoogleBot, people then had two options:

1. Keep blocking.
2. Fix their site’s inefficiencies and un-block.

At least now our Happy Customers have a simple third option:

3. Pay us more money.

Just in time too, I’ve noticed my robot attack insurance premiums have been increasing recently… how strange.