I don't want to bash the hosting provider, especially since they did make efforts to fix my concerns. I often wondered if I cost them most in tech support than the meager fee I was paying for the service. I also don't have experience with of different ISPs for comparison.
In the midst of my anguish over Anibit's performance, I researched a lot of metrics tools and ideas for being smarter about figuring out how to keep track of performance history. Anbits performance could vary by an order of magnitude over the course of a week, but I had no idea when the problems would first show up. I wanted a tool that could track that for me, and I wanted it for free. I could not find anything, so I decided to write something.
Enter "Stupid Simple Website Metrics"
The Stupid Simple Website Metrics tool (from now on shorted to SSWM), is made up of two parts, the metrics collecting part, and the results viewer part. Both are implemented in Python 3.4, though they may work in other Python versions, you probably need Python 3.X at the very least. The metrics collection application is meant to be run from a command-line, and uses a JSON based configuration file to control it's operation. The results viewer is a Python script that implements a web server you can point your browser to see the historic metrics data.
The development was driven by the following goals: 1) Keep it simple(to develop and use) 2) work without requiring any system-wide software or Python modules to be installed. It all works with standard Python 3.4 libraries. The only software you need to have installed is curl. On Windows, I just put a curl executable in the same location as the script. Curl comes standard with most Linux distributions.
The metrics collector loads a list of URL's from the configuration file, and then launches curl to fetch the initial html for that URL. It does not, at least for now, load any other file than the root document. This means its metrics do not test the timing of the complete browser experience, just the amount of time in loading the initial html. This is still important information, in my case, that was where the bottleneck was most prominent. Curl is launched with a parameter to log page load timings. The metrics application then parses this log to determine page loading benchmarks. These are then saved to a SQLite database. It does this for each URL in the configuration.
There are currently two metrics points captures for each URL tested. In a nutshell, I wanted to know how long the server spent in CPU-intensive work preparing my content, and how much time it took in total to download. For static html pages, the first metric should be almost negligible, but but many PHP, Python, Perl, Ruby, Node, etc based websites, there is a significant amount of processing and database I/O that the server must do to determine what html needs to be sent to the browser. Drupal is known for being a powerful PHP web platform, but it is also a heavyweight in processing power needed. A single page on Anibit can require dozens and even hundreds of database queries. Anibit makes heavy use of multiple caching strategies to alleviate this: generated HTML is cached sp that the same page does not have to be regenerated every time, and database queries are cached to memory, making frequent queries sometimes a 1000 times faster! When I was having severe performance problems, it was the generation of content and the queries that were responsible for large part of the slow performance. You can tell how long a server was "thinking" about your your web page in most desktop browsers. Chrome has a superb set of tools for timing and analyzing any webpage built in to it(Firefox is great too). Let's take a look at Chrome's analysis of the Anibit home page:
That green bar, labeled "Waiting TTFB" represents "Time to First Byte". That means the time until the web server started sending HTML. Before then, it is "thinking". The SSWM tool measures something similar, though not quite the same thing. The metric measured is "TTFDB", "Time to first data byte", since it is possible that some servers may send _some_ bytes that are HTTP headers, and not the actual web page contents. The actual amount of time spent sending HTTP header information is typically very small, but it could lead to overly optimistic metrics, see this article from a real expert. SSWM tries to mitigate that by looking for actual HTTP data traffic timings in the curl log.
The second metric captured is the total loading time, represented by adding the blue bar from the Chrome metrics above.
What this tool does not capture is the total time to load and render all parts of any given page. It only instruments the root html of a given URL. This means it's not great for comparison to timings form other tools, but it is good for comparing your site's performance today to your site's performance two weeks ago.
You can check out the code, and documentation on Github here. And don't forget that if it doesn't do what you want, I can add features for a little bit of coin. :)