{"id":1226,"date":"2012-04-25T15:23:58","date_gmt":"2012-04-25T19:23:58","guid":{"rendered":"http:\/\/mossiso.com\/?p=1226"},"modified":"2014-09-22T14:09:13","modified_gmt":"2014-09-22T18:09:13","slug":"filling-in-the-missing-dates-with-awstats","status":"publish","type":"post","link":"https:\/\/mossiso.com\/2012\/04\/25\/filling-in-the-missing-dates-with-awstats\/","title":{"rendered":"Filling in the missing dates with AWStats"},"content":{"rendered":"
Sometimes AWStats will miss some days in calculating stats for your site, and that leaves a big hole in your records. Usually, as in my case, it’s because I messed up. I reinstalled some software on our AWStats machine, and forgot to reinstall cron. Cron is the absolutely necessary tool for getting the server to run things on a timed schedule. I didn’t notice this until several days later, leading to a large gap in the stats for April.<\/p>\n
Fortunately, there is a fix. Unfortunately, it’s a bit labor intensive, and depends on how you rotate your apache logs (if at all, which you should). The AWStats Documentation<\/a> (see FAQ-COM350 and FAQ-COM360) has some basic steps to fix the issue, outlined below:<\/p>\n Again, depending on how you have Apache logs set up, this can be an intensive process. Here’s how I have Apache set up, and the process I went through to get the missing days back into AWStats.<\/p>\n We have our Apache logs rotate each day for each domain on the server (or sub-directory that is calculated separately). This means I’ll have to do this process about 140 times. Looks like I need to write a script…<\/p>\n AWStats can’t run the update on older months if there are more recent months located in the data directory. So we’ll need to move the more recent month’s stats to a temporary location out of the way. So, if the missing dates are in June, and it is currently August, you’ll need to remove the data files for June, July, and August (they look like this First step is to get all of the logs for each domain for the month. This will work out to about 30 or 31 files (if the month is already past), or however many days have past in the current month. For me, each domain archives the days logs in the following format We’ll use the find command to find the correct files. Before we construct that command, we’ll need to create a couple of files to use for our start and end dates.<\/p>\n Now we can use those files in the actual find command. You may need to create the Now unzip those files so they are usable. Move into the If you are doing the current month, then copy in the current apache log for that domain.<\/p>\n This puts all of the domains log files for the month into a directory that we can use in the AWStats update command<\/p>\n Things to note: You need to make sure that each of the log files you have just copied use the same format. You also need to make sure they only contain data for one month. You can edit the files by hand or throw some fancy sed commands at the files to remove any extraneous data.<\/p>\n Now comes the fun part. We first run the logresolvemerge tool on the log files we created in the previous step to create one single log file for the whole month. While in the <code>\/tmp\/apachelogs\/<\/code> directory, run:<\/p>\n Now, we need to run the AWStats update tool with a few parameters to account for the location of the new log file.<\/p>\n If you moved any of the AWStats data files ( <\/p>\n Yeah, that fixed it!<\/p>\n <\/p>\n <\/p>\n","protected":false},"excerpt":{"rendered":" Doh! Sometimes AWStats will miss some days in calculating stats for your site, and that leaves a big hole in your records. Usually, as in my case, it’s because I messed up. I reinstalled some software on our AWStats machine, and forgot to reinstall cron. Cron is the absolutely necessary tool for getting the server … Continue reading Filling in the missing dates with AWStats<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":1227,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[167,170],"tags":[235,236,23],"class_list":["post-1226","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technical","category-websites","tag-awstats","tag-bash","tag-bash-code"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/mossiso.com\/wp-content\/uploads\/2012\/04\/Screen-Shot-2012-04-24-at-1.52.52-PM.png","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9wosP-jM","_links":{"self":[{"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/posts\/1226"}],"collection":[{"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/comments?post=1226"}],"version-history":[{"count":20,"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/posts\/1226\/revisions"}],"predecessor-version":[{"id":1646,"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/posts\/1226\/revisions\/1646"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/media\/1227"}],"wp:attachment":[{"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/media?parent=1226"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/categories?post=1226"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mossiso.com\/wp-json\/wp\/v2\/tags?post=1226"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}\n
The Devil’s in the Details<\/h3>\n
Step 1. Move the data files of newer months<\/h3>\n
awstatsMMYYYY.domain-name.com.txt<\/code> where MM is the two digit month and YYYY is the four digit year) to a temporary directory so they are out of the way.<\/p>\n
Step 2. Get the Apache logs for the month.<\/h3>\n
domain.name.com-access_log-X.gz<\/code> and
domain.name.com-error_log-X.gz<\/code> where the X is a sequential number. So the first problem is how to get the correct file name without having to look in each file to see if it has the right day? Fortunately for me, nothing touches these files after they are created, so their mtime (the time<\/strong> stamp of when they were last m<\/strong>odified) is intact and usable. Now, a quick one-liner to grab all of the files within a certain date range and put their content in a new file.<\/p>\n
touch --date YYYY-MM-DD \/tmp\/start<\/pre>\n
touch --date YYYY-MM-DD \/tmp\/end<\/pre>\n
\/tmp\/apachelogs\/<\/code> directory first.<\/p>\n
find \/path\/to\/apache\/logs\/archive\/ -name \"domain-name.com-*\" -newer \/tmp\/start -not -newer \/tmp\/end -exec cp '{}' \/tmp\/apachelogs\/ \\;<\/pre>\n
\/tmp\/apachelogs\/<\/code> directory, and run the gunzip command.<\/p>\n
gunzip *log*<\/pre>\n
cp \/path\/to\/apache\/logs\/current\/domain-name.com* \/tmp\/apachelogs\/<\/pre>\n
Step 3. Run the AWStats logresolvemerge and update tool<\/h3>\n
perl \/path\/to\/logresolvemerger.pl *log* > domain-name.com-YYYY-MM-log<\/pre>\n
perl \/path\/to\/awstats.pl -update -configdir=\"\/path\/to\/awstats\/configs\" -config=\"domain-name.com\" -LogFile=\"\/tmp\/apachelogs\/domain-name.com-YYYY-MM-log\"<\/pre>\n
Step 4. Move back any remaining files<\/h3>\n
awstatsMMYYYY.domain-name.com.txt<\/code> like for July and August in our example) now’s the time to move them back where they go.<\/p>\n
<\/a>