Means.Us.Com

Website DIY - tricks and solutions

PHP: Automate the update of your Maxmind Geoip lite Database files

Prerequisites: Basic knowledge and experience of PHP. Ability to schedule script (cron job/task manager).
It’s assumed you use Maxmind’s GeoLite (geoip) and have their free code and database files installed on your site.


Introduction:


updatemaxemail

Maxmind updates its GeoLite data files every month. Installing the latest data file on my site was one of those (never) do later tasks – increasing the chances of incorrectly identifying a visitor’s location. To prevent this I decided to automate the update.

I found a few Cron Jobs and Bash scripts; but they didn’t cater for error notification & recovery; so I thought I’d write my own PHP script and share it here.

Please read the Warning/Disclaimer (below), and test before live use. You can set NOTIFICATION_MODE to “3” (screen display), to avoid bombarding yourself with emails during initial testing.



I’d appreciate feedback, especially on its use with Windows Servers (not tested in this environment); and what, if any, modifications are required.


About the Script:


Although designed for Maxmind files, it also works with gzips from other remote sites. The extracted file is given the filename of the gzip minus its “.gz” extension.

Most Maxmind files are gzips; and this script ignores Zips (less code for you to read); a later article will cover a general purpose script (any site, many archive types). If you need a Zip option now, post a comment and I’ll reply with the unZip function from that script.

The script is intended to:

  • get a gzip file from Maxmind and copy it to your server
  • extract the .mmdb, dat, or csv “database” to a file in the same directory
  • check for errors and if necessary revert back to previous database file
  • provide information to identify the cause of errors during testing/live running
  • notify you of errors by email

Change constants and variables to suitable values e.g. your own email address and file paths. The current URLs for Maxmind’s database archives are listed here: Geolite (legacy) | GeoLite2 ( public beta).

The script backs-up the current gzip and database file before update (prior back-ups are overwritten).  On error, the database file (only) is recovered from the back-up.

See below for scheduling, code explanation, and issues that may affect you.


Scheduling the script




This is not a tutorial, you should already be competent using your server’s scheduler; if necessary seek advice from your provider.  Useful guides: CronJobs | Windows Task Scheduler.

Cron job emails:

Cron notification emails do not differentiate between a script’s success and failure; and according to my host, most users do not read them.

cronmail

The script provides its own email to more effectively alert you of errors, but obviously can’t identify errors in the cron job command itself; so you may still want to set the cron job to email you for stderr.

Example commands that may work for you:

php /path/to/update_maxmind_dbfile.php or;

wget -O – http://example.com/update_maxmind_dbfile.php

You may also need to insert a path in front of the php/wget command.
Some hosts block use of wget (but may allow curl).


Code Explanation


Hopefully the only parts of the code requiring explanation or comment are:


031DEFINE ('ADMIN_EMAIL_ADDRESS','admin@example.com');
032DEFINE ('THIS_SITE', 'OneOfYourDomains.com');
033DEFINE ('NOTIFICATION_MODE', 1); // 1=email, 2=screen + email, 3=screen
034DEFINE ('FROM_URL', 'insert URL for required file here');  //see article
035DEFINE ('FROM_FILENAME',basename(FROM_URL)); 
036// path and file on our server:
037$ourDirPath = realpath(dirname(__FILE__)) . '/'; // see article
038$ourGzFile = $ourDirPath . FROM_FILENAME;
039
040// data file name is the name of gz archive without the ".gz" or "gzip" extension
041$ourGeoDataFile = preg_replace('/.gzz|.gzipz/i', '', $ourGzFile); 
042if ($ourGeoDataFile == $ourGzFile) notifyAndExit('SCRIPT NOT RUN', 'FROM_FILENAME "' . FROM_FILENAME. '" does not have a ".gz/.gzip" extension');

Lines 31 to 37: You will have to set constants and variable to suit you and your site’s requirements.  (Most of you will only use one Maxmind database;  and using POST and CL arguments (see Scheduling)  instead would have made the code/article unnecessarily long.)

If you modify the code to use input variables, then I suggest you whitelist allowed files as part of your security.

Line 37 sets the directory path for our files to the same directory the script is in:

037$ourDirPath = realpath(dirname(__FILE__)) . '/';

If you want to use another folder then apply normal directory traversal techniques e.g. if the script is in “/parentDir/scriptDir/” but you want the data file in “/parentDir/geoip/” then change $ourDirPath value to realpath(dirname(__FILE__)) . ‘/../geoip/’;.

Many of you could build the path using $_SERVER[‘DOCUMENT ROOT’]; however this won’t work for everyone in all circumstances.

This post explains why it may be necessary to use realpath(dirname(__FILE__)) or, from PHP 5.3,  realpath(__DIR__).  (link to be added – when written!)



048// if directory doesn't exist (e.g. first time use) then create it
049if ( ! file_exists($ourDirPath) && ! mkdir($ourDirPath,0755) ) { 
050  notifyAndExit('FAIL!!!', $ourDirPath . ': no permissions to "view" dir, or dir not found and unable to create. Last error=' . implode(' | ',error_get_last()) );
051}
052
053backupFile($ourGzFile);
054// the next 2 lines also handle recovery + notification + exit on error detection
055copyMaxmindFile(FROM_URL,$ourGzFile);
056backupDataFileThenGzExtract($ourGzFile, $ourGeoDataFile);
057notifyAndExit('Success', 'file updated');

Lines 49 – 51: If the directory does not exist, create it with 755 “file” permissions otherwise report error and exit

if ( ! file_exists($ourDirPath) && ! mkdir($ourDirPath,0755) ) {...

Note: the permissions parameter is ignored on Windows systems.

file_exists() ALSO RETURNS FALSE if the script does not have directory “execute” permissions – see Issues.  The script should identify this problem.



090function copyMaxmindFile($fromURL,$ourGzFile) {
091  // first open file on our server for create/overwrite by CURL
092  if (! $fh = fopen($ourGzFile, 'wb')) {
093    notifyAndExit('FAIL!!!', 'Failed to fopen ' . $ourGzFile . ' for write :' . implode(' | ',error_get_last()) );
094  }
095  
096  $ch = curl_init($fromURL);
097  curl_setopt($ch,CURLOPT_FAILONERROR,TRUE); // identify as error if http status code >= 400
098  curl_setopt($ch, CURLOPT_HEADER, 0);
099  if( !curl_setopt($ch, CURLOPT_FILE, $fh) ) notifyAndExit('FAIL!!!', 'ABORTED - curl_setopt(CURLOPT_FILE) fail: ' . $ourGzFile);
100  curl_exec($ch);
101  if(curl_errno($ch)|| curl_getinfo($ch, CURLINFO_HTTP_CODE) != 200 ) {
102    fclose($fh);
103    $msgDetail = 'CURL error: ' . curl_error($ch) . ' for ' . FROM_URL . ' (HTTP status '. curl_getinfo($ch, CURLINFO_HTTP_CODE) . ')';
104    curl_close($ch);
105    notifyAndExit('FAIL!!!', $msgDetail);
106  }
107  curl_close($ch);
108  fflush($fh);
109  fclose($fh);
110  clearstatcache(TRUE,$ourGzFile); // without this filesize may return old file size
111  if(filesize($ourGzFile) < 1) notifyAndExit('Abnormal End', $ourGzFile . ' written to your server is empty');
112}

Standard cURL code is used in the function – if cURL is new to you check out one of the many CURL tutorials.

You could replace all bar the last line of this function with a single line:

if (!copy(FROM_URL,$ourGzFile)) notifyAndExit('FAIL!','some msg’);

However this won’t provide sufficient error information, and might even be blocked by your server security. See other reasons why you should use cURL. (link to be added).


Issues


Permission Denied (Unix/Linux and users, owners, permissions):

The script may report that it is unable to create/write to the files in your Maxmind Directory this is probably an ownership/permissions problem.

If you originally created a file via say by anonymous FTP it will be “owned” by that user; this probably won’t be the same user that your (CLI or Web Served) scripts run under. So 644 permissions (Owner: read/write; Group and World: read) would prevent the script writing to the file; likewise a script cannot access a directory without execute permissions.

In these circumstances, you can either change to less secure permissions, or delete the dir & file (make sure there are no other files you need in the dir) and use the script to create from scratch.

This forum answer provides more info on “Permission Denied” Problems.

Zip/gzip extraction falls over

Although zip and gzip extraction are “built in” to PHP they are reliant on an “external” libraries. A few hosts (sack them) fail to build PHP with these libraries.

You can remove the script’s extract function and alter the success notification to prompt you to extract. You will have to uncompress by other means e.g. manually via your control panel file manager; or adding a command line function (like gunzip on Linux) to a scheduled job.

Problems with getting the file (cURL)

File “links” on some sites may take too long, redirect, or require cURL to provide a user agent or referer etc etc (at the time of writing this is not the case with Maxmind). You can  set curl options to cater for these type of problems.


Warning and disclaimer.


Use the script and suggestions on this page at your own risk.

I don’t claim to be an expert; I write about problems I’ve encountered and how I resolved them in my server environment. I’m human and make mistakes. Any code, content and suggestions are intended as a rough guide only.  I can’t guarantee that the solutions on this site will work for you.  Nor can I guarantee that they won’t be harmful to your particular site or business.


As always comments are welcome.

Andy Wrigley+ has worked in IT and Computer Audit for 30 years, and loves independent travel.


Enhanced by Zemanta

4 Comments

  1. Hi There,

    Would be helpful if you could post a sample file with basic variables filled in for maxmind lite with sample paths filled in. I keep getting “Could not open input file: /path/to/update_maxmind_dbfile.php”

    Thanks

    Mark

    • AW
      AW

      May 25, 2014 at 10:25 pm

      Hi Mark, you are right. I’ve been meaning to add additional comments to the sample script but that would also mean altering the line numbers mentioned in this article. I’ll do it when I get chance.

      That said the error msg you mention looks like the problem is in how you’ve set up your server scheduling (cron). To save cluttering these comments I’ve emailed you about it.

  2. this method use full for update database from maxmind database.. many thanks for share this article..

    for cron job set to :
    5 8 * * 3
    /usr/bin/wget -O /dev/null http://example.com/update_maxmind_dbfile.php

    caused GeoLite2 databases are updated on the first Tuesday. I set run script at Wednesday of the month at 8:05am. for hostgator hosting I use command in above.

    • AW
      AW

      May 29, 2014 at 12:29 pm

      Glad it works for you and thanks for providing an example cron used to schedule your script

Leave a Reply

Your comment will appear after its approved; usually within 12 hours but can be up to a week.
Your email address is optional and will not be published.

Copyright © 2013-2017  Means.Us.Com
This site recommends and is hosted by: Kualo Web Hosting.    
Theme: hemingway
 

Blog home  |  ↑ Top of Page ↑