Backup Your GMail Account with Linux and GetMail

About 150,000 Gmail users logged in over the weekend to find that everything had gone – email, chat logs, contacts, and attachments. Google is apparently working to restore the accounts but in the meantime, take it as another vivid reminder of the flimsy nature of the Cloud and the value of local backups. Here, I show you how to backup your GMail account using Linux and GetMail

First, of course, you need to install GetMail if it isn’t already installed on your system. If you’re using Ubuntu, type: –

sudo apt-get install getmail4

If you’re using Fedora, type: –

yum install getmail

If you’re not using either of these, you can do the install manually with: –

cd /tmp
wget http://pyropus.ca/software/getmail/old-versions/getmail-4.20.0.tar.gz
tar xzvf getmail*.tar.gz
cd [the directory that was created]
sudo python setup.py install

Next you need to change some settings in your GMail account by turning on POP3 which will allow the downloading of your e-mails. Because you want a copy of all your mail, I recommend that you choose the “Enable POP for all mail” option. On the “When messages are accessed with POP” option, I would choose “Keep Gmail’s copy in the Inbox” so that Gmail still keeps your email after you back up your email.
There are a couple well-known methods to store email in UNIX-based systems – mbox and Maildir. When mail is stored in mbox format, all your mail is concatenated together in one huge file. In the Maildir format, each email is stored in a separate file. Needless to say, each method has different strengths and weaknesses. The mbox format is convenient because you only need to keep track of one file, but editing/deleting email from that huge file can be a pain in the balls. And when one program is trying to write new email while another program is trying to edit the file, things can sometimes go wrong unless both programs are careful. Maildir is more robust, but it chews through inodes because each email is a separate file. It also can be harder to process Maildir files with regular Unix command-line tools, just because there are so many email files.

I’ll cover both options here.

mbox format

Make a directory called “.getmail” in your home directory with the command:-

mkdir ~/.getmail

This directory will store your configuration data and the debugging logs that GetMail generates. Next, make a directory called gmail-archive with the command:-

mkdir ~/gmail-archive

This directory will store your email. Then you need to make a file with the following:-

vi ~/.getmail/getmail.gmail

…and put the following text in it:-

[retriever]
type = SimplePOP3SSLRetriever
server = pop.gmail.com
username = bob@gmail.com # replace with your GMail username!
password = bobpassword # replace with your GMail password!

[destination]
type = Mboxrd
path = ~/gmail-archive/gmail-backup.mbox

[options]
# print messages about each action (verbose = 2)
# Other options:
# 0 prints only warnings and errors
# 1 prints messages about retrieving and deleting messages only
verbose = 2
message_log = ~/.getmail/gmail.log

From the above, you’ll see that you need to create the file path under [destination] with the command: –

touch ~/gmail-archive/gmail-backup.mbox

If you change the path in the file above, touch whatever filename you used. This command creates an empty file that GetMail can then append data to.

Maildir format

You’d still run “mkdir ~/.getmail” and “mkdir ~/gmail-archive”. But the Maildir format uses three directories (tmp, new, and cur). We need to make those directories with the following:-

mkdir ~/gmail-archive/tmp 
mkdir ~/gmail-archive/new 
mkdir ~/gmail-archive/cur

As above, make a file: –

vi ~/.getmail/getmail.gmail

…and put the following text in it:-

[retriever]
type = SimplePOP3SSLRetriever
server = pop.gmail.com
username = bob@gmail.com
password = bobpassword

[destination]
type = Maildir
path = ~/gmail-archive/

[options]
# print messages about each action (verbose = 2)
# Other options:
# 0 prints only warnings and errors
# 1 prints messages about retrieving and deleting messages only
verbose = 2
message_log = ~/.getmail/gmail.log

You’ll notice only the [destination] part has changed from the mbox format example above. Then you run GetMail with a command such as: –

getmail -r /home/bob/.getmail/getmail.gmail

With any luck, you’ll see something like:-

getmail version 4.6.5
Copyright (C) 1998-2006 Charles Cazabon. Licensed under the GNU GPL version 2.
SimplePOP3SSLRetriever:bob@gmail.com@pop.gmail.com:995:
msg 1/99 (7619 bytes) from <info @example.com> delivered to Mboxrd /home/bob/gmail-archive/gmail-backup.mbox
msg 2/99 (6634 bytes) from <sales @example.com> delivered to Mboxrd /home/bob/gmail-archive/gmail-backup.mbox
…
99 messages retrieved, 0 skipped
Summary:
Retrieved 99 messages from SimplePOP3SSLRetriever:bob@gmail.com@pop.gmail.com:995
</sales></info>

GMail will only allow you to download a certain number of mails at once. You can repeat the command but remember to let GetMail finish each time before you run it again, until all of your email is downloaded.

While you can backup GMail manually simply by running the above command, it would be nice if Linux automatically updated it’s backup of your GMail account whenever you got new mail. This can be achieved by setting up a cron job. Create a new script file on your system with the following command:-

vi fetch-gmail.sh

and put the following lines in that file:-

#!/bin/bash
# Note: -q means fetch quietly so that this program is silent
/usr/bin/getmail -q -r /home/bob/.getmail/getmail.gmail

Remember to replace /home/bob with your actual Linux account home directory path! :-) Make sure that the file is readable/executable with the command: –

chmod u+rx /home/bob/fetch-gmail.sh

If you want to make sure the program works, run the command

/home/bob/fetch-gmail.sh

The program should execute without generating any output, but if there’s new email waiting for you it will be downloaded. This script needs to be silent or else you’ll get warnings when you run the script using cron. Now you’ll need to set up the cron job for this script by using:-

crontab -e

and add the following lines:-

# Every hour at 5 minutes past the hour, fetch my email from GMail.
5 * * * * /home/bob/fetch-gmail.sh

This crontab entry tells cron “Every hour, run the script fetch-gmail.sh”. You can obviously set cron to run the script and check your mail backups as often as you like.

That’s it – you’ve successfully configured Linux to backup your GMail account. Feel smug!

, , ,