Wednesday, August 21, 2013

Caching Apt packages with Nginx - and without Squid

For a few years I have been using Squid-Deb-Proxy to cache downloaded Ubuntu deb packages for multiple machines on my home network. Early on in the project I contributed some code and fixed some bugs in Squid-Deb-Proxy, so I am fairly comfortable with the code and how it works.

Recently I wondered about changing the backend from a Squid Caching server to a Nginx Cache. Why? Well, because it seemed interesting, and Nginx is a fair bit less resource hungry than Squid.

The less expected advantages have been better logging by Nginx helping me find a few issues and better control over the caching.

I installed Nginx on a Debian machine to cache packages for a handful of (K)ubuntu machines, but it could be just as easily installed on a Ubuntu machine.

I only needed the light version, show it's the usual 'sudo apt-get install nginx-light'. The configuration files were already set up in a useful fashion so I didn't touch them. I did need to add a site configuration to /etc/nginx/sites-available though:

proxy_cache_path /var/cache/nginx levels=1 keys_zone=STATIC:10m inactive=30d max_size=12g;

server { 
  listen 8080; server_name mirror.optus.net ; 

  location ~ \.(deb|udeb)$ {
    proxy_pass http://mirror.optus.net ; 
    include /etc/nginx/proxy.conf ;
  } 

  location / { 
    proxy_pass http://mirror.optus.net ;
  } 


server {
  #security.ubuntu.com, extras.ubuntu.com, changelogs.ubuntu.com and others...
  listen 8080; 
  server_name ~(.*).ubuntu.com;
  set $prefix $1 ;
  resolver 127.0.0.1;
  location ~ \.(deb|udeb)$ { 
    proxy_pass     http://$prefix.ubuntu.com ;
    include /etc/nginx/proxy.conf ;

  } location / {
    proxy_pass http://$prefix.ubuntu.com ;
  } 


server { 
  listen 8080;
  server_name ppa.launchpad.net ;

  location ~ \.(deb|udeb)$ {
    proxy_pass http://ppa.launchpad.net ;
    include /etc/nginx/proxy.conf ;
  }
  
  location / { 
    proxy_pass http://ppa.launchpad.net ;
  } 


This is a bit of a mouthful, so lets step through it:

proxy_cache_path /var/cache/nginx levels=1 keys_zone=STATIC:10m inactive=30d max_size=12g;


The cache will be stored in /var/cache/nginx (So make sure this directory exists!) with one level of subdirectories. The downloaded packages will be stored for up to 30 days and the maximum total cache size will be 12 Gb.

If necessary Nginx cache manage will remove packages to keep within these limits.

server {
    listen 8080;
    server_name mirror.optus.net ;

  location ~ \.(deb|udeb)$ {
    proxy_pass http://mirror.optus.net ;
    include /etc/nginx/proxy.conf ;
  }

  location / {
    proxy_pass http://mirror.optus.net ;
  }
}

 This is my local mirror. Nginx will listen on port 8080 for requests to mirror.optus.net. For requests for '.deb' and '.udeb' files Nginx will pass them on, but cache the packages as defined in /etc/nginx/proxy.conf – more on that later.

Other requests to mirror.optus.net are just passed on to the remote server. No caching for these. 

server {
  #security.ubuntu.com, extras.ubuntu.com, changelogs.ubuntu.com and others...
  listen 8080;
  server_name ~(.*).ubuntu.com;
  set $prefix $1 ;
  resolver 127.0.0.1;


This is a bit more complicated and uses regex in the site names. This regex will match anything with '.ubuntu.com' in the hostname, such as security.ubuntu.com, or nz-archive.ubuntu.com and provide appropriate proxy actions.

 The resolver line is required to prevent an error message.

Having put this file into /etc/nginx/sites-available, you need to make a symbolic link to it in etc/nginx/sites-enabled. You could also put a copy of the file directly in etc/nginx/sites-enabled, but symbolic links seem to be the Debian way.

The final piece of work is to add the previously mentioned cache file /etc/nginx/proxy.conf:

proxy_redirect off;
proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
client_max_body_size 100m;
client_body_buffer_size 1m;
proxy_cache STATIC;
proxy_temp_path /var/cache/nginx/partial;
proxy_store_access user:rw group:rw all:r;
proxy_cache_valid 30d;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control;


Of note here is that /var/cache/nginx/partial needs to exist to store partially downloaded files, and the cache validity is 30 days.

Reload Nginx with sudo service nginx reload and check the logs for error messages.

Note that you do need to set apt on each machine to connect to the proxy, and I'll cover this in later post.

Saturday, December 26, 2009

Backups with Duplicty and Amazon S3

For a while now I have been keeping remote backups on the Amazon S3 service. Why Amazon S3? Well, it's very cheap; costing me only about US 20c a month for about 1 - 2 Gb of compressed backups. Don't great support service for these prices though - you are very much on your own.

My criteria for backups was the ability to do incremental backups; a site remote from my own computer and easy to use. Less critical was the ability to encrypt data (my stuff isn't that private), but it seemed like a good idea anyway.

The obvious choice for doing the actual backup was Duplicity. Great program, though the documentation is a little hard to follow. Duplicity reads your files, compresses them into tarballs, then uploads the tarballs onto the remote site along with an index file. When you do the next incremental back up, Duplicity downloads the index file(s) and checks what has been changed. It then uploads incremental change files, with new indexes.

Duplicity can also encrypt the tarballs with your pgp key, making them secure from prying eyes. Not that I think Amazon really wants to sift through my files.

That's the background, the purpose of this post is to show a shell script that automates it all:

#!/bin/bash

source ${HOME}/.backup-data/duplicity-amazon.conf

INCLUDE_FILE="${HOME}/.backup-data/include_file_amazon.txt"

SOURCE="/"

DESTINATION="s3+http://your-bucket"

/usr/bin/duplicity -v5 --encrypt-key ${ENCRKEY} --sign-key ${SIGNKEY} \
--full-if-older-than 1M \
--include-globbing-filelist ${INCLUDE_FILE} \
--exclude '**' \
--allow-source-mismatch \
${SOURCE} \
${DESTINATION}

/usr/bin/duplicity remove-all-but-n-full 2 \
-v4 --encrypt-key ${ENCRKEY} --sign-key ${SIGNKEY} \
--force \
${DESTINATION}

# Reset the ENV variables
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export PASSPHRASE=


This shell script is designed to be run in a multi-user system and backs up specified directories for each user - such as their home drive.

The first lines incorporate the users pgp key and list of files to be backed up. These are the files duplicity-amazon.conf and include_file_amazon.txt, from the users home directory.

The source for backing up is set as the file system root ('/'), and the backup destination is your chosen Amazon S3 bucket.

The first call to Duplicity does a full backup if the current backup is older than one month. All matching directores in the 'include file' are backed up, everything else is ignored by default.

The second call to Duplicity removes old backups older than three months.

Now the two files mentioned above. First the pgp key file:

#!/bin/sh

# Amazon keys
export AWS_ACCESS_KEY_ID='Your Amazon ID'
export AWS_SECRET_ACCESS_KEY='Your Amazon key'

# gpg encryption key
SIGNKEY='Your pgp key'
ENCRKEY=$SIGNKEY
# gpg passphrase
export PASSPHRASE='Your secret passphrase'

It is important that this file is stored in the users home directory with read/write permissions for the user only. You don't want everyone seeing your pgp key and Amazon ID.

Note that the key environment variables are set to null before the main script ends. Again, this is for security.

Now the include files. Note that entries are relative to the system root we specified earlier:

/home/asimpson/Documents
/home/asimpson/simpson image media

Saturday, December 5, 2009

Palm and Bluetooth

Having purchased a bluetooth dongle for next to nothing, I wanted to put it to good use. The thought was to use bluetooth to connect to my Palm Tungsten E2 PDA with Ubuntu Karmic, however it all got more difficult than I'd thought.

A quick howto on avoiding the problems:

Firstly the Palm has to have bluetooth enabled, and be made discoverable. A little bluetooth icon conveniently appears up next to the batttery level indicator to show that all is well.

Now over to the computer; plug in the bluetooth dongle and start up the 'bluetooth-applet' from the command line. With the Palm in range, it's possible to discover the device and 'pair' it to the Ubuntu desktop. The bluetooth applet will show a code number on the screen ask you to type the number into the Palm to confirm that you 'own' both machines.

Using the bluetooth-applet you can now add files to the Palm. Interestingly the one file that I tried by this method got corrupted. Perhaps I was unlucky. I haven't tried this method since.

So far, so good, but this was were it all went wrong for me. I wanted to hotsynsc the Palm. To do this I installed J-Pilot which has a nice graphical interface. Pilot-Link which is command line looks good too.

Nothing would make J-Pilot hotsync with the Palm. I constantly got error messages on the Palm about the serial port being in use by another application. Using Google Fu, I found articles that should said change the port in J-Pilot to 'net:any', add entries to /etc/bluetooth/rfcomm.conf, or even enable ip forwarding, setup PPP and iptables!

Like most things, the answer is real simple: The developers have kindly changed the port definitions. Maybe it's been communicated somewhere, or written on stone tablets in the source code, but I had a helluva time finding out this useful piece of information.

In J-Pilot (or Pilot-Link, etc), change the port setting to 'bt:'. Then it just works.