I use Django to develop web sites. I develop and test the code on my machine, uploading it from time to time to the test area on the server for the customer to review, then upload the approved version to the main site. There is a number of ways to do that. The simplest one I used at the beginning was to upload the files to the server over ssh using mc. This method is time-consuming and error-prone since I had to take care of the following issues:
File permissions. On my machine, I own all
files. On the server, I want that all files belong to
root
to minimize possible consequences if a user
account is broken into. In addition, settings.py
should only be readable by the user the web server runs as
(www-data
on my server) and not by other users,
because it contains the database password.
Configuration files. I want to have
different configurations for development and production. For
example, DEBUG
in settings.py
is set
in development but not in production. Similarly, Apache configuration files
have different settings and directories on the server and at
home. These files do not change very often, so I usually
either forget to update them, or overwrite the server
version.
Deleted files. If a file is deleted from the source tree, it should also be deleted from the server. One solution could be to remove the whole project directory on the server before copying the files; however, this solution has the configuration file problem.
File selection. I have some files in my
source code directory that I don't want to have deployed on
the server, for example, the .git
directory,
which contains the git
repository of the project. So, I have to exclude it whenever I
upload the project for reviewing or releasing.
Generated files. My projects contain a
number of files that are not tracked in the SCM but need to be
present in the deployed copy. For instance, byte-compiled Python code
(.pyc
files) cannot be saved by the web server
process since the directories are owned by root
and the server runs as www-data
; this has a
negative impact on performance (and I want to keep the
permissions that way for security). Another example is gettext
translations (.mo
files), necessary for the
translation to work.
Different deployment locations. The customer review area and the main site are located in the different directories on the server. They may have different databases; per definition, they have different code most of the time. Aside from the configuration file issue, this poses an additioal problem that you have to deal with all of the above two times, once for the customer review, and one more time for the actual release.
I deploy my sites on Debian
GNU/Linux. Its packaging tools solve all of the above problems
during package installation and upgrades. A package is a file
containing a number of (usually somehow related) files to be
installed on a system in one step. With the introduction of
packaging, my work flow didn't change: it's still develop —
review — deploy. After testing the site at home, I create
the packages (using dpkg-buildpackage
),
transfer them to the server, and install them there (using dpkg
).
File permissions. When the package is
installed on the server, the files are created with owner
root
, group root
, and have mode
0644. In addition, a post-installation
script, executed automatically after the packaged files
have been extracted, changes the permissions of
settings.py
.
Configuration files are flagged as such during the package creation. If any of them change during an upgrade, I'm alerted by the package management system; I may choose to leave the existing version, or overwrite it with the new version from the package. The tool offers the possibility to review the changes right in the middle of installation, so that I can choose the either way and apply the differences after the installation is finished. In any case, the change doesn't go unnoticed.
Deleted files. Similarly, the package management system keeps track of the files that belong to the package and removes old non-configuration files before extracting the new ones, so the old files are not left on the server.
File selection. A package should contain the
files to be installed. The Debian approach is to copy the
files to package into a separate directory
(debian/
<package name>), then create the
package, along with the control information. During the
installation, those files are created, preserving the
hierarchy, starting in the system's root directory.
This also helps to solve a more general problem: You may want
to move files into a different hierarchy than
that of the source tree. E.g., you may want to put executables
to /bin
and configuration files to
/etc
in order to reduce
surprise and comply with LSB.
Generated files. Here, the optimal solution depends on the task details. Translations are pretty straightforward: one just needs to run msgfmt on them, so I build them during the package creation.
Byte-compilation is trickier since the format of the files generated depends on the Python version. One solution could be to build one package for each target Python version. Since I have two deployment locations and use different Debian releases for development and production, with this approach I would end up with four to six packages, from which I would use only two. In addition, if I would give a package to a friend running a Python version I don't yet have, he would have to rebuild the package from source.
That is why I use python-support
,
one of the solutions provided by Debian. It frees the package
maintainer from the burden of byte-compiling for all possible
target Python versions and handles it transparently during the
package installation. The binary package includes only the
Python source files. They are byte-compiled for every
installed Python version during the package installation.
With this solution, python-support
is required
on the target system, which is not a problem since it is in stable (even
if this weren't the case, it would be very easy to backport since it doesn't
have any dependencies except Python). In theory, this should
also make the installation of a new Python version longer if
many packages using python-support
are installed;
I haven't tried that.
Different deployment locations. I solve this
issue through having two binary packages built from a single
source tree. First, the source code is configured for the main
site location (file paths, database access parameters, etc.)
and its binary package (mysite
in the example
below) is built. Then, the source code is configured for the
test area location and the respective package
(mysite-test
) is built.
Packaging does add another level of complexity, which one needs
to understand and maintain. This can also be an advantage: For
instance, I can quickly test a package on my home machine before
uploading it to the review area; without packaging, it would not
be a zero-cost operation, and I just wouldn't do that. Or, I can
check out the project in another directory, try to build the
package, and see whether I forgot to add any files to the version
control (this can also be done automatically). This does sometimes
happen in spite of heavy git-status
usage (which is kind of mandatory for git index management, so I
use it much more often (and read its output much more carefully
:-) ) than I used to do with similar tools of other SCM
systems).
In Debian, the main package building logic is contained in the debian/rules
file. It is usually a makefile
with one or more rules how to build the package (see Debian New
Maintainers' Guide for more information). A minimal
debian/rules
file for a Django-based project could
look like this:
#!/usr/bin/make -f clean: # Delete debian/mysite # Delete debian/mysite-test build: # Create .mo files install: build # Copy .py files and templates to debian/mysite # Copy .py files and templates to debian/mysite-test
dpkg-buildpackage -uc -us -rfakeroot
performs the
following steps:
fakeroot debian/rules clean
Build the source package (a gzipped tar file with a description file).
debian/rules build
. We need this rule only
because dpkg-buildpackage
executes it.
fakeroot debian/rules install
For some time after packaging ("debianizing")
my Django projects, I performed all file
selection and file moving operations in
debian/rules
. It worked well till I got bored
reinventing the different deployment locations
wheel: I've started implementing scripts that picked and installed
files in two directories, updating the paths accordingly. This,
and file generation, are the things that automake does
very well, so I started using it.
automake
's main file is Makefile.am
; it
specifies the targets to build and install in its own macro
language. The software author runs automake
, which
generates the Makefile.in
file, a makefile with some
variables named like @var@
. At the build time, the
Debian package maintainer runs autoconf
's
configure
script, which substitutes the variables
with the values supplied or detected by configure
(see autotools
tutorial for more information).
In my projects, I'm the software author and the package
maintainer. I've set up the things once and touch them quite
rarely since then. dpkg-buildpackage
runs
configure
, make
, and make
install
for me every time I need to build a package.
On my home machine, I also run ./configure --datadir=`pwd`
DEBUG=True DB_PASSWD=xyzzy
whenever I check out a project
into a new location (e.g., when I work on a feature and am not
ready for a commit and want to commit an unrelated, logically
complete change). Here, autoconf
puts the password in
settings.py
, and automake
generates a
makefile that can be used to build .mo
files,
clean
them, or install
the Python files
into a specified DESTDIR
.
It would be possible to use a custom combination of
cat
and sed
instead of
autoconf
. I use it since it is required by
automake
anyway, and I think that
configure.ac
and its template files will be more
readable for an experienced developer as complexity grows. A
disadvantage is that, e.g., Apache configuration files are not
completely DRY and have
a few repeating blocks. Perhaps they could be replaced with some
nested macros; I haven't looked at that.
I need to change the autotools files whenever I add a new file to build or install, or when I add a new substitution variable.
I've debianized an
example from the Django
tutorial. You can create Debian packages or run
configure
, make
, and make install
DESTDIR=/tmp/mysite
with it.
The binary packages won't work out of the box. They include
apache2
configuration for Debian; you'll need to a2ensite
it and fix or remove the certificate and authentication bits.
You'll also have to CREATE USER mysite PASSWORD 'xyzzy';
CREATE DATABASE mysite; GRANT ALL ON DATABASE mysite TO
mysite;
in su - postgres -c psql template1
,
assuming you're running PostgreSQL. It is possible
to do that automatically after the package installation; I don't
do that in my packages since I don't have to deploy them on many
systems, and package upgrades often require schema changes anyway,
and the effort to automate that is usually not worth it.
A tip if you use this approach: I run make-messages
only after fakeroot debian/rules clean
since
otherwise the former scans also the Python sources in the
debian/mysite
and debian/mysite-test
directories, which leads to spurious changes in the
.po
files.
Feel free to write me if you have corrections, comments or suggestions.