PostgreSQL

From HerzbubeWiki
Jump to navigation Jump to search

Debian packages

postgresql
postgresql-client


References

External:


On this wiki:

  • TODO


On a Debian system:

  • TODO


Glossary

Postgres / PostgreSQL
The original project name was Postgres, indicating that this was the successor of the Ingres database system. At some point the project was renamed to PostgreSQL to indicate its use of SQL. There has been some discussion of renaming the project back to Postgres because of various problems with the new name (pronounciation among them), but so far it has not happened. I am not sure whether this is still an open issue.
Database cluster
Where all databases of a PostgreSQL instance are stored


Starting/stopping the server

Debian allows to have more than one version of the PostgreSQL database server installed on the system at the same time. If you want to start/stop a specific version, do it like this:

/etc/init.d/postgresql start|stop 9.4


Managing database clusters

List clusters

The following command lists all PostgreSQL database clusters that currently exist (even if they are not running at the moment):

pg_lsclusters

Note that the command lists the TCP/IP port that each cluster is listening on. If you want the psql utility to connect to a specific non-default cluster, then you can specify the port to psql's command line option --port.


Start/stop clusters

To start/stop all existing clusters, use the usual init command:

/etc/init.d/postgresql start|stop

For finer grained control you can also start/stop a single cluster with the pg_ctlcluster command:

pg_ctlcluster 9.4 foo start
pg_ctlcluster 9.4 foo stop

Discussion:

  • 9.4 is the cluster version
  • foo is the name of the cluster; the default cluster created by Debian (possibly on other distros, too) is named "main"


Create a new cluster

The following utility creates a new PostgreSQL database cluster for version 9.4, named "foo":

pg_createcluster 9.4 foo

The database, configuration and log files are stored in

/var/lib/postgresql/9.4/foo
/etc/postgresql/9.4/foo
/var/log/postgresql/postgresql-9.4-foo.log


Delete a cluster

The following utility completely destroys an PostgreSQL database cluster:

pg_dropcluster 9.4 foo


Upgrade a cluster

The following command upgrades the version 9.4 cluster named "foo" to the latest version of PostgreSQL available on the system:

pg_upgradecluster 9.4 foo

Notes (from the man page):

  • The old cluster is given a new port so that the new cluster can keep the port
  • The old cluster is set to manual startup mode, i.e. it is no longer automatically started during system boot
  • After verifying that the new cluster works as expected, the old one should be destroyed with pg_dropcluster

Note from experience: Databases will be offline for the duration of the upgrade because pg_upgradecluster shuts down the old cluster and it can take several minutes until the new cluster is ready.


Database super user

Database role

Installing the PostgreSQL packages creates a database cluster whose database super user by default is

postgres

This is a role defined in the administration database "postgres", in the table "pg_roles", whose "rolsuper" attribute is set to true. Although I haven't been able to confirm this (yet), I strongly suspect that the role by default is not secured with a password. The section "access control" further down explains why this is not a problem.


System user

Installing the PostgreSQL packages creates the system user

postgres

The user does not have a password set in /etc/shadow, which makes it impossible to login using a regular login mechanism (PAM). However, it is possible to use the su command to assume the user's identity.


Access control

The default configuration of PostgreSQL on Debian ensures that only the system user "postgres" is allowed to connect as the database super user "postgres". This is achieved by this entry in

/etc/postgresql/9.4/main/pg_hba.conf

The entry is this:

# TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
local   all         postgres                          peer

To quote from the documentation:

The peer authentication method works by obtaining the client's operating system user name from the kernel and using it as the allowed database user name (with optional user name mapping). This method is only supported on local connections.

Since nobody can login as the system user "postgres", and only the system super user "root" is allowed to switch user (su command line utility), access as the database super user is effectively restricted to the system super user.


Roles

Users vs. roles

In PostgreSQL versions before 8.1, users and groups were distinct kinds of entities. In newer versions of PostgreSQL there are only roles. The SQL commands "CREATE USER", "ALTER USER", "DROP USER", etc., still exist but are aliases to "CREATE ROLE", "ALTER ROLE", "DROP ROLE" etc.

While the concept of "users" is implemented by using roles, the two are not entirely synonymous. Specifically, a role that represents a user usually has the privilege to log in, while a non-user role does not.

Consequently, the SQL command "CREATE USER" creates a role with the login privilege, while "CREATE ROLE" creates a role without login privilege by default.


Managing roles

The following SQL commands are used to manage roles:

  • CREATE ROLE
  • ALTER ROLE
  • DROP ROLE

For managing user roles, there are two convenience command line tools that are wrappers around their corresponding SQL commands. The two utilities have man pages that can be consulted for details:

createuser   # wraps "CREATE ROLE"
dropuser     # wraps "DROP ROLE"

To list all roles that currently exist:

su postgres -c "psql -c 'SELECT rolname FROM pg_roles;'"

Changing a role's password:

alter role foo with password 'secret';
alter role foo with password null;

Because the SQL command includes single quotes, I was unable to form a single command line that uses both su and psql to change the password with SQL commands. The only alternative I have found so far is to change the password interactively in a psql session.


Access control

Access control is performed according to the rules in

/etc/postgresql/9.4/main/pg_hba.conf

The most important thing to know is that the first matching rule is used to perform authentication. All fields of a rule must match, i.e.

  • Connection type
  • Client address
  • Requested database
  • User name

For more details see the official documentation.


~/.pgpass

Passwords for certain users/roles can be stored in a password file which by default is located in the user's home directory and is named

~/.pgpass

The libpq library (which is also used by the psql utility) reads this file and uses passwords it finds there to perform automatical authentication. The format of the file content looks like this:

hostname:port:database:username:password 

For instance:

*:*:davical:davical_dba:secret

The password file must be protected like this:

root@pelargir:~# l /var/lib/postgresql/.pgpass 
-rw------- 1 postgres postgres 51 Jun 13 23:03 /var/lib/postgresql/.pgpass

In the example above, the file is located in the home directory /var/lib/postgresql of the PostgreSQL super user postgres.


Storage on disk

Depends on the version of PostgreSQL being used, and the name of the database cluster. For instance, for PostgreSQL 9.4 the directory that stores the data for the "main" database cluster is this:

/var/lib/postgresql/9.4/main


Databases

Note: The following commands all work with the default cluster on the local system. If you want to connect to a different cluster on the same host, specify the --port option. Read the next section for details about how psql connects to clusters.


List all databases, including their owners

sudo -u postgres psql -c '\l'

Create a new database

sudo -u postgres createdb dbname
sudo -u postgres createdb -O rolename dbname   # specify the role that is the owner of the database
sudo -u postgres psql -c 'CREATE DATABASE dbname OWNER rolename;'

Drop an existing database

sudo -u postgres dropdb dbname
sudo -u postgres psql -c 'DROP DATABASE dbname;'

Database backup:

sudo -u postgres pg_dumpall >/var/backups/postgresql/all-databases.sql
sudo -u postgres pg_dump --clean davical >/tmp/davical.sql  # davical = database name

Restore of a database:

sudo -u postgres psql -d davical </tmp/davical.sql

Connect to a database "davical":

sudo -u postgres psql davical


psql

Summary

The command line utility psql is a terminal-based frontend to PostgreSQL. It can be used both interactively and in "automated mode". Read the tool's man page for details.


Connection handling / authentication

If you specify neither --host nor --port the connection will be made via the default Unix domain socket. In that case, peer authentication will most likely kick in. The reason is this line in pg_hba.conf. The type "local" refers to connections made via Unix domain socket.

# TYPE  DATABASE        USER            ADDRESS                 METHOD
local   all             all                                     peer


With peer authentication the name of the system user you are logged in as is taken as the database user name. This means that usually you will have to run psql in conjunction with su, something like this:

su username -c "psql [...]"


To use password-based authentication, you have to specify --host localhost. The reason is this line in pg_hba.conf. The type "host" refers to connections made via TCP/IP socket. 127.0.0.1 is, of course, equivalent to localhost.

# TYPE  DATABASE        USER            ADDRESS                 METHOD
host    all             all             127.0.0.1/32            md5


If you don't specify --port, connections are made to the default PostgreSQL port 5432, which means you will be talking to whichever PostgreSQL cluster is listening on that port. Use --port to address a different cluster. Use the pg_lsclusters command to see each cluster's TCP/IP port. Note that although you specify a TCP/IP port, psql will still connect via UNIX domain socket and therefore use peer authentication. The reason why this works is that there is a naming convention that maps port numbers to UNIX domain sockets. In the following example you see two sockets that are mapped to the TCP/IP ports 5432 (the default port) and 5433 (a non-default port) via their filesystem name:

root@pelargir:~# ls -l /var/run/postgresql/.*543* srwxrwxrwx 1 postgres postgres 0 Feb 2 14:16 /var/run/postgresql/.s.PGSQL.5432 srwxrwxrwx 1 postgres postgres 0 Feb 2 13:50 /var/run/postgresql/.s.PGSQL.5433


Basic usage

The following two variants can be used to run a list of commands. Note: In the variant that uses --command you cannot mix SQL and psql meta-commands (i.e. commands that begin with a backslash).

psql --host localhost --dbname=foo --username=bar --command="select count(*) from tablename"
echo "select count(*) from tablename" | psql --host localhost --dbname=foo --username=bar


The following two variants can be used to run the commands in a file. Note: In the variant that uses --file you will get error messages prefixed with line numbers so that you can quickly pin-point the location of any errors.

psql --host localhost --dbname=foo --username=bar </path/to/file
psql --host localhost --dbname=foo --username=bar --file=/path/to/file


Meta-commands

Meta-commands are commands that begin with a backslash (\).

The meta-command "\d" lists all tables (and other objects) of the current database, including the owner. For instance:

psql --host=localhost --dbname=roundcube --username=roundcube --command="\d"


phpPgAdmin

Web server configuration

The phppgadmin package adds its own configuration file

/etc/apache2/conf-available/phppgadmin.conf

I am assigning an Apache vhost to phpPgAdmin that is accessible under https://postgres.herzbube.ch/. These are the configuration details:

# --------------------------------------------------------------------------------
# postgres.herzbube.ch
# --------------------------------------------------------------------------------
<VirtualHost *:80>
  ServerName postgres.herzbube.ch
  Redirect permanent "/" "https://postgres.herzbube.ch/"
</VirtualHost>

# --------------------------------------------------------------------------------
# SSL Host
# --------------------------------------------------------------------------------
<VirtualHost *:443>
  ServerName postgres.herzbube.ch
  ServerAdmin webmaster@herzbube.ch
  ErrorLog ${APACHE_LOG_DIR}/postgres.herzbube.ch/error.log
  CustomLog ${APACHE_LOG_DIR}/postgres.herzbube.ch/access.log combined

  DocumentRoot /usr/share/phppgadmin
  # phpPgAdmin has a robots.txt that disallows everything

  <Directory /usr/share/phppgadmin/>
    Require all granted
    php_admin_flag engine on
  </Directory>

  Include conf-available/pelargir-herzbube.ch-vhosts-ssl.conf
</VirtualHost>


Database access

phpPgAdmin uses password authentication. The user must enter the credentials on the main phpPgAdmin login page. phpPgAdmin connects to the PostgreSQL server via TCP/IP on localhost - that's the channel where PostgreSQL allows password authentication.


Upgrading

Overview

Installing a new Debian PostgreSQL package will create a new cluster for the new version. There are two variants how you can go about migrating the data from the old to the new cluster. I prefer the first because it is less work.


Variant 1: Upgrading an existing cluster

Sequence of commands that illustrate the upgrade from 9.4 to 9.6. For a few more details see the Upgrade a cluster section above.

Notes:

  • The pg_upgradecluster command shuts down the old cluster. Since the upgrade can take several minutes your databases will be offline for the duration.
  • The pg_upgradecluster command takes care of reconfiguring the ports used by the old and the new clusters, so that after the upgrade psql commands operate on the new cluster.
  • It is possible to perform an upgrade across several versions. Tested when upgrading from a PostgreSQL 11 cluster (Debian 10) directly to a PostgreSQL 15 cluster (Debian 12), skipping the intermediate PostgreSQL 13 cluster (Debian 11).


# Drop the cluster that was auto-created by the new Debian package
pg_dropcluster --stop 9.6 main

# Upgrade the specified cluster to the newest PostgreSQL version installed on the system
pg_upgradecluster 9.4 main

# Drop the old cluster.
# !!! Only do this after you have verified that the new cluster works !!!
pg_dropcluster 9.4 main

# Delete old packages
aptitude


Variant 2: Move databases from old to new cluster

An alternative for upgrading to a new cluster version is to manually move databases from the old to the new cluster via backup/restore. I used to perform an upgrade like this before I started to trust the much simpler variant 1 (outlined in the previous section). I keep this section around for historical reasons, or if I ever need to perform an upgrade without disrupting services (the upgrade via pg_upgradecluster requires that databases are offline during the upgrade, which can take several minutes).


  • Copy access rules from old to new cluster. The file that needs updating is
/etc/postgresql/9.4/main/pg_hba.conf
  • Probably it will also be necessary to upgrade the new cluster's port in
/etc/postgresql/9.4/main/postgresql.conf
  • In addition, create database backups from the old cluster and restore the backups to the new cluster. Example for the davical database:
# old cluster
sudo -u postgres pg_dump --clean davical >/tmp/davical.sql
#
# new cluster; specify -p <port> to address the new cluster
#
# recreate roles on new cluster
sudo -u postgres createuser --no-superuser --no-createdb --no-createrole --pwprompt --encrypted davical_app
sudo -u postgres createuser --no-superuser --no-createdb --no-createrole --pwprompt --encrypted davical_dba
# recreate database
sudo -u postgres createdb -O davical_dba davical
# restore database content from backup
sudo -u postgres psql -d davical </tmp/davical.sql