Contents
Introduction
A generated column or a computed column is to columns what a view is to a table. PostgreSQL uses the term ‘Generated‘ columns for Computed columns. The value of the column is always computed or generated from other columns in the table. A generated column can either be virtual or stored. The values for virtual columns are computed on the fly during query time and they don’t take storage space. The values for stored columns are pre-computed and stored as part of table data.
Background
Modern databases like SQL Server and Oracle have long had Computed Columns and the lack of computed columns in PostgreSQL made migrations from other Databases quite difficult. This article attempts to explore the different ways of achieving the same functionality in different versions of PostgreSQL.
PostgreSQL 12
PostgreSQL 12 is the next major release of the world’s most popular and feature-rich open source database. The stable version of PostgreSQL 12 is scheduled to be released in late 2019 and supports Generated Columns.
The GENERATED ALWAYS AS
clause is used to create Generated columns. The expression used to define a generated column is called generation expression.
-- PostgreSQL syntax CREATE TABLE employee ( ..., dob timestamp, age integer GENERATED ALWAYS AS _ (date_part('year', CURRENT_TIME) - date_part('year', dob)) STORED );
Comparison with SQL Server Computed Column
Contrast this to SQL Server syntax. There is no special keyword in SQL Server to declare a computed column. You just specify the expression that makes us the computed column after the AS
clause.
-- SQL Server Syntax CREATE TABLE Employee ( ..., dob datetime, age AS DATEDIFF(year,dob,GETDATE()) PERSISTED );
Note the PERSISTED
clause which is the equivalent of PostgreSQL’s STORED
clause. SQL Server also supports non-persisted computed columns, you just don’t specify the PERSISTED
clause. But PostgreSQL currently implements only stored generated columns.
So How Is a Generated Column Different From a Regular Column With a DEFAULT Clause?
- The column default is evaluated once when the row is first inserted if no other value was provided; a generated column is updated whenever the row changes and cannot be overridden.
- A column with
DEFAULT
constraint can be given a value in anINSERT
orUPDATE
statement. Generated columns cannot be given values, they’re always computed. - A column default cannot refer to other columns of the table, whereas a generated columns is specifically meant to do so.
- A column default can use volatile functions, for example,
random()
orcurrent_time
, generated columns cannot.
Limitations and Restrictions
Several restrictions apply to the definition of generated columns and tables involving generated columns:
- The generation expression can only use immutable functions and not volatile functions. In SQL Server terminology, they’re called deterministic and non-deterministic functions, respectively. posgtres doc2
- The generation expression cannot use subqueries or reference anything other than the current row in any way. posgtres doc2
- A generation expression cannot reference another generated column. posgtres doc2
- A generation expression cannot reference a system column, except tableoid. posgtres doc2
- A generated column cannot have a column default or an identity definition. posgtres doc2
- A generated column cannot be part of a partition key. posgtres doc2
Additional Considerations
- Foreign tables can have generated columns. posgtres doc2
- Access privileges for Generated columns are maintained separately from their underlying base columns. So, you can grant access to roles to read from a generated column but not from the underlying base columns. posgtres doc2
- Generated columns are, conceptually, updated after
BEFORE
triggers have run. Therefore, changes made to base columns in aBEFORE
trigger will be reflected in generated columns. But generated columns themselves cannot be accessed inBEFORE
triggers. posgtres doc2
PostgreSQL 11.x and Older
The stable version of PostgreSQL 12 is yet to be released as of this writing and workloads running on older versions might still need this functionality. In PostgreSQL 11.x and older, there are two ways to achieve this:
Use a View
In this approach, the table doesn’t have the ‘age
‘ column. The view is used wherever ‘age
‘ column is needed.
CREATE VIEW v_employee AS SELECT dob, date_part('year', CURRENT_TIME) - date_part('year', dob) as age FROM employee;
Use a Normal Column and Update It With a Trigger
In this approach, ‘age
‘ is declared as a regular integer column and a trigger
populates the column during insert
s and update
s.
CREATE TABLE employee ( ..., dob timestamp, age integer );
CREATE OR REPLACE FUNCTION calc_age() RETURNS TRIGGER AS $body$ BEGIN NEW.age := date_part('year', CURRENT_TIME) - date_part('year', dob); RETURN NEW; END; $body$ LANGUAGE plpgsql;
The major drawback to this approach is the additional maintenance overhead of creating and maintaining views/triggers and remembering to update them when business logic changes, Upgrading your databases to PostgreSQL 12 when the stable version is released would be a good idea to mitigate the drawbacks and remove the maintenance overheads.
I hope you found this article useful. Please leave a comment if you have any questions, feedback or suggestions to improve the article or expand its content.