Contents
Introduction
PostgreSQL 12 is the next major release of the world’s most popular and feature-rich open source database. The stable version of PostgreSQL 12 is scheduled to be released in late 2019. Beta 4 of version 12 was released on 12th Sep 2019.
This article walks you through the summary of most important new features. enhancements and breaking changes in PostgreSQL 12.
New Features & Enhancements
CTE Enhancements
In PostgreSQL, CTEs are optimization fences (outer query restrictions are not passed on to CTEs) and the database evaluates the query inside the CTE and caches the results (i.e., materialized results) and outer WHERE clauses are applied later when the outer query is processed, which means either a full table scan or a full index seek is performed and results in horrible performance for large tables.
WITH AllUsers AS (SELECT * FROM Users) SELECT * FROM AllUsers WHERE Id = 100;
To overcome this, you rewrite this query in PostgreSQL as:
WITH UserRecord AS (SELECT * FROM Users WHERE Id = 100) SELECT * FROM UserRecord;
PostgreSQL 12 addresses this problem by introducing query optimizer hints to enable us to control if the CTE should be materialized or not: MATERIALIZED, NOT MATERIALIZED. So, the query can be rewritten as follows to achieve better performance.
WITH AllUsers AS NOT MATERIALIZED (SELECT * FROM Users) SELECT * FROM AllUsers WHERE Id = 100;
This effectively in-lines the CTE expression and results in significant performance improvement.
Generated Columns
SQL Server and Oracle have long had Computed Columns and the lack of computed columns in PostgreSQL made migrations from other Databases quite difficult. PostgreSQL 12 introduces Generated Columns to address this problem.
A generated column is to columns what a view is to a table. The value of the column is always computed from other columns in the table. A generated column can either be virtual or stored. The values for virtual columns are computed on the fly during query time and they don’t take storage space. The values for stored columns are pre-computed and stored as part of table data. PostgreSQL currently implements only stored generated columns.
The GENERATED ALWAYS AS
clause is used to create Generated columns
CREATE TABLE Employee ( ..., dob timestamp, age integer GENERATED ALWAYS AS (date_part('year', CURRENT_TIME - dob)) STORED );
In PostgreSQL 11.x and older, the only way to do this was to declare age as a normal column and write a trigger to populate the column during inserts and updates
CREATE OR REPLACE FUNCTION calc_age() RETURNS TRIGGER AS $body$ BEGIN NEW.age := date_part('year', CURRENT_TIME - NEW.dob::timestamp); RETURN NEW; END; $body$ LANGUAGE plpgsql;
The SQL Server Syntax for Computed Columns
Contrast this to SQL Server syntax. There is no special keyword in SQL Server to declare a computed column. You just specify the expression that makes us the computed column after the AS
clause.
-- SQL Server Syntax CREATE TABLE Employee ( ..., dob datetime, age AS DATEDIFF(year,dob,GETDATE()) PERSISTED );
Note the PERSISTED
clause which is the equivalent of PostgreSQL’s STORED
clause. SQL Server also supports non-persisted computed columns, you just don’t specify the PERSISTED
clause.
So how is a Generated column different from a normal column with a DEFAULT
clause?
- The column default is evaluated once when the row is first inserted if no other value was provided; a generated column is updated whenever the row changes and cannot be overridden.
- A column with
DEFAULT
constraint can be given a value in anINSERT
or UPDATE statement. Generated columns cannot be given values, they’re always computed. - A column default cannot refer to other columns of the table, whereas a generated columns is specifically meant to do so.
- A column default can use volatile functions, for example
random()
orcurrent_time
, generated columns cannot.
Re-Index Improvements
REINDEX CONCURRENTLY
can rebuild an index without blocking writes to its table. The CONCURRENTLY
clause was earlier available only in the CREATE INDEX
statement.
CREATE TABLE test (x INTEGER); INSERT INTO test SELECT generate_series(1, 1000000); CREATE INDEX i_test ON test (x); REINDEX INDEXCONCURRENTLY i_test;
In previous version of PostgreSQL, REINDEX
on large tables frequently caused blocking writes and severe performance issues.
JIT Compilation enabled by default
Enable Just-in-Time (JIT) compilation by default, if the server has been built with support for it. Note that this support is not built by default, but has to be selected explicitly while configuring the build. This is especially useful for data warehouse queries.
Query Parallelism Improvements
Allow parallelized queries when in SERIALIZABLE
isolation mode. Previously, parallelism was disabled when in this mode.
PL/pgSQL Enhancements
Allow SQL commands and variables with the same names as those commands to be used in the same PL/pgSQL function. For example, allow a variable called comment
to exist in a function that calls the COMMENT
SQL command. Previously this combination caused a parse error.
SQL/JSON Path Queries
PostgreSQL 12 now allows execution of JSON path queries per the SQL/JSON specification in the SQL:2016 standard. JSON path expressions let you evaluate a variety of arithmetic expressions and functions and compare values within JSON documents, similar to XPath expressions for XML. These expressions can be accelerated with GIN indexes which improves the execution of lookups across JSON data.
Partitioning Improvements
There is significant performance improvement in PostgreSQL 12 when processing tables with thousands of partitions for command that only need to use a subset of all the available partitions. The INSERT
and COPY
into a partitioned table also perform significantly faster compared to previous versions. ATTACH PARTITION
can now be performed without blocking concurrent queries on the partitioned table. Partitioned tables can now be referenced as foreign keys. Partition bounds can now be expressions.
Collation Improvements
PostgreSQL 12 now supports case-insensitive and accent-insensitive comparisons for ICU provided collations, also known as “nondeterministic collations“. When used, these collations can provide convenience for comparisons and sorts, but can also lead to a performance penalty as a collation may need to make additional checks on a string.
MCV statistics now supports multiple columns
The CREATE STATISTICS
command allows most-common-value statistics for multiple columns; previously only a single correlation value was recorded for multiple columns.
CREATE STATISTICS stts3 (mcv) ON state, city FROM zipcodes; ANALYZE zipcodes;
Migrating from older versions
Some changes in PostgreSQL 12 may affect compatibility with previous releases.
-
recovery.conf
is no longer used, and the server will not start if that file exists. recovery.signal andstandby.signal
files are now used to switch into non-primary mode.trigger_file
has been renamed to promote_trigger_file. Thestandby_mode
setting has been removed. - In new btree indexes, the maximum index entry length is reduced by eight bytes, to improve handling of duplicate entries. This means that a
REINDEX
operation on an index pg_upgrade’d from a previous release could potentially fail. DROP IF EXISTS FUNCTION
/PROCEDURE
/AGGREGATE
/ROUTINE
to generate an error if no argument list is supplied and there are multiple matching objects
References
https://www.postgresql.org/docs/12/release-12.html
https://www.postgresql.org/developer/roadmap/
https://www.postgresql.org/docs/12/ddl-generated-columns.html