PostgreSQL Internals

The Data

Where is all the data stored

In Postgres, all data is generally stored inside /base folder.

You can check where this folder is with:

Find which data folder belongs to which database with:

There are some different types of files for different purposes.

The Heap or Heap file, is the file that contains all the data (rows) of a table, (usually the files that has a number as name).

A single Heap file is divided into Blocks or Pages, where each page/block stores some number of rows.

At last, an individual row is called Tuple or Item.

Drawing
Drawing

The Connection

circle-exclamation

Connection pools

A pool internally maintains several different clients that can be reused.

Drawing

Explain & Explain Analyze

About

EXPLAIN is a tool to show the Planner Step of a query, which is the step that decides the most optimal way to execute the desired query.

EXPLAIN builds a query plan and display info about it.

EXPLAIN ANALYZE builds a query plan, run it, and info about it.

EXPLAIN ANALYZE output

circle-check

Lines starting with arrows -> are query nodes, which are the steps trying to access data inside the database or trying to do some processing. (The first line is also a query node, even though it doesn't have the arrow)

The order to read the EXPLAIN planning result, is starting from the inner most rows.

And the results from each the query nodes, will be passed to the nearest query node PARENT.

triangle-exclamation
Drawing

Costs

Not the full calculation, only some of the basic variables.

Drawing
circle-info

Tip

Not all query types will use all the variables.

For a sequencial read, as example, only #page read sequentially and #rows scanned.

pg_stats

circle-check

Check them with:

Scan types

Sequential scan

Will be used:

  • As default when no other valuable alternative is available.

  • When reading from the begining of the dataset. (e.g. SELECT * FROM table_name)

  • When the filtering clause is not very limiting, so that the end result will be almost the whole table contents.

It is fast when data is read sequencially.

Index nodes scans

Will be used:

  • When an index is used to access the dataset.

  • When data file and index files are separated but they are nearby.

Scans on index nodes can be separated in:

  • Index scan: index -> seeking the tuples -> the read again the data. (2 trip scan, it must get data of the index col and then get the rest of the data for other columns) (e.g. SELECT * FROM table_name WHERE index_col = 1)

  • Index Only scan: requested index columns only -> directly get data from index file. (when it only needs to return data from the index column) (e.g. SELECT index_col FROM table_name WHERE index_col = 1)

  • Bitmap index scan: builds a memory bitmap of where the tuples that satistfy the statement clauses.

Join nodes scans

Will be used:

  • When joining tables.

  • When joins are performaed on two tables at a time; If more tables are joined together, the output of one join is treated as input to a subsequent join.

When joining a large number of tables, the genetic query optimizer settings may affect what combinstions of joins are considered.

Scans on join nodes can be separated in:

  • Hashed join: Builds a hash table from the inner table, keyed by the join key, then scan the outer table, checking if a corresponding value is present.

  • Merged join: Joins two children already sorted bu their shared join key. This only needs to scan each relation once, but both inputs need to be sorted buy the join key first.

  • Nested loop: For each row in the outer table, iterate through all the rows in the inner table and see if they match the join condition. If the inner relation can be scanned with an index, that can improve the performance of a nested loop join. (This is generally an inefficient way to process joins but is always available and sometimes mey be the only option)

Indexes

triangle-exclamation
triangle-exclamation

About

Index is a data structure that efficiently tells which block/index a record is stored at, thus significantly reducing search time, and avoiding Full Table Scans.

circle-exclamation
circle-exclamation
Drawing

Types of index

Type
Description

B-Tree

General purpose index. 99% of the time you want this.

Hash

Speeds up simple equality checks.

GiST

Geometry, full-text search.

SP-GiST

Clustered data, such as dates - many rows might have the same year.

GIN

For columns that contain arrays of JSON data.

BRIN

Specialized for really large datasets.

B-Tree

It is the default index type when creating an index. B-Tree is a self-balancing tree that can run SELECT, INSERT, DELETE and Sequential reads in logarithmic time log n.

Hash

An index type only used for equality operators =.

They also can consume more space compared to B-Tree.

GIN

Stands for (Generalized Inverted Index), and its core capability is to speed up full text searches.

GIN index stores "key" (or an element or a value) and the "position list" pairs. The position list is the rowID of the key. This means, if the "key" occurs at multiple places in the document, GIN index stores the key only once along with its position of occurrences which not only keeps the GIN index compact in size and also helps speed-up the searches.

triangle-exclamation

BRIN

It is based on the range of values and data blocks on a storage.

It is smaller and less contly to maintain than B-Tree index.

Creating

Where index_name conventionally follows table_colname1_colname2_idx.

circle-exclamation

Deleting

CONCURRENTLY means when you execute the DROP statement, Postgres aquires an exclusive lock on the table and block other accesses until the index removal completes.

CASCADE means if the index has dependent objects, it will automatically drop these objects and all objects that depend on those objects. (It is recursive)

RESTRICT tells Postgres to refuse to drop the index if any objects depend on it. (Used by default)

Managing

Listing all indexes

Check the size of index table

Check how much space indexes take in relation to the table

Check stats for all indexes

Schemas

Schemas in Postgres are namespaces (like folders) to help organize things in a database. Each Schema can have its own separate copy of a table.

Drawing

Every database gets a default schema called "public".

circle-exclamation
circle-check
circle-check

Creating

When creating a new schema in a database, it will not pull the tables nor data from the public schema. It will be totally empty.

Creating tables in this schema

Moving tables between schemas

Access rights

There are two access levels rights:

  • USAGE: to access schema.

  • CREATE: to create objects in the schema.

More on user priviledges Roles/Users.

Changing default schema

Show how Postgres is handling the default value

Usually, the default result of this query will be "$user", public. This means that it will always - if not specified - try to use:

  1. First, a schema named after the User that holds the connection.

  2. Or if it does not exist, the schema named public.

Change the search path

This will change the search path to be test, public.

Meaning it will first, try to find a schema named test. And if not existent, then the public one.

Domains Data Types

CREATE DOMAIN statement creates a user-defined data type with a range, optional DEFAULT, NOT NULL and CHECK constraint.

They have to be unique within a Schema scope and cannot be used outside of the Schema scope where they are defined.

circle-info

They help standardize your database data types in one place.

circle-exclamation

Creating

Then use it in other tables like:

Listing

List all the Domain data types in a Schema with:

Deleting

triangle-exclamation

Last updated