Course Content
ACID Properties and Query Processing
0/2
PostgreSQL Replication
0/1
PostgreSQL Upgrade
0/1
PostgreSQL Tutorial for Absolute Beginners [Administration]
About Lesson

How optimizer chooses the best plan?

The optimizer chooses the best plan with the statistics that it has. PostgreSQL’s analyze will updates the statistics with Optimizer and assist it to choose the best plan.

We can view the statistics details with pg_stats and pg_statistic view.

Example.,

test_db=> SELECT * FROM pg_stats WHERE tablename = ‘dept’;

-[ RECORD 1 ]———-+———————————————

schemaname             | test_user

tablename              | dept

attname                | dept_name

inherited              | f

null_frac              | 0

avg_width              | 3

n_distinct             | -1

most_common_vals       |

most_common_freqs      |

histogram_bounds       | {CSE,ECE,EEE,IT}

correlation            | 0.8

most_common_elems      |

most_common_elem_freqs |

elem_count_histogram   |

-[ RECORD 2 ]———-+———————————————

schemaname             | test_user

tablename              | dept

attname                | dept_id

inherited              | f

null_frac              | 0

avg_width              | 8

n_distinct             | -1

most_common_vals       |

most_common_freqs      |

histogram_bounds       | {10,20,30,40}

correlation            | 1

most_common_elems      |

most_common_elem_freqs |

elem_count_histogram   |

<OUTPUT TRUNCATED>

 

Based on this data, PostgreSQL can estimate how many rows will be returned by any given part of query, and based on this information it can decide whether it’s better to use seq scan or index scan or bitmap index scan. And when joining – which one should be faster – Hash Join, Merge Join or perhaps Nested Loop.

So, during analyzing table, your postgres takes 300*default_statistc_target pages from disk and you can see that in your statistics view.

You can change this value in postgresql.conf file and you can do that at column  level with

ALTER TABLE dept ALTER COLUMN dept_name SET statistics 10;

There are few more settings in postgresql.conf file which can be used to influence your optimizer.

They are

  • enable_bitmapscan = ON
  • enable_hashagg = ON
  • enable_hashjoin = ON
  • enable_indexscan = ON
  • enable_indexonlyscan = ON
  • enable_material = ON
  • enable_mergejoin = ON
  • enable_nestloop = ON
  • enable_seqscan = ON
  • enable_sort = ON
  • enable_tidscan = ON

Example.,
test_db=> explain analyze select emp_sal from emp;
QUERY PLAN
———————————————————————————–
Bitmap Heap Scan on emp (cost=43.01..83.46 rows=2045 width=8) (actual time=0.140..0.584 rows=2045 loops=1)
Heap Blocks: exact=20
-> Bitmap Index Scan on emp_ix (cost=0.00..42.50 rows=2045 width=0) (actual time=0.114..0.114 rows=2045 loops=1)
Planning Time: 0.254 ms
Execution Time: 0.724 ms
(5 rows)

test_db=> SET enable_bitmapscan = OFF;
SET
test_db=> explain analyze select emp_sal from emp;
QUERY PLAN
———————————————————————————-
Index Only Scan using emp_ix on emp (cost=0.28..142.90 rows=2045 width=8) (actual time=0.067..1.325 rows=2045 loops=1)
Heap Fetches: 2045
Planning Time: 0.140 ms
Execution Time: 1.481 ms
(4 rows)