I have an application ported from using Oracle to Postgres – tuning queries is quite different between the two. While I know many of the Postgres devs have detailed blogs, I haven’t found many practical articles on the subject, so I’m writing up what I’ve found. I appreciate any further comments or resources people may have.
- When you fix a problem in a query in Oracle, you can often achieve a 10x speed-up. In Postgres, this is often 100x- not because Postgres is better, but because you seem to hit problems sooner.
- TRUNCATE behaves very differently than Oracle. In either case, TRUNCATE empties the table. In Oracle, it is as if you deleted the files on disk, and operates outside a transaction. In Postgres, it behaves as a DELETE without a where clause, although it does reclaim space immediately, and works within a transaction. That said I found that if you a script with a lot of DELETE/TRUNCATE statements, it can be helpful to explicitly call ANALYZE, otherwise you get maddeningly inconsistent query timings (without these, in my testing the total time for a script of ~50-100 queries was consistent, but the actual time of each query varied dramatically)
- Beware of common table expressions (WITH) – a lot of people seem to be discovering these lately, but not understanding the risks. These are a great way to break up code and make it readable, but in Postgres they are also optimization boundaries. When you write a table with a lot of WITH blocks, you see it materialize each block. Mathematically speaking, an operation may be much faster rearranged (see relational algebra).
- In some, but not all cases, I’ve had success turning a common table expression into a temp table, and adding indexes (It seems that one does not get indexes automatically when Postgres materializes a table).
- Like other database software, Postgres supports grouping values in sets based on either sorting or hashing (think “GROUP BY” or count(distinct X)). In my limited testing, I found Postgres was much more likely to use hashing when there were few columns (e.g., two), even though it is generally much faster. I haven’t yet found a way to encourage this for wider tables (if you think this might be an issue, you can prove this out by concatenating all the columns into one and doing your operation, even with the added work, it is pretty fast).
- Postgres seems to be very sensitive about re-ordering joins. In Oracle, I never had to think about this – if you can, just remove all your outer joins, or put them at the end. In some applications, you can trivially remove outer joins by adding a “null” row (when using numeric IDs, I like to set the ID on this to 0 by convention, which makes it easy to find).
- Pay attention to how often you see sorting – depending on the application, you may want to give more RAM for this, if you can. If Postgres does disk-sorting, it will be very slow.
- Bitmap indexes are awesome, but good luck figuring out how to make them turn on. Oracle has a bitmap index which acts completely differently than Postgres (other than the mashing of bits – there is no “star transform” in Postgres). I had a lot of success tuning full-text queries with these, and when they work they make everything super fast, but I have yet to figure out how to make them happen more often.
Hope this all helps – if you have any useful tips, please post in the comments below!
Need help with Postgres? Contact me for Postgres consulting.