{"id":2257,"date":"2014-04-08T20:54:41","date_gmt":"2014-04-08T20:54:41","guid":{"rendered":"http:\/\/www.garysieling.com\/blog\/?p=2257"},"modified":"2014-04-08T20:54:41","modified_gmt":"2014-04-08T20:54:41","slug":"postgres-developers-notes-pgconf-nyc-2014","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/postgres-developers-notes-pgconf-nyc-2014\/","title":{"rendered":"&#8220;Postgres for Developers&#8221; &#8211; Notes from PGConf NYC 2014"},"content":{"rendered":"<p>I saw a talk by one of the core Postgres developers, which showed a bunch of interesting tricks to handle business rules in Postgres specific SQL. These are all things you could find by reading the documentation, but they are interesting enough to write up examples to highlight some interesting things you can do. A lot of these end up being useful for writing systems with immutable data (especially auditing, and sometimes reporting systems).<\/p>\n<p><b>Example 1: Array Aggregation<\/b> <\/p>\n<p>&#8220;array_agg&#8221; can be used to combine rows, which sort of resembles a pivot table operation (this is the same set of values that would be passed as arguments to other aggregation functions)<\/p>\n<pre lang=\"sql\">\nselect y, array_agg(x) from (\n  select 1 x, 2 y\n  union all\n  select 2 x, 2 y\n  union all \n  select 3 x, 3 y\n) a\ngroup by y\n\n2;\"{1,2}\"\n3;\"{3}\"\n<\/pre>\n<p>If you use the above table as a common table expression, you can also rename the columns in the with block. You can even join on the arrays:<\/p>\n<pre lang=\"sql\">\nwith t(a, b) as\n(\n  select y, array_agg(x) from (\n    select 1 x, 2 y\n    union all\n    select 2 x, 2 y\n    union all \n    select 3 x, 3 y\n  ) a\n  group by y\n)\nselect * \nfrom t t1 join t t2 on t1.b[2] = t2.a\n\n2;\"{1,2}\";2;\"{1,2}\"\n<\/pre>\n<p><b>Example 2: Named Window Functions<\/b> <\/p>\n<p>I&#8217;m not sure yet whether this is just syntactic sugar or has real value, but you can set up named &#8220;windows.&#8221;<\/p>\n<p>By way of explanation, a lot of times when you start using aggregate functions (min, max, array_agg, etc), you end up using window functions, which resemble the following:<\/p>\n<pre lang=\"sql\">\nselect a, max(b) over (partition by a) \nfrom (\n  select 1 a, 1 b\n  union all \n  select 2 a, 1 b\n  union all \n  select 1 a, 2 b\n) t1\n\n1;2\n1;2\n2;1\n<\/pre>\n<p>These allow you do calculate aggregate functions (like min\/max) without combining all the rows.<\/p>\n<p>For instance, if you sort these values, you can find the &#8220;next&#8221; or &#8220;previous&#8221; row in the partition, which is pretty standard sql stuff:<\/p>\n<pre lang=\"sql\">\nselect a, lag(b) over (partition by a order by b) \nfrom (\n  select 1 a, 1 b\n  union all \n  select 2 a, 1 b\n  union all \n  select 1 a, 2 b\n) t1\n\n1;\n1;1\n2;\n<\/pre>\n<p>If you use the above table as a common table expression, you can then rename the columns in the WITH block. You can even join on the arrays:<\/p>\n<pre lang=\"sql\">\nwith t(a, b) as\n(\n  select y, array_agg(x) from (\n    select 1 x, 2 y\n    union all\n    select 2 x, 2 y\n    union all \n    select 3 x, 3 y\n  ) a\n  group by y\n)\nselect * \nfrom t t1 join t t2 on t1.b[2] = t2.a\n\n2;\"{1,2}\";2;\"{1,2}\"\n<\/pre>\n<p>What&#8217;s cool is you can move the &#8220;over partition by&#8221; part out of the query to the end as a named window, which presumably would be really nice if you had a lot of them, or wanted to re-use the same window for multiple fields:<\/p>\n<pre lang=\"sql\">\nselect a, lag(b) over w\nfrom (\n  select 1 a, 1 b\n  union all \n  select 2 a, 1 b\n  union all \n  select 1 a, 2 b\n) t1\nwindow w as (partition by a order by b) \n\n1;\n1;1\n2;\n<\/pre>\n<p><b>Example 3: Ranges<\/b><br \/>\nPostgres has a really cool feature, as of 9.2, where you can query whether something is in a range (ranges are a special type, kind of like the arrays above). This example is a bit contrived, to show that you could combine array_agg and range creation:<\/p>\n<pre lang=\"sql\">\nwith _data as (\n  select 1 a, 1 b\n  union all \n  select 2 a, 1 b\n  union all \n  select 1 a, 2 b\n  union all\n  select 2 a, 2 b\n),\n_history as (\n  select a, array_agg(b) _start, array_agg(b) _end\n  from _data\n  group by a\n)\nselect a, \n       _start[1], \n       _end[1], \n       int4range(_start[1]::integer, _end[2]::integer, '(]'::text) \nfrom _history\n\n1;1;1;\"[2,3)\"\n2;1;1;\"[2,3)\"\n<\/pre>\n<p>There are a bunch of range types built in (based on numerics, timestamps). Note that you can specify whether the endpoints on ranges are inclusive or exclusive.<\/p>\n<p>You can find out if a data value is within a range with the @> operator and see if two ranges overlap with &#038;&#038;. This set of functionality is great for exploring audit records &#8211; if you make a range with &#8220;[valid_from, valid_to)&#8221; you can query to find out what rows were effective on a particular date\/time, for instance.<\/p>\n<p>If you&#8217;re in this area of functionality, also check out <a href=\"http:\/\/www.postgresql.org\/docs\/9.1\/static\/btree-gist.html\">btree_gist indexes<\/a>, which may be helpful for tuning this.<\/p>\n<p><b>Example 4: DISTINCT ON<\/b><br \/>\nPostgres has a feature to pull back the first value for a row in a group by. I assume this is a performance feature, but at the least it&#8217;s a very concise syntax for something that would otherwise require the use of RANK(). I imagine that you&#8217;d always want to use an ORDER BY with it.<\/p>\n<p>The example from the docs for this one is pretty clear:<\/p>\n<pre lang=\"sql\">\nSELECT DISTINCT ON (location) location, time, report\nFROM weather_reports\nORDER BY location, time DESC;\n<\/pre>\n<p>There are a few other features that got a lot of play at the conference (e.g. foreign data wrappers) &#8211; more to come.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I saw a talk by one of the core Postgres developers, which showed a bunch of interesting tricks to handle business rules in Postgres specific SQL. These are all things you could find by reading the documentation, but they are interesting enough to write up examples to highlight some interesting things you can do. A &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.garysieling.com\/blog\/postgres-developers-notes-pgconf-nyc-2014\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;&#8220;Postgres for Developers&#8221; &#8211; Notes from PGConf NYC 2014&#8243;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4],"tags":[160,437],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/2257"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=2257"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/2257\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=2257"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=2257"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=2257"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}