Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit bb766cd

Browse files
committed
JSON_TABLE: Add support for NESTED paths and columns
A NESTED path allows to extract data from nested levels of JSON objects given by the parent path expression, which are projected as columns specified using a nested COLUMNS clause, just like the parent COLUMNS clause. Rows comprised from a NESTED columns are "joined" to the row comprised from the parent columns. If a particular NESTED path evaluates to 0 rows, then the nested COLUMNS will emit NULLs, making it an OUTER join. NESTED columns themselves may include NESTED paths to allow extracting data from arbitrary nesting levels, which are likewise joined against the rows at the parent level. Multiple NESTED paths at a given level are called "sibling" paths and their rows are combined by UNIONing them, that is, after being joined against the parent row as described above. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
1 parent f6a2529 commit bb766cd

File tree

17 files changed

+1207
-31
lines changed

17 files changed

+1207
-31
lines changed

doc/src/sgml/func.sgml

+153-1
Original file line numberDiff line numberDiff line change
@@ -18893,6 +18893,24 @@ DETAIL: Missing "]" after array dimensions.
1889318893
row.
1889418894
</para>
1889518895

18896+
<para>
18897+
JSON data stored at a nested level of the row pattern can be extracted using
18898+
the <literal>NESTED PATH</literal> clause. Each
18899+
<literal>NESTED PATH</literal> clause can be used to generate one or more
18900+
columns using the data from a nested level of the row pattern. Those
18901+
columns can be specified using a <literal>COLUMNS</literal> clause that
18902+
looks similar to the top-level COLUMNS clause. Rows constructed from
18903+
NESTED COLUMNS are called <firstterm>child rows</firstterm> and are joined
18904+
against the row constructed from the columns specified in the parent
18905+
<literal>COLUMNS</literal> clause to get the row in the final view. Child
18906+
columns themselves may contain a <literal>NESTED PATH</literal>
18907+
specification thus allowing to extract data located at arbitrary nesting
18908+
levels. Columns produced by multiple <literal>NESTED PATH</literal>s at the
18909+
same level are considered to be <firstterm>siblings</firstterm> of each
18910+
other and their rows after joining with the parent row are combined using
18911+
UNION.
18912+
</para>
18913+
1889618914
<para>
1889718915
The rows produced by <function>JSON_TABLE</function> are laterally
1889818916
joined to the row that generated them, so you do not have to explicitly join
@@ -18924,6 +18942,7 @@ where <replaceable class="parameter">json_table_column</replaceable> is:
1892418942
<optional> { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT <replaceable>expression</replaceable> } ON ERROR </optional>
1892518943
| <replaceable>name</replaceable> <replaceable>type</replaceable> EXISTS <optional> PATH <replaceable>path_expression</replaceable> </optional>
1892618944
<optional> { ERROR | TRUE | FALSE | UNKNOWN } ON ERROR </optional>
18945+
| NESTED <optional> PATH </optional> <replaceable>json_path_specification</replaceable> <optional> AS <replaceable>json_path_name</replaceable> </optional> COLUMNS ( <replaceable>json_table_column</replaceable> <optional>, ...</optional> )
1892718946
</synopsis>
1892818947

1892918948
<para>
@@ -18971,7 +18990,8 @@ where <replaceable class="parameter">json_table_column</replaceable> is:
1897118990
<listitem>
1897218991
<para>
1897318992
Adds an ordinality column that provides sequential row numbering starting
18974-
from 1.
18993+
from 1. Each <literal>NESTED PATH</literal> (see below) gets its own
18994+
counter for any nested ordinality columns.
1897518995
</para>
1897618996
</listitem>
1897718997
</varlistentry>
@@ -19060,6 +19080,33 @@ where <replaceable class="parameter">json_table_column</replaceable> is:
1906019080
</note>
1906119081
</listitem>
1906219082
</varlistentry>
19083+
19084+
<varlistentry>
19085+
<term>
19086+
<literal>NESTED <optional> PATH </optional></literal> <replaceable>json_path_specification</replaceable> <optional> <literal>AS</literal> <replaceable>json_path_name</replaceable> </optional>
19087+
<literal>COLUMNS</literal> ( <replaceable>json_table_column</replaceable> <optional>, ...</optional> )
19088+
</term>
19089+
<listitem>
19090+
19091+
<para>
19092+
Extracts SQL/JSON values from nested levels of the row pattern,
19093+
generates one or more columns as defined by the <literal>COLUMNS</literal>
19094+
subclause, and inserts the extracted SQL/JSON values into those
19095+
columns. The <replaceable>json_table_column</replaceable>
19096+
expression in the <literal>COLUMNS</literal> subclause uses the same
19097+
syntax as in the parent <literal>COLUMNS</literal> clause.
19098+
</para>
19099+
19100+
<para>
19101+
The <literal>NESTED PATH</literal> syntax is recursive,
19102+
so you can go down multiple nested levels by specifying several
19103+
<literal>NESTED PATH</literal> subclauses within each other.
19104+
It allows to unnest the hierarchy of JSON objects and arrays
19105+
in a single function invocation rather than chaining several
19106+
<function>JSON_TABLE</function> expressions in an SQL statement.
19107+
</para>
19108+
</listitem>
19109+
</varlistentry>
1906319110
</variablelist>
1906419111

1906519112
<note>
@@ -19189,6 +19236,111 @@ SELECT jt.* FROM
1918919236
1 | horror | Psycho | "Alfred Hitchcock"
1919019237
2 | thriller | Vertigo | "Alfred Hitchcock"
1919119238
(2 rows)
19239+
</screen>
19240+
19241+
</para>
19242+
<para>
19243+
The following is a modified version of the above query to show the usage
19244+
of <literal>NESTED PATH</literal> for populating title and director
19245+
columns, illustrating how they are joined to the parent columns id and
19246+
kind:
19247+
19248+
<programlisting>
19249+
SELECT jt.* FROM
19250+
my_films,
19251+
JSON_TABLE ( js, '$.favorites[*] ? (@.films[*].director == $filter)'
19252+
PASSING 'Alfred Hitchcock' AS filter
19253+
COLUMNS (
19254+
id FOR ORDINALITY,
19255+
kind text PATH '$.kind',
19256+
NESTED PATH '$.films[*]' COLUMNS (
19257+
title text FORMAT JSON PATH '$.title' OMIT QUOTES,
19258+
director text PATH '$.director' KEEP QUOTES))) AS jt;
19259+
</programlisting>
19260+
19261+
<screen>
19262+
id | kind | title | director
19263+
----+----------+---------+--------------------
19264+
1 | horror | Psycho | "Alfred Hitchcock"
19265+
2 | thriller | Vertigo | "Alfred Hitchcock"
19266+
(2 rows)
19267+
</screen>
19268+
19269+
</para>
19270+
19271+
<para>
19272+
The following is the same query but without the filter in the root
19273+
path:
19274+
19275+
<programlisting>
19276+
SELECT jt.* FROM
19277+
my_films,
19278+
JSON_TABLE ( js, '$.favorites[*]'
19279+
COLUMNS (
19280+
id FOR ORDINALITY,
19281+
kind text PATH '$.kind',
19282+
NESTED PATH '$.films[*]' COLUMNS (
19283+
title text FORMAT JSON PATH '$.title' OMIT QUOTES,
19284+
director text PATH '$.director' KEEP QUOTES))) AS jt;
19285+
</programlisting>
19286+
19287+
<screen>
19288+
id | kind | title | director
19289+
----+----------+-----------------+--------------------
19290+
1 | comedy | Bananas | "Woody Allen"
19291+
1 | comedy | The Dinner Game | "Francis Veber"
19292+
2 | horror | Psycho | "Alfred Hitchcock"
19293+
3 | thriller | Vertigo | "Alfred Hitchcock"
19294+
4 | drama | Yojimbo | "Akira Kurosawa"
19295+
(5 rows)
19296+
</screen>
19297+
19298+
</para>
19299+
19300+
<para>
19301+
The following shows another query using a different <type>JSON</type>
19302+
object as input. It shows the UNION "sibling join" between
19303+
<literal>NESTED</literal> paths <literal>$.movies[*]</literal> and
19304+
<literal>$.books[*]</literal> and also the usage of
19305+
<literal>FOR ORDINALITY</literal> column at <literal>NESTED</literal>
19306+
levels (columns <literal>movie_id</literal>, <literal>book_id</literal>,
19307+
and <literal>author_id</literal>):
19308+
19309+
<programlisting>
19310+
SELECT * FROM JSON_TABLE (
19311+
'{"favorites":
19312+
{"movies":
19313+
[{"name": "One", "director": "John Doe"},
19314+
{"name": "Two", "director": "Don Joe"}],
19315+
"books":
19316+
[{"name": "Mystery", "authors": [{"name": "Brown Dan"}]},
19317+
{"name": "Wonder", "authors": [{"name": "Jun Murakami"}, {"name":"Craig Doe"}]}]
19318+
}}'::json, '$.favs[*]'
19319+
COLUMNS (user_id FOR ORDINALITY,
19320+
NESTED '$.movies[*]'
19321+
COLUMNS (
19322+
movie_id FOR ORDINALITY,
19323+
mname text PATH '$.name',
19324+
director text),
19325+
NESTED '$.books[*]'
19326+
COLUMNS (
19327+
book_id FOR ORDINALITY,
19328+
bname text PATH '$.name',
19329+
NESTED '$.authors[*]'
19330+
COLUMNS (
19331+
author_id FOR ORDINALITY,
19332+
author_name text PATH '$.name'))));
19333+
</programlisting>
19334+
19335+
<screen>
19336+
user_id | movie_id | mname | director | book_id | bname | author_id | author_name
19337+
---------+----------+-------+----------+---------+---------+-----------+--------------
19338+
1 | 1 | One | John Doe | | | |
19339+
1 | 2 | Two | Don Joe | | | |
19340+
1 | | | | 1 | Mystery | 1 | Brown Dan
19341+
1 | | | | 2 | Wonder | 1 | Jun Murakami
19342+
1 | | | | 2 | Wonder | 2 | Craig Doe
19343+
(5 rows)
1919219344
</screen>
1919319345

1919419346
</para>

src/backend/catalog/sql_features.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -553,7 +553,7 @@ T823 SQL/JSON: PASSING clause YES
553553
T824 JSON_TABLE: specific PLAN clause NO
554554
T825 SQL/JSON: ON EMPTY and ON ERROR clauses YES
555555
T826 General value expression in ON ERROR or ON EMPTY clauses YES
556-
T827 JSON_TABLE: sibling NESTED COLUMNS clauses NO
556+
T827 JSON_TABLE: sibling NESTED COLUMNS clauses YES
557557
T828 JSON_QUERY YES
558558
T829 JSON_QUERY: array wrapper options YES
559559
T830 Enforcing unique keys in SQL/JSON constructor functions YES

src/backend/nodes/nodeFuncs.c

+2
Original file line numberDiff line numberDiff line change
@@ -4159,6 +4159,8 @@ raw_expression_tree_walker_impl(Node *node,
41594159
return true;
41604160
if (WALK(jtc->on_error))
41614161
return true;
4162+
if (WALK(jtc->columns))
4163+
return true;
41624164
}
41634165
break;
41644166
case T_JsonTablePathSpec:

src/backend/parser/gram.y

+36-2
Original file line numberDiff line numberDiff line change
@@ -755,7 +755,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
755755
MAPPING MATCH MATCHED MATERIALIZED MAXVALUE MERGE MERGE_ACTION METHOD
756756
MINUTE_P MINVALUE MODE MONTH_P MOVE
757757

758-
NAME_P NAMES NATIONAL NATURAL NCHAR NEW NEXT NFC NFD NFKC NFKD NO
758+
NAME_P NAMES NATIONAL NATURAL NCHAR NESTED NEW NEXT NFC NFD NFKC NFKD NO
759759
NONE NORMALIZE NORMALIZED
760760
NOT NOTHING NOTIFY NOTNULL NOWAIT NULL_P NULLIF
761761
NULLS_P NUMERIC
@@ -884,8 +884,11 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
884884
* the same precedence as IDENT. This allows resolving conflicts in the
885885
* json_predicate_type_constraint and json_key_uniqueness_constraint_opt
886886
* productions (see comments there).
887+
*
888+
* Like the UNBOUNDED PRECEDING/FOLLOWING case, NESTED is assigned a lower
889+
* precedence than PATH to fix ambiguity in the json_table production.
887890
*/
888-
%nonassoc UNBOUNDED /* ideally would have same precedence as IDENT */
891+
%nonassoc UNBOUNDED NESTED /* ideally would have same precedence as IDENT */
889892
%nonassoc IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
890893
SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
891894
%left Op OPERATOR /* multi-character ops and user-defined operators */
@@ -14270,6 +14273,35 @@ json_table_column_definition:
1427014273
n->location = @1;
1427114274
$$ = (Node *) n;
1427214275
}
14276+
| NESTED path_opt Sconst
14277+
COLUMNS '(' json_table_column_definition_list ')'
14278+
{
14279+
JsonTableColumn *n = makeNode(JsonTableColumn);
14280+
14281+
n->coltype = JTC_NESTED;
14282+
n->pathspec = (JsonTablePathSpec *)
14283+
makeJsonTablePathSpec($3, NULL, @3, -1);
14284+
n->columns = $6;
14285+
n->location = @1;
14286+
$$ = (Node *) n;
14287+
}
14288+
| NESTED path_opt Sconst AS name
14289+
COLUMNS '(' json_table_column_definition_list ')'
14290+
{
14291+
JsonTableColumn *n = makeNode(JsonTableColumn);
14292+
14293+
n->coltype = JTC_NESTED;
14294+
n->pathspec = (JsonTablePathSpec *)
14295+
makeJsonTablePathSpec($3, $5, @3, @5);
14296+
n->columns = $8;
14297+
n->location = @1;
14298+
$$ = (Node *) n;
14299+
}
14300+
;
14301+
14302+
path_opt:
14303+
PATH
14304+
| /* EMPTY */
1427314305
;
1427414306

1427514307
json_table_column_path_clause_opt:
@@ -17688,6 +17720,7 @@ unreserved_keyword:
1768817720
| MOVE
1768917721
| NAME_P
1769017722
| NAMES
17723+
| NESTED
1769117724
| NEW
1769217725
| NEXT
1769317726
| NFC
@@ -18304,6 +18337,7 @@ bare_label_keyword:
1830418337
| NATIONAL
1830518338
| NATURAL
1830618339
| NCHAR
18340+
| NESTED
1830718341
| NEW
1830818342
| NEXT
1830918343
| NFC

0 commit comments

Comments
 (0)