You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+135Lines changed: 135 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -770,6 +770,141 @@ initialization of VOPS extension. After invocation of this function,
770
770
extension will be loaded and all subsequent queries will be normally
771
771
transformed and produce expected results.
772
772
773
+
## <spanid="projections">VOPS projections and automatic table sustitution</span>
774
+
775
+
VOPS provides some functions simplifying creation and usage of projections.
776
+
In future it may be added to SQL grammar, so that it is possible to write
777
+
`CREATE PROJECTION xxx OF TABLE yyy(column1, column2,...) GROUP BY (column1, column2, ...)`.
778
+
But right now it can be done using `create_projection(projection_name text, source_table regclass, vector_columns text[], scalar_columns text[] default null, order_by text default null)` function.
779
+
First argument of this function specifies name of the projection, second refers to existed Postgres table, `vector_columns` is array of
780
+
column names which should be stores as VOPS tiles, `scalar_columns` is array of grouping columns which type is preserved and
781
+
optional `order_by` parameter specifies name of ordering attribute (explained below).
782
+
The `create_projection(PNAME,...)` functions does the following:
783
+
784
+
1. Creates projection table with specified name and attributes.
785
+
2. Creates PNAME_refresh() functions which can be used to update projection.
786
+
3. Creates functional BRIN indexes for `first()` and `last()` functions of ordering attribute (if any)
787
+
4. Creates BRIN index on grouping attributes (if any)
788
+
5. Insert information about created projection in `vops_projections` table. This table is used by optimizer to
789
+
automatically substitute table with partition.
790
+
791
+
The `order_by` attribute is on of the VOPS projection vector columns by which data is sorted. Usually it is some kind of timestamp
792
+
used in *time series* (for example trade date). Presence of such column in projection allows to incrementally update projection.
793
+
Generated `PNAME_refresh()` method calls `populate` method with correspondent values of `predicate` and
794
+
`sort` parameters, selecting from original table only rows with `order_by` column value greater than maximal
795
+
value of this column in projection. It assumes that `order_by` is unique or at least refresh is done at the moment when there is some gap
796
+
in collected events. In addition to `order_by`, sort list for `populate` includes all scalar (grouping) columns.
797
+
It allows to efficiently group imported data by scalar columns and fill VOPS tiles (vector columns) with data.
798
+
799
+
When `order_by` attribute is specified, VOPS creates two functional BRIN indexes on `first()` and `last()`
800
+
functions of this attribute. Presence of such indexes allows to efficiently select time slices. If original query contains
801
+
predicates like `(trade_date between '01-01-2017' and '01-01-2018')` then VOPS projection substitution mechanism adds
802
+
`(first(trade_date) >= '01-01-2017' and last(trade_date) >= '01-01-2018')` conjuncts which allows Postgres optimizer to use BRIN
803
+
indexes to locate affected pages.
804
+
805
+
In in addition to BRIN indexes for `order_by` attribute, VOPS also creates BRIN index for grouping (scalar) columns.
806
+
Such index allows to efficiently select groups and perform index join.
807
+
808
+
Like materialized views, VOPS projections are not updated automatically. It is responsibility of programmer to periodically refresh them.
809
+
Certainly it is possible to define trigger or rule which will automatically insert data in projection table when original table is updated.
810
+
But such approach will be extremely inefficient and slow. To take advantage of vector processing, VOPS has to group data in tiles.
811
+
It can be done only if there is some batch of data which can be grouped by scalar attributes. If you insert records in projection table on-by-one,
812
+
then most of VOPS tiles will contain just one element.
813
+
The most convenient way is to use generated `PNAME_refresh()` function.
814
+
If `order_by` attribute is specified, this function imports from original table only the new data (not present in projection).
815
+
816
+
The main advantage of VOPS projection mechanism is that it allows to automatically substitute queries on original tables with projections.
817
+
There is `vops.auto_substitute_projections` configuration parameter which allows to switch on such substitution.
818
+
By default it is switched off, because VOPS projects may be not synchronized with original table and query on projection may return different result.
819
+
Right now projections can be automatically substituted only if:
820
+
821
+
1. Query doesn't contain joins.
822
+
2. Query performs aggregation of vector (tile) columns.
823
+
3. All other expressions in target list, `ORDER BY` / `GROUP BY` clauses refers only to scalar attributes of projection.
824
+
825
+
Projection can be removed using `drop_projection(projection_name text)` function.
826
+
It not only drops the correspondent table, but also removes information about it from `vops_partitions` table
0 commit comments