23
23
A <firstterm>final function</firstterm>
24
24
can also be specified, in case the desired result of the aggregate
25
25
is different from the data that needs to be kept in the running
26
- state value. The final function takes the last state value
26
+ state value. The final function takes the ending state value
27
27
and returns whatever is wanted as the aggregate result.
28
28
In principle, the transition and final functions are just ordinary
29
29
functions that could also be used outside the context of the
@@ -509,6 +509,102 @@ SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
509
509
and therefore there is no need for them to support moving-aggregate mode.
510
510
</para>
511
511
512
+ </sect2>
513
+
514
+ <sect2 id="xaggr-partial-aggregates">
515
+ <title>Partial Aggregation</title>
516
+
517
+ <indexterm>
518
+ <primary>aggregate function</primary>
519
+ <secondary>partial aggregation</secondary>
520
+ </indexterm>
521
+
522
+ <para>
523
+ Optionally, an aggregate function can support <firstterm>partial
524
+ aggregation</>. The idea of partial aggregation is to run the aggregate's
525
+ state transition function over different subsets of the input data
526
+ independently, and then to combine the state values resulting from those
527
+ subsets to produce the same state value that would have resulted from
528
+ scanning all the input in a single operation. This mode can be used for
529
+ parallel aggregation by having different worker processes scan different
530
+ portions of a table. Each worker produces a partial state value, and at
531
+ the end those state values are combined to produce a final state value.
532
+ (In the future this mode might also be used for purposes such as combining
533
+ aggregations over local and remote tables; but that is not implemented
534
+ yet.)
535
+ </para>
536
+
537
+ <para>
538
+ To support partial aggregation, the aggregate definition must provide
539
+ a <firstterm>combine function</>, which takes two values of the
540
+ aggregate's state type (representing the results of aggregating over two
541
+ subsets of the input rows) and produces a new value of the state type,
542
+ representing what the state would have been after aggregating over the
543
+ combination of those sets of rows. It is unspecified what the relative
544
+ order of the input rows from the two sets would have been. This means
545
+ that it's usually impossible to define a useful combine function for
546
+ aggregates that are sensitive to input row order.
547
+ </para>
548
+
549
+ <para>
550
+ As simple examples, <literal>MAX</> and <literal>MIN</> aggregates can be
551
+ made to support partial aggregation by specifying the combine function as
552
+ the same greater-of-two or lesser-of-two comparison function that is used
553
+ as their transition function. <literal>SUM</> aggregates just need an
554
+ addition function as combine function. (Again, this is the same as their
555
+ transition function, unless the state value is wider than the input data
556
+ type.)
557
+ </para>
558
+
559
+ <para>
560
+ The combine function is treated much like a transition function that
561
+ happens to take a value of the state type, not of the underlying input
562
+ type, as its second argument. In particular, the rules for dealing
563
+ with null values and strict functions are similar. Also, if the aggregate
564
+ definition specifies a non-null <literal>initcond</>, keep in mind that
565
+ that will be used not only as the initial state for each partial
566
+ aggregation run, but also as the initial state for the combine function,
567
+ which will be called to combine each partial result into that state.
568
+ </para>
569
+
570
+ <para>
571
+ If the aggregate's state type is declared as <type>internal</>, it is
572
+ the combine function's responsibility that its result is allocated in
573
+ the correct memory context for aggregate state values. This means in
574
+ particular that when the first input is <literal>NULL</> it's invalid
575
+ to simply return the second input, as that value will be in the wrong
576
+ context and will not have sufficient lifespan.
577
+ </para>
578
+
579
+ <para>
580
+ When the aggregate's state type is declared as <type>internal</>, it is
581
+ usually also appropriate for the aggregate definition to provide a
582
+ <firstterm>serialization function</> and a <firstterm>deserialization
583
+ function</>, which allow such a state value to be copied from one process
584
+ to another. Without these functions, parallel aggregation cannot be
585
+ performed, and future applications such as local/remote aggregation will
586
+ probably not work either.
587
+ </para>
588
+
589
+ <para>
590
+ A serialization function must take a single argument of
591
+ type <type>internal</> and return a result of type <type>bytea</>, which
592
+ represents the state value packaged up into a flat blob of bytes.
593
+ Conversely, a deserialization function reverses that conversion. It must
594
+ take two arguments of types <type>bytea</> and <type>internal</>, and
595
+ return a result of type <type>internal</>. (The second argument is unused
596
+ and is always zero, but it is required for type-safety reasons.) The
597
+ result of the deserialization function should simply be allocated in the
598
+ current memory context, as unlike the combine function's result, it is not
599
+ long-lived.
600
+ </para>
601
+
602
+ <para>
603
+ Worth noting also is that for an aggregate to be executed in parallel,
604
+ the aggregate itself must be marked <literal>PARALLEL SAFE</>. The
605
+ parallel-safety markings on its support functions are not consulted.
606
+ </para>
607
+
512
608
</sect2>
513
609
514
610
<sect2 id="xaggr-support-functions">
@@ -521,7 +617,7 @@ SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
521
617
522
618
<para>
523
619
A function written in C can detect that it is being called as an
524
- aggregate transition or final function by calling
620
+ aggregate support function by calling
525
621
<function>AggCheckCallContext</>, for example:
526
622
<programlisting>
527
623
if (AggCheckCallContext(fcinfo, NULL))
0 commit comments