Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 5f7c2bd

Browse files
committed
sum() on int2 and int4 columns now uses an int8, not numeric, accumulator
for speed reasons; its result type also changes to int8. avg() on these datatypes now accumulates the running sum in int8 for speed; but we still deliver the final result as numeric, so that fractional accuracy is preserved. count() now counts and returns in int8, not int4. I am a little nervous about this possibly breaking users' code, but there didn't seem to be a strong sentiment for avoiding the problem. If we get complaints during beta, we can change count back to int4 and add a "count8" aggregate. For that matter, users can do it for themselves with a simple CREATE AGGREGATE command; the int4inc function is still present, so no C hacking is needed. Also added max() and min() aggregates for OID that do proper unsigned comparison, instead of piggybacking on int4 aggregates. initdb forced.
1 parent 6f2943b commit 5f7c2bd

File tree

13 files changed

+210
-54
lines changed

13 files changed

+210
-54
lines changed

doc/src/sgml/func.sgml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/func.sgml,v 1.64 2001/07/11 22:14:01 momjian Exp $ -->
1+
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/func.sgml,v 1.65 2001/08/14 22:21:58 tgl Exp $ -->
22

33
<chapter id="functions">
44
<title>Functions and Operators</title>
@@ -3768,7 +3768,7 @@ SELECT NULLIF(value, '(none)') ...
37683768
<row>
37693769
<entry>COUNT(*)</entry>
37703770
<entry>number of input values</entry>
3771-
<entry>The return value is of type <type>integer</type>.</entry>
3771+
<entry>The return value is of type <type>bigint</type>.</entry>
37723772
</row>
37733773

37743774
<row>
@@ -3777,7 +3777,7 @@ SELECT NULLIF(value, '(none)') ...
37773777
Counts the input values for which the value of <replaceable
37783778
class="parameter">expression</replaceable> is not NULL.
37793779
</entry>
3780-
<entry></entry>
3780+
<entry>The return value is of type <type>bigint</type>.</entry>
37813781
</row>
37823782

37833783
<row>
@@ -3822,7 +3822,9 @@ SELECT NULLIF(value, '(none)') ...
38223822
<type>smallint</type>, <type>integer</type>,
38233823
<type>bigint</type>, <type>real</type>, <type>double
38243824
precision</type>, <type>numeric</type>, <type>interval</type>.
3825-
The result is of type <type>numeric</type> for any integer type
3825+
The result is of type <type>bigint</type> for <type>smallint</type>
3826+
or <type>integer</type> input, <type>numeric</type> for
3827+
<type>bigint</type>
38263828
input, <type>double precision</type> for floating point input,
38273829
otherwise the same as the input data type.
38283830
</entry>
@@ -3836,7 +3838,8 @@ SELECT NULLIF(value, '(none)') ...
38363838
<primary>variance</primary>
38373839
</indexterm>
38383840
The variance is the square of the standard deviation. The
3839-
supported data types are the same.
3841+
supported data types and result types are the same as for
3842+
standard deviation.
38403843
</entry>
38413844
</row>
38423845

@@ -3848,7 +3851,8 @@ SELECT NULLIF(value, '(none)') ...
38483851
It should be noted that except for <function>COUNT</function>,
38493852
these functions return NULL when no rows are selected. In
38503853
particular, <function>SUM</function> of no rows returns NULL, not
3851-
zero as one might expect.
3854+
zero as one might expect. <function>COALESCE</function> may be
3855+
used to substitute zero for NULL when necessary.
38523856
</para>
38533857

38543858
</sect1>

src/backend/utils/adt/int8.c

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
* Portions Copyright (c) 1994, Regents of the University of California
88
*
99
* IDENTIFICATION
10-
* $Header: /cvsroot/pgsql/src/backend/utils/adt/int8.c,v 1.30 2001/06/07 00:09:29 momjian Exp $
10+
* $Header: /cvsroot/pgsql/src/backend/utils/adt/int8.c,v 1.31 2001/08/14 22:21:58 tgl Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -503,6 +503,14 @@ int8fac(PG_FUNCTION_ARGS)
503503
PG_RETURN_INT64(result);
504504
}
505505

506+
Datum
507+
int8inc(PG_FUNCTION_ARGS)
508+
{
509+
int64 arg = PG_GETARG_INT64(0);
510+
511+
PG_RETURN_INT64(arg + 1);
512+
}
513+
506514
Datum
507515
int8larger(PG_FUNCTION_ARGS)
508516
{

src/backend/utils/adt/numeric.c

Lines changed: 118 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
*
66
* 1998 Jan Wieck
77
*
8-
* $Header: /cvsroot/pgsql/src/backend/utils/adt/numeric.c,v 1.42 2001/06/07 00:09:29 momjian Exp $
8+
* $Header: /cvsroot/pgsql/src/backend/utils/adt/numeric.c,v 1.43 2001/08/14 22:21:58 tgl Exp $
99
*
1010
* ----------
1111
*/
@@ -1772,7 +1772,11 @@ numeric_accum(PG_FUNCTION_ARGS)
17721772

17731773
/*
17741774
* Integer data types all use Numeric accumulators to share code and
1775-
* avoid risk of overflow.
1775+
* avoid risk of overflow. For int2 and int4 inputs, Numeric accumulation
1776+
* is overkill for the N and sum(X) values, but definitely not overkill
1777+
* for the sum(X*X) value. Hence, we use int2_accum and int4_accum only
1778+
* for stddev/variance --- there are faster special-purpose accumulator
1779+
* routines for SUM and AVG of these datatypes.
17761780
*/
17771781

17781782
Datum
@@ -1979,71 +1983,74 @@ numeric_stddev(PG_FUNCTION_ARGS)
19791983
/*
19801984
* SUM transition functions for integer datatypes.
19811985
*
1982-
* We use a Numeric accumulator to avoid overflow. Because SQL92 defines
1983-
* the SUM() of no values to be NULL, not zero, the initial condition of
1984-
* the transition data value needs to be NULL. This means we can't rely
1985-
* on ExecAgg to automatically insert the first non-null data value into
1986-
* the transition data: it doesn't know how to do the type conversion.
1987-
* The upshot is that these routines have to be marked non-strict and
1988-
* handle substitution of the first non-null input themselves.
1986+
* To avoid overflow, we use accumulators wider than the input datatype.
1987+
* A Numeric accumulator is needed for int8 input; for int4 and int2
1988+
* inputs, we use int8 accumulators which should be sufficient for practical
1989+
* purposes. (The latter two therefore don't really belong in this file,
1990+
* but we keep them here anyway.)
1991+
*
1992+
* Because SQL92 defines the SUM() of no values to be NULL, not zero,
1993+
* the initial condition of the transition data value needs to be NULL. This
1994+
* means we can't rely on ExecAgg to automatically insert the first non-null
1995+
* data value into the transition data: it doesn't know how to do the type
1996+
* conversion. The upshot is that these routines have to be marked non-strict
1997+
* and handle substitution of the first non-null input themselves.
19891998
*/
19901999

19912000
Datum
19922001
int2_sum(PG_FUNCTION_ARGS)
19932002
{
1994-
Numeric oldsum;
1995-
Datum newval;
2003+
int64 oldsum;
2004+
int64 newval;
19962005

19972006
if (PG_ARGISNULL(0))
19982007
{
19992008
/* No non-null input seen so far... */
20002009
if (PG_ARGISNULL(1))
20012010
PG_RETURN_NULL(); /* still no non-null */
20022011
/* This is the first non-null input. */
2003-
newval = DirectFunctionCall1(int2_numeric, PG_GETARG_DATUM(1));
2004-
PG_RETURN_DATUM(newval);
2012+
newval = (int64) PG_GETARG_INT16(1);
2013+
PG_RETURN_INT64(newval);
20052014
}
20062015

2007-
oldsum = PG_GETARG_NUMERIC(0);
2016+
oldsum = PG_GETARG_INT64(0);
20082017

20092018
/* Leave sum unchanged if new input is null. */
20102019
if (PG_ARGISNULL(1))
2011-
PG_RETURN_NUMERIC(oldsum);
2020+
PG_RETURN_INT64(oldsum);
20122021

20132022
/* OK to do the addition. */
2014-
newval = DirectFunctionCall1(int2_numeric, PG_GETARG_DATUM(1));
2023+
newval = oldsum + (int64) PG_GETARG_INT16(1);
20152024

2016-
PG_RETURN_DATUM(DirectFunctionCall2(numeric_add,
2017-
NumericGetDatum(oldsum), newval));
2025+
PG_RETURN_INT64(newval);
20182026
}
20192027

20202028
Datum
20212029
int4_sum(PG_FUNCTION_ARGS)
20222030
{
2023-
Numeric oldsum;
2024-
Datum newval;
2031+
int64 oldsum;
2032+
int64 newval;
20252033

20262034
if (PG_ARGISNULL(0))
20272035
{
20282036
/* No non-null input seen so far... */
20292037
if (PG_ARGISNULL(1))
20302038
PG_RETURN_NULL(); /* still no non-null */
20312039
/* This is the first non-null input. */
2032-
newval = DirectFunctionCall1(int4_numeric, PG_GETARG_DATUM(1));
2033-
PG_RETURN_DATUM(newval);
2040+
newval = (int64) PG_GETARG_INT32(1);
2041+
PG_RETURN_INT64(newval);
20342042
}
20352043

2036-
oldsum = PG_GETARG_NUMERIC(0);
2044+
oldsum = PG_GETARG_INT64(0);
20372045

20382046
/* Leave sum unchanged if new input is null. */
20392047
if (PG_ARGISNULL(1))
2040-
PG_RETURN_NUMERIC(oldsum);
2048+
PG_RETURN_INT64(oldsum);
20412049

20422050
/* OK to do the addition. */
2043-
newval = DirectFunctionCall1(int4_numeric, PG_GETARG_DATUM(1));
2051+
newval = oldsum + (int64) PG_GETARG_INT32(1);
20442052

2045-
PG_RETURN_DATUM(DirectFunctionCall2(numeric_add,
2046-
NumericGetDatum(oldsum), newval));
2053+
PG_RETURN_INT64(newval);
20472054
}
20482055

20492056
Datum
@@ -2076,6 +2083,90 @@ int8_sum(PG_FUNCTION_ARGS)
20762083
}
20772084

20782085

2086+
/*
2087+
* Routines for avg(int2) and avg(int4). The transition datatype
2088+
* is a two-element int8 array, holding count and sum.
2089+
*/
2090+
2091+
typedef struct Int8TransTypeData
2092+
{
2093+
#ifndef INT64_IS_BUSTED
2094+
int64 count;
2095+
int64 sum;
2096+
#else
2097+
/* "int64" isn't really 64 bits, so fake up properly-aligned fields */
2098+
int32 count;
2099+
int32 pad1;
2100+
int32 sum;
2101+
int32 pad2;
2102+
#endif
2103+
} Int8TransTypeData;
2104+
2105+
Datum
2106+
int2_avg_accum(PG_FUNCTION_ARGS)
2107+
{
2108+
ArrayType *transarray = PG_GETARG_ARRAYTYPE_P_COPY(0);
2109+
int16 newval = PG_GETARG_INT16(1);
2110+
Int8TransTypeData *transdata;
2111+
2112+
/*
2113+
* We copied the input array, so it's okay to scribble on it directly.
2114+
*/
2115+
if (ARR_SIZE(transarray) != ARR_OVERHEAD(1) + sizeof(Int8TransTypeData))
2116+
elog(ERROR, "int2_avg_accum: expected 2-element int8 array");
2117+
transdata = (Int8TransTypeData *) ARR_DATA_PTR(transarray);
2118+
2119+
transdata->count++;
2120+
transdata->sum += newval;
2121+
2122+
PG_RETURN_ARRAYTYPE_P(transarray);
2123+
}
2124+
2125+
Datum
2126+
int4_avg_accum(PG_FUNCTION_ARGS)
2127+
{
2128+
ArrayType *transarray = PG_GETARG_ARRAYTYPE_P_COPY(0);
2129+
int32 newval = PG_GETARG_INT32(1);
2130+
Int8TransTypeData *transdata;
2131+
2132+
/*
2133+
* We copied the input array, so it's okay to scribble on it directly.
2134+
*/
2135+
if (ARR_SIZE(transarray) != ARR_OVERHEAD(1) + sizeof(Int8TransTypeData))
2136+
elog(ERROR, "int4_avg_accum: expected 2-element int8 array");
2137+
transdata = (Int8TransTypeData *) ARR_DATA_PTR(transarray);
2138+
2139+
transdata->count++;
2140+
transdata->sum += newval;
2141+
2142+
PG_RETURN_ARRAYTYPE_P(transarray);
2143+
}
2144+
2145+
Datum
2146+
int8_avg(PG_FUNCTION_ARGS)
2147+
{
2148+
ArrayType *transarray = PG_GETARG_ARRAYTYPE_P(0);
2149+
Int8TransTypeData *transdata;
2150+
Datum countd,
2151+
sumd;
2152+
2153+
if (ARR_SIZE(transarray) != ARR_OVERHEAD(1) + sizeof(Int8TransTypeData))
2154+
elog(ERROR, "int8_avg: expected 2-element int8 array");
2155+
transdata = (Int8TransTypeData *) ARR_DATA_PTR(transarray);
2156+
2157+
/* SQL92 defines AVG of no values to be NULL */
2158+
if (transdata->count == 0)
2159+
PG_RETURN_NULL();
2160+
2161+
countd = DirectFunctionCall1(int8_numeric,
2162+
Int64GetDatumFast(transdata->count));
2163+
sumd = DirectFunctionCall1(int8_numeric,
2164+
Int64GetDatumFast(transdata->sum));
2165+
2166+
PG_RETURN_DATUM(DirectFunctionCall2(numeric_div, sumd, countd));
2167+
}
2168+
2169+
20792170
/* ----------------------------------------------------------------------
20802171
*
20812172
* Local functions follow

src/backend/utils/adt/oid.c

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $Header: /cvsroot/pgsql/src/backend/utils/adt/oid.c,v 1.45 2001/03/22 03:59:52 momjian Exp $
11+
* $Header: /cvsroot/pgsql/src/backend/utils/adt/oid.c,v 1.46 2001/08/14 22:21:58 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -231,6 +231,24 @@ oidgt(PG_FUNCTION_ARGS)
231231
PG_RETURN_BOOL(arg1 > arg2);
232232
}
233233

234+
Datum
235+
oidlarger(PG_FUNCTION_ARGS)
236+
{
237+
Oid arg1 = PG_GETARG_OID(0);
238+
Oid arg2 = PG_GETARG_OID(1);
239+
240+
PG_RETURN_OID((arg1 > arg2) ? arg1 : arg2);
241+
}
242+
243+
Datum
244+
oidsmaller(PG_FUNCTION_ARGS)
245+
{
246+
Oid arg1 = PG_GETARG_OID(0);
247+
Oid arg2 = PG_GETARG_OID(1);
248+
249+
PG_RETURN_OID((arg1 < arg2) ? arg1 : arg2);
250+
}
251+
234252
Datum
235253
oidvectoreq(PG_FUNCTION_ARGS)
236254
{

src/backend/utils/fmgr/fmgr.c

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $Header: /cvsroot/pgsql/src/backend/utils/fmgr/fmgr.c,v 1.53 2001/06/01 02:41:36 tgl Exp $
11+
* $Header: /cvsroot/pgsql/src/backend/utils/fmgr/fmgr.c,v 1.54 2001/08/14 22:21:58 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -1414,10 +1414,24 @@ fmgr(Oid procedureId,...)
14141414
Datum
14151415
Int64GetDatum(int64 X)
14161416
{
1417+
#ifndef INT64_IS_BUSTED
14171418
int64 *retval = (int64 *) palloc(sizeof(int64));
14181419

14191420
*retval = X;
14201421
return PointerGetDatum(retval);
1422+
#else /* INT64_IS_BUSTED */
1423+
/*
1424+
* On a machine with no 64-bit-int C datatype, sizeof(int64) will not be
1425+
* 8, but we want Int64GetDatum to return an 8-byte object anyway, with
1426+
* zeroes in the unused bits. This is needed so that, for example,
1427+
* hash join of int8 will behave properly.
1428+
*/
1429+
int64 *retval = (int64 *) palloc(Max(sizeof(int64), 8));
1430+
1431+
MemSet(retval, 0, Max(sizeof(int64), 8));
1432+
*retval = X;
1433+
return PointerGetDatum(retval);
1434+
#endif /* INT64_IS_BUSTED */
14211435
}
14221436

14231437
Datum

src/include/catalog/catversion.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
* Portions Copyright (c) 1996-2001, PostgreSQL Global Development Group
3838
* Portions Copyright (c) 1994, Regents of the University of California
3939
*
40-
* $Id: catversion.h,v 1.88 2001/08/13 18:45:36 tgl Exp $
40+
* $Id: catversion.h,v 1.89 2001/08/14 22:21:58 tgl Exp $
4141
*
4242
*-------------------------------------------------------------------------
4343
*/
@@ -53,6 +53,6 @@
5353
*/
5454

5555
/* yyyymmddN */
56-
#define CATALOG_VERSION_NO 200108131
56+
#define CATALOG_VERSION_NO 200108132
5757

5858
#endif

0 commit comments

Comments
 (0)