Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAlvaro Herrera2016-12-02 03:34:01 +0000
committerAlvaro Herrera2016-12-02 03:34:01 +0000
commitfa2fa995528023b2e6ba1108f2f47558c6b66dcd (patch)
tree0e406f0cdbb3dec81277fe7a1c4902567e3feefe /src/backend/lib/stringinfo.c
parent78c8c814390f14398e8fd43fe7282cb2e260b50f (diff)
Permit dump/reload of not-too-large >1GB tuples
Our documentation states that our maximum field size is 1 GB, and that our maximum row size of 1.6 TB. However, while this might be attainable in theory with enough contortions, it is not workable in practice; for starters, pg_dump fails to dump tables containing rows larger than 1 GB, even if individual columns are well below the limit; and even if one does manage to manufacture a dump file containing a row that large, the server refuses to load it anyway. This commit enables dumping and reloading of such tuples, provided two conditions are met: 1. no single column is larger than 1 GB (in output size -- for bytea this includes the formatting overhead) 2. the whole row is not larger than 2 GB There are three related changes to enable this: a. StringInfo's API now has two additional functions that allow creating a string that grows beyond the typical 1GB limit (and "long" string). ABI compatibility is maintained. We still limit these strings to 2 GB, though, for reasons explained below. b. COPY now uses long StringInfos, so that pg_dump doesn't choke trying to emit rows longer than 1GB. c. heap_form_tuple now uses the MCXT_ALLOW_HUGE flag in its allocation for the input tuple, which means that large tuples are accepted on input. Note that at this point we do not apply any further limit to the input tuple size. The main reason to limit to 2 GB is that the FE/BE protocol uses 32 bit length words to describe each row; and because the documentation is ambiguous on its signedness and libpq does consider it signed, we cannot use the highest-order bit. Additionally, the StringInfo API uses "int" (which is 4 bytes wide in most platforms) in many places, so we'd need to change that API too in order to improve, which has lots of fallout. Backpatch to 9.5, which is the oldest that has MemoryContextAllocExtended, a necessary piece of infrastructure. We could apply to 9.4 with very minimal additional effort, but any further than that would require backpatching "huge" allocations too. This is the largest set of changes we could find that can be back-patched without breaking compatibility with existing systems. Fixing a bigger set of problems (for example, dumping tuples bigger than 2GB, or dumping fields bigger than 1GB) would require changing the FE/BE protocol and/or changing the StringInfo API in an ABI-incompatible way, neither of which would be back-patchable. Authors: Daniel Vérité, Álvaro Herrera Reviewed by: Tomas Vondra Discussion: https://postgr.es/m/20160229183023.GA286012@alvherre.pgsql
Diffstat (limited to 'src/backend/lib/stringinfo.c')
-rw-r--r--src/backend/lib/stringinfo.c66
1 files changed, 54 insertions, 12 deletions
diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index 7382e08077a..b618b37e09f 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -4,7 +4,8 @@
*
* StringInfo provides an indefinitely-extensible string data type.
* It can be used to buffer either ordinary C strings (null-terminated text)
- * or arbitrary binary data. All storage is allocated with palloc().
+ * or arbitrary binary data. All storage is allocated with palloc() and
+ * friends.
*
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -37,10 +38,28 @@ makeStringInfo(void)
}
/*
+ * makeLongStringInfo
+ *
+ * Same as makeStringInfo, for larger strings.
+ */
+StringInfo
+makeLongStringInfo(void)
+{
+ StringInfo res;
+
+ res = (StringInfo) palloc(sizeof(StringInfoData));
+
+ initLongStringInfo(res);
+
+ return res;
+}
+
+
+/*
* initStringInfo
*
* Initialize a StringInfoData struct (with previously undefined contents)
- * to describe an empty string.
+ * to describe an empty string; don't enable long strings yet.
*/
void
initStringInfo(StringInfo str)
@@ -49,10 +68,23 @@ initStringInfo(StringInfo str)
str->data = (char *) palloc(size);
str->maxlen = size;
+ str->long_ok = false;
resetStringInfo(str);
}
/*
+ * initLongStringInfo
+ *
+ * Same as initStringInfo, plus enable long strings.
+ */
+void
+initLongStringInfo(StringInfo str)
+{
+ initStringInfo(str);
+ str->long_ok = true;
+}
+
+/*
* resetStringInfo
*
* Reset the StringInfo: the data buffer remains valid, but its
@@ -142,7 +174,7 @@ appendStringInfoVA(StringInfo str, const char *fmt, va_list args)
/*
* Return pvsnprintf's estimate of the space needed. (Although this is
* given as a size_t, we know it will fit in int because it's not more
- * than MaxAllocSize.)
+ * than either MaxAllocSize or half an int's width.)
*/
return (int) nprinted;
}
@@ -244,7 +276,17 @@ appendBinaryStringInfo(StringInfo str, const char *data, int datalen)
void
enlargeStringInfo(StringInfo str, int needed)
{
- int newlen;
+ Size newlen;
+ Size limit;
+
+ /*
+ * Determine the upper size limit. Because of overflow concerns outside
+ * of this module, we limit ourselves to 4-byte signed integer range,
+ * even for "long_ok" strings.
+ */
+ limit = str->long_ok ?
+ (((Size) 1) << (sizeof(int32) * 8 - 1)) - 1 :
+ MaxAllocSize;
/*
* Guard against out-of-range "needed" values. Without this, we can get
@@ -252,7 +294,7 @@ enlargeStringInfo(StringInfo str, int needed)
*/
if (needed < 0) /* should not happen */
elog(ERROR, "invalid string enlargement request size: %d", needed);
- if (((Size) needed) >= (MaxAllocSize - (Size) str->len))
+ if (((Size) needed) >= (limit - (Size) str->len))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("out of memory"),
@@ -261,7 +303,7 @@ enlargeStringInfo(StringInfo str, int needed)
needed += str->len + 1; /* total space required now */
- /* Because of the above test, we now have needed <= MaxAllocSize */
+ /* Because of the above test, we now have needed <= limit */
if (needed <= str->maxlen)
return; /* got enough space already */
@@ -276,14 +318,14 @@ enlargeStringInfo(StringInfo str, int needed)
newlen = 2 * newlen;
/*
- * Clamp to MaxAllocSize in case we went past it. Note we are assuming
- * here that MaxAllocSize <= INT_MAX/2, else the above loop could
- * overflow. We will still have newlen >= needed.
+ * Clamp to the limit in case we went past it. Note we are assuming here
+ * that limit <= INT_MAX/2, else the above loop could overflow. We will
+ * still have newlen >= needed.
*/
- if (newlen > (int) MaxAllocSize)
- newlen = (int) MaxAllocSize;
+ if (newlen > limit)
+ newlen = limit;
- str->data = (char *) repalloc(str->data, newlen);
+ str->data = (char *) repalloc_huge(str->data, (Size) newlen);
str->maxlen = newlen;
}