Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 461ef73

Browse files
committed
Add API for 64-bit large object access. Now users can access up to
4TB large objects (standard 8KB BLCKSZ case). For this purpose new libpq API lo_lseek64, lo_tell64 and lo_truncate64 are added. Also corresponding new backend functions lo_lseek64, lo_tell64 and lo_truncate64 are added. inv_api.c is changed to handle 64-bit offsets. Patch contributed by Nozomi Anzai (backend side) and Yugo Nagata (frontend side, docs, regression tests and example program). Reviewed by Kohei Kaigai. Committed by Tatsuo Ishii with minor editings.
1 parent ae835c7 commit 461ef73

File tree

16 files changed

+856
-32
lines changed

16 files changed

+856
-32
lines changed

doc/src/sgml/lobj.sgml

+31-3
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@
4141
larger than a single database page into a secondary storage area per table.
4242
This makes the large object facility partially obsolete. One
4343
remaining advantage of the large object facility is that it allows values
44-
up to 2 GB in size, whereas <acronym>TOAST</acronym>ed fields can be at
44+
up to 4 TB in size, whereas <acronym>TOAST</acronym>ed fields can be at
4545
most 1 GB. Also, large objects can be randomly modified using a read/write
4646
API that is more efficient than performing such operations using
4747
<acronym>TOAST</acronym>.
@@ -237,7 +237,9 @@ int lo_open(PGconn *conn, Oid lobjId, int mode);
237237
<function>lo_open</function> returns a (non-negative) large object
238238
descriptor for later use in <function>lo_read</function>,
239239
<function>lo_write</function>, <function>lo_lseek</function>,
240-
<function>lo_tell</function>, and <function>lo_close</function>.
240+
<function>lo_lseek64</function>, <function>lo_tell</function>,
241+
<function>lo_tell64</function>, <function>lo_truncate</function>,
242+
<function>lo_truncate64</function>, and <function>lo_close</function>.
241243
The descriptor is only valid for
242244
the duration of the current transaction.
243245
On failure, -1 is returned.
@@ -312,6 +314,7 @@ int lo_read(PGconn *conn, int fd, char *buf, size_t len);
312314
large object descriptor, call
313315
<synopsis>
314316
int lo_lseek(PGconn *conn, int fd, int offset, int whence);
317+
pg_int64 lo_lseek64(PGconn *conn, int fd, pg_int64 offset, int whence);
315318
</synopsis>
316319
<indexterm><primary>lo_lseek</></> This function moves the
317320
current location pointer for the large object descriptor identified by
@@ -321,7 +324,16 @@ int lo_lseek(PGconn *conn, int fd, int offset, int whence);
321324
<symbol>SEEK_CUR</> (seek from current position), and
322325
<symbol>SEEK_END</> (seek from object end). The return value is
323326
the new location pointer, or -1 on error.
327+
<indexterm><primary>lo_lseek64</></> <function>lo_lseek64</function>
328+
is a function for large objects larger than 2GB. <symbol>pg_int64</>
329+
is defined as 8-byte integer type.
324330
</para>
331+
<para>
332+
<function>lo_lseek64</> is new as of <productname>PostgreSQL</productname>
333+
9.3; if this function is run against an older server version, it will
334+
fail and return a negative value.
335+
</para>
336+
325337
</sect2>
326338

327339
<sect2 id="lo-tell">
@@ -332,9 +344,17 @@ int lo_lseek(PGconn *conn, int fd, int offset, int whence);
332344
call
333345
<synopsis>
334346
int lo_tell(PGconn *conn, int fd);
347+
pg_int64 lo_tell64(PGconn *conn, int fd);
335348
</synopsis>
336349
<indexterm><primary>lo_tell</></> If there is an error, the
337350
return value is negative.
351+
<indexterm><primary>lo_tell64</></> <function>lo_tell64</function> is
352+
a function for large objects larger than 2GB.
353+
</para>
354+
<para>
355+
<function>lo_tell64</> is new as of <productname>PostgreSQL</productname>
356+
9.3; if this function is run against an older server version, it will
357+
fail and return a negative value.
338358
</para>
339359
</sect2>
340360

@@ -345,21 +365,24 @@ int lo_tell(PGconn *conn, int fd);
345365
To truncate a large object to a given length, call
346366
<synopsis>
347367
int lo_truncate(PGcon *conn, int fd, size_t len);
368+
int lo_truncate64(PGcon *conn, int fd, pg_int64 len);
348369
</synopsis>
349370
<indexterm><primary>lo_truncate</></> truncates the large object
350371
descriptor <parameter>fd</> to length <parameter>len</>. The
351372
<parameter>fd</parameter> argument must have been returned by a
352373
previous <function>lo_open</function>. If <parameter>len</> is
353374
greater than the current large object length, the large object
354375
is extended with null bytes ('\0').
376+
<indexterm><primary>lo_truncate64</></> <function>lo_truncate64</function>
377+
is a function for large objects larger than 2GB.
355378
</para>
356379

357380
<para>
358381
The file offset is not changed.
359382
</para>
360383

361384
<para>
362-
On success <function>lo_truncate</function> returns
385+
On success <function>lo_truncate</function> and <function>lo_truncate64</function> returns
363386
zero. On error, the return value is negative.
364387
</para>
365388

@@ -368,6 +391,11 @@ int lo_truncate(PGcon *conn, int fd, size_t len);
368391
8.3; if this function is run against an older server version, it will
369392
fail and return a negative value.
370393
</para>
394+
<para>
395+
<function>lo_truncate64</> is new as of <productname>PostgreSQL</productname>
396+
9.3; if this function is run against an older server version, it will
397+
fail and return a negative value.
398+
</para>
371399
</sect2>
372400

373401
<sect2 id="lo-close">

src/backend/libpq/be-fsstubs.c

+99-2
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@
3939
#include "postgres.h"
4040

4141
#include <fcntl.h>
42+
#include <limits.h>
4243
#include <sys/stat.h>
4344
#include <unistd.h>
4445

@@ -216,7 +217,7 @@ lo_lseek(PG_FUNCTION_ARGS)
216217
int32 fd = PG_GETARG_INT32(0);
217218
int32 offset = PG_GETARG_INT32(1);
218219
int32 whence = PG_GETARG_INT32(2);
219-
int status;
220+
int64 status;
220221

221222
if (fd < 0 || fd >= cookies_size || cookies[fd] == NULL)
222223
ereport(ERROR,
@@ -225,9 +226,45 @@ lo_lseek(PG_FUNCTION_ARGS)
225226

226227
status = inv_seek(cookies[fd], offset, whence);
227228

229+
if (INT_MAX < status)
230+
{
231+
ereport(ERROR,
232+
(errcode(ERRCODE_BLOB_OFFSET_OVERFLOW),
233+
errmsg("offset overflow: %d", fd)));
234+
PG_RETURN_INT32(-1);
235+
}
236+
228237
PG_RETURN_INT32(status);
229238
}
230239

240+
241+
Datum
242+
lo_lseek64(PG_FUNCTION_ARGS)
243+
{
244+
int32 fd = PG_GETARG_INT32(0);
245+
int64 offset = PG_GETARG_INT64(1);
246+
int32 whence = PG_GETARG_INT32(2);
247+
MemoryContext currentContext;
248+
int64 status;
249+
250+
if (fd < 0 || fd >= cookies_size || cookies[fd] == NULL)
251+
{
252+
ereport(ERROR,
253+
(errcode(ERRCODE_UNDEFINED_OBJECT),
254+
errmsg("invalid large-object descriptor: %d", fd)));
255+
PG_RETURN_INT64(-1);
256+
}
257+
258+
Assert(fscxt != NULL);
259+
currentContext = MemoryContextSwitchTo(fscxt);
260+
261+
status = inv_seek(cookies[fd], offset, whence);
262+
263+
MemoryContextSwitchTo(currentContext);
264+
265+
PG_RETURN_INT64(status);
266+
}
267+
231268
Datum
232269
lo_creat(PG_FUNCTION_ARGS)
233270
{
@@ -262,15 +299,48 @@ lo_create(PG_FUNCTION_ARGS)
262299

263300
Datum
264301
lo_tell(PG_FUNCTION_ARGS)
302+
{
303+
int32 fd = PG_GETARG_INT32(0);
304+
int64 offset = 0;
305+
306+
if (fd < 0 || fd >= cookies_size || cookies[fd] == NULL)
307+
ereport(ERROR,
308+
(errcode(ERRCODE_UNDEFINED_OBJECT),
309+
errmsg("invalid large-object descriptor: %d", fd)));
310+
311+
offset = inv_tell(cookies[fd]);
312+
313+
if (INT_MAX < offset)
314+
{
315+
ereport(ERROR,
316+
(errcode(ERRCODE_BLOB_OFFSET_OVERFLOW),
317+
errmsg("offset overflow: %d", fd)));
318+
PG_RETURN_INT32(-1);
319+
}
320+
321+
PG_RETURN_INT32(offset);
322+
}
323+
324+
325+
Datum
326+
lo_tell64(PG_FUNCTION_ARGS)
265327
{
266328
int32 fd = PG_GETARG_INT32(0);
267329

268330
if (fd < 0 || fd >= cookies_size || cookies[fd] == NULL)
331+
{
269332
ereport(ERROR,
270333
(errcode(ERRCODE_UNDEFINED_OBJECT),
271334
errmsg("invalid large-object descriptor: %d", fd)));
335+
PG_RETURN_INT64(-1);
336+
}
272337

273-
PG_RETURN_INT32(inv_tell(cookies[fd]));
338+
/*
339+
* We assume we do not need to switch contexts for inv_tell. That is
340+
* true for now, but is probably more than this module ought to
341+
* assume...
342+
*/
343+
PG_RETURN_INT64(inv_tell(cookies[fd]));
274344
}
275345

276346
Datum
@@ -533,6 +603,33 @@ lo_truncate(PG_FUNCTION_ARGS)
533603
PG_RETURN_INT32(0);
534604
}
535605

606+
Datum
607+
lo_truncate64(PG_FUNCTION_ARGS)
608+
{
609+
int32 fd = PG_GETARG_INT32(0);
610+
int64 len = PG_GETARG_INT64(1);
611+
612+
if (fd < 0 || fd >= cookies_size || cookies[fd] == NULL)
613+
ereport(ERROR,
614+
(errcode(ERRCODE_UNDEFINED_OBJECT),
615+
errmsg("invalid large-object descriptor: %d", fd)));
616+
617+
/* Permission checks */
618+
if (!lo_compat_privileges &&
619+
pg_largeobject_aclcheck_snapshot(cookies[fd]->id,
620+
GetUserId(),
621+
ACL_UPDATE,
622+
cookies[fd]->snapshot) != ACLCHECK_OK)
623+
ereport(ERROR,
624+
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
625+
errmsg("permission denied for large object %u",
626+
cookies[fd]->id)));
627+
628+
inv_truncate(cookies[fd], len);
629+
630+
PG_RETURN_INT32(0);
631+
}
632+
536633
/*
537634
* AtEOXact_LargeObject -
538635
* prepares large objects for transaction commit

src/backend/storage/large_object/inv_api.c

+27-20
Original file line numberDiff line numberDiff line change
@@ -324,10 +324,10 @@ inv_drop(Oid lobjId)
324324
* NOTE: LOs can contain gaps, just like Unix files. We actually return
325325
* the offset of the last byte + 1.
326326
*/
327-
static uint32
327+
static uint64
328328
inv_getsize(LargeObjectDesc *obj_desc)
329329
{
330-
uint32 lastbyte = 0;
330+
uint64 lastbyte = 0;
331331
ScanKeyData skey[1];
332332
SysScanDesc sd;
333333
HeapTuple tuple;
@@ -368,7 +368,7 @@ inv_getsize(LargeObjectDesc *obj_desc)
368368
heap_tuple_untoast_attr((struct varlena *) datafield);
369369
pfreeit = true;
370370
}
371-
lastbyte = data->pageno * LOBLKSIZE + getbytealen(datafield);
371+
lastbyte = (uint64) data->pageno * LOBLKSIZE + getbytealen(datafield);
372372
if (pfreeit)
373373
pfree(datafield);
374374
}
@@ -378,30 +378,31 @@ inv_getsize(LargeObjectDesc *obj_desc)
378378
return lastbyte;
379379
}
380380

381-
int
382-
inv_seek(LargeObjectDesc *obj_desc, int offset, int whence)
381+
int64
382+
inv_seek(LargeObjectDesc *obj_desc, int64 offset, int whence)
383383
{
384384
Assert(PointerIsValid(obj_desc));
385385

386386
switch (whence)
387387
{
388388
case SEEK_SET:
389-
if (offset < 0)
390-
elog(ERROR, "invalid seek offset: %d", offset);
389+
if (offset < 0 || offset >= MAX_LARGE_OBJECT_SIZE)
390+
elog(ERROR, "invalid seek offset: " INT64_FORMAT, offset);
391391
obj_desc->offset = offset;
392392
break;
393393
case SEEK_CUR:
394-
if (offset < 0 && obj_desc->offset < ((uint32) (-offset)))
395-
elog(ERROR, "invalid seek offset: %d", offset);
394+
if ((offset + obj_desc->offset) < 0 ||
395+
(offset + obj_desc->offset) >= MAX_LARGE_OBJECT_SIZE)
396+
elog(ERROR, "invalid seek offset: " INT64_FORMAT, offset);
396397
obj_desc->offset += offset;
397398
break;
398399
case SEEK_END:
399400
{
400-
uint32 size = inv_getsize(obj_desc);
401+
int64 pos = inv_getsize(obj_desc) + offset;
401402

402-
if (offset < 0 && size < ((uint32) (-offset)))
403-
elog(ERROR, "invalid seek offset: %d", offset);
404-
obj_desc->offset = size + offset;
403+
if (pos < 0 || pos >= MAX_LARGE_OBJECT_SIZE)
404+
elog(ERROR, "invalid seek offset: " INT64_FORMAT, offset);
405+
obj_desc->offset = pos;
405406
}
406407
break;
407408
default:
@@ -410,7 +411,7 @@ inv_seek(LargeObjectDesc *obj_desc, int offset, int whence)
410411
return obj_desc->offset;
411412
}
412413

413-
int
414+
int64
414415
inv_tell(LargeObjectDesc *obj_desc)
415416
{
416417
Assert(PointerIsValid(obj_desc));
@@ -422,11 +423,11 @@ int
422423
inv_read(LargeObjectDesc *obj_desc, char *buf, int nbytes)
423424
{
424425
int nread = 0;
425-
int n;
426-
int off;
426+
int64 n;
427+
int64 off;
427428
int len;
428429
int32 pageno = (int32) (obj_desc->offset / LOBLKSIZE);
429-
uint32 pageoff;
430+
uint64 pageoff;
430431
ScanKeyData skey[2];
431432
SysScanDesc sd;
432433
HeapTuple tuple;
@@ -437,6 +438,9 @@ inv_read(LargeObjectDesc *obj_desc, char *buf, int nbytes)
437438
if (nbytes <= 0)
438439
return 0;
439440

441+
if ((nbytes + obj_desc->offset) > MAX_LARGE_OBJECT_SIZE)
442+
elog(ERROR, "invalid read request size: %d", nbytes);
443+
440444
open_lo_relation();
441445

442446
ScanKeyInit(&skey[0],
@@ -467,7 +471,7 @@ inv_read(LargeObjectDesc *obj_desc, char *buf, int nbytes)
467471
* there may be missing pages if the LO contains unwritten "holes". We
468472
* want missing sections to read out as zeroes.
469473
*/
470-
pageoff = ((uint32) data->pageno) * LOBLKSIZE;
474+
pageoff = ((uint64) data->pageno) * LOBLKSIZE;
471475
if (pageoff > obj_desc->offset)
472476
{
473477
n = pageoff - obj_desc->offset;
@@ -560,6 +564,9 @@ inv_write(LargeObjectDesc *obj_desc, const char *buf, int nbytes)
560564
if (nbytes <= 0)
561565
return 0;
562566

567+
if ((nbytes + obj_desc->offset) > MAX_LARGE_OBJECT_SIZE)
568+
elog(ERROR, "invalid write request size: %d", nbytes);
569+
563570
open_lo_relation();
564571

565572
indstate = CatalogOpenIndexes(lo_heap_r);
@@ -718,10 +725,10 @@ inv_write(LargeObjectDesc *obj_desc, const char *buf, int nbytes)
718725
}
719726

720727
void
721-
inv_truncate(LargeObjectDesc *obj_desc, int len)
728+
inv_truncate(LargeObjectDesc *obj_desc, int64 len)
722729
{
723730
int32 pageno = (int32) (len / LOBLKSIZE);
724-
int off;
731+
int32 off;
725732
ScanKeyData skey[2];
726733
SysScanDesc sd;
727734
HeapTuple oldtuple;

src/backend/utils/errcodes.txt

+1
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,7 @@ Section: Class 22 - Data Exception
199199
2200N E ERRCODE_INVALID_XML_CONTENT invalid_xml_content
200200
2200S E ERRCODE_INVALID_XML_COMMENT invalid_xml_comment
201201
2200T E ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION invalid_xml_processing_instruction
202+
22P07 E ERRCODE_BLOB_OFFSET_OVERFLOW blob_offset_overflow
202203

203204
Section: Class 23 - Integrity Constraint Violation
204205

0 commit comments

Comments
 (0)