Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

feat: add Connection interface#1374

Merged
stephaniewang526 merged 311 commits intogoogleapis:mainfrom
stephaniewang526:query-interface
May 6, 2022
Merged

feat: add Connection interface#1374
stephaniewang526 merged 311 commits intogoogleapis:mainfrom
stephaniewang526:query-interface

Conversation

@stephaniewang526
Copy link
Contributor

@stephaniewang526 stephaniewang526 commented Jun 15, 2021

Background: go/bq-sql-client-java

This PR provides a completely new Connection interface which defines separate APIs for different types of queries. This allows us to provide an industry standard way for database applications to build against the Java client library. We provide 3 new JDBC-esque API methods:

[In-scope for this PR] executeSelect - Only supports read only SELECT queries.
[In-scope for this PR] dryRun - returns som query processing statistics including schema and query parameters.
[Not in this PR] executeUpdate - Only supports DML and DDL.
[Not in this PR] execute - Any SQL - scripts, DML, DDL, SELECT etc statements.

We also integrate with the BigQueryStorage client library and use the high throughput Read API when applicable to parse query results using Arrow format. Arrow has shown better performance over Avro as the row serialization format in this experiment.

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jun 15, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/java-bigquery API. label Jun 15, 2021
@suztomo

This comment was marked as outdated.

@stephaniewang526

This comment was marked as outdated.

@suztomo

This comment was marked as outdated.

@Neenu1995 Neenu1995 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jul 27, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jul 27, 2021
@stephaniewang526 stephaniewang526 changed the title feat: add QueryConnection interface feat: add Connection interface Aug 3, 2021
@stephaniewang526 stephaniewang526 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 12, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 12, 2021
@stephaniewang526 stephaniewang526 force-pushed the query-interface branch 2 times, most recently from e2dc6b9 to 4584ef2 Compare September 24, 2021 15:35
@stephaniewang526 stephaniewang526 added kokoro:force-run Add this label to force Kokoro to re-run the tests. and removed kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Sep 24, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Sep 27, 2021
@google-cla google-cla bot added cla: no This human has *not* signed the Contributor License Agreement. and removed cla: yes This human has signed the Contributor License Agreement. labels May 4, 2022
@stephaniewang526 stephaniewang526 added the owlbot:run Add this label to trigger the Owlbot post processor. label May 4, 2022
@stephaniewang526 stephaniewang526 removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 5, 2022
@o-shevchenko
Copy link
Contributor

o-shevchenko commented Nov 7, 2024

Hi
I use executeSelect to read data from BQ. The read is extremely slow, but I don't see much confs to tune reading.

Read of 100_000 rows takes 23930 ms.
Is it the expected speed? Is it possible to improve it?

Mono.fromCallable { bigQueryOptionsBuilder.build().service }
            .flatMap { context ->
                val connectionSettings = ConnectionSettings.newBuilder()
                    .setRequestTimeout(10L)
                    .setUseReadAPI(true)
                    .setMaxResults(1000)
                    .setNumBufferedRows(1000)
                    .setUseQueryCache(true)
                    .build();
                val connection = context.createConnection(connectionSettings)
                val bqResult = connection.executeSelect(sql)
                val result = Flux.usingWhen(
                    Mono.just(bqResult.resultSet),
                    { resultSet -> resultSet.toFlux(bqResult.schema) },
                    { _ -> Mono.fromRunnable<Unit> { connection.close() } }
                )
                Mono.just(Data(result, bqResult.schema.toSchema()))
            }
            ...
            
fun ResultSet.toFlux(schema:Schema): Flux<DataRecord> {
    return Flux.generate<DataRecord> { sink ->
        if (next()) {
            sink.next(toDataRecord(schema))
        } else {
            sink.complete()
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/java-bigquery API. cla: yes This human has signed the Contributor License Agreement. size: xl Pull request size is extra large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.