Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source MySQL: MySQL to ClickHouse sync fails - Incremental | Append or Deduped + history #11087

Open
arashlayeghi opened this issue Mar 13, 2022 · 3 comments
Labels
area/connectors Connector related issues community connectors/source/mysql connectors/sources-database frozen Not being actively worked on releaseStage/beta team/db-dw-sources Backlog for Database and Data Warehouse Sources team type/bug Something isn't working

Comments

@arashlayeghi
Copy link

arashlayeghi commented Mar 13, 2022

Environment

  • Airbyte version: 0.35.47-alpha
  • OS Version / Instance: AWS EC2
  • Deployment: Docker
  • Source Connector and version: MySQL (0.5.6)
  • Destination Connector and version: Clickhouse (0.1.4)
  • Source name/version: MySQL 8.0.28
  • Destination name/version: Clickhouse 22.1.3.7
  • Severity: Critical
  • Step where error happened: Deploy / Create a MySQL source with either STANDARD or CDC Replication method / Create a new Clickhouse destination / Setup a new Connection between MySQL and ClickHouse with Sync mode of Incremental Deduped + history OR Incremental Append / Choose transformation: Raw data (JSON) / Press Sync now

Slack issue conversation link.

Current Behavior

It is failed with logs-5808.txt for Incremental Deduped + history and with logs-5810.txt for Incremental Append while the Reset data for both are succeeded. It is worth mentioning that with other sync modes: Full Refresh | Overwrite and Full Refresh | Append it is working fine. You can see below the table we are trying to sync with DESCRIBE trips command. ONE IMPORTANT NOTE is that all these happened to one of our instances which is recently upgraded from MySQL 5.7.33 to MySQL 8.0.28. I tested it with another instance with MySQL 5.7.33 and it worked fine at least without any normalization (Raw data (JSON))

logs-5808.txt
logs-5810.txt

Expected Behavior

It should sync with Incremental Deduped + history with either STANDARD or CDC Replication method.

Logs

logs-5808.txt
logs-5810.txt
image

LOG

replace this with
your long log
output here

Steps to Reproduce

Deploy / Create a MySQL source with either STANDARD or CDC Replication method / Create a new Clickhouse destination / Setup a new Connection between MySQL and ClickHouse with Sync mode of Incremental Deduped + history OR Incremental Append / Choose transformation: Raw data (JSON) / Press Sync now

@arashlayeghi arashlayeghi added needs-triage type/bug Something isn't working labels Mar 13, 2022
@bleonard bleonard added autoteam team/tse Technical Support Engineers labels Apr 26, 2022
@igrankova igrankova changed the title MySQL to ClickHouse sync fails - Incremental | Append or Deduped + history Source MySQL: MySQL to ClickHouse sync fails - Incremental | Append or Deduped + history Jun 7, 2023
@igrankova igrankova added area/connectors Connector related issues connectors/sources-database connectors/source/mysql team/db-dw-sources Backlog for Database and Data Warehouse Sources team releaseStage/beta and removed needs-triage team/tse Technical Support Engineers autoteam labels Jun 7, 2023
@prateekmukhedkar
Copy link
Contributor

@arashlayeghi could you check with the latest MySQL Source Connector version and see if you are still running into this error? Thank you

@squreshicc
Copy link

@prateekmukhedkar I have tried just now. The problem persists. I think problem is with the destination clickhouse connector, because data is being fetched correctly from mysql source. It fails when it runs dbt normalization models on destination clickhouse database. For example, tables do not have the column _airbyte_unique_key created in them, which was supposed to be created automatically at the time of the sync.
temp

@amit-cashify
Copy link

Any update on this? I am also facing this issue.

2023-10-17 06:55:13 normalization > 06:55:13 Completed with 1 error and 0 warnings:
2023-10-17 06:55:13 normalization > 06:55:13
2023-10-17 06:55:13 normalization > 06:55:13 Database Error in model pdd_request_scd (models/generated/airbyte_incremental/scd/airbyte_stage_db/pdd_request_scd.sql)
2023-10-17 06:55:13 normalization > 06:55:13 :HTTPDriver for http://172.16.X.YYY:8123 returned response code 404)
2023-10-17 06:55:13 normalization > 06:55:13 Code: 47. DB::Exception: Missing columns: 'inc_data._ab_cdc_log_file' 'inc_data._ab_cdc_log_pos' 'inc_data._ab_cdc_cursor' while processing query: 'SELECT _airbyte_pdd_request_hashid, status_code, method, pdd_id, inc_data._ab_cdc_log_pos AS

2023-10-17 06:55:16 normalization > 06:55:12.117260 [debug] [Thread-1 (]: Timing info for model.airbyte_utils.pdd_request_scd (execute): 2023-10-17 06:55:12.062759 => 2023-10-17 06:55:12.117175
2023-10-17 06:55:16 normalization > 06:55:12.119386 [debug] [Thread-1 (]: Database Error in model pdd_request_scd (models/generated/airbyte_incremental/scd/airbyte_stage_db/pdd_request_scd.sql)
2023-10-17 06:55:16 normalization > :HTTPDriver for http://172.16.X.YYY:8123 returned response code 404)
2023-10-17 06:55:16 normalization > Code: 47. DB::Exception: Missing columns: 'inc_data._ab_cdc_log_file' 'inc_data._ab_cdc_log_pos' 'inc_data._ab_cdc_cursor' while processing query: 'SELECT _airbyte_pdd_request_hashid, status_code, method, pdd_id, inc_data._ab_cdc_log_pos AS
2023-10-17 06:55:16 normalization > 06:55:12.119979 [error] [Thread-1 (]: 6 of 9 ERROR creating sql incremental model airbyte_stage_db.pdd_request_scd ........................................... [ERROR in 0.13s]

2023-10-17 06:55:16 normalization > 06:55:13.471437 [info ] [MainThread]: Completed with 1 error and 0 warnings:
2023-10-17 06:55:16 normalization > 06:55:13.471743 [info ] [MainThread]:
2023-10-17 06:55:16 normalization > 06:55:13.472043 [error] [MainThread]: Database Error in model pdd_request_scd (models/generated/airbyte_incremental/scd/airbyte_stage_db/pdd_request_scd.sql)
2023-10-17 06:55:16 normalization > 06:55:13.472410 [error] [MainThread]: :HTTPDriver for http://172.16.X.YYY:8123 returned response code 404)
2023-10-17 06:55:16 normalization > 06:55:13.472752 [error] [MainThread]: Code: 47. DB::Exception: Missing columns: 'inc_data._ab_cdc_log_file' 'inc_data._ab_cdc_log_pos' 'inc_data._ab_cdc_cursor' while processing query: 'SELECT _airbyte_pdd_request_hashid, status_code, method, pdd_id, inc_data._ab_cdc_log_pos AS

06:55:12.119979 [error] [Thread-1 (]: 6 of 9 ERROR creating sql incremental model airbyte_stage_db.pdd_request_scd ........................................... [ERROR in 0.13s]
06:55:13.472043 [error] [MainThread]: Database Error in model pdd_request_scd (models/generated/airbyte_incremental/scd/airbyte_stage_db/pdd_request_scd.sql)
06:55:13.472410 [error] [MainThread]: :HTTPDriver for http://172.16.X.YYY:8123 returned response code 404)
06:55:13.472752 [error] [MainThread]: Code: 47. DB::Exception: Missing columns: 'inc_data._ab_cdc_log_file' 'inc_data._ab_cdc_log_pos' 'inc_data._ab_cdc_cursor' while processing query: 'SELECT _airbyte_pdd_request_hashid, status_code, method, pdd_id, inc_data._ab_cdc_log_pos AS,retryable=,timestamp=1697525716373,additionalProperties={}], io.airbyte.config.FailureReason@4fb84519[failureOrigin=normalization,failureType=system_error,internalMessage=[0m06:55:13.472410 [error] [MainThread]: :HTTPDriver for http://172.16.X.YYY:8123 returned response code 404),externalMessage=Normalization failed during the dbt run. This may indicate a problem with the data itself.,metadata=io.airbyte.config.Metadata@c98d1ce[additionalProperties={attemptNumber=5, jobId=100, from_trace_message=true}],stacktrace=AirbyteDbtError:
06:55:12.119979 [error] [Thread-1 (]: 6 of 9 ERROR creating sql incremental model airbyte_stage_db.pdd_request_scd ........................................... [ERROR in 0.13s]
06:55:13.472043 [error] [MainThread]: Database Error in model pdd_request_scd (models/generated/airbyte_incremental/scd/airbyte_stage_db/pdd_request_scd.sql)
06:55:13.472410 [error] [MainThread]: :HTTPDriver for http://172.16.X.YYY:8123 returned response code 404)
06:55:13.472752 [error] [MainThread]: Code: 47. DB::Exception: Missing columns: 'inc_data._ab_cdc_log_file' 'inc_data._ab_cdc_log_pos' 'inc_data._ab_cdc_cursor' while processing query: 'SELECT _airbyte_pdd_request_hashid, status_code, method, pdd_id, inc_data._ab_cdc_log_pos AS,retryable=,timestamp=1697525716373,additionalProperties={}]],additionalProperties={}]
2023-10-17 06:55:16 INFO i.a.c.i.LineGobbler(voidCall):149 -
2023-10-17 06:55:16 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- END DEFAULT NORMALIZATION -----
2023-10-17 06:55:16 INFO i.a.c.i.LineGobbler(voidCall):149 -
2023-10-17 06:55:16 INFO i.a.w.t.s.a.AppendToAttemptLogActivityImpl(log):56 - Retry State: RetryManager(completeFailureBackoffPolicy=BackoffPolicy(minInterval=PT10S, maxInterval=PT30M, base=3), partialFailureBackoffPolicy=null, successiveCompleteFailureLimit=5, totalCompleteFailureLimit=10, successivePartialFailureLimit=1000, totalPartialFailureLimit=10, successiveCompleteFailures=5, totalCompleteFailures=5, successivePartialFailures=0, totalPartialFailures=1)
Backoff before next attempt: 13 minutes 30 seconds

@bleonard bleonard added the frozen Not being actively worked on label Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues community connectors/source/mysql connectors/sources-database frozen Not being actively worked on releaseStage/beta team/db-dw-sources Backlog for Database and Data Warehouse Sources team type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants