Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
All Products
Search
Document Center

ApsaraDB for ClickHouse:Migrate data between ApsaraDB for ClickHouse Community-compatible Edition clusters

Last Updated:Apr 17, 2025

If you want to migrate data in an ApsaraDB for ClickHouse Community-compatible Edition cluster to another ApsaraDB for ClickHouse cluster of the same edition, you can use the cluster migration feature in the ApsaraDB for ClickHouse console to migrate data. This feature supports full data migration and incremental data migration to ensure the integrity of your data.

Prerequisites

  • Both the source and destination clusters must meet the following requirements:

    • The clusters are ApsaraDB for ClickHouse Community-compatible Edition clusters.

      Note

      If you want to migrate data between an ApsaraDB for ClickHouse Community-compatible Edition cluster and an ApsaraDB for ClickHouse Enterprise Edition cluster, see Does ApsaraDB for ClickHouse support data migration from a Community-compatible Edition cluster to an Enterprise Edition cluster?.

    • The clusters are in the Running state.

    • The usernames and passwords of database accounts are created for the clusters.

    • If you enable tiered storage of hot data and cold data for the source cluster, you must also enable tiered storage of hot data and cold data for the destination cluster. If you disable tiered storage of hot data and cold data for the source cluster, you must also disable tiered storage of hot data and cold data for the destination cluster.

    • The clusters are deployed in the same region and use the same virtual private cloud (VPC). The IP address of the source cluster is added to the whitelist of the destination cluster. The IP address of the destination cluster is also added to the whitelist of the source cluster. Otherwise, resolve the network issue first. For more information, see the What do I do if a connection fails to be established between the destination cluster and the data source? section of the FAQ topic.

      Note

      You can execute the SELECT * FROM system.clusters; statement to query the IP address of an ApsaraDB for ClickHouse cluster. For more information about how to configure a whitelist, see Configure a whitelist.

  • The destination cluster must meet the following additional requirements:

    • The version of the destination cluster is later than or the same as the version of the source cluster. For more information, see Release notes.

    • The available storage space (excluding cold storage) of the destination cluster is greater than or equal to 1.2 times the used storage space (excluding cold storage) of the source cluster.

  • Each local table in the source cluster corresponds to a unique distributed table.

Usage notes

  • Migration speed: In most cases, when you migrate data in the ApsaraDB for ClickHouse console, the migration speed of a single node in the destination cluster is higher than 20 MB/s. If the write speed of a single node in the source cluster is also higher than 20 MB/s, make sure that the migration speed of the destination cluster keeps pace with the write speed of the source cluster. Otherwise, the migration may fail.

  • The destination cluster will stop merging data parts during the migration. The source cluster will not stop merging data parts during the migration.

  • Migration content:

    • You can migrate the following objects from the source cluster: the cluster, databases, tables, data dictionaries, materialized views, user permissions, and cluster configurations.

    • You cannot migrate Kafka or RabbitMQ tables.

      Important

      To ensure that Kafka and RabbitMQ data is not sharded, you must delete the Kafka and RabbitMQ tables in the source cluster and then create corresponding tables in the destination cluster or use different consumer groups.

    • You can migrate only the schemas of non-MergeTree tables such as external tables and log tables.

      Note

      If the source cluster contains non-MergeTree tables, the non-MergeTree tables in the destination cluster have only a table schema and no specific business data after migration. You can use the remote function to migrate specific business data. For more information, see the Use the remote function to migrate data section of the Migrate data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse cluster topic.

  • Data volume:

    • Cold data: The migration speed of cold data is relatively slow. We recommend that you clean up the cold data in the self-managed cluster to ensure that the total amount of cold data does not exceed 1 TB. If the migration takes too long, the migration may fail.

    • Hot data: If the amount of hot data exceeds 10 TB, the migration task may fail. We recommend that you use other migration methods.

    If the preceding requirements are not met, you can manually migrate data. For more information, see Manual migration.

Impacts on clusters

  • Source cluster: During data migration, you can read data from and write data to the tables in the source cluster. However, you cannot perform DDL operations on the source cluster, such as adding, deleting, and modifying the metadata in the databases and tables.

Important
  • To ensure the completion of the migration task, the source cluster automatically suspends data write operations within the predefined time window when the estimated remaining migration time displayed on the console is less than or equal to 10 minutes.

  • When all data is migrated or the predefined time window for data write suspension ends, the source cluster automatically resumes data writing.

  • Destination cluster: After migration, the cluster performs frequent merge operations for a period of time. This leads to increased I/O usage and higher latency in service requests. We recommend that you make a plan to handle the potential impacts of service request latency. You must calculate the duration of the merge operations. For more information, see Calculate the merge operation duration after migration.

Procedure

Important

The following steps are performed on the destination cluster but not the source cluster.

Step 1: Create a migration task

  1. Log on to the ApsaraDB for ClickHouse console.

  2. On the Clusters page, click the Clusters of Community-compatible Edition tab and click the ID of the cluster that you want to manage.

  3. In the left-side navigation pane, click Data Migration and Synchronization > Migration from Self-managed ClickHouse or ApsaraDB for ClickHouse.

  4. On the page that appears, click Create Migration Task.

    1. Configure the source and destination clusters.

      Configure the parameters in the Source Instance Information section and the Destination Instance Information section and click Test Connectivity and Proceed.

      Note

      After the connection test succeeds, proceed to the Migration Content step. If the connection test fails, configure the source and destination clusters again as prompted.

      image

    2. Confirm the migration content.

      On the page that appears, read the information about the data migration content and click Next: Pre-detect and Start Synchronization.

    3. The system performs prechecks on the migration configuration and then starts the migration task in the backend after the prechecks pass.

      The system performs the following prechecks on the source and destination clusters: Instance Status Detection, Storage Space Detection, and Local Table and Distributed Table Detection.

      • If the prechecks pass, perform the following operations:

        1. Read the information about the impacts of data migration on clusters.

        2. Configure the Time of Stopping Data Writing parameter.

          Note
          • To ensure data consistency, you need to configure the source cluster to suspend data write operations during the final 10 minutes of the migration.

          • To ensure the success rate of data migration, we recommend that you specify a value greater than or equal to 30 minutes.

          • A migration task must end within five days after the task is created and started. Therefore, the end date of Time of Stopping Data Writing must be less than or equal to the current date plus 5 days.

          • To reduce the impact of data migration on your business, we recommend that you configure a time range during off-peak hours.

        3. Click Completed.

          Note

          After you click Completed, the task is created and started.

      • If the prechecks fail, you need to follow the on-screen instructions to resolve the issue and then configure the migration task parameters again. The following table describes the precheck items.

        Item

        Description

        Cluster Status Detection

        Before you migrate data, make sure that no management operations, such as scale-out, upgrade, or downgrade operations, are being performed on the source cluster and the destination cluster. If management operations are being performed on the source cluster and the destination cluster, the system cannot start a migration task.

        Storage Space Detection

        Before a migration task is started, the system checks the storage space of the source cluster and the destination cluster. Make sure that the storage space of the destination cluster is greater than or equal to 1.2 times the storage space of the source cluster.

        Local Table and Distributed Table Detection

        If no distributed table is created for a local table or multiple distributed tables are created for the same local table of the source cluster, the precheck fails. You must delete redundant distributed tables or create a unique distributed table.

Step 2: Check whether the migration task can be completed

If the write speed of the source cluster is lower than 20 MB/s, skip this step.

If the write speed of the source cluster is higher than 20 MB/s, you must check the write speed of the destination cluster. In most cases, the write speed of a single node in the destination cluster is higher than 20 MB/s. To ensure a successful migration, the write speed of the destination cluster must keep pace with the source cluster. You can perform the following operations to check the actual write speed of the destination cluster:

  1. Check the disk throughput of the destination cluster to determine the actual write speed. For more information about how to view the disk throughput, see View cluster monitoring information.

  2. Compare the write speeds.

    1. If the write speed of the destination cluster is higher than that of the source cluster, the migration may succeed. Proceed to Step 3.

    2. If the write speed of the destination cluster is lower than that of the source cluster, the migration may fail. We recommend that you cancel the migration task and manually migrate data. For more information, see Cancel a migration task and Manual migration.

Step 3: View the migration task

  1. On the Clusters page, click the Clusters of Community-compatible Edition tab and click the ID of the cluster that you want to manage.

  2. In the left-side navigation pane, click Migrate Instance.

    On the page that appears, view the following information about the migration task: Migration Status, Running Information, and Data Write-Stop Window.

    Note

    When the estimated remaining migration time displayed in the Running Information column is less than or equal to 10 minutes and the migration state is Migrating, the data write suspension for the source cluster is triggered. The following section describes the rules for data write suspension:

    • If the time when data write suspension is triggered falls within the predefined time window, the source cluster suspends data write operations.

    • If the time when data write suspension is triggered does not fall within the predefined time window and is less than or equal to the task creation and start date plus 5 days, you can modify the time window to continue the migration task.

    • If the time when data write suspension is triggered does not fall within the predefined time window and is greater than the task creation and start date plus 5 days, the migration task fails. You must cancel the migration task, clear the migrated data in the destination cluster, and recreate a migration task to migrate data.

(Optional) Step 4: Cancel the migration task

  1. On the Clusters page, click the Clusters of Community-compatible Edition tab and click the ID of the cluster that you want to manage.

  2. In the left-side navigation pane, click Migrate Instance.

  3. Click Cancel Migration in the Actions column of the migration task that you want to manage.

  4. In the Cancel Migration message, click OK.

    Note
    • After the migration task is canceled, the task state is not updated immediately. We recommend that you refresh the page at intervals to view the task state.

    • After the task is canceled, the value of the Migration Status parameter for the task changes to Completed.

    • Before you restart a migration task, you must clear the migrated data in the destination cluster to avoid data duplication.

(Optional) Step 5: Modify the data write-stop time window

  1. On the Clusters page, click the Clusters of Community-compatible Edition tab and click the ID of the cluster that you want to manage.

  2. In the left-side navigation pane, click Migrate Instance.

  3. Click Modify Data Write-Stop Time Window in the Actions column of the migration task that you want to manage.

  4. In the Modify Data Write-Stop Time Window dialog box, configure the Time of Stopping Data Writing parameter.

    Note

    The rules for setting Time of Stopping Data Writing are the same as those for Time of Stopping Data Writing that is set when you create a migration task.

  5. Click OK.

References

For more information about how to migrate data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse cluster, see Migrate data from a self-managed ClickHouse cluster to an ApsaraDB for ClickHouse cluster that runs Community-compatible Edition.