Aws-Database-Migration-Service - User Guide
Aws-Database-Migration-Service - User Guide
User Guide
API Version API Version 2016-01-01
AWS Database Migration Service User Guide
Table of Contents
What Is AWS Database Migration Service? ............................................................................................. 1
Migration Tasks That AWS DMS Performs ...................................................................................... 1
How AWS DMS Works at the Basic Level ...................................................................................... 2
How AWS DMS Works ........................................................................................................................ 4
High-Level View of AWS DMS ...................................................................................................... 4
Components .............................................................................................................................. 5
Sources ..................................................................................................................................... 9
Targets .................................................................................................................................... 10
With Other AWS Services .......................................................................................................... 10
Support for AWS CloudFormation ...................................................................................... 11
Constructing an ARN ........................................................................................................ 11
Setting Up ....................................................................................................................................... 14
Sign Up for AWS ...................................................................................................................... 14
Create an IAM User .................................................................................................................. 14
Migration Planning for AWS Database Migration Service ............................................................... 16
Getting Started ................................................................................................................................ 17
Start a Database Migration ........................................................................................................ 17
Step 1: Welcome ...................................................................................................................... 17
Step 2: Create a Replication Instance .......................................................................................... 18
Step 3: Specify Source and Target Endpoints ............................................................................... 22
Step 4: Create a Task ................................................................................................................ 25
Monitor Your Task .................................................................................................................... 29
Security ........................................................................................................................................... 31
IAM Permissions Required ......................................................................................................... 31
IAM Roles for the CLI and API .................................................................................................... 34
Fine-Grained Access Control ...................................................................................................... 38
Using Resource Names to Control Access ............................................................................. 38
Using Tags to Control Access ............................................................................................. 40
Setting an Encryption Key ......................................................................................................... 44
Network Security ...................................................................................................................... 46
Using SSL ................................................................................................................................ 47
Limitations on Using SSL with AWS Database Migration Service ............................................. 48
Managing Certificates ....................................................................................................... 48
Enabling SSL for a MySQL-compatible, PostgreSQL, or SQL Server Endpoint ............................ 49
SSL Support for an Oracle Endpoint ................................................................................... 50
Changing the Database Password ............................................................................................... 54
Limits ............................................................................................................................................. 56
Limits for AWS Database Migration Service .................................................................................. 56
Replication Instance .......................................................................................................................... 57
Replication Instances in Depth ................................................................................................... 58
Public and Private Replication Instances ...................................................................................... 60
AWS DMS Maintenance ............................................................................................................. 60
AWS DMS Maintenance Window ......................................................................................... 60
Replication Engine Versions ....................................................................................................... 63
Deprecating a Replication Instance Version .......................................................................... 63
Upgrading the Engine Version of a Replication Instance ........................................................ 63
Setting Up a Network for a Replication Instance .......................................................................... 65
Network Configurations for Database Migration ................................................................... 65
Creating a Replication Subnet Group .................................................................................. 70
Setting an Encryption Key ......................................................................................................... 71
Creating a Replication Instance .................................................................................................. 72
Modifying a Replication Instance ............................................................................................... 76
Rebooting a Replication Instance ............................................................................................... 78
Deleting a Replication Instance ................................................................................................. 80
Error: Unsupported Character Set Causes Field Data Conversion to Fail .................................. 305
Error: Codepage 1252 to UTF8 [120112] A field data conversion failed .................................. 306
Troubleshooting PostgreSQL Specific Issues ............................................................................... 306
JSON data types being truncated ..................................................................................... 306
Columns of a user defined data type not being migrated correctly ........................................ 307
Error: No schema has been selected to create in ................................................................. 307
Deletes and updates to a table are not being replicated using CDC ........................................ 307
Truncate statements are not being propagated .................................................................. 307
Preventing PostgreSQL from capturing DDL ....................................................................... 307
Selecting the schema where database objects for capturing DDL are created .......................... 308
Oracle tables missing after migrating to PostgreSQL ........................................................... 308
Task Using View as a Source Has No Rows Copied .............................................................. 308
Troubleshooting Microsoft SQL Server Specific Issues .................................................................. 308
Special Permissions for AWS DMS user account to use CDC .................................................. 308
Errors Capturing Changes for SQL Server Database ............................................................. 309
Missing Identity Columns ................................................................................................. 309
Error: SQL Server Does Not Support Publications ............................................................... 309
Changes Not Appearing in Target ..................................................................................... 309
Troubleshooting Amazon Redshift Specific Issues ....................................................................... 309
Loading into a Amazon Redshift Cluster in a Different Region Than the AWS DMS Replication
Instance ......................................................................................................................... 310
Error: Relation "awsdms_apply_exceptions" already exists .................................................... 310
Errors with Tables Whose Name Begins with "awsdms_changes" ........................................... 310
Seeing Tables in Cluster with Names Like dms.awsdms_changes000000000XXXX .................... 310
Permissions Required to Work with Amazon Redshift .......................................................... 310
Troubleshooting Amazon Aurora MySQL Specific Issues ............................................................... 310
Error: CHARACTER SET UTF8 fields terminated by ',' enclosed by '"' lines terminated by '\n' ....... 311
Best Practices ................................................................................................................................. 312
Improving Performance ........................................................................................................... 312
Sizing a replication instance ..................................................................................................... 314
Reducing Load on Your Source Database ................................................................................... 315
Using the Task Log ................................................................................................................. 315
Schema conversion ................................................................................................................. 315
Migrating Large Binary Objects (LOBs) ..................................................................................... 315
Using Limited LOB Mode ................................................................................................ 316
Ongoing Replication ............................................................................................................... 316
Changing the User and Schema for an Oracle Target ................................................................. 317
Improving Performance When Migrating Large Tables ................................................................ 317
Reference ...................................................................................................................................... 319
AWS DMS Data Types ............................................................................................................. 319
Release Notes ................................................................................................................................ 321
AWS DMS 3.1.2 Release Notes ................................................................................................. 321
AWS DMS 3.1.1 Release Notes ................................................................................................. 321
AWS DMS 2.4.4 Release Notes ................................................................................................. 323
AWS DMS 2.4.3 Release Notes ................................................................................................. 324
AWS DMS 2.4.2 Release Notes ................................................................................................. 324
AWS DMS 2.4.1 Release Notes ................................................................................................. 327
AWS DMS 2.4.0 Release Notes ................................................................................................. 328
AWS DMS 2.3.0 Release Notes ................................................................................................. 329
Document History .......................................................................................................................... 332
Earlier Updates ....................................................................................................................... 332
AWS Glossary ................................................................................................................................. 335
With AWS DMS, you can perform one-time migrations, and you can replicate ongoing changes to keep
sources and targets in sync. If you want to change database engines, you can use the AWS Schema
Conversion Tool (AWS SCT) to translate your database schema to the new platform. You then use AWS
DMS to migrate the data. Because AWS DMS is a part of the AWS Cloud, you get the cost efficiency,
speed to market, security, and flexibility that AWS services offer.
For information about what AWS Regions support AWS DMS, see Working with an AWS DMS Replication
Instance (p. 57). For information on the cost of database migration, see the AWS Database Migration
Service pricing page.
• In a traditional solution, you need to perform capacity analysis, procure hardware and software, install
and administer systems, and test and debug the installation. AWS DMS automatically manages the
deployment, management, and monitoring of all hardware and software needed for your migration.
Your migration can be up and running within minutes of starting the AWS DMS configuration process.
• With AWS DMS, you can scale up (or scale down) your migration resources as needed to match your
actual workload. For example, if you determine that you need additional storage, you can easily
increase your allocated storage and restart your migration, usually within minutes. On the other
hand, if you discover that you aren't using all of the resource capacity you configured, you can easily
downsize to meet your actual workload.
• AWS DMS uses a pay-as-you-go model. You only pay for AWS DMS resources while you use them,
as opposed to traditional licensing models with up-front purchase costs and ongoing maintenance
charges.
• AWS DMS automatically manages all of the infrastructure that supports your migration server,
including hardware and software, software patching, and error reporting.
• AWS DMS provides automatic failover. If your primary replication server fails for any reason, a backup
replication server can take over with little or no interruption of service.
• AWS DMS can help you switch to a modern, perhaps more cost-effective, database engine than the one
you are running now. For example, AWS DMS can help you take advantage of the managed database
services provided by Amazon RDS or Amazon Aurora. Or it can help you move to the managed data
warehouse service provided by Amazon Redshift, NoSQL platforms like Amazon DynamoDB, or low-
cost storage platforms like Amazon Simple Storage Service. Conversely, if you want to migrate away
from old infrastructure but continue to use the same database engine, AWS DMS also supports that
process.
• AWS DMS supports nearly all of today’s most popular DBMS engines as data sources, including Oracle,
Microsoft SQL Server, MySQL, MariaDB, PostgreSQL, Db2 LUW, SAP, MongoDB, and Amazon Aurora.
• AWS DMS provides a broad coverage of available target engines including Oracle, Microsoft SQL
Server, PostgreSQL, MySQL, Amazon Redshift, SAP ASE, Amazon S3, and Amazon DynamoDB.
• You can migrate from any of the supported data sources to any of the supported data targets. AWS
DMS supports fully heterogeneous data migrations between the supported engines.
• AWS DMS ensures that your data migration is secure. Data at rest is encrypted with AWS Key
Management Service (AWS KMS) encryption. During migration, you can use Secure Socket Layers (SSL)
to encrypt your in-flight data as it travels from source to target.
1. To start a migration project, identify your source and target data stores. These data stores can reside
on any of the data engines mentioned preceding.
2. For both the source and target, configure endpoints within AWS DMS that specify the connection
information to the databases. The endpoints use the appropriate ODBC drivers to communicate with
your source and target.
3. Provision a replication instance, which is a server that AWS DMS automatically configures with
replication software.
4. Create a replication task, which specifies the actual data tables to migrate and data transformation
rules to apply. AWS DMS manages running the replication task and provides you status on the
migration process.
• If you are new to AWS DMS but familiar with other AWS services, start with How AWS Database
Migration Service Works (p. 4). This section dives into the key components of AWS DMS and the
overall process of setting up and running a migration.
• If you want to switch database engines, the AWS Schema Conversion Tool can convert your existing
database schema, including tables, indexes, and most application code, to the target platform.
• For information on related AWS services that you might need to design your migration strategy, see
AWS Cloud Products.
• Amazon Web Services provides a number of database services. For guidance on which service is best
for your environment, see Running Databases on AWS.
• For an overview of all AWS products, see What is Cloud Computing?
For information on the cost of database migration, see the AWS Database Migration Service pricing page.
Topics
• High-Level View of AWS DMS (p. 4)
• Components of AWS Database Migration Service (p. 5)
• Sources for AWS Database Migration Service (p. 9)
• Targets for AWS Database Migration Service (p. 10)
• Using AWS DMS with Other AWS Services (p. 10)
During a full load migration, where existing data from the source is moved to the target, AWS DMS loads
data from tables on the source data store to tables on the target data store. While the full load is in
progress, any changes made to the tables being loaded are cached on the replication server; these are
the cached changes. It’s important to note that AWS DMS doesn't capture changes for a given table until
the full load for that table is started. In other words, the point when change capture starts is different for
each individual table.
When the full load for a given table is complete, AWS DMS immediately begins to apply the cached
changes for that table. When all tables have been loaded, AWS DMS begins to collect changes as
transactions for the ongoing replication phase. After AWS DMS applies all cached changes, tables are
transactionally consistent. At this point, AWS DMS moves to the ongoing replication phase, applying
changes as transactions.
At the start of the ongoing replication phase, a backlog of transactions generally causes some lag
between the source and target databases. The migration eventually reaches a steady state after working
through this backlog of transactions. At this point, you can shut down your applications, allow any
remaining transactions to be applied to the target, and bring your applications up, now pointing at the
target database.
AWS DMS creates the target schema objects necessary to perform the migration. However, AWS DMS
takes a minimalist approach and creates only those objects required to efficiently migrate the data. In
other words, AWS DMS creates tables, primary keys, and in some cases unique indexes, but doesn't create
any other objects that are not required to efficiently migrate the data from the source. For example, it
doesn't create secondary indexes, nonprimary key constraints, or data defaults.
In most cases, when performing a migration, you also migrate most or all of the source schema. If
you are performing a homogeneous migration (between two databases of the same engine type), you
migrate the schema by using your engine’s native tools to export and import the schema itself, without
any data.
If your migration is heterogeneous (between two databases that use different engine types), you can use
the AWS Schema Conversion Tool (AWS SCT) to generate a complete target schema for you. If you use
the tool, any dependencies between tables such as foreign key constraints need to be disabled during
the migration's "full load" and "cached change apply" phases. If performance is an issue, removing or
disabling secondary indexes during the migration process helps. For more information on the AWS SCT,
see AWS Schema Conversion Tool in the AWS SCT documentation.
An AWS DMS migration consists of three components: a replication instance, source and target
endpoints, and a replication task. You create an AWS DMS migration by creating the necessary
replication instance, endpoints, and tasks in an AWS Region.
Replication instance
At a high level, an AWS DMS replication instance is simply a managed Amazon Elastic Compute
Cloud (Amazon EC2) instance that hosts one or more replication tasks.
The figure following shows an example replication instance running several associated replication
tasks.
A single replication instance can host one or more replication tasks, depending on the characteristics
of your migration and the capacity of the replication server. AWS DMS provides a variety of
replication instances so you can choose the optimal configuration for you use case. For more
information about the various classes of replication instances, see Selecting the Right AWS DMS
Replication Instance for Your Migration (p. 58).
AWS DMS creates the replication instance on an Amazon Elastic Compute Cloud (Amazon EC2)
instance. Some of the smaller instance classes are sufficient for testing the service or for small
migrations. If your migration involves a large number of tables, or if you intend to run multiple
concurrent replication tasks, you should consider using one of the larger instances. We recommend
this approach because AWS DMS can consume a significant amount of memory and CPU.
Depending on the Amazon EC2 instance class you select, your replication instance comes with either
50 GB or 100 GB of data storage. This amount is usually sufficient for most customers. However, if
your migration involves large transactions or a high-volume of data changes then you may want to
increase the base storage allocation. Change data capture (CDC) may cause data to be written to
disk, depending on how fast the target can write the changes.
AWS DMS can provide high availability and failover support using a Multi-AZ deployment. In a
Multi-AZ deployment, AWS DMS automatically provisions and maintains a standby replica of the
replication instance in a different Availability Zone. The primary replication instance is synchronously
replicated to the standby replica. If the primary replication instance fails or becomes unresponsive,
the standby resumes any running tasks with minimal interruption. Because the primary is constantly
replicating its state to the standby, Multi-AZ deployment does incur some performance overhead.
For more detailed information about the AWS DMS replication instance, see Working with an AWS
DMS Replication Instance (p. 57).
Endpoints
AWS DMS uses an endpoint to access your source or target data store. The specific connection
information is different, depending on your data store, but in general you supply the following
information when you create an endpoint.
• Endpoint type — Source or target.
• Engine type — Type of database engine, such as Oracle, Postgres, or Amazon S3.
• Server name — Server name or IP address, reachable by AWS DMS
• Port — Port number used for database server connections.
• Encryption — SSL mode, if used to encrypt the connection.
• Credentials — User name and password for an account with the required access rights.
When you create an endpoint using the AWS DMS console, the console requires that you test the
endpoint connection. The test must be successful before using the endpoint in a DMS task. Like the
connection information, the specific test criteria are different for different engine types. In general,
AWS DMS verifies that the database exists at the given server name and port, and that the supplied
credentials can be used to connect to the database with the necessary privileges to perform a
migration. If the connection test is successful, AWS DMS downloads and stores schema information,
including table definitions and primary/unique key definitions, that can be used later during task
configuration.
A single endpoint can be used by more than one replication task. For example, you may have
two logically distinct applications hosted on the same source database that you want to migrate
separately. You would create two replication tasks, one for each set of application tables, but you
can use the same AWS DMS endpoint in both tasks.
You can customize the behavior of an endpoint by using extra connection attributes. These attributes
can control various behavior such as logging detail, file size, and other parameters. Each data
store engine type has different extra connection attributes available. You can find the specific
extra connection attributes for each data store in the source or target section for that data store.
For a list of supported source and target data stores, see Sources for AWS Database Migration
Service (p. 9) and Targets for AWS Database Migration Service (p. 10).
For more detailed information about AWS DMS endpoints, see Working with AWS DMS
Endpoints (p. 83).
Replication Tasks
You use an AWS DMS replication task to move a set of data from the source endpoint to the target
endpoint. Creating a replication task is the last step you need to take before you start a migration.
When you create a replication task, you specify the following task settings:
• Replication instance – the instance that will host and run the task
• Source endpoint
• Target endpoint
• Migration type options – a migration type can be one of the following:
• Full load (Migrate existing data) – If you can afford an outage long enough to copy your existing
data, this option is a good one to choose. This option simply migrates the data from your source
database to your target database, creating tables when necessary.
• Full load + CDC (Migrate existing data and replicate ongoing changes) – This option performs a
full data load while capturing changes on the source. Once the full load is complete, captured
changes are applied to the target. Eventually the application of changes reaches a steady state.
At this point you can shut down your applications, let the remaining changes flow through to
the target, and then restart your applications pointing at the target.
• CDC only (Replicate data changes only) – In some situations it might be more efficient to copy
existing data using a method other than AWS DMS. For example, in a homogeneous migration,
using native export/import tools might be more efficient at loading the bulk data. In this
situation, you can use AWS DMS to replicate changes starting when you start your bulk load to
bring and keep your source and target databases in sync.
For a full explanation of the migration type options, see Creating a Task (p. 218).
• Target table preparation mode options. For a full explanation of target table modes, see Creating
a Task (p. 218).
• Do nothing – AWS DMS assumes that the target tables are pre-created on the target.
• Drop tables on target – AWS DMS drops and recreates the target tables.
• Truncate – If you created tables on the target, AWS DMS truncates them before the migration
starts. If no tables exist and you select this option, AWS DMS creates any missing tables.
• LOB mode options. For a full explanation of LOB modes, see Setting LOB Support for Source
Databases in a AWS DMS Task (p. 238).
• Don't include LOB columns – LOB columns are excluded from the migration.
• Full LOB mode – Migrate complete LOBs regardless of size. AWS DMS migrates LOBs piecewise
in chunks controlled by the Max LOB Size parameter. This mode is slower than using Limited
LOB mode.
• Limited LOB mode – Truncate LOBs to the value specified by the Max LOB Size parameter. This
mode is faster than using Full LOB mode.
• Table mappings – indicates the tables to migrate
• Data transformations – changing schema, table, and column names
• data validation
• CloudWatch logging
You use the task to migrate data from the source endpoint to the target endpoint, and the task
processing is done on the replication instance. You specify what tables and schemas to migrate and
any special processing, such as logging requirements, control table data, and error handling.
Conceptually, an AWS DMS replication task performs two distinct functions as shown in the diagram
following:
The full load process is straight-forward to understand. Data is extracted from the source in a bulk
extract manner and loaded directly into the target. You can specify the number of tables to extract
and load in parallel on the AWS DMS console under Advanced Settings.
For more information about AWS DMS tasks, see Working with AWS DMS Tasks (p. 214).
Ongoing replication, or change data capture (CDC)
You can also use an AWS DMS task to capture ongoing changes to the source data store while
you are migrating your data to a target. The change capture process that AWS DMS uses when
replicating ongoing changes from a source endpoint collects changes to the database logs by using
the database engine's native API.
In the CDC process, the replication task is designed to stream changes from the source to the target,
using in-memory buffers to hold data in-transit. If the in-memory buffers become exhausted for any
reason, the replication task will spill pending changes to the Change Cache on disk. This could occur,
for example, if AWS DMS is capturing changes from the source faster than they can be applied on
the target. In this case, you will see the task’s target latency exceed the task’s source latency.
You can check this by navigating to your task on the AWS DMS console, and opening the Task
Monitoring tab. The CDCLatencyTarget and CDCLatencySource graphs are shown at the bottom of
the page. If you have a task that is showing target latency then there is likely some tuning on the
target endpoint needed to increase the application rate.
The replication task also uses storage for task logs as discussed above. The disk space that comes
pre-configured with your replication instance is usually sufficient for logging and spilled changes.
If you need additional disk space, for example, when using detailed debugging to investigate a
migration issue, you can modify the replication instance to allocate more space.
Schema and code migration
AWS DMS doesn't perform schema or code conversion. You can use tools such as Oracle SQL
Developer, MySQL Workbench, or pgAdmin III to move your schema if your source and target are the
same database engine. If you want to convert an existing schema to a different database engine, you
can use AWS SCT. It can create a target schema and also can generate and create an entire schema:
tables, indexes, views, and so on. You can also use AWS SCT to convert PL/SQL or TSQL to PgSQL
and other formats. For more information on AWS SCT, see AWS Schema Conversion Tool.
Whenever possible, AWS DMS attempts to create the target schema for you. Sometimes, AWS DMS
can't create the schema—for example, AWS DMS doesn't create a target Oracle schema for security
reasons. For MySQL database targets, you can use extra connection attributes to have AWS DMS
migrate all objects to the specified database and schema or create each database and schema for
you as it finds the schema on the source.
• Oracle versions 10.2 and later, 11g, and up to 12.1, for the Enterprise, Standard, Standard One, and
Standard Two editions
• Microsoft SQL Server versions 2005, 2008, 2008R2, 2012, 2014, and 2016, for the Enterprise,
Standard, Workgroup, and Developer editions. The Web and Express editions are not supported.
• MySQL versions 5.5, 5.6, and 5.7.
• MariaDB (supported as a MySQL-compatible data source).
• PostgreSQL version 9.4 and later.
• MongoDB versions 2.6.x and 3.x and later.
• SAP Adaptive Server Enterprise (ASE) versions 12.5, 15, 15.5, 15.7, 16 and later.
• Db2 LUW versions:
• Version 9.7, all Fix Packs are supported.
• Version 10.1, all Fix Packs are supported.
• Version 10.5, all Fix Packs except for Fix Pack 5 are supported.
Microsoft Azure
• Oracle versions 11g (versions 11.2.0.3.v1 and later) and 12c, for the Enterprise, Standard, Standard
One, and Standard Two editions.
• Microsoft SQL Server versions 2008R2, 2012, 2014, and 2016 for the Enterprise, Standard, Workgroup,
and Developer editions. The Web and Express editions are not supported.
• MySQL versions 5.5, 5.6, and 5.7.
• MariaDB (supported as a MySQL-compatible data source).
• PostgreSQL 9.4 and later. Change data capture (CDC) is only supported for versions 9.4.9 and higher
and 9.5.4 and higher. The rds.logical_replication parameter, which is required for CDC, is
supported only in these versions and later.
• Amazon Aurora (supported as a MySQL-compatible data source).
• Amazon Simple Storage Service.
• Oracle versions 10g, 11g, 12c, for the Enterprise, Standard, Standard One, and Standard Two editions
• Microsoft SQL Server versions 2005, 2008, 2008R2, 2012, 2014, and 2016, for the Enterprise,
Standard, Workgroup, and Developer editions. The Web and Express editions are not supported.
• MySQL, versions 5.5, 5.6, and 5.7
• MariaDB (supported as a MySQL-compatible data target)
• PostgreSQL, versions 9.4 and later
• SAP Adaptive Server Enterprise (ASE) versions 15, 15.5, 15.7, 16 and later
Amazon RDS instance databases, Amazon Redshift, Amazon DynamoDB, and Amazon S3
• Oracle versions 11g (versions 11.2.0.3.v1 and later) and 12c, for the Enterprise, Standard, Standard
One, and Standard Two editions
• Microsoft SQL Server versions 2008R2, 2012, and 2014, for the Enterprise, Standard, Workgroup, and
Developer editions. The Web and Express editions are not supported.
• MySQL, versions 5.5, 5.6, and 5.7
• MariaDB (supported as a MySQL-compatible data target)
• PostgreSQL, versions 9.4 and later
• Amazon Aurora with MySQL compatibility
• Amazon Aurora with PostgreSQL compatibility
• Amazon Redshift
• Amazon S3
• Amazon DynamoDB
• You can use an Amazon EC2 instance or Amazon RDS DB instance as a target for a data migration.
• You can use the AWS Schema Conversion Tool (AWS SCT) to convert your source schema and SQL code
into an equivalent target schema and SQL code.
• You can use Amazon S3 as a storage site for your data or you can use it as an intermediate step when
migrating large amounts of data.
• You can use AWS CloudFormation to set up your AWS resources for infrastructure management or
deployment. For example, you can provision AWS DMS resources such as replication instances, tasks,
certificates, and endpoints. You create a template that describes all the AWS resources that you want,
and AWS CloudFormation provisions and configures those resources for you.
As a developer or system administrator, you can create and manage collections of these resources that
you can then use for repetitive migration tasks or deploying resources to your organization. For more
information about AWS CloudFormation, see AWS CloudFormation Concepts in the AWS CloudFormation
User Guide.
AWS DMS supports creating the following AWS DMS resources using AWS CloudFormation:
• AWS::DMS::Certificate
• AWS::DMS::Endpoint
• AWS::DMS::EventSubscription
• AWS::DMS::ReplicationInstance
• AWS::DMS::ReplicationSubnetGroup
• AWS::DMS::ReplicationTask
arn:aws:dms:<region>:<account number>:<resourcetype>:<resourcename>
In this syntax:
• <region> is the ID of the AWS Region where the AWS DMS resource was created, such as us-west-2.
The following table shows AWS Region names and the values you should use when constructing an
ARN.
Region Name
• <account number> is your account number with dashes omitted. To find your account number, log
in to your AWS account at http://aws.amazon.com, choose My Account/Console, and then choose My
Account.
• <resourcetype> is the type of AWS DMS resource.
The following table shows the resource types that you should use when constructing an ARN for a
particular AWS DMS resource.
• <resourcename> is the resource name assigned to the AWS DMS resource. This is a generated
arbitrary string.
The following table shows examples of ARNs for AWS DMS resources with an AWS account of
123456789012, which were created in the US East (N. Virginia) region, and has a resource name:
Endpoint arn:aws:dms:us-
east-1:123456789012:endpoint:D3HMZ2IGUCGFF3NTAXUXGF6S5A
With AWS DMS, you pay only for the resources you use. The AWS DMS replication instance that you
create will be live (not running in a sandbox). You will incur the standard AWS DMS usage fees for the
instance until you terminate it. For more information about AWS DMS usage rates, see the AWS DMS
product page. If you are a new AWS customer, you can get started with AWS DMS for free; for more
information, see AWS Free Usage Tier.
If you close your AWS account, all AWS DMS resources and configurations associated with your account
are deleted after two days. These resources include all replication instances, source and target endpoint
configuration, replication tasks, and SSL certificates. If after two days you decide to use AWS DMS again,
you recreate the resources you need.
If you do not have an AWS account, use the following procedure to create one.
Part of the sign-up procedure involves receiving a phone call and entering a verification code using
the phone keypad.
Note your AWS account number, because you'll need it for the next task.
your password. You can create access keys for your AWS account to access the command line interface or
API. However, we don't recommend that you access AWS using the credentials for your AWS account; we
recommend that you use AWS Identity and Access Management (IAM) instead. Create an IAM user, and
then add the user to an IAM group with administrative permissions or and grant this user administrative
permissions. You can then access AWS using a special URL and the credentials for the IAM user.
If you signed up for AWS but have not created an IAM user for yourself, you can create one using the IAM
console.
To create an IAM user for yourself and add the user to an Administrators group
1. Use your AWS account email address and password to sign in as the AWS account root user to the
IAM console at https://console.aws.amazon.com/iam/.
Note
We strongly recommend that you adhere to the best practice of using the Administrator
IAM user below and securely lock away the root user credentials. Sign in as the root user
only to perform a few account and service management tasks.
2. In the navigation pane of the console, choose Users, and then choose Add user.
3. For User name, type Administrator.
4. Select the check box next to AWS Management Console access, select Custom password, and then
type the new user's password in the text box. You can optionally select Require password reset to
force the user to create a new password the next time the user signs in.
5. Choose Next: Permissions.
6. On the Set permissions page, choose Add user to group.
7. Choose Create group.
8. In the Create group dialog box, for Group name type Administrators.
9. For Filter policies, select the check box for AWS managed - job function.
10. In the policy list, select the check box for AdministratorAccess. Then choose Create group.
11. Back in the list of groups, select the check box for your new group. Choose Refresh if necessary to
see the group in the list.
12. Choose Next: Tags to add metadata to the user by attaching tags as key-value pairs.
13. Choose Next: Review to see the list of group memberships to be added to the new user. When you
are ready to proceed, choose Create user.
You can use this same process to create more groups and users, and to give your users access to your
AWS account resources. To learn about using policies to restrict users' permissions to specific AWS
resources, go to Access Management and Example Policies.
To sign in as this new IAM user, sign out of the AWS console, then use the following URL, where
your_aws_account_id is your AWS account number without the hyphens (for example, if your AWS
account number is 1234-5678-9012, your AWS account ID is 123456789012):
https://your_aws_account_id.signin.aws.amazon.com/console/
Enter the IAM user name and password that you just created. When you're signed in, the navigation bar
displays "your_user_name @ your_aws_account_id".
If you don't want the URL for your sign-in page to contain your AWS account ID, you can create an
account alias. On the IAM dashboard, choose Customize and type an alias, such as your company name.
To sign in after you create an account alias, use the following URL.
https://your_account_alias.signin.aws.amazon.com/console/
To verify the sign-in link for IAM users for your account, open the IAM console and check under AWS
Account Alias on the dashboard.
• You will need to configure a network that connects your source and target databases to a AWS DMS
replication instance. This can be as simple as connecting two AWS resources in the same VPC as the
replication instance to more complex configurations such as connecting an on-premises database to an
Amazon RDS DB instance over VPN. For more information, see Network Configurations for Database
Migration (p. 65)
• Source and Target Endpoints – You will need to know what information and tables in the source
database need to be migrated to the target database. AWS DMS supports basic schema migration,
including the creation of tables and primary keys. However, AWS DMS doesn't automatically create
secondary indexes, foreign keys, user accounts, and so on in the target database. Note that, depending
on your source and target database engine, you may need to set up supplemental logging or modify
other settings for a source or target database. See the Sources for Data Migration (p. 83) and
Targets for Data Migration (p. 147) sections for more information.
• Schema/Code Migration – AWS DMS doesn't perform schema or code conversion. You can use tools
such as Oracle SQL Developer, MySQL Workbench, or pgAdmin III to convert your schema. If you want
to convert an existing schema to a different database engine, you can use the AWS Schema Conversion
Tool. It can create a target schema and also can generate and create an entire schema: tables, indexes,
views, and so on. You can also use the tool to convert PL/SQL or TSQL to PgSQL and other formats.
For more information on the AWS Schema Conversion Tool, see AWS Schema Conversion Tool .
• Unsupported Data Types – Some source data types need to be converted into the equivalent data
types for the target database. See the source or target section for your data store to find more
information on supported data types.
For information on the cost of database migration using AWS Database Migration Service, see the AWS
Database Migration Service pricing page.
Topics
• Start a Database Migration with AWS Database Migration Service (p. 17)
• Step 1: Welcome (p. 17)
• Step 2: Create a Replication Instance (p. 18)
• Step 3: Specify Source and Target Endpoints (p. 22)
• Step 4: Create a Task (p. 25)
• Monitor Your Task (p. 29)
To use the wizard, select Getting started for from the navigation pane on the AWS DMS console. You
can use the wizard to help create your first data migration. Following the wizard process, you allocate
a replication instance that performs all the processes for the migration, specify a source and a target
database, and then create a task or set of tasks to define what tables and replication processes you
want to use. AWS DMS then creates your replication instance and performs the tasks on the data being
migrated.
Alternatively, you can create each of the components of an AWS DMS database migration by selecting
the items from the navigation pane. For a database migration, you must do the following:
• Complete the tasks outlined in Setting Up for AWS Database Migration Service (p. 14)
• Allocate a replication instance that performs all the processes for the migration
• Specify a source and a target database endpoint
• Create a task or set of tasks to define what tables and replication processes you want to use
Step 1: Welcome
If you start your database migration using the AWS DMS console wizard, you will see the Welcome page,
which explains the process of database migration using AWS DMS.
• Choose Next.
The procedure following assumes that you have chosen the AWS DMS console wizard. Note that you can
also do this step by selecting Replication instances from the AWS DMS console's navigation pane and
then selecting Create replication instance.
Replication engine version By default, the replication instance runs the latest
version of the AWS DMS replication engine software. We
recommend that you accept this default; however, you
can choose a previous engine version if necessary.
Publicly accessible Choose this option if you want the replication instance to
be accessible from the Internet.
4. Choose the Advanced tab, shown following, to set values for network and encryption settings if you
need them. The following table describes the settings.
Allocated storage (GB) Storage is primarily consumed by log files and cached
transactions. For caches transactions, storage is used only
when the cached transactions need to be written to disk.
Therefore, AWS DMS doesn’t use a significant amount of
storage.Some exceptions include the following:
Replication Subnet Group Choose the replication subnet group in your selected VPC
where you want the replication instance to be created.
If your source database is in a VPC, choose the subnet
group that contains the source database as the location
for your replication instance. For more information about
replication subnet groups, see Creating a Replication
Subnet Group (p. 70).
Availability zone Choose the Availability Zone where your source database
is located.
VPC Security group(s) The replication instance is created in a VPC. If your source
database is in a VPC, select the VPC security group that
provides access to the DB instance where the database
resides.
KMS master key Choose the encryption key to use to encrypt replication
storage and connection information. If you choose
(Default) aws/dms, the default AWS Key Management
Service (AWS KMS) key associated with your account and
region is used. A description and your account number are
shown, along with the key's ARN. For more information
on using the encryption key, see Setting an Encryption
Key and Specifying KMS Permissions (p. 44).
5. Specify the Maintenance settings. The following table describes the settings. For more information
about maintenance settings, see AWS DMS Maintenance Window (p. 60)
Auto minor version upgrade Select to have minor engine upgrades applied
automatically to the replication instance during the
maintenance window.
The procedure following assumes that you have chosen the AWS DMS console wizard. Note that you can
also do this step by selecting Endpoints from the AWS DMS console's navigation pane and then selecting
Create endpoint. When using the console wizard, you create both the source and target endpoints on
the same page. When not using the console wizard, you create each endpoint separately.
1. On the Connect source and target database endpoints page, specify your connection information
for the source or target database. The following table describes the settings.
Endpoint identifier Type the name you want to use to identify the endpoint.
You might want to include in the name the type of
endpoint, such as oracle-source or PostgreSQL-
target. The name must be unique for all replication
instances.
Source engine and Target engine Choose the type of database engine that is the endpoint.
User name Type the user name with the permissions required to
allow data migration. For information on the permissions
required, see the security section for the source or target
database engine in this user guide.
Password Type the password for the account with the required
permissions. If you want to use special characters in your
password, such as "+" or "&", enclose the entire password
in curly braces "{}".
2. Choose the Advanced tab, shown following, to set values for connection string and encryption key if
you need them. You can test the endpoint connection by choosing Run test.
Extra connection attributes Type any additional connection parameters here. For
more information about extra connection attributes, see
the documentation section for your data store.
KMS master key Choose the encryption key to use to encrypt replication
storage and connection information. If you choose
(Default) aws/dms, the default AWS Key Management
Service (AWS KMS) key associated with your account
and region is used. For more information on using the
encryption key, see Setting an Encryption Key and
Specifying KMS Permissions (p. 44).
migrate existing data, migrate existing data and replicate ongoing changes, or replicate data changes
only.
Using AWS DMS, you can specify precise mapping of your data between the source and the target
database. Before you specify your mapping, make sure you review the documentation section on data
type mapping for your source and your target database.
You can choose to start a task as soon as you finish specifying information for that task on the
Create task page, or you can start the task from the Dashboard page once you finish specifying task
information.
The procedure following assumes that you have chosen the AWS DMS console wizard and specified
replication instance information and endpoints using the console wizard. Note that you can also do this
step by selecting Tasks from the AWS DMS console's navigation pane and then selecting Create task.
1. On the Create Task page, specify the task options. The following table describes the settings.
Migration type Choose the migration method you want to use. You can
choose to have just the existing data migrated to the
target database or have ongoing changes sent to the
target database in addition to the migrated data.
Start task on create When this option is selected, the task begins as soon as it
is created.
2. Choose the Task Settings tab, shown following, and specify values for your target table, LOB
support, and to enable logging. The task settings shown depend on the Migration type value you
select. For example, when you select Migrate existing data, the following options are shown:
Target table preparation mode Do nothing - Data and metadata of the target tables are
not changed.
Include LOB columns in replication Don't include LOB columns - LOB columns will be
excluded from the migration.
Max LOB size (kb) In Limited LOB Mode, LOB columns which exceed the
setting of Max LOB Size will be truncated to the specified
Max LOB Size.
When you select Migrate existing data and replicate for Migration type, the following options are
shown:
Target table preparation mode Do nothing - Data and metadata of the target tables are
not changed.
Stop task after full load completes Don't stop - Do not stop the task, immediately apply
cached changes and continue on.
Include LOB columns in replication Don't include LOB columns - LOB columns will be
excluded from the migration.
Max LOB size (kb) In Limited LOB Mode, LOB columns which exceed the
setting of Max LOB Size will be truncated to the specified
Max LOB Size.
3. Choose the Table mappings tab, shown following, to set values for schema mapping and the
mapping method. If you choose Custom, you can specify the target schema and table values. For
more information about table mapping, see Using Table Mapping to Specify Task Settings (p. 245).
4. Once you have finished with the task settings, choose Create task.
table statistics of a database migration. For more information about monitoring, see Monitoring AWS
DMS Tasks (p. 261)
The VPC based on the Amazon Virtual Private Cloud (Amazon VPC) service that you use with your
replication instance must be associated with a security group that has rules that allow all traffic on all
ports to leave (egress) the VPC. This approach allows communication from the replication instance to
your source and target database endpoints, as long as correct ingress is enabled on those endpoints.
If you want to view database migration logs, you need the appropriate Amazon CloudWatch Logs
permissions for the IAM role you are using.
Topics
• IAM Permissions Needed to Use AWS DMS (p. 31)
• Creating the IAM Roles to Use With the AWS CLI and AWS DMS API (p. 34)
• Fine-Grained Access Control Using Resource Names and Tags (p. 38)
• Setting an Encryption Key and Specifying KMS Permissions (p. 44)
• Network Security for AWS Database Migration Service (p. 46)
• Using SSL With AWS Database Migration Service (p. 47)
• Changing the Database Password (p. 54)
The following set of permissions gives you access to AWS DMS, and also permissions for certain actions
needed from other Amazon services such as AWS KMS, IAM, Amazon Elastic Compute Cloud (Amazon
EC2), and Amazon CloudWatch. CloudWatch monitors your AWS DMS migration in real time and collects
and tracks metrics that indicate the progress of your migration. You can use CloudWatch Logs to debug
problems with a task.
Note
You can further restrict access to AWS DMS resources using tagging. For more information about
restricting access to AWS DMS resources using tagging, see Fine-Grained Access Control Using
Resource Names and Tags (p. 38)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "dms:*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"kms:ListAliases",
"kms:DescribeKey"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"iam:GetRole",
"iam:PassRole",
"iam:CreateRole",
"iam:AttachRolePolicy"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeVpcs",
"ec2:DescribeInternetGateways",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:ModifyNetworkInterfaceAttribute",
"ec2:CreateNetworkInterface",
"ec2:DeleteNetworkInterface"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"cloudwatch:Get*",
"cloudwatch:List*"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:FilterLogEvents",
"logs:GetLogEvents"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"redshift:Describe*",
"redshift:ModifyClusterIamRoles"
],
"Resource": "*"
}
]
}
A breakdown of these permissions might help you better understand why each one is necessary.
This section is required to allow the user to call AWS DMS API operations.
{
"Effect": "Allow",
"Action": "dms:*",
"Resource": "*"
}
This section is required to allow the user to list their available KMS Keys and alias for display in the
console. This entry is not required if the KMSkey ARN is known and when using only the CLI.
{
"Effect": "Allow",
"Action": [
"kms:ListAliases",
"kms:DescribeKey"
],
"Resource": "*"
}
This section is required for certain endpoint types that require a Role ARN to be passed in with the
endpoint. In addition, if the required AWS DMS roles are not created ahead of time, the AWS DMS
console has the ability to create the role. If all roles are configured ahead of time, all that is required in
iam:GetRole and iam:PassRole. For more information about roles, see Creating the IAM Roles to Use With
the AWS CLI and AWS DMS API (p. 34).
{
"Effect": "Allow",
"Action": [
"iam:GetRole",
"iam:PassRole",
"iam:CreateRole",
"iam:AttachRolePolicy"
],
"Resource": "*"
}
This section is required since AWS DMS needs to create the EC2 instance and configure the network for
the replication instance that is created. These resources exist in the customer's account, so the ability to
perform these actions on behalf of the customer is required.
{
"Effect": "Allow",
"Action": [
"ec2:DescribeVpcs",
"ec2:DescribeInternetGateways",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:ModifyNetworkInterfaceAttribute",
"ec2:CreateNetworkInterface",
"ec2:DeleteNetworkInterface"
],
"Resource": "*"
}
This section is required to allow the user to be able to view replication instance metrics.
{
"Effect": "Allow",
"Action": [
"cloudwatch:Get*",
"cloudwatch:List*"
],
"Resource": "*"
}
{
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:FilterLogEvents",
"logs:GetLogEvents"
],
"Resource": "*"
}
This section is required when using Redshift as a target. It allows AWS DMS to validate that the Redshift
cluster is set up properly for AWS DMS.
{
"Effect": "Allow",
"Action": [
"redshift:Describe*",
"redshift:ModifyClusterIamRoles"
],
"Resource": "*"
}
The AWS DMS console creates several roles that are automatically attached to your AWS account when
you use the AWS DMS console. If you use the AWS Command Line Interface (AWS CLI) or the AWS DMS
API for your migration, you need to add these roles to your account. For more information on adding
these roles, see Creating the IAM Roles to Use With the AWS CLI and AWS DMS API (p. 34).
Updates to managed policies are automatic. If you are using a custom policy with the IAM roles, be
sure to periodically check for updates to the managed policy in this documentation. You can view the
details of the managed policy by using a combination of the get-policy and get-policy-version
commands.
For example, the following get-policy command retrieves information on the role.
{
"Policy": {
"PolicyName": "AmazonDMSVPCManagementRole",
"Description": "Provides access to manage VPC settings for AWS managed customer
configurations",
"CreateDate": "2015-11-18T16:33:19Z",
"AttachmentCount": 1,
"IsAttachable": true,
"PolicyId": "ANPAJHKIGMBQI4AEFFSYO",
"DefaultVersionId": "v3",
"Path": "/service-role/",
"Arn": "arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole",
"UpdateDate": "2016-05-23T16:29:57Z"
}
}
{
"PolicyVersion": {
"CreateDate": "2016-05-23T16:29:57Z",
"VersionId": "v3",
"Document": {
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:CreateNetworkInterface",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeInternetGateways",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs",
"ec2:DeleteNetworkInterface",
"ec2:ModifyNetworkInterfaceAttribute"
],
"Resource": "*",
"Effect": "Allow"
}
]
},
"IsDefaultVersion": true
}
}
The same commands can be used to get information on the AmazonDMSCloudWatchLogsRole and the
AmazonDMSRedshiftS3Role managed policy.
Note
If you use the AWS DMS console for your database migration, these roles are added to your AWS
account automatically.
To create the dms-vpc-role IAM role for use with the AWS CLI or AWS DMS API
1. Create a JSON file with the IAM policy following. Name the JSON file
dmsAssumeRolePolicyDocument.json.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Create the role using the AWS CLI using the following command.
To create the dms-cloudwatch-logs-role IAM role for use with the AWS CLI or AWS DMS
API
1. Create a JSON file with the IAM policy following. Name the JSON file
dmsAssumeRolePolicyDocument2.json.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Create the role using the AWS CLI using the following command.
If you use Amazon Redshift as your target database, you must create the IAM role dms-access-for-
endpoint to provide access to Amazon Simple Storage Service (Amazon S3).
To create the dms-access-for-endpoint IAM role for use with Amazon Redshift as a target
database
1. Create a JSON file with the IAM policy following. Name the JSON file
dmsAssumeRolePolicyDocument3.json.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "1",
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
},
{
"Sid": "2",
"Effect": "Allow",
"Principal": {
"Service": "redshift.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
2. Create the role using the AWS CLI using the following command.
You should now have the IAM policies in place to use the AWS CLI or AWS DMS API.
The following policy denies access to the AWS DMS replication instance with the ARN arn:aws:dms:us-
east-1:152683116:rep:DOH67ZTOXGLIXMIHKITV:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"dms:*"
],
"Effect": "Deny",
"Resource": "arn:aws:dms:us-east-1:152683116:rep:DOH67ZTOXGLIXMIHKITV"
}
]
}
For example, the following commands would fail when the policy is in effect:
You can also specify IAM policies that limit access to AWS DMS endpoints and replication tasks.
The following policy limits access to an AWS DMS endpoint using the endpoint's ARN:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"dms:*"
],
"Effect": "Deny",
"Resource": "arn:aws:dms:us-east-1:152683116:endpoint:D6E37YBXTNHOA6XRQSZCUGX"
}
]
}
For example, the following commands would fail when the policy using the endpoint's ARN is in effect:
The following policy limits access to an AWS DMS task using the task's ARN:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"dms:*"
],
"Effect": "Deny",
"Resource": "arn:aws:dms:us-east-1:152683116:task:UO3YR4N47DXH3ATT4YMWOIT"
}
]
}
For example, the following commands would fail when the policy using the task's ARN is in effect:
The following lists the standard tags available for use with AWS DMS:
• aws:CurrentTime – Represents the request date and time, allowing the restriction of access based on
temporal criteria.
• aws:EpochTime – This tag is similar to the aws:CurrentTime tag above, except that the current time is
represented as the number of seconds elapsed since the Unix Epoch.
• aws:MultiFactorAuthPresent – This is a boolean tag that indicates whether or not the request was
signed via multi-factor authentication.
• aws:MultiFactorAuthAge – Provides access to the age of the multi-factor authentication token (in
seconds).
• aws:principaltype - Provides access to the type of principal (user, account, federated user, etc.) for the
current request.
• aws:SourceIp - Represents the source ip address for the user issuing the request.
• aws:UserAgent – Provides information about the client application requesting a resource.
• aws:userid – Provides access to the ID of the user issuing the request.
• aws:username – Provides access to the name of the user issuing the request.
• dms:InstanceClass – Provides access to the compute size of the replication instance host(s).
• dms:StorageSize - Provides access to the storage volume size (in GB).
You can also define your own tags. Customer-defined tags are simple key/value pairs that are persisted
in the AWS Tagging service and can be added to AWS DMS resources, including replication instances,
endpoints, and tasks. These tags are matched via IAM "Conditional" statements in policies, and are
referenced using a specific conditional tag. The tag keys are prefixed with "dms", the resource type, and
the "tag" prefix. The following shows the tag format:
For example, suppose you want to define a policy that only allows an API call to succeed for a replication
instance that contains the tag "stage=production". The following conditional statement would match a
resource with the given tag:
"Condition":
{
"streq":
{
"dms:rep-tag/stage":"production"
}
You would add the following tag to a replication instance that would match this policy condition:
stage production
In addition to tags already assigned to AWS DMS resources, policies can also be written to limit the tag
keys and values that may be applied to a given resource. In this case, the tag prefix would be "req".
For example, the following policy statement would limit the tags that a user can assign to a given
resource to a specific list of allowed values:
"Condition":
{
"streq":
{
"dms:req-tag/stage": [ "production", "development", "testing" ]
}
}
The following policy examples limit access to an AWS DMS resource based on resource tags.
The following policy limits access to a replication instance where the tag value is "Desktop" and the tag
key is "Env":
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"dms:*"
],
"Effect": "Deny",
"Resource": "*",
"Condition": {
"StringEquals": {
"dms:rep-tag/Env": [
"Desktop"
]
}
}
}
]
}
The following commands succeed or fail based on the IAM policy that restricts access when the tag value
is "Desktop" and the tag key is "Env":
]
}
The following policy limits access to a AWS DMS endpoint where the tag value is "Desktop" and the tag
key is "Env":
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"dms:*"
],
"Effect": "Deny",
"Resource": "*",
"Condition": {
"StringEquals": {
"dms:endpoint-tag/Env": [
"Desktop"
]
}
}
}
]
}
The following commands succeed or fail based on the IAM policy that restricts access when the tag value
is "Desktop" and the tag key is "Env":
The following policy limits access to a replication task where the tag value is "Desktop" and the tag key is
"Env":
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"dms:*"
],
"Effect": "Deny",
"Resource": "*",
"Condition": {
"StringEquals": {
"dms:task-tag/Env": [
"Desktop"
]
}
}
}
]
}
The following commands succeed or fail based on the IAM policy that restricts access when the tag value
is "Desktop" and the tag key is "Env":
The default KMS key (aws/dms) is created when you first launch a replication instance and you have not
selected a custom KMS master key from the Advanced section of the Create Replication Instance page.
If you use the default KMS key, the only permissions you need to grant to the IAM user account you are
using for migration are kms:ListAliases and kms:DescribeKey. For more information about using
the default KMS key, see IAM Permissions Needed to Use AWS DMS (p. 31).
To use a custom KMS key, assign permissions for the custom KMS key using one of the following options.
• Add the IAM user account used for the migration as a Key Administrator/Key User for the KMS custom
key. This will ensure that necessary KMS permissions are granted to the IAM user account. Note that
this action is in addition to the IAM permissions that you must grant to the IAM user account to use
AWS DMS. For more information about granting permissions to a key user, see Allows Key Users to
Use the CMK.
• If you do not want to add the IAM user account as a Key Administrator/Key User for your custom KMS
key, then add the following additional permissions to the IAM permissions that you must grant to the
IAM user account to use AWS DMS.
{
"Effect": "Allow",
"Action": [
"kms:ListAliases",
"kms:DescribeKey",
"kms:CreateGrant",
"kms:Encrypt",
"kms:ReEncrypt*"
],
"Resource": "*"
},
AWS DMS does not work with KMS Key Aliases, but you can use the KMS key's Amazon Resource Number
(ARN) when specifying the KMS key information. For more information on creating your own KMS keys
and giving users access to a KMS key, see the KMS Developer Guide.
If you don't specify a KMS key identifier, then AWS DMS uses your default encryption key. KMS creates
the default encryption key for AWS DMS for your AWS account. Your AWS account has a different default
encryption key for each AWS region.
To manage the KMS keys used for encrypting your AWS DMS resources, you use KMS. You can find
KMS in the AWS Management Console by choosing Identity & Access Management on the console
home page and then choosing Encryption Keys on the navigation pane. KMS combines secure, highly
available hardware and software to provide a key management system scaled for the cloud. Using KMS,
you can create encryption keys and define the policies that control how these keys can be used. KMS
supports AWS CloudTrail, so you can audit key usage to verify that keys are being used appropriately.
Your KMS keys can be used in combination with AWS DMS and supported AWS services such as Amazon
RDS, Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon Elastic Block Store
(Amazon EBS).
Once you have created your AWS DMS resources with the KMS key, you cannot change the encryption
key for those resources. Make sure to determine your encryption key requirements before you create
your AWS DMS resources.
• The replication instance must have access to the source and target endpoints. The security group for
the replication instance must have network ACLs or rules that allow egress from the instance out on
the database port to the database endpoints.
• Database endpoints must include network ACLs and security group rules that allow incoming access
from the replication instance. You can achieve this using the replication instance's security group, the
private IP address, the public IP address, or the NAT gateway’s public address, depending on your
configuration.
• If your network uses a VPN Tunnel, the EC2 instance acting as the NAT Gateway must use a security
group that has rules that allow the replication instance to send traffic through it.
By default, the VPC security group used by the AWS DMS replication instance has rules that allow egress
to 0.0.0.0/0 on all ports. If you modify this security group or use your own security group, egress must,
at a minimum, be permitted to the source and target endpoints on the respective database ports.
The network configurations you can use for database migration each require specific security
considerations:
• Configuration with All Database Migration Components in One VPC (p. 66) — The security group
used by the endpoints must allow ingress on the database port from the replication instance. Ensure
that the security group used by the replication instance has ingress to the endpoints, or you can create
a rule in the security group used by the endpoints that allows the private IP address of the replication
instance access.
• Configuration with Two VPCs (p. 66) — The security group used by the replication instance must
have a rule for the VPC range and the DB port on the database.
• Configuration for a Network to a VPC Using AWS Direct Connect or a VPN (p. 66) — a VPN tunnel
allowing traffic to tunnel from the VPC into an on- premises VPN. In this configuration, the VPC
includes a routing rule that sends traffic destined for a specific IP address or range to a host that
can bridge traffic from the VPC into the on-premises VPN. If this case, the NAT host includes its own
Security Group settings that must allow traffic from the Replication Instance’s private IP address or
security group into the NAT instance.
• Configuration for a Network to a VPC Using the Internet (p. 67) — The VPC security group must
include routing rules that send traffic not destined for the VPC to the Internet gateway. In this
configuration, the connection to the endpoint appears to come from the public IP address on the
replication instance.
• Configuration with an Amazon RDS DB instance not in a VPC to a DB instance in a VPC Using
ClassicLink (p. 67) — When the source or target Amazon RDS DB instance is not in a VPC and does
not share a security group with the VPC where the replication instance is located, you can setup a
proxy server and use ClassicLink to connect the source and target databases.
• Source endpoint is outside the VPC used by the replication instance and uses a NAT gateway — You
can configure a network address translation (NAT) gateway using a single Elastic IP Address bound to a
single Elastic Network Interface, which then receives a NAT identifier (nat-#####). If the VPC includes
a default route to that NAT Gateway instead of the Internet Gateway, the replication instance will
instead appear to contact the Database Endpoint using the public IP address of the Internet Gateway.
In this case, the ingress to the Database Endpoint outside the VPC needs to allow ingress from the NAT
address instead of the Replication Instance’s public IP Address.
Not all databases use SSL in the same way. Amazon Aurora with MySQL compatibility uses the server
name, the endpoint of the primary instance in the cluster, as the endpoint for SSL. An Amazon Redshift
endpoint already uses an SSL connection and does not require an SSL connection set up by AWS DMS.
An Oracle endpoint requires additional steps; for more information, see SSL Support for an Oracle
Endpoint (p. 50).
Topics
• Limitations on Using SSL with AWS Database Migration Service (p. 48)
• Managing Certificates (p. 48)
• Enabling SSL for a MySQL-compatible, PostgreSQL, or SQL Server Endpoint (p. 49)
• SSL Support for an Oracle Endpoint (p. 50)
To assign a certificate to an endpoint, you provide the root certificate or the chain of intermediate
CA certificates leading up to the root (as a certificate bundle), that was used to sign the server SSL
certificate that is deployed on your endpoint. Certificates are accepted as PEM formatted X509 files, only.
When you import a certificate, you receive an Amazon Resource Name (ARN) that you can use to specify
that certificate for an endpoint. If you use Amazon RDS, you can download the root CA and certificate
bundle provided by Amazon RDS at https://s3.amazonaws.com/rds-downloads/rds-combined-ca-
bundle.pem.
You can choose from several SSL modes to use for your SSL certificate verification.
• none – The connection is not encrypted. This option is not secure, but requires less overhead.
• require – The connection is encrypted using SSL (TLS) but no CA verification is made. This option is
more secure, and requires more overhead.
• verify-ca – The connection is encrypted. This option is more secure, and requires more overhead. This
option verifies the server certificate.
• verify-full – The connection is encrypted. This option is more secure, and requires more overhead.
This option verifies the server certificate and verifies that the server hostname matches the hostname
attribute for the certificate.
Not all SSL modes work with all database endpoints. The following table shows which SSL modes are
supported for each database engine.
Amazon Redshift Default SSL not enabled SSL not enabled SSL not enabled
SAP ASE Default SSL not enabled SSL not enabled Supported
Managing Certificates
You can use the DMS console to view and manage your SSL certificates. You can also import your
certificates using the DMS console.
1. Sign in to the AWS Management Console and choose AWS Database Migration Service.
Note
If you are signed in as an AWS Identity and Access Management (IAM) user, you must have
the appropriate permissions to access AWS DMS. For more information on the permissions
required for database migration, see IAM Permissions Needed to Use AWS DMS (p. 31).
2. In the navigation pane, choose Certificates.
3. Choose Import Certificate.
4. Upload the certificate you want to use for encrypting the connection to an endpoint.
Note
You can also upload a certificate using the AWS DMS console when you create or modify an
endpoint by selecting Add new CA certificate on the Create database endpoint page.
5. Create an endpoint as described in Step 3: Specify Source and Target Endpoints (p. 22)
1. Sign in to the AWS Management Console and choose AWS Database Migration Service.
Note
If you are signed in as an AWS Identity and Access Management (IAM) user, you must have
the appropriate permissions to access AWS DMS. For more information on the permissions
required for database migration, see IAM Permissions Needed to Use AWS DMS (p. 31).
2. In the navigation pane, choose Certificates.
3. Choose Import Certificate.
4. Upload the certificate you want to use for encrypting the connection to an endpoint.
Note
You can also upload a certificate using the AWS DMS console when you create or modify an
endpoint by selecting Add new CA certificate on the Create database endpoint page.
5. In the navigation pane, choose Endpoints, select the endpoint you want to modify, and choose
Modify.
6. Choose an SSL mode.
If you select either the verify-ca or verify-full mode, you must specify the CA certificate that you
want to use, as shown following.
7. Choose Modify.
8. When the endpoint has been modified, select the endpoint and choose Test connection to
determine if the SSL connection is working.
After you create your source and target endpoints, create a task that uses these endpoints. For more
information on creating a task, see Step 4: Create a Task (p. 25).
Topics
• Using an Existing Certificate for Oracle SSL (p. 50)
• Using a Self-Signed Certificate for Oracle SSL (p. 51)
To use an existing Oracle client installation for Oracle SSL with AWS DMS
1. Set the ORACLE_HOME system variable to the location of your dbhome_1 directory by running the
following command:
prompt>export ORACLE_HOME=/home/user/app/user/product/12.1.0/dbhome_1
prompt>export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib
prompt>mkdir $ORACLE_HOME/ssl_wallet
4. Put the CA certificate .pem file in the ssl_wallet directory. Amazon RDS customers can download the
RDS CA certificates file from https://s3.amazonaws.com/rds-downloads/rds-ca-2015-root.pem.
5. Run the following commands to create the Oracle wallet:
When you have completed the steps previous, you can import the wallet file with the ImportCertificate
API by specifying the certificate-wallet parameter. You can then use the imported wallet certificate when
you select verify-ca as the SSL mode when creating or modifying your Oracle endpoint.
Note
Oracle wallets are binary files. AWS DMS accepts these files as-is.
1. Create a directory you will use to work with the self-signed certificate.
mkdir <SELF_SIGNED_CERT_DIRECTORY>
cd <SELF_SIGNED_CERT_DIRECTORY>
4. Self sign a root certificate using the root key you created in the previous step.
mkdir $ORACLE_HOME/self_signed_ssl_wallet
8. List the contents of the Oracle wallet. The list should include the root certificate.
9. Generate the Certificate Signing Request (CSR) using the ORAPKI utility.
13. If the output from step 12 is sha256WithRSAEncryption, then run the following code.
14. If the output from step 12 is md5WithRSAEncryption, then run the following code.
WALLET_LOCATION =
(SOURCE =
(METHOD = FILE)
(METHOD_DATA =
(DIRECTORY = <ORACLE_HOME>/self_signed_ssl_wallet)
)
)
SQLNET.AUTHENTICATION_SERVICES = (NONE)
SSL_VERSION = 1.0
SSL_CLIENT_AUTHENTICATION = FALSE
SSL_CIPHER_SUITES = (SSL_RSA_WITH_AES_256_CBC_SHA)
lsnrctl stop
SSL_CLIENT_AUTHENTICATION = FALSE
WALLET_LOCATION =
(SOURCE =
(METHOD = FILE)
(METHOD_DATA =
(DIRECTORY = <ORACLE_HOME>/self_signed_ssl_wallet)
)
)
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = <SID>)
(ORACLE_HOME = <ORACLE_HOME>)
(SID_NAME = <SID>)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = localhost.localdomain)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCPS)(HOST = localhost.localdomain)(PORT = 1522))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)
)
<SID>=
(DESCRIPTION=
(ADDRESS_LIST =
(ADDRESS=(PROTOCOL = TCP)(HOST = localhost.localdomain)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = <SID>)
)
)
<SID>_ssl=
(DESCRIPTION=
(ADDRESS_LIST =
(ADDRESS=(PROTOCOL = TCPS)(HOST = localhost.localdomain)(PORT = 1522))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = <SID>)
)
)
lsnrctl start
lsnrctl status
22. Test the SSL connection to the database from localhost using sqlplus and the SSL tnsnames entry.
sqlplus -L <ORACLE_USER>@<SID>_ssl
SYS_CONTEXT('USERENV','NETWORK_PROTOCOL')
--------------------------------------------------------------------------------
tcps
cd <SELF_SIGNED_CERT_DIRECTORY>
25. Create a new client Oracle wallet that AWS DMS will use.
27. List the contents of the Oracle wallet that AWS DMS will use. The list should include the self-signed
root certificate.
28. Upload the Oracle wallet you just created to AWS DMS.
1. Sign in to the AWS Management Console and choose AWS DMS. Note that if you are signed in as
an AWS Identity and Access Management (IAM) user, you must have the appropriate permissions to
access AWS DMS. For more information on the permissions required, see IAM Permissions Needed to
Use AWS DMS (p. 31).
2. In the navigation pane, choose Tasks.
3. Choose the task that uses the endpoint you want to change the database password for, and then
choose Stop.
4. While the task is stopped, you can change the password of the database for the endpoint using the
native tools you use to work with the database.
5. Return to the DMS Management Console and choose Endpoints from the navigation pane.
6. Choose the endpoint for the database you changed the password for, and then choose Modify.
7. Type the new password in the Password box, and then choose Modify.
8. Choose Tasks from the navigation pane.
9. Choose the task that you stopped previously, and choose Start/Resume.
10. Choose either Start or Resume, depending on how you want to continue the task, and then choose
Start task.
The maximum size of a database that AWS DMS can migrate depends on your source environment,
the distribution of data in your source database, and how busy your source system is. The best way to
determine whether your particular system is a candidate for AWS DMS is to test it out. Start slowly so
you can get the configuration worked out, then add some complex objects, and finally, attempt a full
load as a test.
The 6 TB limit for storage applies to the DMS replication instance. This storage is used to cache changes
if the target cannot keep up with the source and for storing log information. This limit does not apply to
the target size; target endpoints can be larger than 6 TB.
The following table lists the AWS DMS resources and their limits per region.
Replication instances 20
Event subscriptions 20
Endpoints 100
Tasks 200
In a Multi-AZ deployment, AWS DMS automatically provisions and maintains a synchronous standby
replica of the replication instance in a different Availability Zone. The primary replication instance is
synchronously replicated across Availability Zones to a standby replica. This approach provides data
redundancy, eliminates I/O freezes, and minimizes latency spikes.
AWS DMS uses a replication instance to connect to your source data store, read the source data, and
format the data for consumption by the target data store. A replication instance also loads the data into
the target data store. Most of this processing happens in memory. However, large transactions might
require some buffering on disk. Cached transactions and log files are also written to disk.
You can create an AWS DMS replication instance in the following AWS Regions.
Region Name
Region Name
AWS DMS supports a special AWS Region called AWS GovCloud (US) that is designed to allow US
government agencies and customers to move more sensitive workloads into the cloud. AWS GovCloud
(US) addresses the US government's specific regulatory and compliance requirements. For more
information about AWS GovCloud (US), see What Is AWS GovCloud (US)?
Following, you can find out more details about replication instances.
Topics
• Selecting the Right AWS DMS Replication Instance for Your Migration (p. 58)
• Public and Private Replication Instances (p. 60)
• AWS DMS Maintenance (p. 60)
• Working with Replication Engine Versions (p. 63)
• Setting Up a Network for a Replication Instance (p. 65)
• Setting an Encryption Key for a Replication Instance (p. 71)
• Creating a Replication Instance (p. 72)
• Modifying a Replication Instance (p. 76)
• Rebooting a Replication Instance (p. 78)
• Deleting a Replication Instance (p. 80)
• DDL Statements Supported by AWS DMS (p. 81)
• The T2 instance classes are low-cost standard instances designed to provide a baseline level of CPU
performance with the ability to burst above the baseline. They are suitable for developing, configuring,
and testing your database migration process. They also work well for periodic data migration tasks
that can benefit from the CPU burst capability.
• The C4 instance classes are designed to deliver the highest level of processor performance for
computer-intensive workloads. They achieve significantly higher packet per second (PPS) performance,
lower network jitter, and lower network latency. AWS DMS can be CPU-intensive, especially when
performing heterogeneous migrations and replications such as migrating from Oracle to PostgreSQL.
C4 instances can be a good choice for these situations.
• The R4 instance classes are memory optimized for memory-intensive workloads. Ongoing migrations
or replications of high-throughput transaction systems using DMS can, at times, consume large
amounts of CPU and memory. R4 instances include more memory per vCPU.
Each replication instance has a specific configuration of memory and vCPU. The following table shows
the configuration for each replication instance type. For pricing information, see the AWS Database
Migration Service pricing page.
General Purpose
dms.t2.micro 1 1
dms.t2.small 1 2
dms.t2.medium 2 4
dms.t2.large 2 8
Compute Optimized
dms.c4.large 2 3.75
dms.c4.xlarge 4 7.5
dms.c4.2xlarge 8 15
dms.c4.4xlarge 16 30
Memory Optimized
dms.r4.large 2 15.25
dms.r4.xlarge 4 30.5
dms.r4.2xlarge 8 61
dms.r4.4xlarge 16 122
dms.r4.8xlarge 32 244
To help you determine which replication instance class would work best for your migration, let’s look at
the change data capture (CDC) process that the AWS DMS replication instance uses.
Let’s assume that you’re running a full load plus CDC task (bulk load plus ongoing replication). In this
case, the task has its own SQLite repository to store metadata and other information. Before AWS DMS
starts a full load, these steps occur:
• AWS DMS starts capturing changes for the tables it's migrating from the source engine’s transaction
log (we call these cached changes). After full load is done, these cached changes are collected and
applied on the target. Depending on the volume of cached changes, these changes can directly be
applied from memory, where they are collected first, up to a set threshold. Alternatively, they can be
applied from disk, where changes are written when they can't be held in memory.
• After cached changes are applied, by default AWS DMS starts a transactional apply on the target
instance.
During the applied cached changes phase and ongoing replications phase, AWS DMS uses two stream
buffers, one each for incoming and outgoing data. AWS DMS also uses an important component called
a sorter, which is another memory buffer. Following are two important uses of the sorter component
(which has others):
• It tracks all transactions and makes sure that it forwards only relevant transactions to the outgoing
buffer.
• It makes sure that transactions are forwarded in the same commit order as on the source.
As you can see, we have three important memory buffers in this architecture for CDC in AWS DMS. If
any of these buffers experience memory pressure, the migration can have performance issues that can
potentially cause failures.
When you plug heavy workloads with a high number of transactions per second (TPS) into this
architecture, you can find the extra memory provided by R4 instances useful. You can use R4 instances
to hold a large number of transactions in memory and prevent memory-pressure issues during ongoing
replications.
A private replication instance has a private IP address that you can't access outside the replication
network. A replication instance should have a private IP address when both source and target databases
are in the same network that is connected to the replication instance's VPC by using a VPN, AWS Direct
Connect, or VPC peering.
A VPC peering connection is a networking connection between two VPCs that enables routing using each
VPC’s private IP addresses as if they were in the same network. For more information about VPC peering,
see VPC Peering in the Amazon VPC User Guide.
Maintenance items require that AWS DMS take your replication instance offline for a short time.
Maintenance that requires a resource to be offline includes required operating system or instance
patching. Required patching is automatically scheduled only for patches that are related to security
and instance reliability. Such patching occurs infrequently (typically once or twice a year) and seldom
requires more than a fraction of your maintenance window. You can have minor version updates applied
automatically by choosing the Auto minor version upgrade console option.
If AWS DMS determines that maintenance is required during a given week, the maintenance occurs
during the 30-minute maintenance window you chose when you created the replication instance. AWS
DMS completes most maintenance during the 30-minute maintenance window. However, a longer time
might be required for larger changes.
The 30-minute maintenance window that you selected when you created the replication instance is from
an 8-hour block of time allocated for each AWS Region. If you don't specify a preferred maintenance
window when you create your replication instance, AWS DMS assigns one on a randomly selected day
of the week. For a replication instance that uses a Multi-AZ deployment, a failover might be required for
maintenance to be completed.
The following table lists the maintenance window for each AWS Region that supports AWS DMS.
• If the tables in the migration task are in the replicating ongoing changes phase (CDC), AWS DMS
pauses the task for a moment while the patch is applied. The migration then continues from where it
was interrupted when the patch was applied.
• If AWS DMS is migrating a table when the patch is applied, AWS DMS restarts the migration for the
table.
To adjust the preferred maintenance window, use the AWS CLI modify-replication-instance
command with the following parameters.
• --replication-instance-identifier
• --preferred-maintenance-window
Example
The following AWS CLI example sets the maintenance window to Tuesdays from 4:00–4:30 a.m. UTC.
To adjust the preferred maintenance window, use the AWS DMS API ModifyReplicationInstance
action with the following parameters.
• ReplicationInstanceIdentifier = myrepinstance
• PreferredMaintenanceWindow = Tue:04:00-Tue:04:30
Example
The following code example sets the maintenance window to Tuesdays from 4:00–4:30 a.m. UTC.
https://dms.us-west-2.amazonaws.com/
?Action=ModifyReplicationInstance
&DBInstanceIdentifier=myrepinstance
&PreferredMaintenanceWindow=Tue:04:00-Tue:04:30
&SignatureMethod=HmacSHA256
&SignatureVersion=4
&Version=2014-09-01
&X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=AKIADQKE4SARGYLE/20140425/us-east-1/dms/aws4_request
&X-Amz-Date=20140425T192732Z
&X-Amz-SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date
&X-Amz-Signature=1dc9dd716f4855e9bdf188c70f1cf9f6251b070b68b81103b59ec70c3e7854b3
When you launch a new replication instance, it runs the latest AWS DMS engine version unless you
specify otherwise. For more information, see Working with an AWS DMS Replication Instance (p. 57).
If you have a replication instance that is currently running, you can upgrade it to a more recent engine
version. (AWS DMS doesn't support engine version downgrades.) For more information, including a list of
replication engine versions, see the following section.
Beginning on August 5, 2018, at 0:00 UTC, all DMS replication instances running version 1.9.0 will
be scheduled for automatic upgrade to the latest available version during the maintenance window
specified for each instance. We recommend that you upgrade your instances before that time, at a time
that is convenient for you.
You can initiate an upgrade of your replication instance by using the instructions in the following section,
Upgrading the Engine Version of a Replication Instance (p. 63).
For migration tasks that are running when you choose to upgrade the replication instance, tables in the
full load phase at the time of the upgrade are reloaded from the start once the upgrade is complete.
Replication for all other tables should resume without interruption once the upgrade is complete. We
recommend testing all current migration tasks on the latest available version of AWS DMS replication
instance before upgrading the instances from version 1.9.0.
Version Summary
Version Summary
2.2.x • Support for Microsoft SQL Server 2016, as either an AWS DMS source or an
AWS DMS target.
• Support for SAP ASE 16, as either an AWS DMS source or an AWS DMS target.
• Support for Microsoft SQL Server running on Microsoft Azure, as an AWS
DMS source only. You can perform a full migration of existing data; however,
change data capture (CDC) is not available.
Note
Upgrading the replication instance takes several minutes. When the instance is ready, its status
changes to available.
1. Determine the Amazon Resource Name (ARN) of your replication instance by using the following
command.
In the output, take note of the ARN for the replication instance you want to upgrade, for example:
arn:aws:dms:us-east-1:123456789012:rep:6EFQQO6U6EDPRCPKLNPL2SCEEY
2. Determine which replication instance versions are available by using the following command.
In the output, take note of the engine version number or numbers that are available for your
replication instance class. You should see this information in the output from step 1.
Replace arn in the preceding with the actual replication instance ARN from the previous step.
Replace n.n.n with the engine version number that you want, for example: 2.2.1
Note
Upgrading the replication instance takes several minutes. You can view the replication instance
status using the following command.
The Elastic Network Interface (ENI) allocated for the replication instance in your VPC must be associated
with a security group that has rules that allow all traffic on all ports to leave (egress) the VPC. This
approach allows communication from the replication instance to your source and target database
endpoints, as long as correct egress rules are enabled on the endpoints. We recommend that you use the
default settings for the endpoints, which allows egress on all ports to all addresses.
The source and target endpoints access the replication instance that is inside the VPC either by
connecting to the VPC or by being inside the VPC. The database endpoints must include network access
control lists (ACLs) and security group rules (if applicable) that allow incoming access from the replication
instance. Depending on the network configuration you are using, you can use the replication instance
VPC security group, the replication instance's private or public IP address, or the NAT gateway's public IP
address. These connections form a network that you use for data migration.
Topics
• Configuration with All Database Migration Components in One VPC (p. 66)
• Configuration with Two VPCs (p. 66)
• Configuration for a Network to a VPC Using AWS Direct Connect or a VPN (p. 66)
• Configuration for a Network to a VPC Using the Internet (p. 67)
• Configuration with an Amazon RDS DB instance not in a VPC to a DB instance in a VPC Using
ClassicLink (p. 67)
The following illustration shows a configuration where a database on an Amazon EC2 instance connects
to the replication instance and data is migrated to an Amazon RDS DB instance.
The VPC security group used in this configuration must allow ingress on the database port from the
replication instance. You can do this by either ensuring that the security group used by the replication
instance has ingress to the endpoints, or by explicitly allowing the private IP address of the replication
instance.
A VPC peering connection is a networking connection between two VPCs that enables routing using
each VPC’s private IP addresses as if they were in the same network. We recommend this method for
connecting VPCs within an AWS Region. You can create VPC peering connections between your own VPCs
or with a VPC in another AWS account within the same AWS Region. For more information about VPC
peering, see VPC Peering in the Amazon VPC User Guide.
The following illustration shows an example configuration using VPC peering. Here, the source database
on an Amazon EC2 instance in a VPC connects by VPC peering to a VPC. This VPC contains the replication
instance and the target database on an Amazon RDS DB instance.
The VPC security groups used in this configuration must allow ingress on the database port from the
replication instance.
as monitoring, authentication, security, data, or other systems, by extending an internal network into
the AWS cloud. By using this type of network extension, you can seamlessly connect to AWS-hosted
resources such as a VPC.
The following illustration shows a configuration where the source endpoint is an on-premises database
in a corporate data center. It is connected by using AWS Direct Connect or a VPN to a VPC that contains
the replication instance and a target database on an Amazon RDS DB instance.
In this configuration, the VPC security group must include a routing rule that sends traffic destined for a
specific IP address or range to a host. This host must be able to bridge traffic from the VPC into the on-
premises VPN. In this case, the NAT host includes its own security group settings that must allow traffic
from the replication instance’s private IP address or security group into the NAT instance.
To add an Internet gateway to your VPC, see Attaching an Internet Gateway in the Amazon VPC User
Guide.
The VPC security group must include routing rules that send traffic not destined for the VPC by default
to the Internet gateway. In this configuration, the connection to the endpoint appears to come from the
public IP address of the replication instance, not the private IP address.
ClassicLink allows you to link an EC2-Classic DB instance to a VPC in your account, within the same AWS
Region. After you've created the link, the source DB instance can communicate with the replication
instance inside the VPC using their private IP addresses.
Because the replication instance in the VPC cannot directly access the source DB instance on the EC2-
Classic platform using ClassicLink, you must use a proxy server. The proxy server connects the source DB
instance to the VPC containing the replication instance and target DB instance. The proxy server uses
ClassicLink to connect to the VPC. Port forwarding on the proxy server allows communication between
the source DB instance and the target DB instance in the VPC.
The following procedure shows how to use ClassicLink to connect an Amazon RDS source DB instance
that is not in a VPC to a VPC containing an AWS DMS replication instance and a target DB instance.
• Create an AWS DMS replication instance in a VPC. (All replication instances are created in a VPC).
• Associate a VPC security group to the replication instance and the target DB instance. When two
instances share a VPC security group, they can communicate with each other by default.
• Set up a proxy server on an EC2 Classic instance.
• Create a connection using ClassicLink between the proxy server and the VPC.
• Create AWS DMS endpoints for the source and target databases.
• Create an AWS DMS task.
To create a AWS DMS replication instance and assign a VPC security group:
a. Sign in to the AWS Management Console and choose AWS Database Migration Service. Note
that if you are signed in as an AWS Identity and Access Management (IAM) user, you must have
the appropriate permissions to access AWS DMS. For more information on the permissions
required for database migration, see IAM Permissions Needed to Use AWS DMS (p. 31).
b. On the Dashboard page, choose Replication Instance. Follow the instructions at Step 2: Create
a Replication Instance (p. 18) to create a replication instance.
c. After you have created the AWS DMS replication instance, open the EC2 service console. Select
Network Interfaces from the navigation pane.
d. Select the DMSNetworkInterface, and then choose Change Security Groups from the Actions
menu.
e. Select the security group you want to use for the replication instance and the target DB
instance.
2. Step 2: Associate the security group from the last step with the target DB instance.
a. Open the Amazon RDS service console. Select Instances from the navigation pane.
b. Select the target DB instance. From Instance Actions, select Modify.
c. For the Security Group parameter, select the security group you used in the previous step.
d. Select Continue, and then Modify DB Instance.
API Version API Version 2016-01-01
68
AWS Database Migration Service User Guide
Network Configurations for Database Migration
3. Step 3: Set up a proxy server on an EC2 Classic instance using NGINX. Use an AMI of your choice to
launch an EC2 Classic instance. The example below is based on the AMI Ubuntu Server 14.04 LTS
(HVM).
a. Connect to the EC2 Classic instance and install NGINX using the following commands:
b. Edit the NGINX daemon file, /etc/init/nginx.conf, using the following code:
env DAEMON=/usr/local/nginx/sbin/nginx
env PID=/usr/local/nginx/logs/nginx.pid
expect fork
respawn
respawn limit 10 5
pre-start script
$DAEMON -t
if [ $? -ne 0 ]
then exit $?
fi
end script
exec $DAEMON
worker_processes 1;
events {
worker_connections 1024;
}
stream {
server {
d. From the command line, start NGINX using the following commands:
4. Step 4: Create a ClassicLink connection between the proxy server and the target VPC that contains
the target DB instance and the replication instance
Use ClassicLink to connect the proxy server with the target VPC
a. Open the EC2 console and select the EC2 Classic instance that is running the proxy server.
b. Select ClassicLink under Actions, then select Link to VPC.
c. Select the security group you used earlier in this procedure.
d. Select Link to VPC.
5. Step 5: Create AWS DMS endpoints using the procedure at Step 3: Specify Source and Target
Endpoints (p. 22). You must use the internal EC2 DNS hostname of the proxy as the server name
when specifying the source endpoint.
6. Step 6: Create a AWS DMS task using the procedure at Step 4: Create a Task (p. 25).
You create a replication instance in a subnet that you select, and you can manage what subnet a source
or target endpoint uses by using the AWS DMS console.
You create a replication subnet group to define which subnets to use. You must specify at least one
subnet in two different Availability Zones.
1. Sign in to the AWS Management Console and choose AWS Database Migration Service. If you are
signed in as an AWS Identity and Access Management (IAM) user, you must have the appropriate
permissions to access AWS DMS. For more information on the permissions required for database
migration, see IAM Permissions Needed to Use AWS DMS (p. 31).
2. In the navigation pane, choose Subnet Groups.
3. Choose Create Subnet Group.
4. On the Edit Replication Subnet Group page, shown following, specify your replication subnet group
information. The following table describes the settings.
VPC Choose the VPC you want to use for database migration.
Keep in mind that the VPC must have at least one subnet
in at least two Availability Zones.
Available Subnets Choose the subnets you want to include in the replication
subnet group. You must select subnets in at least two
Availability Zones.
AWS account. You can view and manage this master key with AWS Key Management Service (AWS KMS).
You can use the default master key in your account (aws/dms) or a custom master key that you create. If
you have an existing AWS KMS encryption key, you can also use that key for encryption.
You can specify your own encryption key by supplying a KMS key identifier to encrypt your AWS DMS
resources. When you specify your own encryption key, the user account used to perform the database
migration must have access to that key. For more information on creating your own encryption keys and
giving users access to an encryption key, see the AWS KMS Developer Guide.
If you don't specify a KMS key identifier, then AWS DMS uses your default encryption key. KMS creates
the default encryption key for AWS DMS for your AWS account. Your AWS account has a different default
encryption key for each AWS Region.
To manage the keys used for encrypting your AWS DMS resources, you use KMS. You can find KMS in the
AWS Management Console by choosing Identity & Access Management on the console home page and
then choosing Encryption Keys on the navigation pane.
KMS combines secure, highly available hardware and software to provide a key management system
scaled for the cloud. Using KMS, you can create encryption keys and define the policies that control how
these keys can be used. KMS supports AWS CloudTrail, so you can audit key usage to verify that keys
are being used appropriately. Your KMS keys can be used in combination with AWS DMS and supported
AWS services such as Amazon RDS, Amazon S3, Amazon Elastic Block Store (Amazon EBS), and Amazon
Redshift.
When you have created your AWS DMS resources with a specific encryption key, you can't change the
encryption key for those resources. Make sure to determine your encryption key requirements before you
create your AWS DMS resources.
The procedure following assumes that you have chosen the AWS DMS console wizard. You can also do
this step by selecting Replication instances from the AWS DMS console's navigation pane and then
selecting Create replication instance.
1. On the Create replication instance page, specify your replication instance information. The
following table describes the settings.
Replication engine version By default, the replication instance runs the latest
version of the AWS DMS replication engine software. We
recommend that you accept this default; however, you
can choose a previous engine version if necessary.
Publicly accessible Choose this option if you want the replication instance to
be accessible from the Internet.
2. Choose the Advanced tab, shown following, to set values for network and encryption settings if you
need them. The following table describes the settings.
Allocated storage (GB) Storage is primarily consumed by log files and cached
transactions. For caches transactions, storage is used only
when the cached transactions need to be written to disk.
Therefore, AWS DMS doesn’t use a significant amount of
storage. Some exceptions include the following:
Replication Subnet Group Choose the replication subnet group in your selected VPC
where you want the replication instance to be created.
If your source database is in a VPC, choose the subnet
group that contains the source database as the location
for your replication instance. For more information about
replication subnet groups, see Creating a Replication
Subnet Group (p. 70).
Availability zone Choose the Availability Zone where your source database
is located.
VPC Security group(s) The replication instance is created in a VPC. If your source
database is in a VPC, select the VPC security group that
provides access to the DB instance where the database
resides.
KMS master key Choose the encryption key to use to encrypt replication
storage and connection information. If you choose
(Default) aws/dms, the default AWS Key Management
Service (AWS KMS) key associated with your account
and AWS Region is used. A description and your account
number are shown, along with the key's ARN. For more
information on using the encryption key, see Setting an
Encryption Key and Specifying KMS Permissions (p. 44).
3. Specify the Maintenance settings. The following table describes the settings. For more information
about maintenance settings, see AWS DMS Maintenance Window (p. 60).
Auto minor version upgrade Select to have minor engine upgrades applied
automatically to the replication instance during the
maintenance window.
When you modify a replication instance, you can apply the changes immediately. To apply
changes immediately, you select the Apply changes immediately option in the AWS Management
Console, you use the --apply-immediately parameter when calling the AWS CLI, or you set the
ApplyImmediately parameter to true when using the AWS DMS API.
If you don't choose to apply changes immediately, the changes are put into the pending modifications
queue. During the next maintenance window, any pending changes in the queue are applied.
Note
If you choose to apply changes immediately, any changes in the pending modifications queue
are also applied. If any of the pending modifications require downtime, choosing Apply changes
immediately can cause unexpected downtime.
Instance class You can change the instance class. Choose an instance
class with the configuration you need for your migration.
Changing the instance class causes the replication
instance to reboot. This reboot occurs during the next
maintenance window or can occur immediately if you
select the Apply changes immediately option.
Replication engine version You can upgrade the engine version that is used by the
replication instance. Upgrading the replication engine
version causes the replication instance to shut down while
it is being upgraded.
Allocated storage (GB) Storage is primarily consumed by log files and cached
transactions. For caches transactions, storage is used only
when the cached transactions need to be written to disk.
Therefore, AWS DMS doesn’t use a significant amount of
storage. Some exceptions include the following:
VPC Security Group(s) The replication instance is created in a VPC. If your source
database is in a VPC, select the VPC security group that
provides access to the DB instance where the database
resides.
Auto minor version upgrade Choose this option to have minor engine upgrades
applied automatically to the replication instance during
the maintenance window or immediately if you select the
Apply changes immediately option.
Apply changes immediately Choose this option to apply any modifications you made
immediately. Depending on the settings you choose,
choosing this option could cause an immediate reboot of
the replication instance.
the AWS DMS instance is configured for Multi-AZ, the reboot can be conducted with a failover. An AWS
DMS event is created when the reboot is completed.
If your AWS DMS instance is a Multi-AZ deployment, you can force a failover from one AWS Availability
Zone to another when you reboot. When you force a failover of your AWS DMS instance, AWS DMS
automatically switches to a standby instance in another Availability Zone. Rebooting with failover is
beneficial when you want to simulate a failure of an AWS DMS instance for testing purposes.
If there are migration tasks running on the replication instance when a reboot occurs, no data loss
occurs and the task resumes once the reboot is completed. If the tables in the migration task are in the
middle of a bulk load (full load phase), DMS restarts the migration for those tables from the beginning.
If tables in the migration task are in the ongoing replication phase, the task resumes once the reboot is
completed.
You can't reboot your AWS DMS replication instance if its status is not in the Available state. Your AWS
DMS instance can be unavailable for several reasons, such as a previously requested modification or a
maintenance-window action. The time required to reboot an AWS DMS replication instance is typically
small (under 5 minutes).
• --replication-instance-arn
The following AWS CLI example reboots a replication instance with failover.
--replication-instance-arn arnofmyrepinstance \
--force-failover
• ReplicationInstanceArn = arnofmyrepinstance
https://dms.us-west-2.amazonaws.com/
?Action=RebootReplicationInstance
&DBInstanceArn=arnofmyrepinstance
&SignatureMethod=HmacSHA256
&SignatureVersion=4
&Version=2014-09-01
&X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=AKIADQKE4SARGYLE/20140425/us-east-1/dms/aws4_request
&X-Amz-Date=20140425T192732Z
&X-Amz-SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date
&X-Amz-Signature=1dc9dd716f4855e9bdf188c70f1cf9f6251b070b68b81103b59ec70c3e7854b3
The following code example reboots a replication instance and fails over to another AWS Availability
Zone.
https://dms.us-west-2.amazonaws.com/
?Action=RebootReplicationInstance
&DBInstanceArn=arnofmyrepinstance
&ForceFailover=true
&SignatureMethod=HmacSHA256
&SignatureVersion=4
&Version=2014-09-01
&X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=AKIADQKE4SARGYLE/20140425/us-east-1/dms/aws4_request
&X-Amz-Date=20140425T192732Z
&X-Amz-SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date
&X-Amz-Signature=1dc9dd716f4855e9bdf188c70f1cf9f6251b070b68b81103b59ec70c3e7854b3
If you close your AWS account, all AWS DMS resources and configurations associated with your account
are deleted after two days. These resources include all replication instances, source and target endpoint
configuration, replication tasks, and SSL certificates. If after two days you decide to use AWS DMS again,
you recreate the resources you need.
• --replication-instance-arn
• ReplicationInstanceArn = <arnofmyrepinstance>
https://dms.us-west-2.amazonaws.com/
?Action=DeleteReplicationInstance
&DBInstanceArn=arnofmyrepinstance
&SignatureMethod=HmacSHA256
&SignatureVersion=4
&Version=2014-09-01
&X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=AKIADQKE4SARGYLE/20140425/us-east-1/dms/aws4_request
&X-Amz-Date=20140425T192732Z
&X-Amz-SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date
&X-Amz-Signature=1dc9dd716f4855e9bdf188c70f1cf9f6251b070b68b81103b59ec70c3e7854b3
• Create table
• Drop table
• Rename table
• Add column
• Drop column
• Rename column
• Change column data type
For information about which DDL statements are supported for a specific source, see the topic describing
that source.
Topics
• Sources for Data Migration (p. 83)
• Targets for Data Migration (p. 147)
• Creating Source and Target Endpoints (p. 210)
• Oracle versions 10.2 and later, 11g, and up to 12.2, for the Enterprise, Standard, Standard One, and
Standard Two editions.
• Microsoft SQL Server versions 2005, 2008, 2008R2, 2012, 2014, and 2016 for the Enterprise,
Standard, Workgroup, and Developer editions. The Web and Express editions are not supported.
• MySQL versions 5.5, 5.6, and 5.7.
• MariaDB (supported as a MySQL-compatible data source).
• PostgreSQL 9.4 and later.
• SAP Adaptive Server Enterprise (ASE) versions 12.5.3 or higher, 15, 15.5, 15.7, 16 and later.
• MongoDB versions 2.6.x and 3.x and later.
• Db2 LUW versions:
• Version 9.7, all Fix Packs are supported.
• Version 10.1, all Fix Packs are supported.
• Version 10.5, all Fix Packs except for Fix Pack 5 are supported.
Microsoft Azure
• AWS DMS supports full data load when using Azure SQL Database as a source. Change data capture
(CDC) is not supported.
• Oracle versions 11g (versions 11.2.0.3.v1 and later), and 12c, for the Enterprise, Standard, Standard
One, and Standard Two editions.
• Microsoft SQL Server versions 2008R2, 2012, 2014, and 2016 for both the Enterprise and Standard
editions. CDC is supported for all versions of Enterprise Edition. CDC is only supported for Standard
Edition version 2016 SP1 and later. The Web, Workgroup, Developer, and Express editions are not
supported by AWS DMS.
• MySQL versions 5.5, 5.6, and 5.7. Change data capture (CDC) is only supported for versions 5.6 and
later.
• PostgreSQL 9.4 and later. CDC is only supported for versions 9.4.9 and higher and 9.5.4 and higher.
The rds.logical_replication parameter, which is required for CDC, is supported only in these
versions and later.
• MariaDB, supported as a MySQL-compatible data source.
• Amazon Aurora with MySQL compatibility.
• AWS DMS supports full data load and change data capture (CDC) when using Amazon Simple Storage
Service as a source.
Topics
• Using an Oracle Database as a Source for AWS DMS (p. 84)
• Using a Microsoft SQL Server Database as a Source for AWS DMS (p. 100)
• Using Microsoft Azure SQL Database as a Source for AWS DMS (p. 109)
• Using a PostgreSQL Database as a Source for AWS DMS (p. 110)
• Using a MySQL-Compatible Database as a Source for AWS DMS (p. 122)
• Using an SAP ASE Database as a Source for AWS DMS (p. 129)
• Using MongoDB as a Source for AWS DMS (p. 132)
• Using Amazon Simple Storage Service as a Source for AWS DMS (p. 138)
• Using an IBM Db2 for Linux, Unix, and Windows Database (Db2 LUW) as a Source for AWS
DMS (p. 144)
For self-managed Oracle databases, AWS DMS supports all Oracle database editions for versions 10.2
and later, 11g, and up to 12.2 for self-managed databases as sources. For Amazon-managed Oracle
databases provided by Amazon RDS, AWS DMS supports all Oracle database editions for versions 11g
(versions 11.2.0.3.v1 and later) and up to 12.2.
You can use SSL to encrypt connections between your Oracle endpoint and the replication instance. For
more information on using SSL with an Oracle endpoint, see Using SSL With AWS Database Migration
Service (p. 47).
The steps to configure an Oracle database as a source for AWS DMS source are as follows:
1. If you want to create a CDC-only or full load plus CDC task, then you must choose either Oracle
LogMiner or Oracle Binary Reader to capture data changes. Choosing LogMiner or Binary Reader
determines some of the subsequent permission and configuration tasks. For a comparison of LogMiner
and Binary Reader, see the next section.
2. Create an Oracle user with the appropriate permissions for AWS DMS. If you are creating a full-load-
only task, then no further configuration is needed.
3. If you are creating a full load plus CDC task or a CDC-only task, configure Oracle for LogMiner or
Binary Reader.
4. Create a DMS endpoint that conforms with your chosen configuration.
For additional details on working with Oracle databases and AWS DMS, see the following sections.
Topics
• Using Oracle LogMiner or Oracle Binary Reader for Change Data Capture (CDC) (p. 85)
• Working with a Self-Managed Oracle Database as a Source for AWS DMS (p. 87)
• Working with an Amazon-Managed Oracle Database as a Source for AWS DMS (p. 89)
• Limitations on Using Oracle as a Source for AWS DMS (p. 92)
• Extra Connection Attributes When Using Oracle as a Source for AWS DMS (p. 93)
• Source Data Types for Oracle (p. 97)
By default, AWS DMS uses Oracle LogMiner for change data capture (CDC). The advantages of using
LogMiner with AWS DMS include the following:
• LogMiner supports most Oracle options, such as encryption options and compression options. Binary
Reader doesn't support all Oracle options, in particular options for encryption and compression.
• LogMiner offers a simpler configuration, especially compared to Oracle Binary Reader's direct access
setup or if the redo logs are on Automatic Storage Management (ASM).
• LogMiner fully supports most Oracle encryption options, including Oracle Transparent Data Encryption
(TDE).
• LogMiner supports the following HCC compression types for both full load and on-going replication
(CDC):
• QUERY HIGH
• ARCHIVE HIGH
• ARCHIVE LOW
• QUERY LOW
Binary Reader supports QUERY LOW compression only for full load replications, not ongoing (CDC)
replications.
• LogMiner supports table clusters for use by AWS DMS. Binary Reader does not.
The advantages to using Binary Reader with AWS DMS, instead of LogMiner, include the following:
• For migrations with a high volume of changes, LogMiner might have some I/O or CPU impact on the
computer hosting the Oracle source database. Binary Reader has less chance of having I/O or CPU
impact because the archive logs are copied to the replication instance and mined there.
• For migrations with a high volume of changes, CDC performance is usually much better when using
Binary Reader compared with using Oracle LogMiner.
• Binary Reader supports CDC for LOBs in Oracle version 12c. LogMiner does not.
• Binary Reader supports the following HCC compression types for both full load and continuous
replication (CDC):
• QUERY HIGH
• ARCHIVE HIGH
• ARCHIVE LOW
The QUERY LOW compression type is only supported for full load migrations.
In general, use Oracle LogMiner for migrating your Oracle database unless you have one of the following
situations:
• You need to run several migration tasks on the source Oracle database.
• The volume of changes or the redo log volume on the source Oracle database is high.
• You are migrating LOBs from an Oracle 12.2 or later source endpoint.
• If your workload includes UPDATE statements that update only LOB columns you must use Binary
Reader. These update statements are not supported by Oracle LogMiner.
• If your source is Oracle version 11 and you perform UPDATE statements on XMLTYPE and LOB
columns, you must use Binary Reader. These statements are not supported by Oracle LogMiner.
• On Oracle 12c, LogMiner does not support LOB columns. You must use Binary Reader if you are
migrating LOB columns from Oracle 12c.
LogMiner is used by default, so you don't have to explicitly specify its use. In order to enable Binary
Reader to access the transaction logs, add the following extra connection attributes.
useLogMinerReader=N; useBfile=Y;
If the Oracle source database is using Oracle Automatic Storage Management (ASM), the extra
connection attribute needs to include the ASM user name and ASM server address. When you create the
source endpoint, the password field needs to have both passwords, the source user password and the
ASM password.
For example, the following extra connection attribute format is used to access a server that uses Oracle
ASM.
useLogMinerReader=N;asm_user=<asm_username>;asm_server=<first_RAC_server_ip_address>:<port_number>/
+ASM
If the Oracle source database is using Oracle ASM, the source endpoint password field must have both
the Oracle user password and the ASM password, separated by a comma. For example, the following
works in the password field.
<oracle_user_password>,<asm_user_password>
• AWS DMS doesn't capture changes made by the Oracle DBMS_REDEFINITION package, such as
changes to table metadata and the OBJECT_ID value.
• AWS DMS doesn't support index-organized tables with an overflow segment in CDC mode when using
BFILE. An example is when you access the redo logs without using LogMiner.
User Account Privileges Required on a Self-Managed Oracle Source for AWS DMS
To use an Oracle database as a source in an AWS DMS task, the user specified in the AWS DMS Oracle
database definitions must be granted the following privileges in the Oracle database. When granting
privileges, use the actual name of objects (for example, V_$OBJECT including the underscore), not the
synonym for the object (for example, V$OBJECT without the underscore).
When using ongoing replication (CDC), you need these additional permissions.
• The following permission is required when using CDC so that AWS DMS can add to Oracle LogMiner
redo logs for both 11g and 12c.
• The following permission is required when using CDC so that AWS DMS can add to Oracle LogMiner
redo logs for 12c only.
If you are using any of the additional features noted following, the given additional permissions are
required:
• Provide Oracle account access – You must provide an Oracle user account for AWS DMS. The user
account must have read/write privileges on the Oracle database, as specified in the previous section.
• Ensure that ARCHIVELOG mode is on – Oracle can run in two different modes, the ARCHIVELOG
mode and the NOARCHIVELOG mode. To use Oracle with AWS DMS, the source database must be in
ARCHIVELOG mode.
• Set up supplemental logging – If you are planning to use the source in a CDC or full-load plus CDC
task, then you need to set up supplemental logging to capture the changes for replication.
There are two steps to enable supplemental logging for Oracle. First, you need to enable database-level
supplemental logging. Doing this ensures that the LogMiner has the minimal information to support
various table structures such as clustered and index-organized tables. Second, you need to enable table-
level supplemental logging for each table to be migrated.
1. Run the following query to determine if database-level supplemental logging is already enabled.
The return result should be from GE to 9.0.0.
2. Run the following query. The returned result should be YES or IMPLICIT.
There are two methods to enable table-level supplemental logging. In the first one, if your database
user account has ALTER TABLE privileges on all tables to be migrated, you can use the extra connection
parameter addSupplementalLogging as described following. Otherwise, you can use the steps
following for each table in the migration.
1. If the table has a primary key, add PRIMARY KEY supplemental logging for the table by running the
following command.
ALTER TABLE <table_name> ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
2. If no primary key exists and the table has multiple unique indexes, then AWS DMS uses the first
unique index in alphabetical order of index name.
In some cases, the target table primary key or unique index is different than the source table primary
key or unique index. In these cases, add supplemental logging on the source table columns that make up
the target table primary key or unique index. If you change the target table primary key, you should add
supplemental logging on the selected index's columns, instead of the columns of the original primary
key or unique index.
Add additional logging if needed, such as if a filter is defined for a table. If a table has a unique index
or a primary key, you need to add supplemental logging on each column that is involved in a filter if
those columns are different than the primary key or unique index columns. However, if ALL COLUMNS
supplemental logging has been added to the table, you don't need to add any additional logging.
ALTER TABLE <table_name> ADD SUPPLEMENTAL LOG GROUP <group_name> (<column_list>) ALWAYS;
Grant the following to the AWS DMS user account used to access the source Oracle endpoint.
• Provide Oracle account access – You must provide an Oracle user account for AWS DMS. The user
account must have read/write privileges on the Oracle database, as specified in the previous section.
• Set the backup retention period for your Amazon RDS database to one day or longer – Setting
the backup retention period ensures that the database is running in ARCHIVELOG mode. For more
information about setting the backup retention period, see the Working with Automated Backups in
the Amazon RDS User Guide.
• Set up archive retention – Run the following to retain archived redo logs of your Oracle database
instance. Running this command lets AWS DMS retrieve the log information using LogMiner. Make sure
that you have enough storage for the archived redo logs during the migration period.
• Set up supplemental logging – If you are planning to use the source in a CDC or full-load plus CDC
task, then set up supplemental logging to capture the changes for replication.
There are two steps to enable supplemental logging for Oracle. First, you need to enable database-
level supplemental logging. Doing this ensures that the LogMiner has the minimal information to
support various table structures such as clustered and index-organized tables. Second, you need to
enable table-level supplemental logging for each table to be migrated.
exec rdsadmin.rdsadmin_util.alter_supplemental_logging('ADD');
• Run the following command to enable PRIMARY KEY logging for tables that have primary keys.
For tables that don’t have primary keys, use the following command to add supplemental logging.
If you create a table without a primary key, you should either include a supplemental logging clause in
the create statement or alter the table to add supplemental logging. The following command creates a
table and adds supplemental logging.
If you create a table and later add a primary key, you need to add supplemental logging to the table.
Add supplemental logging to the table using the following command.
API Version API Version 2016-01-01
91
AWS Database Migration Service User Guide
Using Oracle as a Source
alter table <table_name> add supplemental log data (PRIMARY KEY) columns;
Configuring Change Data Capture (CDC) for an Amazon RDS for Oracle Source
for AWS DMS
You can configure AWS DMS to use an Amazon RDS for Oracle instance as a source of CDC. You use
Oracle Binary Reader with an Amazon RDS for Oracle source (Oracle versions 11.2.0.4.v11 and later, and
12.1.0.2.v7 and later).
You must include the following extra connection attributes when you create the Amazon RDS for Oracle
source endpoint:
oraclePathPrefix=/rdsdbdata/db/ORCL_A/; usePathPrefix=/rdsdbdata/log/;
replacePathPrefix=true
• AWS DMS supports Oracle transparent data encryption (TDE) tablespace encryption and AWS Key
Management Service (AWS KMS) encryption when used with Oracle LogMiner. All other forms of
encryption are not supported.
• Tables with LOBs must have a primary key to use CDC.
• AWS DMS supports the rename table <table name> to <new table name> syntax with Oracle
version 11 and higher.
• Oracle source databases columns created using explicit CHAR semantics are transferred to a target
Oracle database using BYTE semantics. You must create tables containing columns of this type on the
target Oracle database before migrating.
• AWS DMS doesn't replicate data changes resulting from partition or subpartition operations—data
definition language (DDL) operations such as ADD, DROP, EXCHANGE, or TRUNCATE. To replicate such
changes, you must reload the table being replicated. AWS DMS replicates any future data changes to
newly added partitions without you having to reload the table. However, UPDATE operations on old
data records in partitions fail and generate a 0 rows affected warning.
• The DDL statement ALTER TABLE ADD <column> <data_type> DEFAULT <> doesn't replicate
the default value to the target. The new column in the target is set to NULL. If the new column
is nullable, Oracle updates all the table rows before logging the DDL itself. As a result, AWS DMS
captures the changes to the counters but doesn't update the target. Because the new column is set to
NULL, if the target table has no primary key or unique index, subsequent updates generate a 0 rows
affected warning.
• Data changes resulting from the CREATE TABLE AS statement are not supported. However, the new
table is created on the target.
• When limited-size LOB mode is enabled, AWS DMS replicates empty LOBs on the Oracle source as
NULL values in the target.
• When AWS DMS begins CDC, it maps a timestamp to the Oracle system change number (SCN). By
default, Oracle keeps only five days of the timestamp to SCN mapping. Oracle generates an error if the
timestamp specified is too old (greater than the five-day retention period). For more information, see
the Oracle documentation.
• AWS DMS doesn't support connections to an Oracle source by using an ASM proxy.
• AWS DMS doesn't support virtual columns.
The following table shows the extra connection attributes you can use to configure an Oracle database
as a source for AWS DMS.
Name Description
Default value: N
Example: addSupplementalLogging=Y
Note
If you use this option, you still need to enable
database-level supplemental logging as discussed
previously.
Default value: Y
Name Description
Example:
useLogminerReader=N;asm_user=<asm_username>;
asm_server=<first_RAC_server_ip_address>:<port_number>/
+ASM
Default value: N
accessAlternateDirectly You must set this attribute to false in order to use the
Binary Reader to capture change data for an Amazon RDS
for Oracle as the source. This tells the DMS instance to
not access redo logs through any specified path prefix
replacement using direct file access. For more information,
see Configuring Change Data Capture (CDC) for an Amazon
RDS for Oracle Source for AWS DMS (p. 92).
useAlternateFolderForOnline You must set this attribute to true in order to use the Binary
Reader to capture change data for an Amazon RDS for
Oracle as the source. This tells the DMS instance to use any
specified prefix replacement to access all online redo logs.
For more information, see Configuring Change Data Capture
(CDC) for an Amazon RDS for Oracle Source for AWS DMS
(p. 92).
Name Description
oraclePathPrefix You must set this string attribute to the required value in
order to use the Binary Reader to capture change data for
an Amazon RDS for Oracle as the source. This value specifies
the default Oracle root used to access the redo logs. For
more information, see Configuring Change Data Capture
(CDC) for an Amazon RDS for Oracle Source for AWS DMS
(p. 92).
usePathPrefix You must set this string attribute to the required value in
order to use the Binary Reader to capture change data for
an Amazon RDS for Oracle as the source. This value specifies
the path prefix used to replace the default Oracle root to
access the redo logs. For more information, see Configuring
Change Data Capture (CDC) for an Amazon RDS for Oracle
Source for AWS DMS (p. 92).
replacePathPrefix You must set this attribute to true in order to use the
Binary Reader to capture change data for an Amazon RDS
for Oracle as the source. This setting tells DMS instance
to replace the default Oracle root with the specified
usePathPrefix setting to access the redo logs. For more
information, see Configuring Change Data Capture (CDC) for
an Amazon RDS for Oracle Source for AWS DMS (p. 92).
Name Description
Default value: 5
Example: retryInterval=6
Default value:0
Example: archivedLogDestId=1
archivedLogsOnly When this field is set to Y, AWS DMS only accesses the
archived redo logs. If the archived redo logs are stored on
Oracle ASM only, the AWS DMS user account needs to be
granted ASM privileges.
Default value: N
Example: archivedLogsOnly=Y
numberDataTypeScale Specifies the number scale. You can select a scale up to 38,
or you can select FLOAT. By default, the NUMBER data type
is converted to precision 38, scale 10.
Default value: 10
Name Description
failTasksOnLobTruncation When set to true, this attribute causes a task to fail if the
actual size of an LOB column is greater than the specified
LobMaxSize.
Example: failTasksOnLobTruncation=true
standbyDelayTime Use this attribute to specify a time in minutes for the delay
in standby sync.
With AWS DMS version 2.3.0 and later, you can create an
Oracle ongoing replication (CDC) task that uses an Oracle
active data guard standby instance as a source for replicating
on-going changes to a supported target. This eliminates
the need to connect to an active database that may be in
production.
Default value:0
Example: standbyDelayTime=1
For information on how to view the data type that is mapped in the target, see the section for the target
endpoint you are using.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
BINARY_FLOAT REAL4
BINARY_DOUBLE REAL8
BINARY BYTES
DATE DATETIME
TIME DATETIME
TIMESTAMP DATETIME
CHAR STRING
VARCHAR2 STRING
NCHAR WSTRING
NVARCHAR2 WSTRING
RAW BYTES
REAL REAL8
BLOB BLOB
To use this data type with AWS DMS, you must enable the use of BLOB
data types for a specific task. AWS DMS supports BLOB data types only
in tables that include a primary key.
CLOB CLOB
To use this data type with AWS DMS, you must enable the use of CLOB
data types for a specific task. During change data capture (CDC), AWS
DMS supports CLOB data types only in tables that include a primary
key.
NCLOB NCLOB
To use this data type with AWS DMS, you must enable the use of
NCLOB data types for a specific task. During CDC, AWS DMS supports
NCLOB data types only in tables that include a primary key.
LONG CLOB
XMLTYPE CLOB
Support for the XMLTYPE data type requires the full Oracle Client
(as opposed to the Oracle Instant Client). When the target column
is a CLOB, both full LOB mode and limited LOB mode are supported
(depending on the target).
Oracle tables used as a source with columns of the following data types are not supported and cannot be
replicated. Replicating columns with these data types result in a null column.
• BFILE
• ROWID
• REF
• UROWID
• Nested Table
• User-defined data types
• ANYDATA
Note
Virtual columns are not supported.
AWS DMS supports, as a source, on-premises and Amazon EC2 instance databases for Microsoft SQL
Server versions 2005, 2008, 2008R2, 2012, 2014, and 2016. The Enterprise, Standard, Workgroup, and
Developer editions are supported. The Web and Express editions are not supported.
AWS DMS supports, as a source, Amazon RDS DB instance databases for SQL Server versions 2008R2,
2012, 2014, and 2016. The Enterprise and Standard editions are supported. CDC is supported for all
versions of Enterprise Edition. CDC is only supported for Standard Edition version 2016 SP1 and later.
The Web, Workgroup, Developer, and Express editions are not supported.
You can have the source SQL Server database installed on any computer in your network. A SQL Server
account with the appropriate access privileges to the source database for the type of task you chose is
also required for use with AWS DMS.
AWS DMS supports migrating data from named instances of SQL Server. You can use the following
notation in the server name when you create the source endpoint.
IPAddress\InstanceName
For example, the following is a correct source endpoint server name. Here, the first part of the name
is the IP address of the server, and the second part is the SQL Server instance name (in this example,
SQLTest).
10.0.0.25\SQLTest
You can use SSL to encrypt connections between your SQL Server endpoint and the replication instance.
For more information on using SSL with a SQL Server endpoint, see Using SSL With AWS Database
Migration Service (p. 47).
To capture changes from a source SQL Server database, the database must be configured for full
backups and must be either the Enterprise, Developer, or Standard Edition.
For additional details on working with SQL Server source databases and AWS DMS, see the following.
Topics
• Limitations on Using SQL Server as a Source for AWS DMS (p. 100)
• Using Ongoing Replication (CDC) from a SQL Server Source (p. 101)
• Supported Compression Methods (p. 105)
• Working with SQL Server AlwaysOn Availability Groups (p. 105)
• Configuring a SQL Server Database as a Replication Source for AWS DMS (p. 105)
• Extra Connection Attributes When Using SQL Server as a Source for AWS DMS (p. 106)
• Source Data Types for SQL Server (p. 107)
• The identity property for a column is not migrated to a target database column.
• In AWS DMS engine versions before version 2.4.x, changes to rows with more than 8000 bytes of
information, including header and mapping information, are not processed correctly due to limitations
in the SQL Server TLOG buffer size. Use the latest AWS DMS version to avoid this issue.
• The SQL Server endpoint does not support the use of sparse tables.
• Windows Authentication is not supported.
• Changes to computed fields in a SQL Server are not replicated.
• Temporal tables are not supported.
• SQL Server partition switching is not supported.
• A clustered index on the source is created as a nonclustered index on the target.
• When using the WRITETEXT and UPDATETEXT utilities, AWS DMS does not capture events applied on
the source database.
• The following data manipulation language (DML) pattern is not supported:
For example, running the following command causes AWS DMS to fail:
USE [master]
GO
ALTER SERVER AUDIT [my_audit_test-20140710] WITH (STATE=on)
GO
AWS DMS supports ongoing replication for these SQL Server configurations:
• For source SQL Server instances that are on-premises or on Amazon EC2, AWS DMS supports ongoing
replication for SQL Server Enterprise, Standard, and Developer Edition.
• For source SQL Server instances running on Amazon RDS, AWS DMS supports ongoing replication for
SQL Server Enterprise through SQL Server 2016 SP1. Beyond this version, AWS DMS supports CDC for
both SQL Server Enterprise and Standard editions.
If you want AWS DMS to automatically set up the ongoing replication, the AWS DMS user account that
you use to connect to the source database must have the sysadmin fixed server role. If you don't want to
assign the sysadmin role to the user account you use, you can still use ongoing replication by following
the series of manual steps discussed following.
The following requirements apply specifically when using ongoing replication with a SQL Server
database as a source for AWS DMS:
• SQL Server must be configured for full backups, and you must perform a backup before beginning to
replicate data.
• The recovery model must be set to Bulk logged or Full.
• SQL Server backup to multiple disks isn't supported. If the backup is defined to write the database
backup to multiple files over different disks, AWS DMS can't read the data and the AWS DMS task fails.
• For self-managed SQL Server sources, be aware that SQL Server Replication Publisher definitions for
the source database used in a DMS CDC task aren't removed when you remove a task. A SQL Server
system administrator must delete these definitions from SQL Server for self-managed sources.
• During CDC, AWS DMS needs to look up SQL Server transaction log backups to read changes. AWS
DMS doesn't support using SQL Server transaction log backups that were created using third-party
backup software.
• For self-managed SQL Server sources, be aware that SQL Server doesn't capture changes on newly
created tables until they've been published. When tables are added to a SQL Server source, AWS DMS
manages creating the publication. However, this process might take several minutes. Operations made
to newly created tables during this delay aren't captured or replicated to the target.
• AWS DMS change data capture requires FULLOGGING to be turned on in SQL Server. To turn on
FULLLOGGING in SQL Server, either enable MS-REPLICATION or CHANGE DATA CAPTURE (CDC).
• You can't reuse the SQL Server tlog until the changes have been processed.
• CDC operations aren't supported on memory-optimized tables. This limitation applies to SQL Server
2014 (when the feature was first introduced) and later.
• MS-Replication, to capture changes for tables with primary keys. You can configure this automatically
by giving the AWS DMS endpoint user sysadmin privileges on the source SQL Server instance.
Alternatively, you can follow the steps provided in this section to prepare the source and use a non-
sysadmin user for the AWS DMS endpoint.
• MS-CDC, to capture changes for tables without primary keys. MS-CDC must be enabled at the
database level, and for all of the tables individually.
For a SQL Server source running on Amazon RDS, AWS DMS uses MS-CDC to capture changes for tables,
with or without primary keys. MS-CDC must be enabled at the database level, and for all of the tables
individually, using the Amazon RDS-specific stored procedures described in this section.
There are several ways you can use a SQL Server database for ongoing replication (CDC):
• Set up ongoing replication using the sysadmin role. (This applies only to self-managed SQL Server
sources.)
• Set up ongoing replication to not use the sysadmin role. (This applies only to self-managed SQL Server
sources.)
• Set up ongoing replication for an Amazon RDS for SQL Server DB instance.
First, enable MS-CDC for the database by running the following command. Use an account that has the
sysadmin role assigned to it.
use [DBname]
EXEC sys.sp_cdc_enable_db
Next, enable MS-CDC for each of the source tables by running the following command.
For more information on setting up MS-CDC for specific tables, see the SQL Server documentation.
To set up a SQL Server database source for ongoing replication without using the sysadmin
role
1. Create a new SQL Server account with password authentication using SQL Server Management
Studio (SSMS). In this example, we use an account called dmstest.
2. In the User Mappings section of SSMS, choose the MSDB and MASTER databases (which gives public
permission) and assign the DB_OWNER role for the database you want to use ongoing replication.
3. Open the context (right-click) menu for the new account, choose Security and explicitly grant the
Connect SQL privilege.
4. Run the following grant commands.
5. In SSMS, open the context (right-click) menu for the Replication folder, and then choose Configure
Distribution. Follow all default steps and configure this SQL Server instance for distribution. A
distribution database is created under databases.
6. Create a publication using the procedure following.
7. Create a new AWS DMS task with SQL Server as the source endpoint using the user account you
created.
Note
The steps in this procedure apply only for tables with primary keys. You still need to enable MS-
CDC for tables without primary keys.
7. Expand Tables and select the tables with PK (also these tables you want to publish). Choose Next.
8. You don't need to create a filter, so choose Next.
9. You don't need to create a Snapshot Agent, so choose Next.
10. Choose Security Settings and choose Run under the SQL Server Agent service account. Make sure
to choose By impersonating the process account for publisher connection. Choose OK.
11. Choose Next.
12. Choose Create the publication.
13. Provide a name of the publication in the following format:
Unlike self-managed SQL Server sources, Amazon RDS for SQL Server doesn't support MS-Replication.
Therefore, AWS DMS needs to use MS-CDC for tables with or without primary keys.
Amazon RDS does not grant sysadmin privileges for setting replication artifacts that AWS DMS uses
for on-going changes in a source SQL Server instance. You must enable MS-CDC on the Amazon RDS
instance using master user privileges in the following procedure.
2. For each table with a primary key, run the following query to enable MS-CDC.
exec sys.sp_cdc_enable_table
@source_schema = N'db_name',
@source_name = N'table_name',
@role_name = NULL,
@supports_net_changes = 1
GO
For each table with unique keys but no primary key, run the following query to enable MS-CDC.
exec sys.sp_cdc_enable_table
@source_schema = N'db_name',
@source_name = N'table_name',
@index_name = N'unique_index_name'
@role_name = NULL,
@supports_net_changes = 1
GO
For each table with no primary key nor unique keys, run the following query to enable MS-CDC.
exec sys.sp_cdc_enable_table
@source_schema = N'db_name',
@source_name = N'table_name',
@role_name = NULL
GO
3. Set the retention period for changes to be available on the source using the following command.
2005 No No
2008 Yes No
2012 Yes No
2014 Yes No
Note
Sparse columns and columnar structure compression are not supported.
• Enable the Distribution option on all SQL Server instances in your Availability Replicas.
• In the AWS DMS console, open the SQL Server source database settings. For Server Name, specify
the Domain Name Service (DNS) name or IP address that was configured for the Availability Group
Listener.
When you start an AWS DMS task for the first time, it might take longer than usual to start because the
creation of the table articles is being duplicated by the Availability Groups Server.
backups. In addition, AWS DMS must connect with a user (a SQL Server instance login) that has the
sysadmin fixed server role on the SQL Server database you are connecting to.
Following, you can find information about configuring SQL Server as a replication source for AWS DMS.
The following table shows the extra connection attributes you can use with SQL Server as a source:
Name Description
Default value:
RELY_ON_SQL_SERVER_REPLICATION_AGENT
Example: safeguardPolicy=
RELY_ON_SQL_SERVER_REPLICATION_AGENT
Name Description
active transaction log file growth during full load and
ongoing replication tasks.
Example: readBackupOnly=Y
For information on how to view the data type that is mapped in the target, see the section for the target
endpoint you are using.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
BIGINT INT8
BIT BOOLEAN
DECIMAL NUMERIC
INT INT4
MONEY NUMERIC
SMALLINT INT2
SMALLMONEY NUMERIC
TINYINT UINT1
REAL REAL4
FLOAT REAL8
DATETIME DATETIME
SMALLDATETIME DATETIME
DATE DATE
TIME TIME
DATETIMEOFFSET WSTRING
CHAR STRING
VARCHAR STRING
TEXT
NCHAR WSTRING
NTEXT
BINARY BYTES
VARBINARY BYTES
IMAGE
TIMESTAMP BYTES
UNIQUEIDENTIFIER STRING
XML NCLOB
AWS DMS doesn't support tables that include fields with the following data types:
• CURSOR
• SQL_VARIANT
• TABLE
Note
User-defined data types are supported according to their base type. For example, a user-defined
data type based on DATETIME is handled as a DATETIME data type.
For more information, see Using a Microsoft SQL Server Database as a Source for AWS DMS (p. 100).
Note
AWS DMS doesn't support change data capture operations (CDC) with Azure SQL Database.
You can use SSL to encrypt connections between your PostgreSQL endpoint and the replication instance.
For more information on using SSL with a PostgreSQL endpoint, see Using SSL With AWS Database
Migration Service (p. 47).
For a homogeneous migration from a PostgreSQL database to a PostgreSQL database on AWS, the
following is true:
• JSONB columns on the source are migrated to JSONB columns on the target.
• JSON columns are migrated as JSON columns on the target.
• HSTORE columns are migrated as HSTORE columns on the target.
For a heterogeneous migration with PostgreSQL as the source and a different database engine as the
target, the situation is different. In this case, JSONB, JSON, and HSTORE columns are converted to the
AWS DMS intermediate type of NCLOB and then translated to the corresponding NCLOB column type
on the target. In this case, AWS DMS treats JSONB data as if it were a LOB column. During the full load
phase of a migration, the target column must be nullable.
AWS DMS supports change data capture (CDC) for PostgreSQL tables with primary keys. If a table
doesn't have a primary key, the write-ahead logs (WAL) don't include a before image of the database row
and AWS DMS can't update the table.
AWS DMS supports CDC on Amazon RDS PostgreSQL databases when the DB instance is configured to
use logical replication. Amazon RDS supports logical replication for a PostgreSQL DB instance version
9.4.9 and higher and 9.5.4 and higher.
For additional details on working with PostgreSQL databases and AWS DMS, see the following sections.
Topics
• Migrating from PostgreSQL to PostgreSQL Using AWS DMS (p. 111)
• Prerequisites for Using a PostgreSQL Database as a Source for AWS DMS (p. 113)
• Security Requirements When Using a PostgreSQL Database as a Source for AWS DMS (p. 113)
• Limitations on Using a PostgreSQL Database as a Source for AWS DMS (p. 114)
• Setting Up an Amazon RDS PostgreSQL DB Instance as a Source (p. 115)
• Removing AWS DMS Artifacts from a PostgreSQL Source Database (p. 117)
• Additional Configuration Settings When Using a PostgreSQL Database as a Source for AWS
DMS (p. 117)
• Using PostgreSQL Version 10.x and Later as a Source for AWS DMS (p. 117)
• Extra Connection Attributes When Using PostgreSQL as a Source for AWS DMS (p. 119)
We recommend that you use native PostgreSQL database migration tools such as pg_dump under the
following conditions:
• You have a homogeneous migration, where you are migrating from a source PostgreSQL database to a
target PostgreSQL database.
• You are migrating an entire database.
• The native tools allow you to migrate your data with minimal downtime.
The pg_dump utility uses the COPY command to create a schema and data dump of a PostgreSQL
database. The dump script generated by pg_dump loads data into a database with the same name and
recreates the tables, indexes, and foreign keys. You can use the pg_restore command and the -d
parameter to restore the data to a database with a different name.
For more information about importing a PostgreSQL database into Amazon RDS for PostgreSQL or
Amazon Aurora (PostgreSQL), see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide//
PostgreSQL.Procedural.Importing.html.
Data types that are supported on the source database but are not supported on the target may not
migrate successfully. AWS DMS streams some data types as strings if the data type is unknown. Some
data types, such as XML and JSON, can successfully migrate as small files but can fail if the are large
documents.
The following table shows source PostgreSQL data types and whether they can be migrated successfully:
INTEGER X
SMALLINT X
BIGINT X
REAL X
DOUBLE X
SMALLSERIAL X
SERIAL X
BIGSERIAL X
MONEY X
CHAR X Without
specified
precision
CHAR(n) X
VARCHAR X Without
specified
precision
VARCHAR(n) X
TEXT X
BYTEA X
TIMESTAMP X
TIMESTAMP(Z) X
DATE X
TIME X
TIME (z) X
INTERVAL X
BOOLEAN X
ENUM X
CIDR X
INET X
MACADDR X
TSVECTOR X
TSQUERY X
XML X
POINT X
LINE X
LSEG X
BOX X
PATH X
POLYGON X
CIRCLE X
JSON X
ARRAY X
COMPOSITE X
RANGE X
The max_replication_slots value should be set according to the number of tasks that you
want to run. For example, to run five tasks you need to set a minimum of five slots. Slots open
automatically as soon as a task starts and remain open even when the task is no longer running. You
need to manually delete open slots.
• Set max_wal_senders to a value greater than 1.
The max_wal_senders parameter sets the number of concurrent tasks that can run.
• Set wal_sender_timeout =0
The wal_sender_timeout parameter terminates replication connections that are inactive longer
than the specified number of milliseconds. Although the default is 60 seconds, we recommend that
you set this parameter to zero, which disables the timeout mechanism.
• The parameter idle_in_transaction_session_timeout in PostgreSQL versions 9.6 and later lets
you cause idle transactions to time out and fail. Some AWS DMS transactions are idle for some time
before the AWS DMS engine uses them again. Do not end idle transactions when you use AWS DMS.
• A captured table must have a primary key. If a table doesn't have a primary key, AWS DMS ignores
DELETE and UPDATE record operations for that table.
• Timestamp with a time zone type column is not supported.
• AWS DMS ignores an attempt to update a primary key segment. In these cases, the target identifies
the update as one that didn't update any rows. However, because the results of updating a primary key
in PostgreSQL are unpredictable, no records are written to the exceptions table.
• AWS DMS doesn't support the Start Process Changes from Timestamp run option.
• AWS DMS supports full load and change processing on Amazon RDS for PostgreSQL. For information
on how to prepare a PostgreSQL DB instance and to set it up for using CDC, see Setting Up an Amazon
RDS PostgreSQL DB Instance as a Source (p. 115).
• Replication of multiple tables with the same name but where each name has a different case (for
example table1, TABLE1, and Table1) can cause unpredictable behavior, and therefore AWS DMS
doesn't support it.
• AWS DMS supports change processing of CREATE, ALTER, and DROP DDL statements for tables unless
the tables are held in an inner function or procedure body block or in other nested constructs.
Note
To replicate partitioned tables from a PostgreSQL source to a PostgreSQL target, you first need
to manually create the parent and child tables on the target. Then you define a separate task
to replicate to those tables. In such a case, you set the task configuration to Truncate before
loading.
Note
The PostgreSQL NUMERIC datatype is not fixed in size. When transferring data that is a
NUMERIC data type but without precision and scale DMS uses NUMERIC(28,6) (a precision of
28 and scale of 6) by default. As an example the value 0.611111104488373 from the source will
be converted to to 0.611111 on the PostgreSQL target.
You use the AWS master user account for the PostgreSQL DB instance as the user account for the
PostgreSQL source endpoint for AWS DMS. The master user account has the required roles that allow
it to set up change data capture (CDC). If you use an account other than the master user account, the
account must have the rds_superuser role and the rds_replication role. The rds_replication role grants
permissions to manage logical slots and to stream data using logical slots.
If you don't use the master user account for the DB instance, you must create several objects from the
master user account for the account that you use. For information about creating the needed objects, see
Migrating an Amazon RDS for PostgreSQL Database Without Using the Master User Account (p. 115).
• In general, use the AWS master user account for the PostgreSQL DB instance as the user account for
the PostgreSQL source endpoint. The master user account has the required roles that allow it to set up
CDC. If you use an account other than the master user account, you must create several objects from
the master account for the account that you use. For more information, see Migrating an Amazon RDS
for PostgreSQL Database Without Using the Master User Account (p. 115).
• Set the rds.logical_replication parameter in your DB parameter group to 1. This is a
static parameter that requires a reboot of the DB instance for the parameter to take effect.
As part of applying this parameter, AWS DMS sets the wal_level, max_wal_senders,
max_replication_slots, and max_connections parameters. These parameter changes can
increase WAL generation, so you should only set the rds.logical_replication parameter when
you are using logical slots.
• A best practice is to set the wal_sender_timeout parameter to 0. Setting this parameter to 0
prevents PostgreSQL from terminating replication connections that are inactive longer than the
specified timeout. When AWS DMS is migrating data, replication connections need to be able to last
longer than the specified timeout.
Migrating an Amazon RDS for PostgreSQL Database Without Using the Master
User Account
If you don't use the master user account for the Amazon RDS PostgreSQL DB instance that you are using
as a source, you need to create several objects to capture data definition language (DDL) events. You
create these objects in the account other than the master account and then create a trigger in the master
user account.
Note
If you set the captureDDL parameter to N on the source endpoint, you don't have to create the
following table and trigger on the source database.
Use the following procedure to create these objects. The user account other than the master account is
referred to as the NoPriv account in this procedure.
To create objects
1. Choose the schema where the objects are to be created. The default schema is public. Ensure that
the schema exists and is accessible by the NoPriv account.
2. Log in to the PostgreSQL DB instance using the NoPriv account.
3. Create the table awsdms_ddl_audit by running the following command, replacing
<objects_schema> in the code following with the name of the schema to use.
5. Log out of the NoPriv account and log in with an account that has the rds_superuser role
assigned to it.
When you have completed the procedure preceding, you can create the AWS DMS source endpoint using
the NoPriv account.
Note
Dropping a schema should be done with extreme caution, if at all. Never drop an operational
schema, especially not a public one.
• You can add values to the extra connection attribute to capture DDL events and to specify the
schema in which the operational DDL database artifacts are created. For more information, see Extra
Connection Attributes When Using PostgreSQL as a Source for AWS DMS (p. 119).
• You can override connection string parameters. Select this option if you need to do either of the
following:
• Specify internal AWS DMS parameters. Such parameters are rarely required and are therefore not
exposed in the user interface.
• Specify pass-through (passthru) values for the specific database client. AWS DMS includes pass-
through parameters in the connection sting passed to the database client.
Because most of the name changes are superficial, AWS DMS has created wrapper functions that let AWS
DMS work with PostgreSQL version 10.x and later. The wrapper functions are prioritized higher than
functions in pg_catalog. In addition, we ensure that schema visibility of existing schemas isn't changed
so that we don't override any other system catalog functions such as user-defined functions.
To use these wrapper functions before you perform any migration tasks, run the following SQL code on
the source PostgreSQL database. Use the same AWS DMS user account as you are using for the target
database.
BEGIN;
CREATE SCHEMA IF NOT EXISTS fnRenames;
CREATE OR REPLACE FUNCTION fnRenames.pg_switch_xlog() RETURNS pg_lsn AS $$
SELECT pg_switch_wal(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_replay_pause() RETURNS VOID AS $$
SELECT pg_wal_replay_pause(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_replay_resume() RETURNS VOID AS $$
SELECT pg_wal_replay_resume(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_location() RETURNS pg_lsn AS $$
SELECT pg_current_wal_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_is_xlog_replay_paused() RETURNS boolean AS $$
SELECT pg_is_wal_replay_paused(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlogfile_name(lsn pg_lsn) RETURNS TEXT AS $$
SELECT pg_walfile_name(lsn); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_last_xlog_replay_location() RETURNS pg_lsn AS $$
SELECT pg_last_wal_replay_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_last_xlog_receive_location() RETURNS pg_lsn AS $$
SELECT pg_last_wal_receive_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_flush_location() RETURNS pg_lsn AS $$
SELECT pg_current_wal_flush_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_insert_location() RETURNS pg_lsn AS $
$
SELECT pg_current_wal_insert_lsn(); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_location_diff(lsn1 pg_lsn, lsn2 pg_lsn)
RETURNS NUMERIC AS $$
SELECT pg_wal_lsn_diff(lsn1, lsn2); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_xlogfile_name_offset(lsn pg_lsn, OUT TEXT, OUT
INTEGER) AS $$
SELECT pg_walfile_name_offset(lsn); $$ LANGUAGE SQL;
CREATE OR REPLACE FUNCTION fnRenames.pg_create_logical_replication_slot(slot_name name,
plugin name,
temporary BOOLEAN DEFAULT FALSE, OUT slot_name name, OUT xlog_position pg_lsn) RETURNS
RECORD AS $$
SELECT slot_name::NAME, lsn::pg_lsn FROM
pg_catalog.pg_create_logical_replication_slot(slot_name, plugin,
temporary); $$ LANGUAGE SQL;
ALTER USER <user name> SET search_path = fnRenames, pg_catalog, "$user", public;
Note
If you do not invoke this preparatory code on a source PostgreSQL 10.x database, an error is
raised like this.
The following table shows the extra connection attributes you can use when using PostgreSQL as a
source for AWS DMS:
Name Description
Default value: Y
Example: captureDDLs=Y
Example: ddlArtifactsSchema=xyzddlschema
failTasksOnLobTruncation When set to true, this value causes a task to fail if the
actual size of a LOB column is greater than the specified
LobMaxSize.
Example: failTasksOnLobTruncation=true
Example: executeTimeout=100
For information on how to view the data type that is mapped in the target, see the section for the target
endpoint you are using.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
INTEGER INT4
SMALLINT INT2
BIGINT INT8
REAL REAL4
DOUBLE REAL8
SMALLSERIAL INT2
SERIAL INT4
BIGSERIAL INT8
MONEY NUMERIC(38,4)
TEXT NCLOB
BYTEA BLOB
TIMESTAMP TIMESTAMP
DATE DATE
TIME TIME
UUID STRING
TSVECTOR CLOB
TSQUERY CLOB
XML CLOB
JSON NCLOB
JSONB NCLOB
ARRAY NCLOB
COMPOSITE NCLOB
HSTORE NCLOB
You can use SSL to encrypt connections between your MySQL-compatible endpoint and the replication
instance. For more information on using SSL with a MySQL-compatible endpoint, see Using SSL With
AWS Database Migration Service (p. 47).
In the following sections, the term "self-managed" applies to any database that is installed either on-
premises or on Amazon EC2. The term "Amazon-managed" applies to any database on Amazon RDS,
Amazon Aurora, or Amazon Simple Storage Service.
For additional details on working with MySQL-compatible databases and AWS DMS, see the following
sections.
Topics
• Migrating from MySQL to MySQL Using AWS DMS (p. 122)
• Using Any MySQL-Compatible Database as a Source for AWS DMS (p. 124)
• Using a Self-Managed MySQL-Compatible Database as a Source for AWS DMS (p. 124)
• Using a Amazon-Managed MySQL-Compatible Database as a Source for AWS DMS (p. 125)
• Limitations on Using a MySQL Database as a Source for AWS DMS (p. 126)
• Extra Connection Attributes When Using MySQL as a Source for AWS DMS (p. 127)
• Source Data Types for MySQL (p. 128)
We recommend that you use native MySQL database migration tools such as mysqldump under the
following conditions:
• You have a homogeneous migration, where you are migrating from a source MySQL database to a
target MySQL database.
• You are migrating an entire database.
• The native tools allow you to migrate your data with minimal downtime.
You can import data from an existing MySQL or MariaDB database to an Amazon RDS MySQL or MariaDB
DB instance. You do so by copying the database with mysqldump and piping it directly into the Amazon
RDS MySQL or MariaDB DB instance. The mysqldump command-line utility is commonly used to make
backups and transfer data from one MySQL or MariaDB server to another. It is included with MySQL and
MariaDB client software.
For more information about importing a MySQL database into Amazon RDS for MySQL or Amazon
Aurora (MySQL), see Importing Data into a MySQL DB Instance and Importing Data from a MySQL or
MariaDB DB to an Amazon RDS MySQL or MariaDB DB Instance.
Data types that are supported on the source database but are not supported on the target may not
migrate successfully. AWS DMS streams some data types as strings if the data type is unknown. Some
data types, such as XML and JSON, can successfully migrate as small files but can fail if the are large
documents.
The following table shows source MySQL data types and whether they can be migrated successfully:
INT X
BIGINT X
MEDIUMINT X
TINYINT X
DECIMAL(p,s) X
BINARY X
BIT(M) X
BLOB X
LONGBLOB X
MEDIUMBLOB X
TINYBLOB X
DATE X
DATETIME X
TIME X
TIMESTAMP X
YEAR X
DOUBLE X
FLOAT X
VARCHAR(N) X
VARBINARY(N) X
CHAR(N) X
TEXT X
LONGTEXT X
MEDIUMTEXT X
TINYTEXT X
GEOMETRY X
POINT X
LINESTRING X
POLYGON X
MULTILINESTRING X
MULTIPOLYGON X
GEOMETRYCOLLECTION X
ENUM X
SET X
You must have an account for AWS DMS that has the Replication Admin role. The role needs the
following privileges:
• REPLICATION CLIENT – This privilege is required for change data capture (CDC) tasks only. In other
words, full-load-only tasks don't require this privilege.
• REPLICATION SLAVE – This privilege is required for change data capture (CDC) tasks only. In other
words, full-load-only tasks don't require this privilege.
• SUPER – This privilege is required only in MySQL versions before 5.6.6.
The AWS DMS user must also have SELECT privileges for the source tables designated for replication.
You must enable binary logging if you plan to use change data capture (CDC). To enable binary logging,
the following parameters must be configured in MySQL’s my.ini (Windows) or my.cnf (UNIX) file.
Parameter Value
log-bin Set the path to the binary log file, such as log-bin=E:\MySql_Logs
\BinLog. Don't include the file extension.
log_slave_updates Set this parameter to TRUE if you are using a MySQL or MariaDB read-replica
as a source.
If your source uses the NDB (clustered) database engine, the following parameters must be configured to
enable CDC on tables that use that storage engine. Add these changes in MySQL’s my.ini (Windows) or
my.cnf (UNIX) file.
Parameter Value
ndb_log_bin Set this parameter to ON. This value ensures that changes in clustered tables
are logged to the binary log.
Set this parameter to OFF. This value prevents writing UPDATE statements
ndb_log_update_as_write
as INSERT statements in the binary log.
ndb_log_updated_onlySet this parameter to OFF. This value ensures that the binary log contains
the entire row and not just the changed columns.
When using an Amazon-managed MySQL-compatible database as a source for AWS DMS, make sure that
you have the following prerequisites:
• You must enable automatic backups. For more information on setting up automatic backups, see
Working with Automated Backups in the Amazon RDS User Guide.
• You must enable binary logging if you plan to use change data capture (CDC). For more information on
setting up binary logging for an Amazon RDS MySQL database, see Working with Automated Backups
in the Amazon RDS User Guide.
• You must ensure that the binary logs are available to AWS DMS. Because Amazon-managed MySQL-
compatible databases purge the binary logs as soon as possible, you should increase the length of time
that the logs remain available. For example, to increase log retention to 24 hours, run the following
command.
• Change data capture (CDC) is not supported for Amazon RDS MySQL 5.5 or lower. For Amazon RDS
MySQL, you must use version 5.6 or higher to enable CDC.
• The data definition language (DDL) statements DROP TABLE and RENAME TABLE are not supported.
Additionally, all DDL statements for partitioned tables are not supported.
• For partitioned tables on the source, when you set Target table preparation mode to Drop tables
on target, AWS DMS creates a simple table without any partitions on the MySQL target. To migrate
partitioned tables to a partitioned table on the target, pre-create the partitioned tables on the target
MySQL database.
• Using an ALTER TABLE<table_name> ADD COLUMN <column_name> statement to add columns to the
beginning (FIRST) or the middle of a table (AFTER) is not supported. Columns are always added to the
end of the table.
• CDC is not supported when a table name contains uppercase and lowercase characters, and the source
engine is hosted on an operating system with case-insensitive file names. An example is Windows or
OS X using HFS+.
• The AR_H_USER header column is not supported.
• The AUTO_INCREMENT attribute on a column is not migrated to a target database column.
• Capturing changes when the binary logs are not stored on standard block storage is not supported. For
example, CDC doesn't work when the binary logs are stored on Amazon Simple Storage Service.
• AWS DMS creates target tables with the InnoDB storage engine by default. If you need to use a storage
engine other than InnoDB, you must manually create the table and migrate to it using “do nothing”
mode.
• You can't use Aurora MySQL read replicas as a source for AWS DMS.
• If the MySQL-compatible source is stopped during full load, the AWS DMS task doesn't stop with an
error. The task ends successfully, but the target might be out of sync with the source. If this happens,
either restart the task or reload the affected tables.
• Indexes created on a portion of a column value aren't migrated. For example, the index CREATE INDEX
first_ten_chars ON customer (name(10)) isn't created on the target.
• In some cases, the task is configured to not replicate LOBs ("SupportLobs" is false in task settings or
“Don't include LOB columns” is checked in the task console). In these cases, AWS DMS doesn't migrate
any MEDIUMBLOB, LONGBLOB, MEDIUMTEXT, and LONGTEXT columns to the target.
BLOB, TINYBLOB, TEXT, and TINYTEXT columns are not affected and are migrated to the target.
The following table shows the extra connection attributes available when using Amazon RDS MySQL as a
source for AWS DMS.
Name Description
eventsPollInterval Specifies how often to check the binary log for new
changes/events when the database is idle.
Default value: 5
Valid values: 1 - 60
Example: eventsPollInterval=5
initstmt=SET time_zone Specifies the time zone for the source MySQL database.
Timestamps are translated to the specified timezone.
Example: CleanSrcMetadataOnMismatch=false
For information on how to view the data type that is mapped in the target, see the section for the target
endpoint you are using.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
INT INT4
MEDIUMINT INT4
BIGINT INT8
TINYINT INT1
BINARY BYTES(1)
BIT BOOLEAN
BIT(64) BYTES(8)
BLOB BYTES(66535)
LONGBLOB BLOB
MEDIUMBLOB BLOB
TINYBLOB BYTES(255)
DATE DATE
DATETIME DATETIME
TIME STRING
TIMESTAMP DATETIME
YEAR INT2
DOUBLE REAL8
FLOAT REAL(DOUBLE)
CHAR WSTRING
LONGTEXT NCLOB
MEDIUMTEXT NCLOB
GEOMETRY BLOB
POINT BLOB
LINESTRING BLOB
POLYGON BLOB
MULTIPOINT BLOB
MULTILINESTRING BLOB
MULTIPOLYGON BLOB
GEOMETRYCOLLECTION BLOB
Note
If the DATETIME and TIMESTAMP data types are specified with a "zero" value (that is,
0000-00-00), make sure that the target database in the replication task supports "zero" values
for the DATETIME and TIMESTAMP data types. Otherwise, these values are recorded as null on
the target.
The following MySQL data types are supported in full load only.
ENUM STRING
SET STRING
For additional details on working with SAP ASE databases and AWS DMS, see the following sections.
Topics
• Prerequisites for Using an SAP ASE Database as a Source for AWS DMS (p. 130)
• Limitations on Using SAP ASE as a Source for AWS DMS (p. 130)
• Permissions Required for Using SAP ASE as a Source for AWS DMS (p. 131)
• Removing the Truncation Point (p. 131)
• Source Data Types for SAP ASE (p. 131)
• Enable SAP ASE replication for tables by using the sp_setreptable command.
• Disable RepAgent on the SAP ASE database.
• To replicate to SAP ASE version 15.7 on an Amazon EC2 instance on Microsoft Windows configured for
non-Latin characters (for example, Chinese), install SAP ASE 15.7 SP121 on the target computer.
• Only one AWS DMS task can be run for each SAP ASE database.
• You can't rename a table. For example, the following command fails:
• You can't rename a column. For example, the following command fails:
• Zero values located at the end of binary data type strings are truncated when replicated to the target
database. For example, 0x0000000000000000000000000100000100000000 in the source table
becomes 0x00000000000000000000000001000001 in the target table.
• If the database default is set not to allow NULL values, AWS DMS creates the target table with columns
that don't allow NULL values. Consequently, if a full load or change data capture (CDC) replication
task contains empty values, AWS DMS throws an error. You can prevent these errors by allowing NULL
values in the source database by using the following commands.
• sa_role
• replication_role
• sybase_ts_role
• If you enable the Automatically enable Sybase replication option (in the Advanced tab) when you
create the SAP ASE source endpoint, also give permission to AWS DMS to run the stored procedure
sp_setreptable.
After the $replication_truncation_point entry is established, keep the AWS DMS task running
to prevent the database log from becoming excessively large. If you want to stop the AWS DMS task
permanently, remove the replication truncation point by issuing the following command:
dbcc settrunc('ltm','ignore')
After the truncation point is removed, you can't resume the AWS DMS task. The log continues to be
truncated automatically at the checkpoints (if automatic truncation is set).
For information on how to view the data type that is mapped in the target, see the Targets for Data
Migration (p. 147) section for your target endpoint.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
BIGINT INT8
BINARY BYTES
BIT BOOLEAN
CHAR STRING
DATE DATE
DATETIME DATETIME
DECIMAL NUMERIC
DOUBLE REAL8
FLOAT REAL8
IMAGE BLOB
INT INT4
MONEY NUMERIC
NCHAR WSTRING
NUMERIC NUMERIC
NVARCHAR WSTRING
REAL REAL4
SMALLDATETIME DATETIME
SMALLINT INT2
SMALLMONEY NUMERIC
TEXT CLOB
TIME TIME
TINYINT UINT1
UNITEXT NCLOB
UNIVARCHAR UNICODE
VARBINARY BYTES
VARCHAR STRING
If you are new to MongoDB, be aware of the following important MongoDB database concepts:
• A record in MongoDB is a document, which is a data structure composed of field and value pairs. The
value of a field can include other documents, arrays, and arrays of documents.A document is roughly
equivalent to a row in a relational database table.
• A collection in MongoDB is a group of documents, and is roughly equivalent to a relational database
table.
• Internally, a MongoDB document is stored as a binary JSON (BSON) file in a compressed format that
includes a type for each field in the document. Each document has a unique ID.
AWS DMS supports two migration modes when using MongoDB as a source. You specify the migration
mode using the Metadata mode parameter using the AWS Management Console or the extra connection
attribute nestingLevel when you create the MongoDB endpoint. The choice of migration mode affects
the resulting format of the target data as explained following.
Document mode
In document mode, the MongoDB document is migrated as is, meaning that the document data
is consolidated into a single column named _doc in a target table. Document mode is the default
setting when you use MongoDB as a source endpoint.
For example, consider the following documents in a MongoDB collection called myCollection.
> db.myCollection.find()
{ "_id" : ObjectId("5a94815f40bd44d1b02bdfe0"), "a" : 1, "b" : 2, "c" : 3 }
{ "_id" : ObjectId("5a94815f40bd44d1b02bdfe1"), "a" : 4, "b" : 5, "c" : 6 }
After migrating the data to a relational database table using document mode, the data is structured
as follows. The data fields in the MongoDB document are consolidated into the _doc column.
oid_id _doc
You can optionally set the extra connection attribute extractDocID to true to create a second
column named "_id" that acts as the primary key. If you are going to use change data capture
(CDC), set this parameter to true.
In document mode, AWS DMS manages the creation and renaming of collections like this:
• If you add a new collection to the source database, AWS DMS creates a new target table for the
collection and replicates any documents.
• If you rename an existing collection on the source database, AWS DMS doesn't rename the target
table.
Table mode
In table mode, AWS DMS transforms each top-level field in a MongoDB document into a column in
the target table. If a field is nested, AWS DMS flattens the nested values into a single column. AWS
DMS then adds a key field and data types to the target table's column set.
For each MongoDB document, AWS DMS adds each key and type to the target table’s column set.
For example, using table mode, AWS DMS migrates the previous example into the following table.
oid_id a b c
5a94815f40bd44d1b02bdfe0
1 2 3
5a94815f40bd44d1b02bdfe1
4 5 6
Nested values are flattened into a column containing dot-separated key names. The column is
named the concatenation of the flattened field names separated by periods. For example, AWS DMS
migrates a JSON document with a field of nested values such as {"a" : {"b" : {"c": 1}}} into
a column named a.b.c.
To create the target columns, AWS DMS scans a specified number of MongoDB documents and
creates a set of all the fields and their types. AWS DMS then uses this set to create the columns of
the target table. If you create or modify your MongoDB source endpoint using the console, you can
specify the number of documents to scan. The default value is 1000 documents. If you use the AWS
CLI, you can use the extra connection attribute docsToInvestigate.
In table mode, AWS DMS manages documents and collections like this:
• When you add a document to an existing collection, the document is replicated. If there are fields
that don't exist in the target, those fields aren't replicated.
• When you update a document, the updated document is replicated. If there are fields that don't
exist in the target, those fields aren't replicated.
• Deleting a document is fully supported.
• Adding a new collection doesn't result in a new table on the target when done during a CDC task.
• Renaming a collection is not supported.
use admin
db.createUser(
{
user: "root",
pwd: "<password>",
roles: [ { role: "root", db: "admin" } ]
}
)
The following code creates a user with minimal privileges on the database to be migrated.
use <database_to_migrate>
db.createUser(
{
user: "<dms-user>",
pwd: "<password>",
roles: [ { role: "read", db: "local" }, "read"]
})
You can use CDC with either the primary or secondary node of a MongoDB replica set as the source
endpoint.
mongo localhost
4. Test the connection to the replica set using the following commands:
If you plan to perform a document mode migration, select option _id as a separate column when
you create the MongoDB endpoint. Selecting this option creates a second column named _id that acts
as the primary key. This second column is required by AWS DMS to support data manipulation language
(DML) operations.
If an authentication method is not specified, AWS DMS uses the default method for the version of the
MongoDB source.
• When the _id option is set as a separate column, the ID string can't exceed 200 characters.
• Object ID and array type keys are converted to columns that are prefixed with oid and array in table
mode.
Internally, these columns are referenced with the prefixed names. If you use transformation rules
in AWS DMS that reference these columns, you must specify the prefixed column. For example, you
specify ${oid__id} and not ${_id}, or ${array__addresses} and not ${_addresses}.
• Collection names can't include the dollar symbol ($).
• Table mode and document mode have the limitations discussed preceding.
Extra Connection Attributes When Using MongoDB as a Source for AWS DMS
When you set up your MongoDB source endpoint, you can specify extra connection attributes. Extra
connection attributes are specified by key-value pairs and separated by semicolons.
The following table describes the extra connection attributes available when using MongoDB databases
as an AWS DMS source.
A positive integer
docsToInvestigate 1000 – Use this attribute when nestingLevel is set to
greater than 0. ONE.
authSource A valid MongoDB admin – This attribute isn't used when authType=NO.
database name.
For information on how to view the data type that is mapped in the target, see the section for the target
endpoint that you are using.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
Boolean Bool
Binary BLOB
Date Date
Timestamp Date
Int INT4
Long INT8
Double REAL8
Array CLOB
OID String
REGEX CLOB
CODE CLOB
When you set up your MongoDB source endpoint, you can specify extra connection attributes. Extra
connection attributes are specified by key-value pairs and separated by semicolons.
The following table describes the extra connection attributes available when using MongoDB databases
as an AWS DMS source.
A positive integer
docsToInvestigate 1000 – Use this attribute when nestingLevel is set to
greater than 0. ONE.
authSource A valid MongoDB admin – This attribute isn't used when authType=NO.
database name.
Note
If the source endpoint is MongoDB, then the following extra connection attributes must be
enabled:
• nestingLevel=NONE
• extractDocID=FALSE
For more information, see Using Amazon DocumentDB as a Target for AWS Database Migration
Service (p. 198).
and the default mapping from AWS DMS data types. For more information about MongoDB data types,
see BSON Types in the MongoDB documentation.
For information on how to view the data type that is mapped in the target, see the section for the target
endpoint that you are using.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
Boolean Bool
Binary BLOB
Date Date
Timestamp Date
Int INT4
Long INT8
Double REAL8
Array CLOB
OID String
REGEX CLOB
CODE CLOB
The source data files must be present in the Amazon S3 bucket before the full load starts. You specify
the bucket name using the bucketName parameter.
The source data files must be in comma separated value (CSV) format. Name them using the naming
convention shown following. In this convention, schemaName is the source schema and tableName is
the name of a table within that schema.
/schemaName/tableName/LOAD001.csv
/schemaName/tableName/LOAD002.csv
/schemaName/tableName/LOAD003.csv
...
For example, suppose that your data files are in mybucket, at the following Amazon S3 path.
s3://mybucket/hr/employee
At load time, AWS DMS assumes that the source schema name is hr, and that the source table name is
employee.
In addition to bucketName (which is required), you can optionally provide a bucketFolder parameter
to specify where AWS DMS should look for data files in the Amazon S3 bucket. Continuing the previous
example, if you set bucketFolder to sourcedata, then AWS DMS reads the data files at the following
path.
s3://mybucket/sourcedata/hr/employee
You can specify the column delimiter, row delimiter, null value indicator, and other parameters using
extra connection attributes. For more information, see Extra Connection Attributes for Amazon S3 as a
Source for AWS DMS (p. 142).
Suppose that you have a data file that includes the following.
101,Smith,Bob,4-Jun-14,New York
102,Smith,Bob,8-Oct-15,Los Angeles
103,Smith,Bob,13-Mar-17,Dallas
104,Smith,Bob,13-Mar-17,Dallas
{
"TableCount": "1",
"Tables": [
{
"TableName": "employee",
"TablePath": "hr/employee/",
"TableOwner": "hr",
"TableColumns": [
{
"ColumnName": "Id",
"ColumnType": "INT8",
"ColumnNullable": "false",
"ColumnIsPk": "true"
},
{
"ColumnName": "LastName",
"ColumnType": "STRING",
"ColumnLength": "20"
},
{
"ColumnName": "FirstName",
"ColumnType": "STRING",
"ColumnLength": "30"
},
{
"ColumnName": "HireDate",
"ColumnType": "DATETIME"
},
{
"ColumnName": "OfficeLocation",
"ColumnType": "STRING",
"ColumnLength": "20"
}
],
"TableColumnsTotal": "5"
}
]
}
TableCount—the number of source tables. In this example, there is only one table.
Tables—an array consisting of one JSON map per source table. In this example, there is only one map.
Each map consists of the following elements:
In the example preceding, some of the columns are of type STRING. In this case, use the ColumnLength
element to specify the maximum number of characters.
• BYTE
• STRING
If you don't specify otherwise, AWS DMS assumes that ColumnLength is zero.
For a column of the NUMERIC type, you need to specify the precision and scale. Precision is the total
number of digits in a number, and scale is the number of digits to the right of the decimal point. You use
the ColumnPrecision and ColumnScale elements for this, as shown following.
...
{
"ColumnName": "HourlyRate",
"ColumnType": "NUMERIC",
"ColumnPrecision": "5"
"ColumnScale": "2"
}
...
CDC00001.csv
CDC00002.csv
CDC00003.csv
...
To indicate where AWS DMS can find the files, you must specify the cdcPath parameter. Continuing the
previous example, if you set cdcPath to changedata, then AWS DMS reads the CDC files at the following
path.
s3://mybucket/changedata
• Operation—the change operation to be performed: INSERT, UPDATE, or DELETE. These keywords are
case-insensitive.
• Table name—the name of the source table.
• Schema name—the name of the source schema.
• Data—one or more columns that represent the data to be changed.
INSERT,employee,hr,101,Smith,Bob,4-Jun-14,New York
UPDATE,employee,hr,101,Smith,Bob,8-Oct-15,Los Angeles
UPDATE,employee,hr,101,Smith,Bob,13-Mar-17,Dallas
DELETE,employee,hr,101,Smith,Bob,13-Mar-17,Dallas
The AWS Identity and Access Management (IAM) role assigned to the user account used to create the
migration task must have the following set of permissions.
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::mybucket*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::mybucket*"
]
}
]
}
Option Description
bucketFolder=testFolder;
bucketName=buckettest;
cdcPath The location of change data capture (CDC) files. This attribute is required if
a task captures change data; otherwise, it's optional. If cdcPath is present,
then AWS DMS reads CDC files from this path and replicate the data changes
to the target endpoint. For more information, see Using CDC with Amazon
S3 as a Source for AWS DMS (p. 141). An example follows.
cdcPath=dataChanges;
The delimiter used to separate rows in the source files. The default is a
carriage return (\r). An example follows.
csvRowDelimiter=\n;
csvDelimiter The delimiter used to separate columns in the source files. The default is a
comma. An example follows.
csvDelimiter=,;
Option Description
A JSON object that describes how AWS DMS should interpret the data in
externalTableDefinition
the Amazon S3 bucket during the migration. For more information, see
Defining External Tables for Amazon S3 as a Source for AWS DMS (p. 139).
An example follows.
externalTableDefinition=<json_object>
ignoreHeaderRows When set to 1, AWS DMS ignores the first row header in a CSV file. A value of
1 enables the feature, a value of 0 disables the feature. The default is 0.
ignoreHeaderRows=1
For information on how to view the data type that is mapped in the target, see the section for the target
endpoint you are using.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
BYTE
Requires ColumnLength. For more information, see Defining External Tables for Amazon S3 as a
Source for AWS DMS (p. 139).
DATE
TIME
DATETIME
TIMESTAMP
INT1
INT2
INT4
INT8
NUMERIC
Requires ColumnPrecision and ColumnScale. For more information, see Defining External Tables
for Amazon S3 as a Source for AWS DMS (p. 139).
REAL4
REAL8
STRING
UINT1
UINT2
UINT4
UINT8
BLOB
CLOB
BOOLEAN
You can use SSL to encrypt connections between your Db2 LUW endpoint and the replication instance.
You must be using AWS DMS engine version 2.4.2 or higher to use SSL. For more information on using
SSL with a Db2 LUW endpoint, see Using SSL With AWS Database Migration Service (p. 47).
To enable ongoing replication, also called change data capture (CDC), you must do the following
• The database must be set to be recoverable. To capture changes, AWS DMS requires that the
database is configured to be recoverable. A database is recoverable if either or both of the database
configuration parameters LOGARCHMETH1 and LOGARCHMETH2 are set to ON.
• The user account must be granted the following permissions:
SYSADM or DBADM
DATAACCESS
• When truncating a table with multiple partitions, the number of DDL events shown in the AWS DMS
console will be equal to the number of partitions. This is because Db2 LUW records a separate DDL for
each partition.
• The following DDL actions are not supported on partitioned tables:
• ALTER TABLE ADD PARTITION
• ALTER TABLE DETACH PARTITION
• ALTER TABLE ATTACH PARTITION
• The DECFLOAT data type is not supported. Consequently, changes to DECFLOAT columns are ignored
during ongoing replication.
• The RENAME COLUMN statement is not supported.
• When performing updates to MDC (Multi-Dimensional Clustering) tables, each update is shown in the
AWS DMS console as INSERT + DELETE.
• When the task setting Include LOB columns in replication is disabled, any table that has LOB columns
is suspended during ongoing replication.
• When the Audit table option is enabled, the first timestamp record in the audit table will be NULL.
• When the Change table option is enabled, the first timestamp record in the table will be zero (i.e.
1970-01-01 00:00:00.000000).
• For Db2 LUW versions 10.5 and higher: Variable-length string columns with data that is stored out-of-
row is ignored. Note that this limitation is only applicable to tables created with extended row size.
The following table shows the extra connection attributes you can use with Db2 LUW as a source:
Name Description
For information on how to view the data type that is mapped in the target, see the section for the target
endpoint you are using.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
INTEGER INT4
SMALLINT INT2
BIGINT INT8
FLOAT REAL8
DOUBLE REAL8
REAL REAL4
REAL8
STRING
GRAPHIC WSTRING
n<=127
VARGRAPHIC WSTRING
n<=255
n<=32k
n<=32k
DATE DATE
TIME TIME
TIMESTAMP DATETIME
BLOB BLOB
CLOB CLOB
DBCLOB CLOB
XML CLOB
• Oracle versions 10g, 11g, 12c, for the Enterprise, Standard, Standard One, and Standard Two editions
• Microsoft SQL Server versions 2005, 2008, 2008R2, 2012, 2014, and 2016, for the Enterprise,
Standard, Workgroup, and Developer editions. The Web and Express editions are not supported.
• MySQL versions 5.5, 5.6, and 5.7
• MariaDB (supported as a MySQL-compatible data target)
• PostgreSQL versions 9.4 and later
• SAP Adaptive Server Enterprise (ASE) versions 15, 15.5, 15.7, 16 and later
Amazon RDS instance databases, Amazon Redshift, Amazon Simple Storage Service, Amazon
DynamoDB, Amazon Kinesis Data Streams and Amazon Elasticsearch Service
• Amazon RDS Oracle versions 11g (versions 11.2.0.3.v1 and later) and 12c, for the Enterprise, Standard,
Standard One, and Standard Two editions
• Amazon RDS Microsoft SQL Server versions 2008R2, 2012, and 2014, for the Enterprise, Standard,
Workgroup, and Developer editions. The Web and Express editions are not supported.
• Amazon RDS MySQL versions 5.5, 5.6, and 5.7
• Amazon RDS MariaDB (supported as a MySQL-compatible data target)
• Amazon RDS PostgreSQL versions 9.4 and later
• Amazon Aurora with MySQL compatibility
• Amazon Aurora with PostgreSQL compatibility
• Amazon Redshift
• Amazon Simple Storage Service
• Amazon DynamoDB
• Amazon Elasticsearch Service
• Amazon Kinesis Data Streams
Topics
• Using an Oracle Database as a Target for AWS Database Migration Service (p. 148)
• Using a Microsoft SQL Server Database as a Target for AWS Database Migration Service (p. 152)
• Using a PostgreSQL Database as a Target for AWS Database Migration Service (p. 156)
• Using a MySQL-Compatible Database as a Target for AWS Database Migration Service (p. 159)
• Using an Amazon Redshift Database as a Target for AWS Database Migration Service (p. 163)
• Using a SAP ASE Database as a Target for AWS Database Migration Service (p. 170)
• Using Amazon Simple Storage Service as a Target for AWS Database Migration Service (p. 171)
• Using an Amazon DynamoDB Database as a Target for AWS Database Migration Service (p. 175)
• Using Amazon Kinesis Data Streams as a Target for AWS Database Migration Service (p. 189)
• Using an Amazon Elasticsearch Service Cluster as a Target for AWS Database Migration
Service (p. 195)
• Using Amazon DocumentDB as a Target for AWS Database Migration Service (p. 198)
AWS DMS supports Oracle versions 10g, 11g, and 12c for on-premises and EC2 instances for the
Enterprise, Standard, Standard One, and Standard Two editions as targets. AWS DMS supports Oracle
versions 11g (versions 11.2.0.3.v1 and later) and 12c for Amazon RDS instance databases for the
Enterprise, Standard, Standard One, and Standard Two editions.
When using Oracle as a target, we assume that the data should be migrated into the schema or user
that is used for the target connection. If you want to migrate data to a different schema, you need to
use a schema transformation to do so. For example, suppose that your target endpoint connects to the
user RDSMASTER and you want to migrate from the user PERFDATA to PERFDATA. In this case, create a
transformation as follows.
{
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "rename",
"rule-target": "schema",
"object-locator": {
"schema-name": "PERFDATA"
},
"value": "PERFDATA"
}
For more information about transformations, see Specifying Table Selection and Transformations by
Table Mapping Using JSON (p. 250).
For additional details on working with Oracle databases as a target for AWS DMS, see the following
sections:
Topics
• Limitations on Oracle as a Target for AWS Database Migration Service (p. 149)
• User Account Privileges Required for Using Oracle as a Target (p. 149)
• Configuring an Oracle Database as a Target for AWS Database Migration Service (p. 150)
• Extra Connection Attributes When Using Oracle as a Target for AWS DMS (p. 150)
• AWS DMS does not create schema on the target Oracle database. You have to create any schemas you
want on the target Oracle database. The schema name must already exist for the Oracle target. Tables
from source schema are imported to user/schema, which AWS DMS uses to connect to the target
instance. You must create multiple replication tasks if you have to migrate multiple schemas.
• AWS DMS doesn't support the Use direct path full load option for tables with INDEXTYPE
CONTEXT. As a workaround, you can use array load.
• In Batch Optimized Apply mode, loading into the net changes table uses Direct Path, which doesn't
support XMLType. As a workaround, you can use Transactional Apply mode.
For the requirements specified following, grant the additional privileges named:
• To use a specific table list, grant SELECT on any replicated table and also ALTER on any replicated
table.
• To allow a user to create a table in his default tablespace, grant the privilege GRANT UNLIMITED
TABLESPACE.
• For logon, grant the privilege CREATE SESSION.
• If you are using a direct path, grant the privilege LOCK ANY TABLE.
• If the "DROP and CREATE table" or "TRUNCATE before loading" option is selected in the full load
settings, and the target table schema is different from that for the AWS DMS user, grant the privilege
DROP ANY TABLE.
• To store changes in change tables or an audit table when the target table schema is different from that
for the AWS DMS user, grant the privileges CREATE ANY TABLE and CREATE ANY INDEX.
Read Privileges Required for AWS Database Migration Service on the Target
Database
The AWS DMS user account must be granted read permissions for the following DBA tables:
• SELECT on DBA_USERS
• SELECT on DBA_TAB_PRIVS
• SELECT on DBA_OBJECTS
• SELECT on DBA_SYNONYMS
• SELECT on DBA_SEQUENCES
• SELECT on DBA_TYPES
• SELECT on DBA_INDEXES
• SELECT on DBA_TABLES
• SELECT on DBA_TRIGGERS
If any of the required privileges cannot be granted to V$xxx, then grant them to V_$xxx.
The following table shows the extra connection attributes available when using Oracle as a target.
Name Description
useDirectPathFullLoad Use direct path full load, specify this to enable/disable the
OCI direct path protocol for bulk loading Oracle tables.
Default value: Y
Name Description
Valid values: Y/N
Example: useDirectPathFullLoad=N
Example: charLengthSemantics=CHAR
DATE DATETIME
REAL4 FLOAT
REAL8 FLOAT
BLOB BLOB
To use this data type with AWS DMS, you must enable the use of BLOBs
for a specific task. BLOB data types are supported only in tables that
include a primary key
CLOB CLOB
To use this data type with AWS DMS, you must enable the use of CLOBs
for a specific task. During CDC, CLOB data types are supported only in
tables that include a primary key.
NCLOB NCLOB
To use this data type with AWS DMS, you must enable the use
of NCLOBs for a specific task. During CDC, NCLOB data types are
supported only in tables that include a primary key.
When the source database is Oracle, the source data types are
replicated "as is" to the Oracle target. For example, an XMLTYPE data
type on the source is created as an XMLTYPE data type on the target.
For on-premises and Amazon EC2 instance databases, AWS DMS supports as a target SQL Server versions
2005, 2008, 2008R2, 2012, 2014, and 2016, for the Enterprise, Standard, Workgroup, and Developer
editions. The Web and Express editions are not supported.
For Amazon RDS instance databases, AWS DMS supports as a target SQL Server versions 2008R2, 2012,
2014, and 2016, for the Enterprise, Standard, Workgroup, and Developer editions are supported. The
Web and Express editions are not supported.
For additional details on working with AWS DMS and SQL Server target databases, see the following.
Topics
• Limitations on Using SQL Server as a Target for AWS Database Migration Service (p. 153)
• Security Requirements When Using SQL Server as a Target for AWS Database Migration
Service (p. 153)
• Extra Connection Attributes When Using SQLServer as a Target for AWS DMS (p. 153)
• Target Data Types for Microsoft SQL Server (p. 154)
• When you manually create a SQL Server target table with a computed column, full load replication is
not supported when using the BCP bulk-copy utility. To use full load replication, disable the Use BCP
for loading tables option in the console's Advanced tab. For more information on working with BCP,
see the Microsoft SQL Server documentation.
• When replicating tables with SQL Server spatial data types (GEOMETRY and GEOGRAPHY), AWS DMS
replaces any spatial reference identifier (SRID) that you might have inserted with the default SRID. The
default SRID is 0 for GEOMETRY and 4326 for GEOGRAPHY.
• Temporal tables are not supported. Migrating temporal tables may work with a replication-only task in
transactional apply mode if those tables are manually created on the target.
• AWS DMS user account must have at least the db_owner user role on the Microsoft SQL Server
database you are connecting to.
• A Microsoft SQL Server system administrator must provide this permission to all AWS DMS user
accounts.
The following table shows the extra connection attributes that you can use when SQL Server is the
target.
Name Description
Default value: Y
Name Description
Valid values: Y/N
Example: useBCPFullLoad=Y
BCPPacketSize The maximum size of the packets (in bytes) used to transfer
data using BCP.
Example : BCPPacketSize=16384
Example: controlTablesFileGroup=filegroup1
BOOLEAN TINYINT
BYTES VARBINARY(length)
For earlier versions, if the scale is 3 or less use DATETIME. In all other
cases, use VARCHAR (37).
TIME For SQL Server 2008 and later, use DATETIME2 (%d).
For earlier versions, if the scale is 3 or less use DATETIME. In all other
cases, use VARCHAR (37).
DATETIME For SQL Server 2008 and later, use DATETIME2 (scale).
For earlier versions, if the scale is 3 or less use DATETIME. In all other
cases, use VARCHAR (37).
INT1 SMALLINT
INT2 SMALLINT
INT4 INT
INT8 BIGINT
REAL4 REAL
REAL8 FLOAT
UINT1 TINYINT
UINT2 SMALLINT
UINT4 INT
UINT8 BIGINT
BLOB VARBINARY(max)
IMAGE
To use this data type with AWS DMS, you must enable the use of BLOBs
for a specific task. AWS DMS supports BLOB data types only in tables
that include a primary key.
CLOB VARCHAR(max)
To use this data type with AWS DMS, you must enable the use of CLOBs
for a specific task. During CDC, AWS DMS supports CLOB data types
only in tables that include a primary key.
NCLOB NVARCHAR(max)
To use this data type with AWS DMS, you must enable the use of
NCLOBs for a specific task. During CDC, AWS DMS supports NCLOB
data types only in tables that include a primary key.
AWS DMS takes a table-by-table approach when migrating data from source to target in the Full Load
phase. Table order during the full load phase cannot be guaranteed. Tables are out of sync during the
full load phase and while cached transactions for individual tables are being applied. As a result, active
referential integrity constraints can result in task failure during the full load phase.
In PostgreSQL, foreign keys (referential integrity constraints) are implemented using triggers. During
the full load phase, AWS DMS loads each table one at a time. We strongly recommend that you disable
foreign key constraints during a full load, using one of the following methods:
• Temporarily disable all triggers from the instance, and finish the full load.
• Use the session_replication_role parameter in PostgreSQL.
At any given time, a trigger can be in one of the following states: origin, replica, always, or
disabled. When the session_replication_role parameter is set to replica, only triggers in
the replica state are active, and they are fired when they are called. Otherwise, the triggers remain
inactive.
PostgreSQL has a failsafe mechanism to prevent a table from being truncated, even when
session_replication_role is set. You can use this as an alternative to disabling triggers, to help
the full load run to completion. To do this, set the target table preparation mode to DO_NOTHING.
Otherwise, DROP and TRUNCATE operations fail when there are foreign key constraints.
In Amazon RDS, you can control set this parameter using a parameter group. For a PostgreSQL instance
running on Amazon EC2, you can set the parameter directly.
For additional details on working with a PostgreSQL database as a target for AWS DMS, see the following
sections:
Topics
• Limitations on Using PostgreSQL as a Target for AWS Database Migration Service (p. 157)
• Security Requirements When Using a PostgreSQL Database as a Target for AWS Database Migration
Service (p. 157)
• Extra Connection Attributes When Using PostgreSQL as a Target for AWS DMS (p. 157)
• Target Data Types for PostgreSQL (p. 157)
• The JSON data type is converted to the Native CLOB data type.
• In an Oracle to PostgreSQL migration, if a column in Oracle contains a NULL character (Hex value
U+0000), AWS DMS converts the NULL character to a space (Hex value U+0020). This is due to a
PostgreSQL limitation.
The following table shows the extra connection attributes you can use to configure PostgreSQL as a
target for AWS DMS.
Name Description
maxFileSize Specifies the maximum size (in KB) of any CSV file used to
transfer data to PostgreSQL.
Example: maxFileSize=512
Example: executeTimeout=100
afterConnectScript=SET Add this attribute to have AWS DMS bypass all foreign keys
and user triggers. This action greatly reduces the time it
session_replication_role='replica'
takes to bulk load data when using full load mode.
Example: afterConnectScript=SET
session_replication_role='replica'
DMS and the default mapping from AWS DMS data types. Unsupported data types are listed following
the table.
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
BOOL BOOL
BYTES BYTEA
DATE DATE
TIME TIME
INT1 SMALLINT
INT2 SMALLINT
INT4 INTEGER
INT8 BIGINT
REAL4 FLOAT4
REAL8 FLOAT8
STRING If the length is from 1 through 21,845, then use VARCHAR (length in
bytes).
UINT1 SMALLINT
UINT2 INTEGER
UINT4 BIGINT
UINT8 BIGINT
WSTRING If the length is from 1 through 21,845, then use VARCHAR (length in
bytes).
BCLOB BYTEA
NCLOB TEXT
CLOB TEXT
Note
When replicating from a PostgreSQL source, AWS DMS creates the target table with the same
data types for all columns, apart from columns with user-defined data types. In such cases, the
data type is created as "character varying" in the target.
You can use SSL to encrypt connections between your MySQL-compatible endpoint and the replication
instance. For more information on using SSL with a MySQL-compatible endpoint, see Using SSL With
AWS Database Migration Service (p. 47).
MySQL versions 5.5, 5.6, and 5.7 are supported, as are MariaDB and Aurora MySQL.
You can use the following MySQL-compatible databases as targets for AWS DMS:
Note
Regardless of the source storage engine (MyISAM, MEMORY, and so on), AWS DMS creates a
MySQL-compatible target table as an InnoDB table by default. If you need to have a table that
uses a storage engine other than InnoDB, you can manually create the table on the MySQL-
compatible target and migrate the table using the "do nothing" mode. For more information
about the "do nothing" mode, see Full Load Task Settings (p. 228).
For additional details on working with a MySQL-compatible database as a target for AWS DMS, see the
following sections.
Topics
• Using Any MySQL-Compatible Database as a Target for AWS Database Migration Service (p. 160)
• Limitations on Using a MySQL-Compatible Database as a Target for AWS Database Migration
Service (p. 160)
• Extra Connection Attributes When Using a MySQL-Compatible Database as a Target for AWS
DMS (p. 161)
• Target Data Types for MySQL (p. 162)
• You must provide a user account to AWS DMS that has read/write privileges to the MySQL-compatible
database. To create the necessary privileges, run the following commands.
• During the full-load migration phase, you must disable foreign keys on your target tables. To disable
foreign key checks on a MySQL-compatible database during a full load, you can add the following
command to the Extra Connection Attributes in the Advanced section of the target endpoint.
initstmt=SET FOREIGN_KEY_CHECKS=0
• The data definition language (DDL) statements TRUNCATE PARTITION, DROP TABLE, and RENAME
TABLE.
• Using an ALTER TABLE <table_name> ADD COLUMN <column_name> statement to add columns
to the beginning or the middle of a table.
• When only the LOB column in a source table is updated, AWS DMS doesn't update the corresponding
target column. The target LOB is only updated if at least one other column is updated in the same
transaction.
• When loading data to a MySQL-compatible target in a full load task, AWS DMS doesn't report
duplicate key errors in the task log.
• When you update a column's value to its existing value, MySQL-compatible databases return a 0
rows affected warning. Although this behavior isn't technically an error, it is different from how
the situation is handled by other database engines. For example, Oracle performs an update of one
row. For MySQL-compatible databases, AWS DMS generates an entry in the awsdms_apply_exceptions
control table and logs the following warning.
Some changes from the source database had no impact when applied to
the target database. See awsdms_apply_exceptions table for details.
The following table shows extra configuration settings that you can use when creating a MySQL-
compatible target for AWS DMS.
Name Description
Example: targetDbType=MULTIPLE_DATABASES
Default value: 1
Example: parallelLoadThreads=1
initstmt=SET time-zone Specifies the time zone for the target MySQL-compatible
database.
maxFileSize Specifies the maximum size (in KB) of any CSV file used to
transfer data to a MySQL-compatible database.
Name Description
Default value: 32768 KB (32 MB)
Example: maxFileSize=512
Example: CleanSrcMetadataOnMismatch=false
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
BOOLEAN BOOLEAN
DATE DATE
TIME TIME
INT1 TINYINT
INT2 SMALLINT
INT4 INTEGER
INT8 BIGINT
REAL4 FLOAT
The Amazon Redshift cluster must be in the same AWS account and same AWS Region as the replication
instance.
During a database migration to Amazon Redshift, AWS DMS first moves data to an Amazon S3 bucket.
Once the files reside in an Amazon S3 bucket, AWS DMS then transfers them to the proper tables in
the Amazon Redshift data warehouse. AWS DMS creates the S3 bucket in the same AWS Region as the
Amazon Redshift database. The AWS DMS replication instance must be located in that same region.
If you use the AWS Command Line Interface (AWS CLI) or the AWS DMS API to migrate data to Amazon
Redshift, you must set up an AWS Identity and Access Management (IAM) role to allow S3 access. For
more information about creating this IAM role, see Creating the IAM Roles to Use With the AWS CLI and
AWS DMS API (p. 34).
The Amazon Redshift endpoint provides full automation for the following:
AWS Database Migration Service supports both full load and change processing operations. AWS DMS
reads the data from the source database and creates a series of comma-separated value (CSV) files.
For full-load operations, AWS DMS creates files for each table. AWS DMS then copies the table files
for each table to a separate folder in Amazon Simple Storage Service. When the files are uploaded to
Amazon Simple Storage Service, AWS DMS sends a copy command and the data in the files are copied
into Amazon Redshift. For change-processing operations, AWS DMS copies the net changes to the CSV
files. AWS DMS then uploads the net change files to Amazon Simple Storage Service and copies the data
to Amazon Redshift.
For additional details on working with Amazon Redshift as a target for AWS DMS, see the following
sections:
Topics
• Prerequisites for Using an Amazon Redshift Database as a Target for AWS Database Migration
Service (p. 164)
• Limitations on Using Amazon Redshift as a Target for AWS Database Migration Service (p. 165)
• Configuring an Amazon Redshift Database as a Target for AWS Database Migration Service (p. 165)
• Using Enhanced VPC Routing with an Amazon Redshift as a Target for AWS Database Migration
Service (p. 166)
• Extra Connection Attributes When Using Amazon Redshift as a Target for AWS DMS (p. 166)
• Target Data Types for Amazon Redshift (p. 168)
• Use the AWS Management Console to launch an Amazon Redshift cluster. You should note the basic
information about your AWS account and your Amazon Redshift cluster, such as your password, user
name, and database name. You need these values when creating the Amazon Redshift target endpoint.
• The Amazon Redshift cluster must be in the same AWS account and the same AWS Region as the
replication instance.
• The AWS DMS replication instance needs network connectivity to the Amazon Redshift endpoint
(hostname and port) that your cluster uses.
• AWS DMS uses an Amazon Simple Storage Service bucket to transfer data to the Amazon Redshift
database. For AWS DMS to create the bucket, the DMS console uses an Amazon IAM role, dms-
access-for-endpoint. If you use the AWS CLI or DMS API to create a database migration with
Amazon Redshift as the target database, you must create this IAM role. For more information about
creating this role, see Creating the IAM Roles to Use With the AWS CLI and AWS DMS API (p. 34).
• AWS DMS converts BLOBs, CLOBs, and NCLOBs to a VARCHAR on the target Amazon Redshift instance.
Amazon Redshift doesn't support VARCHAR data types larger than 64 KB, so you can't store traditional
LOBs on Amazon Redshift.
ALTER TABLE <table name> MODIFY COLUMN <column name> <data type>;
• AWS DMS cannot migrate or replicate changes to a schema with a name that begins with
underscore (_). If you have schemas that have a name that begins with an underscore, use mapping
transformations to rename the schema on the target.
• Amazon Redshift doesn't support VARCHARs larger than 64 KB. LOBs from traditional databases can't
be stored in Amazon Redshift.
Property Description
server The name of the Amazon Redshift cluster you are using.
port The port number for Amazon Redshift. The default value is 5439.
password The password for the user named in the username property.
database The name of the Amazon Redshift data warehouse (service) you are working
with.
If you want to add extra connection string attributes to your Amazon Redshift endpoint, you can
specify the maxFileSize and fileTransferUploadStreams attributes. For more information on
these attributes, see Extra Connection Attributes When Using Amazon Redshift as a Target for AWS
DMS (p. 166).
AWS DMS can be affected by this behavior because it uses the COPY command to move data in S3 to an
Amazon Redshift cluster.
Following are the steps AWS DMS takes to load data into an Amazon Redshift target:
1. AWS DMS copies data from the source to CSV files on the replication server.
2. AWS DMS uses the AWS SDK to copy the CSV files into an S3 bucket on your account.
3. AWS DMS then uses the COPY command in Amazon Redshift to copy data from the CSV files in S3 to
an appropriate table in Amazon Redshift.
If Enhanced VPC Routing is not enabled, Amazon Redshift routes traffic through the Internet, including
traffic to other services within the AWS network. If the feature is not enabled, you do not have to
configure the network path. If the feature is enabled, you must specifically create a network path
between your cluster's VPC and your data resources. For more information on the configuration required,
see Enhanced VPC Routing in the Amazon Redshift documentation.
The following table shows the extra connection attributes available when Amazon Redshift is the target.
Name Description
maxFileSize Specifies the maximum size (in KB) of any CSV file used to
transfer data to Amazon Redshift.
Example: maxFileSize=512
Default value: 10
Valid values: 1 - 64
Example: fileTransferUploadStreams=20
Name Description
Valid values: true | false
Example: acceptanydate=true
dateformat Specifies the date format. This is a string input and is empty
by default. The default format is YYYY-MM-DD but you can
change it to, for example, DD-MM-YYYY. If your date or
time values use different formats, use the auto argument
with the dateformat parameter. The auto argument
recognizes several formats that are not supported when
using a dateformat string. The auto keyword is case-
sensitive.
Example: dateformat=auto
timeformat Specifies the time format. This is a string input and is empty
by default. The auto argument recognizes several formats
that are not supported when using a timeformat string.
If your date and time values use formats different from
each other, use the auto argument with the timeformat
parameter.
Default value: 10
Example: timeformat=auto
Example: emptyasnull=true
Example:
truncateColumns=true;
Name Description
Example:
removeQuotes=true;
Example:
trimBlanks=false;
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
BOOLEAN BOOL
DATE DATE
TIME VARCHAR(20)
TIMESTAMP (s)
VARCHAR (37)
INT1 INT2
INT2 INT2
INT4 INT4
INT8 INT8
NUMERIC (p,s)
VARCHAR (Length)
REAL4 FLOAT4
REAL8 FLOAT8
UINT1 INT2
UINT2 INT2
UINT4 INT4
SAP ASE versions 15, 15.5, 15.7, 16 and later are supported.
• You must provide SAP ASE account access to the AWS DMS user. This user must have read/write
privileges in the SAP ASE database.
• When replicating to SAP ASE version 15.7 installed on a Windows EC2 instance configured with a non-
Latin language (for example, Chinese), AWS DMS requires SAP ASE 15.7 SP121 to be installed on the
target SAP ASE machine.
The following table shows the extra connection attributes available when using SAP ASE as a target:
Name Description
Note
If the user name or password specified in the connection string contains non-Latin characters
(for example, Chinese), the following property is required: charset=gb18030
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
BOOLEAN BIT
DATE DATE
TIME TIME
INT1 TINYINT
INT2 SMALLINT
INT4 INTEGER
INT8 BIGINT
REAL4 REAL
UINT1 TINYINT
BLOB IMAGE
CLOB UNITEXT
NCLOB TEXT
AWS DMS does not support tables that include fields with the following data types. Replicated columns
with these data types show as null.
LOAD00002..., LOAD00009, LOAD0000A, and so on. AWS DMS names CDC files using timestamps,
for example 20141029-1134010000.csv. For each source table, AWS DMS creates a folder under the
specified target folder. AWS DMS writes all full load and CDC files to the specified Amazon S3 bucket.
The parameter bucketFolder contains the location where the .csv files are stored before being
uploaded to the S3 bucket. Table data is stored in the following format in the S3 bucket:
<schema_name>/<table_name>/LOAD00000001.csv
<schema_name>/<table_name>/LOAD00000002.csv
...
<schema_name>/<table_name>/LOAD00000009.csv
<schema_name>/<table_name>/LOAD0000000A.csv
<schema_name>/<table_name>/LOAD0000000B.csv
...
<schema_name>/<table_name>/LOAD0000000F.csv
<schema_name>/<table_name>/LOAD00000010.csv
...
You can specify the column delimiter, row delimiter, and other parameters using the extra connection
attributes. For more information on the extra connection attributes, see Extra Connection Attributes
When Using Amazon S3 as a Target for AWS DMS (p. 174) at the end of this section.
When you use AWS DMS to replicate data changes, the first column of the CSV output file indicates how
the data was changed as shown following:
I,101,Smith,Bob,4-Jun-14,New York
U,101,Smith,Bob,8-Oct-15,Los Angeles
U,101,Smith,Bob,13-Mar-17,Dallas
D,101,Smith,Bob,13-Mar-17,Dallas
For this example, suppose that there is an EMPLOYEE table in the source database. AWS DMS writes data
to the CSV file, in response to the following events:
• A new employee (Bob Smith, employee ID 101) is hired on 4-Jun-14 at the New York office. In the CSV
file, the I in the first column indicates that a new row was INSERTed into the EMPLOYEE table at the
source database.
• On 8-Oct-15, Bob transfers to the Los Angeles office. In the CSV file, the U indicates that the
corresponding row in the EMPLOYEE table was UPDATEd to reflect Bob's new office location. The rest
of the line reflects the row in the EMPLOYEE table as it appears after the UPDATE.
• On 13-Mar,17, Bob transfers again to the Dallas office. In the CSV file, the U indicates that this row
was UPDATEd again. The rest of the line reflects the row in the EMPLOYEE table as it appears after the
UPDATE.
• After some time working in Dallas, Bob leaves the company. In the CSV file, the D indicates that the
row was DELETEd in the source table. The rest of the line reflects how the row in the EMPLOYEE table
appeared before it was deleted.
The AWS account you use for the migration must have write and delete access to the Amazon Simple
Storage Service bucket you are using as a target. The role assigned to the user account used to create the
migration task must have the following set of permissions.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::buckettest2*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::buckettest2*"
]
}
]
}
• Only the following data definition language (DDL) commands are supported: TRUNCATE TABLE, DROP
TABLE, and CREATE TABLE.
• Full LOB mode is not supported.
• Changes to the source table structure during full load are not supported. Changes to the data are
supported during full load.
• Multiple tasks that replicate data from the same source table to the same target S3 endpoint bucket
result in those tasks writing to the same file. We recommend that you specify different target
endpoints (buckets) if your data source is from the same table.
Security
To use Amazon Simple Storage Service as a target, the account used for the migration must have write
and delete access to the Amazon Simple Storage Service bucket that is used as the target. You must
specify the Amazon Resource Name (ARN) of an IAM role that has the permissions required to access
Amazon Simple Storage Service.
AWS DMS supports a set of predefined grants for Amazon Simple Storage Service, known
as canned ACLs. Each canned ACL has a set of grantees and permissions you can use to set
permissions for the Amazon Simple Storage Service bucket. You can specify a canned ACL using the
cannedAclForObjects on the connection string attribute for your S3 target endpoint. For more
information about using the extra connection attribute cannedAclForObjects, see Extra Connection
Attributes When Using Amazon S3 as a Target for AWS DMS (p. 174) for more information. For more
information about Amazon Simple Storage Service canned ACLs, see Canned ACL.
The IAM role that you use for the migration must be able to perform the s3:PutObjectAcl API action.
Option Description
addColumnName An optional parameter that allows you to add column name information to
the .csv output file. The default is false.
Example:
addColumnName=true;
Example:
bucketFolder=testFolder;
Example:
bucketName=buckettest;
cannedAclForObjects Allows AWS DMS to specify a predefined (canned) access control list for
objects written to the S3 bucket. For more information about Amazon
Simple Storage Service canned ACLs, see Canned ACL in the Amazon Simple
Storage Service Developer Guide.
Example:
cannedAclForObjects=PUBLIC_READ;
cdcInsertsOnly An optional parameter to write only INSERT operations to the .CSV output
files. By default, the first field in a .CSV record contains the letter I (insert),
U (update) or D (delete) to indicate whether the row was inserted, updated,
or deleted at the source database. If cdcInsertsOnly is set to true, then
only INSERTs are recorded in the CSV file, without any I annotation.
Example:
cdcInsertsOnly=true;
compressionType An optional parameter to use GZIP to compress the target files. Set to NONE
(the default) or do not use to leave the files uncompressed.
Example:
Option Description
compressionType=GZIP;
csvRowDelimiter The delimiter used to separate rows in the source files. The default is a
carriage return (\n).
Example:
csvRowDelimiter=\n;
csvDelimiter The delimiter used to separate columns in the source files. The default is a
comma.
Example:
csvDelimiter=,;
maxFileSize Specifies the maximum size (in KB) of any CSV file to be created while
migrating to S3 target during full load.
Example:
maxFileSize=512
rfc4180 An optional parameter used to control RFC compliance behavior with data
migrated to Amazon Simple Storage Service. When using Amazon Simple
Storage Service as a target, if the data has quotes or a new line character
in it then AWS DMS encloses the entire column with an additional ". Every
quote mark within the data is repeated twice. This is in compliance with RFC
4180.
Example:
rfc4180=false;
In DynamoDB, tables, items, and attributes are the core components that you work with. A table is a
collection of items, and each item is a collection of attributes. DynamoDB uses primary keys, called
partition keys, to uniquely identify each item in a table. You can also use keys and secondary indexes to
provide more querying flexibility.
You use object mapping to migrate your data from a source database to a target DynamoDB table.
Object mapping lets you determine where the source data is located in the target.
When AWS DMS creates tables on an Amazon DynamoDB target endpoint, it creates as many tables as
in the source database endpoint. AWS DMS also sets several Amazon DynamoDB parameter values. The
cost for the table creation depends on the amount of data and the number of tables to be migrated.
When AWS DMS sets Amazon DynamoDB parameter values for a migration task, the default Read
Capacity Units (RCU) parameter value is set to 200.
The Write Capacity Units (WCU) parameter value is also set, but its value depends on several other
settings:
Currently AWS DMS supports single table to single table restructuring to DynamoDB scalar type
attributes. If you are migrating data into DynamoDB from a relational database table, you take data from
a table and reformat it into DynamoDB scalar data type attributes. These attributes can accept data from
multiple columns, and you can map a column to an attribute directly.
• String
• Number
• Boolean
Note
NULL data from the source are ignored on the target.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
The role that you use for the migration to DynamoDB must have the following permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"dynamodb:PutItem",
"dynamodb:CreateTable",
"dynamodb:DescribeTable",
"dynamodb:DeleteTable",
"dynamodb:DeleteItem"
],
"Resource": [
"arn:aws:dynamodb:us-west-2:account-id:table/Name1",
"arn:aws:dynamodb:us-west-2:account-id:table/OtherName*",
]
},
{
"Effect": "Allow",
"Action": [
"dynamodb:ListTables"
],
"Resource": "*"
}
]
}
• DynamoDB limits the precision of the Number data type to 38 places. Store all data types with a
higher precision as a String. You need to explicitly specify this using the object mapping feature.
• Because Amazon DynamoDB doesn't have a Date data type, data using the Date data type are
converted to strings.
• Amazon DynamoDB doesn't allow updates to the primary key attributes. This restriction is important
when using ongoing replication with change data capture (CDC) because it can result in unwanted
data in the target. Depending on how you have the object mapping, a CDC operation that updates the
primary key can either fail or insert a new item with the updated primary key and incomplete data.
• AWS DMS only supports replication of tables with non-composite primary keys, unless you specify an
object mapping for the target table with a custom partition key or sort key, or both.
• AWS DMS doesn't support LOB data unless it is a CLOB. AWS DMS converts CLOB data into a
DynamoDB string when migrating data.
mapping lets you define the attribute names and the data to be migrated to them. You must have
selection rules when you use object mapping,
Amazon DynamoDB doesn't have a preset structure other than having a partition key and an optional
sort key. If you have a noncomposite primary key, AWS DMS uses it. If you have a composite primary key
or you want to use a sort key, define these keys and the other attributes in your target DynamoDB table.
To create an object mapping rule, you specify the rule-type as object-mapping. This rule specifies what
type of object mapping you want to use.
{ "rules": [
{
"rule-type": "object-mapping",
"rule-id": "<id>",
"rule-name": "<name>",
"rule-action": "<valid object-mapping rule action>",
"object-locator": {
"schema-name": "<case-sensitive schema name>",
"table-name": ""
},
"target-table-name": "<table_name>",
}
}
]
}
AWS DMS currently supports map-record-to-record and map-record-to-document as the only valid values
for the rule-action parameter. map-record-to-record and map-record-to-document specify what AWS
DMS does by default to records that aren't excluded as part of the exclude-columns attribute list;
these values don't affect the attribute-mappings in any way.
• map-record-to-record can be used when migrating from a relational database to DynamoDB. It uses
the primary key from the relational database as the partition key in Amazon DynamoDB and creates
an attribute for each column in the source database. When using map-record-to-record, for any
column in the source table not listed in the exclude-columns attribute list, AWS DMS creates a
corresponding attribute on the target DynamoDB instance regardless of whether that source column is
used in an attribute mapping.
• map-record-to-document puts source columns into a single, flat, DynamoDB map on the target using
the attribute name "_doc." When using map-record-to-document, for any column in the source
table not listed in the exclude-columns attribute list, AWS DMS places the data into a single, flat,
DynamoDB map attribute on the source called "_doc".
One way to understand the difference between the rule-action parameters map-record-to-record and
map-record-to-document is to see the two parameters in action. For this example, assume that you are
starting with a relational database table row with the following structure and data:
To migrate this information to DynamoDB, you create rules to map the data into a DynamoDB table
item. Note the columns listed for the exclude-columns parameter. These columns are not directly
mapped over to the target. Instead, attribute mapping is used to combine the data into new items, such
as where FirstName and LastName are grouped together to become CustomerName on the DynamoDB
target. NickName and income are not excluded.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "object-mapping",
"rule-id": "1",
"rule-name": "TransformToDDB",
"rule-action": "map-record-to-record",
"object-locator": {
"schema-name": "test",
"table-name": "customer",
},
"target-table-name": "customer_t",
"mapping-parameters": {
"partition-key-name": "CustomerName",
"exclude-columns": [
"FirstName",
"LastName",
"HomeAddress",
"HomePhone",
"WorkAddress",
"WorkPhone"
],
"attribute-mappings": [
{
"target-attribute-name": "CustomerName",
"attribute-type": "scalar",
"attribute-sub-type": "string",
"value": "${FirstName},${LastName}"
},
{
"target-attribute-name": "ContactDetails",
"attribute-type": "document",
"attribute-sub-type": "dynamodb-map",
"value": {
"M": {
"Home": {
"M": {
"Address": {
"S": "${HomeAddress}"
},
"Phone": {
"S": "${HomePhone}"
}
}
},
"Work": {
"M": {
"Address": {
"S": "${WorkAddress}"
},
"Phone": {
"S": "${WorkPhone}"
}
}
}
}
}
}
]
}
}
]
}
By using the rule-action parameter map-record-to-record, the data for NickName and income are
mapped to items of the same name in the DynamoDB target.
However, suppose that you use the same rules but change the rule-action parameter to map-record-
to-document. In this case, the columns not listed in the exclude-columns parameter, NickName and
income, are mapped to a _doc item.
• an expression (required)
• expression attribute values (optional) . Specifies a DynamoDB json structure of the attribute value
• expression attribute names (optional)
• options for when to use the condition expression (optional). The default is apply-during-cdc = false
and apply-during-full-load = true
"target-table-name": "customer_t",
"mapping-parameters": {
"partition-key-name": "CustomerName",
"condition-expression": {
"expression":"<conditional expression>",
"expression-attribute-values": [
{
"name":"<attribute name>",
"value":<attribute value>
}
],
"apply-during-cdc":<optional Boolean value>,
"apply-during-full-load": <optional Boolean value>
}
The following sample highlights the sections used for condition expression.
The following example shows the structure of the source database and the desired structure of the
DynamoDB target. First is shown the structure of the source, in this case an Oracle database, and then
the desired structure of the data in DynamoDB. The example concludes with the JSON used to create the
desired target structure.
FirstName
LastName
StoreId
HomeAddress
HomePhone
WorkAddress
WorkPhone
DateOfBirth
Primary N/A
Key
CustomerName
StoreId ContactDetails DateOfBirth
Randy,Marsh
5 { 02/29/1988
"Name": "Randy",
"Home": {
"Address": "221B Baker Street",
"Phone": 1234567890
},
"Work": {
"Address": "31 Spooner Street,
Quahog",
"Phone": 9876541230
}
}
The following JSON shows the object mapping and column mapping used to achieve the DynamoDB
structure:
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "object-mapping",
"rule-id": "2",
"rule-name": "TransformToDDB",
"rule-action": "map-record-to-record",
"object-locator": {
"schema-name": "test",
"table-name": "customer"
},
"target-table-name": "customer_t",
"mapping-parameters": {
"partition-key-name": "CustomerName",
"sort-key-name": "StoreId",
"exclude-columns": [
"FirstName",
"LastName",
"HomeAddress",
"HomePhone",
"WorkAddress",
"WorkPhone"
],
"attribute-mappings": [
{
"target-attribute-name": "CustomerName",
"attribute-type": "scalar",
"attribute-sub-type": "string",
"value": "${FirstName},${LastName}"
},
{
"target-attribute-name": "StoreId",
"attribute-type": "scalar",
"attribute-sub-type": "string",
"value": "${StoreId}"
},
{
"target-attribute-name": "ContactDetails",
"attribute-type": "scalar",
"attribute-sub-type": "string",
"value": "{\"Name\":\"${FirstName}\",\"Home\":{\"Address\":\"${HomeAddress}\",
\"Phone\":\"${HomePhone}\"}, \"Work\":{\"Address\":\"${WorkAddress}\",\"Phone\":
\"${WorkPhone}\"}}"
}
]
}
}
]
}
Another way to use column mapping is to use DynamoDB format as your document type. The following
code example uses dynamodb-map as the attribute-sub-type for attribute mapping.
{
"rules": [
{
"rule-type": "object-mapping",
"rule-id": "1",
"rule-name": "TransformToDDB",
"rule-action": "map-record-to-record",
"object-locator": {
"schema-name": "test",
"table-name": "customer",
},
"target-table-name": "customer_t",
"mapping-parameters": {
"partition-key-name": "CustomerName",
"sort-key-name": "StoreId",
"exclude-columns": [
"FirstName",
"LastName",
"HomeAddress",
"HomePhone",
"WorkAddress",
"WorkPhone"
],
"attribute-mappings": [
{
"target-attribute-name": "CustomerName",
"attribute-type": "scalar",
"attribute-sub-type": "string",
"value": "${FirstName},${LastName}"
},
{
"target-attribute-name": "StoreId",
"attribute-type": "scalar",
"attribute-sub-type": "string",
"value": "${StoreId}"
},
{
"target-attribute-name": "ContactDetails",
"attribute-type": "document",
"attribute-sub-type": "dynamodb-map",
"value": {
"M": {
"Name": {
"S": "${FirstName}"
}"Home": {
"M": {
"Address": {
"S": "${HomeAddress}"
},
"Phone": {
"S": "${HomePhone}"
}
}
},
"Work": {
"M": {
"Address": {
"S": "${WorkAddress}"
},
"Phone": {
"S": "${WorkPhone}"
}
}
}
}
}
}
]
}
}
]
}
The table-mapping rules used to map the two tables to the two DynamoDB tables is shown below:
{
"rules":[
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "dms_sample",
"table-name": "nfl_data"
},
"rule-action": "include"
},
{
"rule-type": "selection",
"rule-id": "2",
"rule-name": "2",
"object-locator": {
"schema-name": "dms_sample",
"table-name": "sport_team"
},
"rule-action": "include"
},
{
"rule-type":"object-mapping",
"rule-id":"3",
"rule-name":"MapNFLData",
"rule-action":"map-record-to-record",
"object-locator":{
"schema-name":"dms_sample",
"table-name":"nfl_data"
},
"target-table-name":"NFLTeams",
"mapping-parameters":{
"partition-key-name":"Team",
"sort-key-name":"PlayerName",
"exclude-columns": [
"player_number", "team", "Name"
],
"attribute-mappings":[
{
"target-attribute-name":"Team",
"attribute-type":"scalar",
"attribute-sub-type":"string",e
"value":"${team}"
},
{
"target-attribute-name":"PlayerName",
"attribute-type":"scalar",
"attribute-sub-type":"string",
"value":"${Name}"
},
{
"target-attribute-name":"PlayerInfo",
"attribute-type":"scalar",
"attribute-sub-type":"string",
"value":"{\"Number\": \"${player_number}\",\"Position\": \"${Position}\",
\"Status\": \"${status}\",\"Stats\": {\"Stat1\": \"${stat1}:${stat1_val}\",\"Stat2\":
\"${stat2}:${stat2_val}\",\"Stat3\": \"${stat3}:${
stat3_val}\",\"Stat4\": \"${stat4}:${stat4_val}\"}"
}
]
}
},
{
"rule-type":"object-mapping",
"rule-id":"4",
"rule-name":"MapSportTeam",
"rule-action":"map-record-to-record",
"object-locator":{
"schema-name":"dms_sample",
"table-name":"sport_team"
},
"target-table-name":"SportTeams",
"mapping-parameters":{
"partition-key-name":"TeamName",
"exclude-columns": [
"name", "id"
],
"attribute-mappings":[
{
"target-attribute-name":"TeamName",
"attribute-type":"scalar",
"attribute-sub-type":"string",
"value":"${name}"
},
{
"target-attribute-name":"TeamInfo",
"attribute-type":"scalar",
"attribute-sub-type":"string",
"value":"{\"League\": \"${sport_league_short_name}\",\"Division\":
\"${sport_division_short_name}\"}"
}
]
}
}
]
}
The sample output for the NFLTeams DynamoDB table is shown below:
The sample output for the SportsTeams DynamoDB table is shown below:
{
"abbreviated_name": "IND",
"home_field_id": 53,
"sport_division_short_name": "AFC South",
"sport_league_short_name": "NFL",
"sport_type_name": "football",
"TeamInfo": "{\"League\": \"NFL\",\"Division\": \"AFC South\"}",
"TeamName": "Indianapolis Colts"
}
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
When AWS DMS migrates data from heterogeneous databases, we map data types from the source
database to intermediate data types called AWS DMS data types. We then map the intermediate data
types to the target data types. The following table shows each AWS DMS data type and the data type it
maps to in DynamoDB:
String String
WString String
Boolean Boolean
Date String
DateTime String
INT1 Number
INT2 Number
INT4 Number
INT8 Number
Numeric Number
Real4 Number
Real8 Number
UINT1 Number
UINT2 Number
UINT4 Number
UINT8 Number
CLOB String
A Kinesis data stream is made up of shards. Shards are uniquely identified sequences of data records
in a stream. For more information on shards in Amazon Kinesis Data Streams, see Shard in the Amazon
Kinesis Data Streams Developer Guide.
AWS Database Migration Service publishes records to a Kinesis data stream using JSON. During
conversion, AWS DMS serializes each record from the source database into an attribute-value pair in
JSON format.
You must use AWS Database Migration Service engine version 3.1.2 or higher to migrate data to Amazon
Kinesis Data Streams.
You use object mapping to migrate your data from any supported data source to a target stream. With
object mapping, you determine how to structure the data records in the stream. You also define a
partition key for each table, which Kinesis Data Streams uses to group the data into its shards.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "1",
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
The role that you use for the migration to a Kinesis data stream must have the following permissions.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kinesis:ListStreams",
"kinesis:PutRecords"
],
"Resource": "arn:aws:kinesis:region:account-id:stream/stream-name"
},
]
}
• AWS DMS publishes each update to a single record in the source database as one data record in a
given Kinesis data stream. As a result, applications consuming the data from the stream lose track of
transaction boundaries.
• Kinesis data streams don't support deduplication. Applications that consume data from a stream need
to handle duplicate records. For more information, see Handling Duplicate Records in the Amazon
Kinesis Data Streams Developer Guide.
• AWS DMS supports the following two forms for partition keys:
• SchemaName.TableName: A combination of the schema and table name.
• ${AttributeName}: The value of one of the fields in the JSON, or the primary key of the table in
the source database.
Kinesis data streams don't have a preset structure other than having a partition key.
To create an object mapping rule, you specify rule-type as object-mapping. This rule specifies what
type of object mapping you want to use.
{ "rules": [
{
"rule-type": "object-mapping",
"rule-id": "<id>",
"rule-name": "<name>",
"rule-action": "<valid object-mapping rule action>",
"object-locator": {
"schema-name": "<case-sensitive schema name>",
"table-name": ""
},
}
]
}
Use map-record-to-record when migrating from a relational database to a Kinesis data stream. This
rule type uses the taskResourceId.schemaName.tableName value from the relational database
as the partition key in the Kinesis data stream and creates an attribute for each column in the source
database. When using map-record-to-record, for any column in the source table not listed in the
exclude-columns attribute list, AWS DMS creates a corresponding attribute in the target stream.
This corresponding attribute is created regardless of whether that source column is used in an attribute
mapping.
One way to understand map-record-to-record is to see it in action. For this example, assume that
you are starting with a relational database table row with the following structure and data.
To migrate this information to a Kinesis data stream, you create rules to map the data to the target
stream. The following rule illustrates the mapping.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "object-mapping",
"rule-id": "1",
"rule-name": "DefaultMapToKinesis",
"rule-action": "map-record-to-document",
"object-locator": {
"schema-name": "Test",
"table-name": "Customer",
},
}
]
}
The following illustrates the resulting record format in the Kinesis data stream.
• StreamName: XXX
• PartitionKey: Test.Customers //schmaName.tableName
• Data: //The following JSON message
{
"FirstName": "Randy",
"LastName": "Marsh",
"StoreId": "5",
"HomeAddress": "221B Baker Street",
"HomePhone": "1234567890",
"WorkAddress": "31 Spooner Street, Quahog",
"WorkPhone": "9876543210",
"DateOfBirth": "02/29/1988"
}
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "Test",
"table-name": "%"
},
"rule-action": "include"
}
]
{
{
"rule-type": "object-mapping",
"rule-id": "1",
"rule-name": "TransformToKinesis",
"rule-action": "map-record-to-document",
"object-locator": {
"schema-name": "Test",
"table-name": "Customer",
},
"mapping-parameters": {
"partition-key":{
"attribute-name": "CustomerName",
"value": "${FirstName},${LastName}"
},
"exclude-columns": [
"FirstName", "LastName", "HomeAddress", "HomePhone", "WorkAddress",
"WorkPhone"
],
"attribute-mappings": [
{
"attribute-name": "CustomerName",
"value": "${FirstName},${LastName}"
},
{
"attribute-name": "ContactDetails",
"value": {
"Home":{
"Address":"${HomeAddress}",
"Phone":${HomePhone}
},
"Work":{
"Address":"${WorkAddress}",
"Phone":${WorkPhone}
}
}
},
{
"attribute-name": "DateOfBirth",
"value": "${DateOfBirth}"
},
]
}
}
]
}
To set a constant value for partition-key, specify a partition-key value. For example, you might do
this to force all the data to be stored in a single shard. The following mapping illustrates this approach.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "Test",
"table-name": "%"
},
"rule-action": "include"
}
]
{
{
"rule-type": "object-mapping",
"rule-id": "1",
"rule-name": "TransformToKinesis",
"rule-action": "map-record-to-document",
"object-locator": {
"schema-name": "Test",
"table-name": "Customer",
},
"mapping-parameters": {
"partition-key":{
"value": "ConstantPartitionKey"
},
"exclude-columns": [
"FirstName", "LastName", "HomeAddress", "HomePhone", "WorkAddress",
"WorkPhone"
],
"attribute-mappings": [
{
"attribute-name": "CustomerName",
"value": "${FirstName},${LastName}"
},
{
"attribute-name": "ContactDetails",
"value": {
"Home":{
"Address":"${HomeAddress}",
"Phone":${HomePhone}
},
"Work":{
"Address":"${WorkAddress}",
"Phone":${WorkPhone}
}
}
},
{
"attribute-name": "DateOfBirth",
"value": "${DateOfBirth}"
},
]
}
}
]
}
RecordType
The record type can be either data or control. Data records represent the actual rows in the source.
Control records are for important events in the stream, for example a restart of the task.
Operation
The source schema for the record. This field can be empty for a control record.
TableName
The source table for the record. This field can be empty for a control record.
Timestamp
The timestamp for when the JSON message was constructed. The field is formatted with the ISO
8601 format.
Note
The partition-key value for a control record that is for a specific table is
TaskId.SchemaName.TableName. The partition-key value for a control record that is for a
specific task is that record's TaskId. Specifying a partition-key value in the object mapping
has no impact on the partition-key for a control record.
In Elasticsearch, you work with indexes and documents. An index is a collection of documents, and a
document is a JSON object containing scalar values, arrays, and other objects. Elasticsearch provides
a JSON-based query language, so that you can query data in an index and retrieve the corresponding
documents.
When AWS DMS creates indexes for a target endpoint for Amazon Elasticsearch Service, it creates one
index for each table from the source endpoint. The cost for creating an Elasticsearch index depends on
several factors. These are the number of indexes created, the total amount of data in these indexes, and
the small amount of metadata that Elasticsearch stores for each document.
You must use AWS Database Migration Service engine version 3.1.2 or higher to migrate data to Amazon
Elasticsearch Service.
Configure your Elasticsearch cluster with compute and storage resources that are appropriate for the
scope of your migration. We recommend that you consider the following factors, depending on the
replication task you want to use:
• For a full data load, consider the total amount of data that you want to migrate, and also the speed of
the transfer.
• For replicating ongoing changes, consider the frequency of updates, and your end-to-end latency
requirements.
Also, configure the index settings on your Elasticsearch cluster, paying close attention to the shard and
replica count.
• Boolean
• Date
• Float
• Int
• String
AWS DMS converts data of type Date into type String. You can specify custom mapping to interpret
these dates.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "1",
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
The role that you use for the migration to Elasticsearch must have the following permissions.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"es:ESHttpDelete",
"es:ESHttpGet",
"es:ESHttpHead",
"es:ESHttpPost",
"es:es:ESHttpPut"
],
"Resource": "arn:aws:es:region:account-id:domain/domain-name/*"
}
]
}
In the preceding example, replace region with the AWS Region identifier, account-id with your AWS
account ID, and domain-name with the name of your Amazon Elasticsearch Service domain. An example
is arn:aws:es:us-west-2:123456789012:domain/my-es-domain
The following table describes the extra connection attributes available when using an Elasticsearch
instance as an AWS DMS source.
A positive integer
fullLoadErrorPercentage 10 – For a full load task, this attribute determines the
greater than 0 but no threshold of errors allowed before the task fails. For
larger than 100. example, suppose that there are 1,500 rows at the source
endpoint and this parameter is set to 10. Then the task
fails if AWS DMS encounters more than 150 errors (10
percent of the row count) when writing to the target
endpoint.
A positive integer
errorRetryDuration 300 – If an error occurs at the target endpoint, AWS DMS
greater than 0. retries for this many seconds. Otherwise, the task fails.
• AWS DMS only supports replication of tables with noncomposite primary keys. The primary key of the
source table must consist of a single column.
• Elasticsearch uses dynamic mapping (auto guess) to determine the data types to use for migrated
data.
• Elasticsearch stores each document with a unique ID. The following is an example ID.
"_id": "D359F8B537F1888BC71FE20B3D79EAE6674BE7ACA9B645B0279C7015F6FF19FD"
Each document ID is 64 bytes long, so anticipate this as a storage requirement. For example, if you
migrate 100,000 rows from an AWS DMS source, the resulting Elasticsearch index requires storage for
an additional 6,400,000 bytes.
• With Amazon ES, you can't make updates to the primary key attributes. This restriction is important
when using ongoing replication with change data capture (CDC) because it can result in unwanted
data in the target. In CDC mode, primary keys are mapped to SHA256 values, which are 32 bytes long.
These are converted to human-readable 64-byte strings, and are used as Elasticsearch document IDs.
• If AWS DMS encounters any items that can't be migrated, it writes error messages to Amazon
CloudWatch Logs. This behavior differs from that of other AWS DMS target endpoints, which write
errors to an exceptions table.
Boolean boolean
Date string
Time date
Timestamp date
INT4 integer
Real4 float
UINT4 integer
For additional information about AWS DMS data types, see Data Types for AWS Database Migration
Service (p. 319).
You can use AWS DMS to replicate source data to Amazon DocumentDB databases, collections. or
documents.
If the source endpoint is MongoDB, make sure to enable the following extra connection attributes:
• nestingLevel=NONE
• extractDocID=FALSE
For more information, see Extra Connection Attributes When Using MongoDB as a Source for AWS
DMS (p. 135).
MongoDB stores data in a binary JSON format (BSON). AWS DMS supports all of the BSON data types
that are supported by Amazon DocumentDB. For a list of these data types, see Supported MongoDB
APIs, Operations, and Data Types in the Amazon DocumentDB Developer Guide.
If the source endpoint is a relational database, AWS DMS maps database objects to Amazon
DocumentDB as follows:
If the source endpoint is Amazon S3, then the resulting Amazon DocumentDB objects correspond to AWS
DMS mapping rules for Amazon S3. For example, consider the following URI.
s3://mybucket/hr/employee
In this case, AWS DMS maps the objects in mybucket to Amazon DocumentDB as follows:
For more information on mapping rules for Amazon S3, see Using Amazon Simple Storage Service as a
Source for AWS DMS (p. 138).
For additional details on working with Amazon DocumentDB as a target for AWS DMS, including a
walkthrough of the migration process, see the following sections.
Topics
• Mapping Data from a Source to an Amazon DocumentDB Target (p. 199)
• Ongoing Replication with Amazon DocumentDB as a Target (p. 202)
• Limitations to Using Amazon DocumentDB as a Target (p. 203)
• Target Data Types for Amazon DocumentDB (p. 203)
• Walkthrough: Migrating from MongoDB to Amazon DocumentDB (p. 204)
If the resulting JSON document contains a field named _id, then that field is used as the unique _id in
Amazon DocumentDB.
If the JSON doesn't contain an _id field, then Amazon DocumentDB generates an _id value
automatically.
• If one of the columns is named _id, then the data in that column is used as the target_id.
• If there is no _id column, but the source data has a primary key or a unique index, then AWS DMS uses
that key or index value as the _id value. The data from the primary key or unique index also appears
as explicit fields in the JSON document.
• If there is no _id column, and no primary key or a unique index, then Amazon DocumentDB generates
an _id value automatically.
To coerce a data type, you can prefix the source column name with json_ (that is, json_columnName)
either manually or using a transformation. In this case, the column is created as a nested JSON document
within the target document, rather than as a string field.
For example, suppose that you want to migrate the following document from a MongoDB source
endpoint.
{
"_id": "1",
"FirstName": "John",
"LastName": "Doe",
"ContactDetails": "{"Home": {"Address": "Boston","Phone": "1111111"},"Work":
{ "Address": "Boston", "Phone": "2222222222"}}"
}
If you don't coerce any of the source data types, the embedded ContactDetails document is migrated
as a string.
{
"_id": "1",
"FirstName": "John",
"LastName": "Doe",
"ContactDetails": "{\"Home\": {\"Address\": \"Boston\",\"Phone\": \"1111111\"},\"Work
\": { \"Address\": \"Boston\", \"Phone\": \"2222222222\"}}"
}
However, you can add a transformation rule to coerce ContactDetails to a JSON object. For example,
suppose that the original source column name is ContactDetails. Suppose also that the renamed
source column is to be json_ContactDetails. AWS DMS replicates the ContactDetails field as
nested JSON, as follows.
{
"_id": "1",
"FirstName": "John",
"LastName": "Doe",
"ContactDetails": {
"Home": {
"Address": "Boston",
"Phone": "1111111111"
},
"Work": {
"Address": "Boston",
"Phone": "2222222222"
}
}
}
To coerce a data type, you can prefix a column name with array_ (that is, array_columnName), either
manually or using a transformation. In this case, AWS DMS considers the column as a JSON array, and
creates it as such in the target document.
Suppose that you want to migrate the following document from a MongoDB source endpoint.
{
"_id" : "1",
"FirstName": "John",
"LastName": "Doe",
If you don't coerce any of the source data types, the embedded ContactDetails document is migrated
as a string.
{
"_id": "1",
"FirstName": "John",
"LastName": "Doe",
However, you can add transformation rules to coerce ContactAddress and ContactPhoneNumbers to
JSON arrays, as shown in the following table.
ContactAddress array_ContactAddress
ContactPhoneNumbers array_ContactPhoneNumbers
{
"_id": "1",
"FirstName": "John",
"LastName": "Doe",
"ContactAddresses": [
"Boston",
"New York"
],
"ContactPhoneNumbers": [
"1111111111",
"2222222222"
]
}
You can download the public key for Amazon DocumentDB as the rds-combined-ca-bundle.pem file from
an AWS-hosted Amazon S3 bucket.
After you download this .pem file, you can import the file into AWS DMS as described following.
• For Certificate identifier, enter a unique name for the certificate, for example docdb-cert.
• For Import file, navigate to the location where you saved the .pem file.
When the settings are as you want them, choose Add new CA certificate.
AWS CLI
Use the aws dms import-certificate command, as shown in the following example.
When you create an AWS DMS target endpoint, provide the certificate identifier (for example, docdb-
cert). Also, set the SSL mode parameter to verify-full.
• If the source record has a column named _id, the value of that column determines the corresponding
_id in the Amazon DocumentDB collection.
• If there is no _id column, but the source data has a primary key or unique index, then AWS DMS uses
that key or index value as the _id for the Amazon DocumentDB collection.
• If the source record doesn't have an _id column, a primary key, or a unique index, then AWS DMS
matches all of the source columns to the corresponding fields in the Amazon DocumentDB collection.
When a new source record is created, AWS DMS writes a corresponding document to Amazon
DocumentDB. If an existing source record is updated, AWS DMS updates the corresponding fields in the
target document in Amazon DocumentDB. Any fields that exist in the target document but not in the
source record remain untouched.
When a source record is deleted, AWS DMS deletes the corresponding document from Amazon
DocumentDB.
Statement that adds a column to a table (ALTER The DDL statement is ignored, and a warning is
TABLE...ADD and similar) issued. When the first INSERT is performed at
the source, the new field is added to the target
document.
Statement that changes the column data type The DDL statement is ignored, and a warning is
(ALTER COLUMN...MODIFY and similar) issued. When the first INSERT is performed at
the source with the new data type, the target
document is created with a field of that new data
type.
• In Amazon DocumentDB, collection names can't contain the dollar symbol ($). In addition, database
names can't contain any Unicode characters.
• AWS DMS doesn't support merging of multiple source tables into a single Amazon DocumentDB
collection.
• When AWS DMS processes changes from a source table that doesn't have a primary key, any LOB
columns in that table are ignored.
• If the Change table option is enabled and AWS DMS encounters a source column named "_id", then
that column appears as "__id" (two underscores) in the change table.
• If you choose Oracle as a source endpoint, then the Oracle source must have full supplemental logging
enabled. Otherwise, if there are columns at the source that weren't changed, then the data is loaded
into Amazon DocumentDB as null values.
BOOLEAN Boolean
DATE Date
DATETIME Date
REAL4 Double
REAL8 Double
BLOB Binary
Important
Before you begin, make sure to launch an Amazon DocumentDB cluster in your default virtual
private cloud (VPC). For more information, see Getting Started in the Amazon DocumentDB
Developer Guide.
Topics
• Step 1: Launch an Amazon EC2 Instance (p. 205)
• Step 2: Install and Configure MongoDB Community Edition (p. 206)
• Step 3: Create an AWS DMS Replication Instance (p. 207)
• Step 4: Create Source and Target Endpoints (p. 208)
• Step 5: Create and Run a Migration Task (p. 209)
a. On the Choose an Amazon Machine Image (AMI) page, at the top of the list of AMIs, go to
Amazon Linux AMI and choose Select.
b. On the Choose an Instance Type page, at the top of the list of instance types, choose t2.micro.
Then choose Next: Configure Instance Details.
c. On the Configure Instance Details page, for Network, choose your default VPC. Then choose
Next: Add Storage.
d. On the Add Storage page, skip this step by choosing Next: Add Tags.
e. On the Add Tags page, skip this step by choosing Next: Configure Security Group.
f. On the Configure Security Group page, do the following:
• If you don't have an Amazon EC2 key pair, choose Create a new key pair and follow the
instructions. You are asked to download a private key file (.pem file). You need this file later when
you log in to your Amazon EC2 instance.
• If you already have an Amazon EC2 key pair, for Select a key pair choose your key pair from
the list. You must already have the private key file (.pem file) available in order to log in to your
Amazon EC2 instance.
API Version API Version 2016-01-01
205
AWS Database Migration Service User Guide
Using Amazon DocumentDB as a Target
In the console navigation pane, choose EC2 Dashboard, and then choose the instance that you
launched. In the lower pane, on the Description tab, find the Public DNS location for your instance,
for example: ec2-11-22-33-44.us-west-2.compute.amazonaws.com.
It takes a few minutes for your Amazon EC2 instance to become available.
5. Use the ssh command to log in to your Amazon EC2 instance, as in the following example.
Specify your private key file (.pem file) and the public DNS name of your EC2 instance. The login ID is
ec2-user. No password is required.
For further details about connecting to your EC instance, see Connecting to Your Linux Instance
Using SSH in the Amazon EC2 User Guide for Linux Instances.
1. Go to Install MongoDB Community Edition on Amazon Linux in the MongoDB documentation and
follow the instructions there.
2. By default, the MongoDB server (mongod) only allows loopback connections from IP address
127.0.0.1 (localhost). To allow connections from elsewhere in your Amazon VPC, do the following:
a. Edit the /etc/mongod.conf file and look for the following lines.
# network interfaces
net:
port: 27017
bindIp: 127.0.0.1 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or,
alternatively, use the net.bindIpAll setting.
bindIp: public-dns-name
c. Replace public-dns-name with the actual public DNS name for your instance, for example
ec2-11-22-33-44.us-west-2.compute.amazonaws.com.
d. Save the /etc/mongod.conf file, and then restart mongod.
a. Use the wget command to download a JSON file containing sample data.
wget http://media.mongodb.org/zips.json
b. Use the mongoimport command to import the data into a new database (zips-db).
API Version API Version 2016-01-01
206
AWS Database Migration Service User Guide
Using Amazon DocumentDB as a Target
c. After the import completes, use the mongo shell to connect to MongoDB and verify that the
data was loaded successfully.
d. Replace public-dns-name with the actual public DNS name for your instance.
e. At the mongo shell prompt, enter the following commands.
use zips-db
db.zips.count()
db.zips.aggregate( [
{ $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } },
{ $group: { _id: "$_id.state", avgCityPop: { $avg: "$pop" } } }
] )
exit
When the settings are as you want them, choose Create replication instance.
Note
You can begin using your replication instance when its status becomes available. This can take
several minutes.
When the settings are as you want them, choose Create endpoint.
Next, you create a target endpoint. This endpoint is for your Amazon DocumentDB cluster, which should
already be running. For more information on launching your Amazon DocumentDB cluster, see Getting
Started in the Amazon DocumentDB Developer Guide.
Important
Before you proceed, do the following:
• Have available the master user name and password for your Amazon DocumentDB cluster.
• Have available the DNS name and port number of your Amazon DocumentDB cluster, so
that AWS DMS can connect to it. To determine this information, use the following AWS CLI
command, replacing cluster-id with the name of your Amazon DocumentDB cluster.
• Download a certificate bundle that Amazon DocumentDB can use to verify SSL connections.
To do this, enter the following command.
wget https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem
When the settings are as you want them, choose Create endpoint.
Now that you've created the source and target endpoints, test them to ensure that they work correctly.
Also, to ensure that AWS DMS can access the database objects at each endpoint, refresh the endpoints'
schemas.
To test an endpoint
If the Status changes to failed instead, review the failure message. Correct any errors that might be
present, and test the endpoint again.
Note
Repeat this procedure for the target endpoint (docdb-target).
To refresh schemas
Note
Repeat this procedure for the target endpoint (docdb-target).
• For Task name, enter a name that's easy to remember, for example my-dms-task.
• For Replication instance, choose the replication instance that you created in Step 3: Create an
AWS DMS Replication Instance (p. 207).
• For Source endpoint, choose the source endpoint that you created in Step 4: Create Source and
Target Endpoints (p. 208).
• For Target endpoint, choose the target endpoint that you created in Step 4: Create Source and
Target Endpoints (p. 208).
• For Migration type, choose Migrate existing data.
• For Start task on create, enable this option.
In the Task Settings section, keep all of the options at their default values.
In the Table mappings section, choose the Guided tab, and then enter the following information:
When the settings are as you want them, choose Create task.
AWS DMS now begins migrating data from MongoDB to Amazon DocumentDB. The task status changes
from Starting to Running. You can monitor the progress by choosing Tasks in the AWS DMS console.
After several minutes, the status changes to Load complete.
Note
After the migration is complete, you can use the mongo shell to connect to your Amazon
DocumentDB cluster and view the zips data. For more information, see Access Your Amazon
DocumentDB Cluster Using the mongo Shell in the Amazon DocumentDB Developer Guide.
The procedure following assumes that you have chosen the AWS DMS console wizard. Note that you can
also do this step by selecting Endpoints from the AWS DMS console's navigation pane and then selecting
Create endpoint. When using the console wizard, you create both the source and target endpoints on
the same page. When not using the console wizard, you create each endpoint separately.
1. On the Connect source and target database endpoints page, specify your connection information
for the source or target database. The following table describes the settings.
Select RDS DB Instance Choose this option if the endpoint is an Amazon RDS DB
instance.
Endpoint identifier Type the name you want to use to identify the endpoint.
You might want to include in the name the type of
endpoint, such as oracle-source or PostgreSQL-
target. The name must be unique for all replication
instances.
Source engine and Target engine Choose the type of database engine that is the endpoint.
User name Type the user name with the permissions required to
allow data migration. For information on the permissions
required, see the security section for the source or target
database engine in this user guide.
Password Type the password for the account with the required
permissions. If you want to use special characters in your
password, such as "+" or "&", enclose the entire password
in curly braces "{}".
Database name The name of the database you want to use as the
endpoint.
2. Choose the Advanced tab, shown following, to set values for connection string and encryption key if
you need them. You can test the endpoint connection by choosing Run test.
Extra connection attributes Type any additional connection parameters here. For
more information about extra connection attributes, see
the documentation section for your data store.
KMS master key Choose the encryption key to use to encrypt replication
storage and connection information. If you choose
(Default) aws/dms, the default AWS Key Management
Service (AWS KMS) key associated with your account
and region is used. For more information on using the
encryption key, see Setting an Encryption Key and
Specifying KMS Permissions (p. 44).
Test endpoint connection (optional) Add the VPC and replication instance name. To test the
connection, choose Run test.
• Before you can create a task, you must create a source endpoint, a target endpoint, and a replication
instance.
• You can specify many task settings to tailor your migration task. You can set these by using the AWS
Management Console, AWS Command Line Interface (AWS CLI), or AWS DMS API. These settings
include specifying how migration errors are handled, error logging, and control table information.
• After you create a task, you can run it immediately. The target tables with the necessary metadata
definitions are automatically created and loaded, and you can specify ongoing replication.
• By default, AWS DMS starts your task as soon as you create it. However, in some situations, you might
want to postpone the start of the task. For example, when using the AWS CLI, you might have a
process that creates a task and a different process that starts the task based on some triggering event.
As needed, you can postpone your task's start.
• You can monitor, stop, or restart tasks using the AWS DMS console, AWS CLI, or AWS DMS API.
The following are actions that you can do when working with an AWS DMS task.
Creating a Task Assessment Report Creating a Task Assessment Report (p. 215)
Creating an Ongoing Replication Task Creating Tasks for Ongoing Replication Using AWS DMS
(p. 239)
You can set up a task to provide
continuous replication between the
source and target.
Applying Task Settings Specifying Task Settings for AWS Database Migration
Service Tasks (p. 224)
Each task has settings that you can
configure according to the needs of
your database migration. You create
Reloading Tables During a Task Reloading Tables During a Task (p. 242)
Managing Task Logs Managing AWS DMS Task Logs (p. 267)
The task assessment report includes a summary that lists the unsupported data types and the column
count for each one. It includes a list of data structures in JSON for each unsupported data type. You can
use the report to modify the source data types and improve the migration success.
There are two levels of unsupported data types. Data types that are shown on the report as “not
supported” can’t be migrated. Data types that are shown on the report as “partially supported” might be
converted to another data type and not migrate as you expect.
{
"summary":{
"task-name":"test15",
"not-supported":{
"data-type": [
"sql-variant"
],
"column-count":3
},
"partially-supported":{
"data-type":[
"float8",
"jsonb"
],
"column-count":2
}
},
"types":[
{
"data-type":"float8",
"support-level":"partially-supported",
"schemas":[
{
"schema-name":"schema1",
"tables":[
{
"table-name":"table1",
"columns":[
"column1",
"column2"
]
},
{
"table-name":"table2",
"columns":[
"column3",
"column4"
]
}
]
},
{
"schema-name":"schema2",
"tables":[
{
"table-name":"table3",
"columns":[
"column5",
"column6"
]
},
{
"table-name":"table4",
"columns":[
"column7",
"column8"
]
}
]
}
]
},
{
"datatype":"int8",
"support-level":"partially-supported",
"schemas":[
{
"schema-name":"schema1",
"tables":[
{
"table-name":"table1",
"columns":[
"column9",
"column10"
]
},
{
"table-name":"table2",
"columns":[
"column11",
"column12"
]
}
]
}
]
}
]
}
You can view the latest task assessment report from the Assessment tab on the Tasks page on the AWS
console. AWS DMS stores previous task assessment reports in an Amazon S3 bucket. The Amazon S3
bucket name is in the following format.
dms-<customerId>-<customerDNS>
The report is stored in the bucket in a folder named with the task name. The report’s file name is the
date of the assessment in the format yyyy-mm-dd-hh-mm. You can view and compare previous task
assessment reports from the Amazon S3 console.
AWS DMS also creates an AWS Identity and Access Management (IAM) role to allow access to the S3
bucket; the role name is dms-access-for-tasks. The role uses the AmazonDMSRedshiftS3Role policy.
You can enable the task assessment feature using the AWS console, the AWS CLI, or the DMS API:
• On the console, choose Task Assessment when creating or modifying a task. To view the task
assessment report using the console, choose the task on the Tasks page and choose the Assessment
results tab in the details section.
• The CLI commands are start-replication-task-assessment to begin a task assessment and
describe-replication-task-assessment-results to receive the task assessment report in
JSON format.
• The AWS DMS API uses the StartReplicationTaskAssessment action to begin a task assessment
and the DescribeReplicationTaskAssessment action to receive the task assessment report in
JSON format.
Creating a Task
There are several things you must do to create an AWS DMS migration task:
• Create a source endpoint, a target endpoint, and a replication instance before you create a migration
task.
• Select a migration method:
• Migrating Data to the Target Database – This process creates files or tables in the target database
and automatically defines the metadata that is required at the target. It also populates the tables
with data from the source. The data from the tables is loaded in parallel for improved efficiency. This
process is the Migrate existing data option in the AWS console and is called Full Load in the API.
• Capturing Changes During Migration – This process captures changes to the source database
that occur while the data is being migrated from the source to the target. When the migration of
the originally requested data has completed, the change data capture (CDC) process then applies
the captured changes to the target database. Changes are captured and applied as units of single
committed transactions, and you can update several different target tables as a single source
commit. This approach guarantees transactional integrity in the target database. This process is
the Migrate existing data and replicate ongoing changes option in the AWS console and is called
full-load-and-cdc in the API.
• Replicating Only Data Changes on the Source Database – This process reads the recovery log
file of the source database management system (DBMS) and groups together the entries for each
transaction. In some cases, AWS DMS can't apply changes to the target within a reasonable time
(for example, if the target is not accessible). In these cases, AWS DMS buffers the changes on the
replication server for as long as necessary. It doesn't reread the source DBMS logs, which can take
a large amount of time. This process is the Replicate data changes only option in the AWS DMS
console.
• Determine how the task should handle large binary objects (LOBs) on the source. For more
information, see Setting LOB Support for Source Databases in a AWS DMS Task (p. 238).
• Specify migration task settings. These include setting up logging, specifying what data is written to
the migration control table, how errors are handled, and other settings. For more information about
task settings, see Specifying Task Settings for AWS Database Migration Service Tasks (p. 224).
• Set up table mapping to define rules to select and filter data that you are migrating. For more
information about table mapping, see Using Table Mapping to Specify Task Settings (p. 245).
Before you specify your mapping, make sure that you review the documentation section on data type
mapping for your source and your target database.
You can choose to start a task as soon as you finish specifying information for that task on the Create
task page. Alternatively, you can start the task from the Dashboard page after you finish specifying task
information.
The procedure following assumes that you have chosen the AWS DMS console wizard and specified
replication instance information and endpoints using the console wizard. You can also do this step by
selecting Tasks from the AWS DMS console's navigation pane and then selecting Create task.
1. On the Create Task page, specify the task options. The following table describes the settings.
Migration type Choose the migration method you want to use. You can
choose to have just the existing data migrated to the
target database or have ongoing changes sent to the
target database in addition to the migrated data.
Start task on create When this option is selected, the task begins as soon as it
is created.
2. Choose the Task Settings tab, shown following, and specify values for your target table, LOB
support, and to enable logging. The task settings shown depend on the Migration type value you
select. For example, when you select Migrate existing data, the following options are shown:
Target table preparation mode Do nothing – In Do nothing mode, AWS DMS assumes
that the target tables have been pre-created on the
target. If the migration is a full load or full load plus
CDC, you must ensure that the target tables are empty
before starting the migration. Do nothing mode is an
appropriate choice for CDC-only tasks when the target
tables have been pre-backfilled from the source, and
ongoing replication is applied to keep the source and
target in-sync. You can use the AWS Schema Conversion
Tool (AWS SCT), to pre-create target tables for you.
Include LOB columns in replication Don't include LOB columns – LOB columns are excluded
from the migration.
Max LOB size (kb) In Limited LOB Mode, LOB columns that exceed the
setting of Max LOB size are truncated to the specified
Max LOB Size.
When you select Migrate existing data and replicate for Migration type, the following options are
shown:
Target table preparation mode Do nothing – Data and metadata of the target tables are
not changed.
Stop task after full load completes Don't stop – Don't stop the task but immediately apply
cached changes and continue on.
Include LOB columns in replication Don't include LOB columns – LOB columns is excluded
from the migration.
Max LOB size (KB) In Limited LOB Mode, LOB columns that exceed the
setting of Max LOB Size are truncated to the specified
Max LOB Size.
3. Choose the Table mappings tab, shown following, to set values for schema mapping and the
mapping method. If you choose Custom, you can specify the target schema and table values. For
more information about table mapping, see Using Table Mapping to Specify Task Settings (p. 245).
4. After you have finished with the task settings, choose Create task.
Topics
• Target Metadata Task Settings (p. 227)
• Full Load Task Settings (p. 228)
• Logging Task Settings (p. 228)
• Parallel Loading of Tables (p. 229)
• Control Table Task Settings (p. 230)
• Stream Buffer Task Settings (p. 232)
• Change Processing Tuning Settings (p. 233)
• Data Validation Task Settings (p. 234)
• Task Settings for Change Processing DDL Handling (p. 234)
• Error Handling Task Settings (p. 234)
• Saving Task Settings (p. 237)
Creating a Task Assessment Report Creating a Task Assessment Report (p. 215)
Creating an Ongoing Replication Task Creating Tasks for Ongoing Replication Using AWS DMS
(p. 239)
You can set up a task to provide
continuous replication between the
source and target.
Applying Task Settings Specifying Task Settings for AWS Database Migration
Service Tasks (p. 224)
Each task has settings that you can
configure according to the needs of
your database migration. You create
these settings in a JSON file or, with
some settings, you can specify the
settings using the AWS DMS console.
Reloading Tables During a Task Reloading Tables During a Task (p. 242)
Managing Task Logs Managing AWS DMS Task Logs (p. 267)
{
"TargetMetadata": {
"TargetSchema": "",
"SupportLobs": true,
"FullLobMode": false,
"LobChunkSize": 64,
"LimitedSizeLobMode": true,
"LobMaxSize": 32,
"BatchApplyEnabled": true
},
"FullLoadSettings": {
"TargetTablePrepMode": "DO_NOTHING",
"CreatePkAfterFullLoad": false,
"StopTaskCachedChangesApplied": false,
"StopTaskCachedChangesNotApplied": false,
"MaxFullLoadSubTasks": 8,
"TransactionConsistencyTimeout": 600,
"CommitRate": 10000
},
"Logging": {
"EnableLogging": false
},
"ControlTablesSettings": {
"ControlSchema":"",
"HistoryTimeslotInMinutes":5,
"HistoryTableEnabled": false,
"SuspendedTablesTableEnabled": false,
"StatusTableEnabled": false
},
"StreamBufferSettings": {
"StreamBufferCount": 3,
"StreamBufferSizeInMB": 8
},
"ChangeProcessingTuning": {
"BatchApplyPreserveTransaction": true,
"BatchApplyTimeoutMin": 1,
"BatchApplyTimeoutMax": 30,
"BatchApplyMemoryLimit": 500,
"BatchSplitSize": 0,
"MinTransactionSize": 1000,
"CommitTimeout": 1,
"MemoryLimitTotal": 1024,
"MemoryKeepTime": 60,
"StatementCacheSize": 50
},
"ChangeProcessingDdlHandlingPolicy": {
"HandleSourceTableDropped": true,
"HandleSourceTableTruncated": true,
"HandleSourceTableAltered": true
},
"ValidationSettings": {
"EnableValidation": false,
"ThreadCount": 5
},
"ErrorBehavior": {
"DataErrorPolicy": "LOG_ERROR",
"DataTruncationErrorPolicy":"LOG_ERROR",
"DataErrorEscalationPolicy":"SUSPEND_TABLE",
"DataErrorEscalationCount": 50,
"TableErrorPolicy":"SUSPEND_TABLE",
"TableErrorEscalationPolicy":"STOP_TASK",
"TableErrorEscalationCount": 50,
"RecoverableErrorCount": 0,
"RecoverableErrorInterval": 5,
"RecoverableErrorThrottling": true,
"RecoverableErrorThrottlingMax": 1800,
"ApplyErrorDeletePolicy":"IGNORE_RECORD",
"ApplyErrorInsertPolicy":"LOG_ERROR",
"ApplyErrorUpdatePolicy":"LOG_ERROR",
"ApplyErrorEscalationPolicy":"LOG_ERROR",
"ApplyErrorEscalationCount": 0,
"FullLoadIgnoreConflicts": true
}
• TargetSchema – The target table schema name. If this metadata option is empty, the schema from
the source table is used. AWS DMS automatically adds the owner prefix for the target database to
all tables if no source schema is defined. This option should be left empty for MySQL-type target
endpoints.
• LOB settings – Settings that determine how large objects (LOBs) are managed. If you set
SupportLobs=true, you must set one of the following to true:
• FullLobMode – If you set this option to true, then you must enter a value for the LobChunkSize
option. Enter the size, in kilobytes, of the LOB chunks to use when replicating the data to the target.
The FullLobMode option works best for very large LOB sizes but tends to cause slower loading.
• InlineLobMaxSize – This value determines which LOBs AWS Database Migration Service
transfers inline during a full load. Transferring small LOBs is more efficient than looking them
up from a source table. During a full load, AWS Database Migration Service checks all LOBs and
performs an inline transfer for the LOBs that are smaller than InlineLobMaxSize. AWS Database
Migration Service transfers all LOBs larger than the InlineLobMaxSize in FullLobMode. The
default value for InlineLobMaxSize is 0 and the range is 1 kilobyte–2 gigabyte. Set a value for
InlineLobMaxSize only if you know that most of the LOBs are smaller than the value specified in
InlineLobMaxSize.
• LimitedSizeLobMode – If you set this option to true, then you must enter a value for the
LobMaxSize option. Enter the maximum size, in kilobytes, for an individual LOB.
• LoadMaxFileSize – An option for PostgreSQL and MySQL target endpoints that defines the
maximum size on disk of stored, unloaded data, such as CSV files. This option overrides the connection
attribute. You can provide values from 0, which indicates that this option doesn't override the
connection attribute, to 100,000 KB.
• BatchApplyEnabled – Determines if each transaction is applied individually or if changes are
committed in batches. The default value is false.
When LOB columns are included in the replication, BatchApplyEnabledcan only be used in Limited-
size LOB mode.
• ParallelLoadThreads – Specifies the number of threads AWS DMS uses to load each table into the
target database. The maximum value for a MySQL target is 16; the maximum value for a DynamoDB
target is 32. The maximum limit can be increased upon request.
• ParallelLoadBufferSize – Specifies the maximum number of records to store in the buffer used
by the parallel load threads to load data to the target. The default value is 50. Maximum value is 1000.
This field is currently only valid when DynamoDB is the target. This parameter should be used with
ParallelLoadThreads and is valid only when ParallelLoadThreads > 1.
API Version API Version 2016-01-01
227
AWS Database Migration Service User Guide
Task Settings
• To indicate how to handle loading the target at full-load startup, specify one of the following values
for the TargetTablePrepMode option:
• DO_NOTHING – Data and metadata of the existing target table are not affected.
• DROP_AND_CREATE – The existing table is dropped and a new table is created in its place.
• TRUNCATE_BEFORE_LOAD – Data is truncated without affecting the table metadata.
• To delay primary key or unique index creation until after full load completes, set the
CreatePkAfterFullLoad option.
When this option is selected, you cannot resume incomplete full load tasks.
• For full load and CDC-enabled tasks, you can set the following Stop task after full load
completes options:
• StopTaskCachedChangesApplied – Set this option to true to stop a task after a full load
completes and cached changes are applied.
• StopTaskCachedChangesNotApplied – Set this option to true to stop a task before cached
changes are applied.
• MaxFullLoadSubTasks – Set this option to indicate the maximum number of tables to load in
parallel. The default is 8; the maximum value is 50.
• To set the number of seconds that AWS DMS waits for transactions to close before
beginning a full-load operation, if transactions are open when the task starts, set the
TransactionConsistencyTimeout option. The default value is 600 (10 minutes). AWS DMS begins
the full load after the timeout value is reached, even if there are open transactions. A full-load-only
task doesn't wait for 10 minutes but instead starts immediately.
• To indicate the maximum number of events that can be transferred together, set the CommitRate
option.
There are several ways to enable Amazon CloudWatch logging. You can select the EnableLogging
option on the AWS Management Console when you create a migration task or set the EnableLogging
option to true when creating a task using the AWS DMS API. You can also specify "EnableLogging":
true in the JSON of the logging section of task settings.
To delete the task logs, you can specify "DeleteTaskLogs": true in the JSON of the logging section
of task settings.
After you specify a component activity, you can then specify the amount of information that is logged.
The following list is in order from the lowest level of information to the highest level of information. The
higher levels always include information from the lower levels. These severity values include:
For example, the following JSON section gives task settings for logging for all component activities.
…
"Logging": {
"EnableLogging": true,
"LogComponents": [{
"Id": "SOURCE_UNLOAD",
"Severity": "LOGGER_SEVERITY_DEFAULT"
},{
"Id": "SOURCE_CAPTURE",
"Severity": "LOGGER_SEVERITY_DEFAULT"
},{
"Id": "TARGET_LOAD",
"Severity": "LOGGER_SEVERITY_DEFAULT"
},{
"Id": "TARGET_APPLY",
"Severity": "LOGGER_SEVERITY_INFO"
},{
"Id": "TASK_MANAGER",
"Severity": "LOGGER_SEVERITY_DEBUG"
}]
},
…
To use parallel loading, you create a rule of type table-settings with the parallel-load option.
Within the table-settings rule, you specify the selection criteria for the table or tables that you want
to load in parallel. To specify the selection criteria, set the type element for parallel-load to one of
the following:
• partitions-auto
• subpartitions-auto
• none
The following example illustrates how to create a table-settings rule to load table partitions in
parallel.
{
"rules": [{
"rule-type": "table-settings",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "table1"
},
"parallel-load": {
"type": "partitions-auto"
}
}]
}
• Replication Status (dmslogs.awsdms_status) – This table provides details about the current task.
These include task status, amount of memory consumed by the task, and the number of changes not
yet applied to the target. This table also gives the position in the source database where AWS DMS is
currently reading and indicates if the task is a full load or change data capture (CDC).
• Suspended Tables (dmslogs.awsdms_suspended_tables) – This table provides a list of suspended
tables as well as the reason they were suspended.
• Replication History (dmslogs.awsdms_history) – This table provides information about replication
history. This information includes the number and volume of records processed during the task,
latency at the end of a CDC task, and other statistics.
• FULL LOAD
• CHANGE PROCESSING (CDC)
The Replication Status (dmslogs.awsdms_status) table contains the current status of the task and the
target database. It has the following settings:
• FULL LOAD
• CHANGE PROCESSING (CDC)
• ControlSchema – Use this option to indicate the database schema name for the AWS DMS target
Control Tables. If you do not enter any information in this field, then the tables are copied to the
default location in the database.
• HistoryTimeslotInMinutes – Use this option to indicate the length of each time slot in the
Replication History table. The default is 5 minutes.
• StreamBufferCount – Use this option to specify the number of data stream buffers for the
migration task. The default stream buffer number is 3. Increasing the value of this setting might
increase the speed of data extraction. However, this performance increase is highly dependent on the
migration environment, including the source system and instance class of the replication server. The
default is sufficient for most situations.
• StreamBufferSizeInMB – Use this option to indicate the maximum size of each data stream
buffer. The default size is 8 MB. You might need to increase the value for this option when you
work with very large LOBs. You also might need to increase the value if you receive a message in
the log files that the stream buffer size is insufficient. When calculating the size of this option, you
can use the following equation: [Max LOB size (or LOB chunk size)]*[number of LOB
The following settings apply only when the target metadata parameter BatchApplyEnabled is set to
true.
If set to false, there can be temporary lapses in transactional integrity to improve performance.
There is no guarantee that all the changes within a transaction from the source are applied to the
target in a single batch.
• BatchApplyTimeoutMin – Sets the minimum amount of time in seconds that AWS DMS waits
between each application of batch changes. The default value is 1.
• BatchApplyTimeoutMax – Sets the maximum amount of time in seconds that AWS DMS waits
between each application of batch changes before timing out. The default value is 30.
• BatchApplyMemoryLimit – Sets the maximum amount of memory in (MB) to use for pre-processing
in Batch optimized apply mode. The default value is 500.
• BatchSplitSize – Sets the maximum number of changes applied in a single batch. The default value
0, meaning there is no limit applied.
The following settings apply only when the target metadata parameter BatchApplyEnabled is set to
false.
• MinTransactionSize – Sets the minimum number of changes to include in each transaction. The
default value is 1000.
• CommitTimeout – Sets the maximum time in seconds for AWS DMS to collect transactions in batches
before declaring a timeout. The default value is 1.
• HandleSourceTableAltered – Set this option to true to alter the target table when the source
table is altered.
AWS DMS attempts to keep transaction data in memory until the transaction is fully committed to the
source and/or the target. However, transactions that are larger than the allocated memory or that are
not committed within the specified time limit are written to disk.
The following settings apply to change processing tuning regardless of the change processing mode.
• MemoryLimitTotal – Sets the maximum size (in MB) that all transactions can occupy in memory
before being written to disk. The default value is 1024.
• MemoryKeepTime – Sets the maximum time in seconds that each transaction can stay in memory
before being written to disk. The duration is calculated from the time that AWS DMS started capturing
the transaction. The default value is 60.
• StatementCacheSize – Sets the maximum number of prepared statements to store on the server for
later execution when applying changes to the target. The default value is 50. The maximum value is
200.
"ValidationSettings": {
"EnableValidation": true,
"ThreadCount": 5
}
For an Oracle endpoint, AWS DMS uses DBMS_CRYPTO to validate BLOBs. If your Oracle endpoint uses
BLOBs, then you must grant the execute permission on dbms_crypto to the user account that is used to
access the Oracle endpoint. You can do this by running the following statement.
• HandleSourceTableDropped – Set this option to true to drop the target table when the source
table is dropped
• HandleSourceTableTruncated – Set this option to true to truncate the target table when the
source table is truncated
• HandleSourceTableAltered – Set this option to true to alter the target table when the source
table is altered.
• DataErrorPolicy – Determines the action AWS DMS takes when there is an error related to data
processing at the record level. Some examples of data processing errors include conversion errors,
errors in transformation, and bad data. The default is LOG_ERROR.
• IGNORE_RECORD – The task continues and the data for that record is ignored. The error counter for
the DataErrorEscalationCount property is incremented. Thus, if you set a limit on errors for a
table, this error counts toward that limit.
• LOG_ERROR – The task continues and the error is written to the task log.
• SUSPEND_TABLE – The task continues but data from the table with the error record is moved into an
error state and the data is not replicated.
• STOP_TASK – The task stops and manual intervention is required.
• DataTruncationErrorPolicy – Determines the action AWS DMS takes when data is truncated. The
default is LOG_ERROR.
• IGNORE_RECORD – The task continues and the data for that record is ignored. The error counter for
the DataErrorEscalationCount property is incremented. Thus, if you set a limit on errors for a
table, this error counts toward that limit.
• LOG_ERROR – The task continues and the error is written to the task log.
• SUSPEND_TABLE – The task continues but data from the table with the error record is moved into an
error state and the data is not replicated.
• STOP_TASK – The task stops and manual intervention is required.
• DataErrorEscalationPolicy – Determines the action AWS DMS takes when the maximum
number of errors (set in the DataErrorsEscalationCount parameter) is reached. The default is
SUSPEND_TABLE.
• SUSPEND_TABLE – The task continues but data from the table with the error record is moved into an
error state and the data is not replicated.
• STOP_TASK – The task stops and manual intervention is required.
• DataErrorEscalationCount – Sets the maximum number of errors that can occur to the data for
a specific record. When this number is reached, the data for the table that contains the error record is
handled according to the policy set in the DataErrorEscalationCount. The default is 0.
• TableErrorPolicy – Determines the action AWS DMS takes when an error occurs when processing
data or metadata for a specific table. This error only applies to general table data and is not an error
that relates to a specific record. The default is SUSPEND_TABLE.
• SUSPEND_TABLE – The task continues but data from the table with the error record is moved into an
error state and the data is not replicated.
• STOP_TASK – The task stops and manual intervention is required.
• TableErrorEscalationPolicy – Determines the action AWS DMS takes when the maximum
number of errors (set using the TableErrorEscalationCount parameter). The default and only
user setting is STOP_TASK, where the task is stopped and manual intervention is required.
• TableErrorEscalationCount – The maximum number of errors that can occur to the general
data or metadata for a specific table. When this number is reached, the data for the table is handled
according to the policy set in the TableErrorEscalationPolicy. The default is 0.
• RecoverableErrorCount – The maximum number of attempts made to restart a task when an
environmental error occurs. After the system attempts to restart the task the designated number of
times, the task is stopped and manual intervention is required. The default value is -1, which instructs
AWS DMS to attempt to restart the task indefinitely. Set this value to 0 to never attempt to restart a
task. If a fatal error occurs, AWS DMS stops attempting to restart the task after six attempts.
• RecoverableErrorInterval – The number of seconds that AWS DMS waits between attempts to
restart a task. The default is 5.
• RecoverableErrorThrottling – When enabled, the interval between attempts to restart a task is
increased each time a restart is attempted. The default is true.
This approach doesn't work with PostgreSQL or any other source endpoint that doesn't replicate DDL
table truncation. API Version API Version 2016-01-01
236
AWS Database Migration Service User Guide
Task Settings
• FailOnNoTablesCaptured – Set this to true to cause a task to fail when the transformation rules
defined for a task find no tables when the task starts. The default is false.
• FailOnTransactionConsistencyBreached – This option applies to tasks using Oracle as a source
with CDC. Set this to true to cause a task to fail when a transaction is open for more time than the
specified timeout and could be dropped.
When a CDC task starts with Oracle, AWS DMS waits for a limited time for the oldest open transaction
to close before starting CDC. If the oldest open transaction doesn't close until the timeout is reached,
then we normally start CDC anyway, ignoring that transaction. If this setting is set to true, the task
fails.
• FullLoadIgnoreConflicts – Set this to false to have AWS DMS ignore "zero rows affected" and
"duplicates" errors when applying cached events. If set to true, AWS DMS reports all errors instead of
ignoring them. The default is false.
For example, the following JSON file contains settings saved for a task.
{
"TargetMetadata": {
"TargetSchema": "",
"SupportLobs": true,
"FullLobMode": false,
"LobChunkSize": 64,
"LimitedSizeLobMode": true,
"LobMaxSize": 32,
"BatchApplyEnabled": true
},
"FullLoadSettings": {
"TargetTablePrepMode": "DO_NOTHING",
"CreatePkAfterFullLoad": false,
"StopTaskCachedChangesApplied": false,
"StopTaskCachedChangesNotApplied": false,
"MaxFullLoadSubTasks": 8,
"TransactionConsistencyTimeout": 600,
"CommitRate": 10000
},
"Logging": {
"EnableLogging": false
},
"ControlTablesSettings": {
"ControlSchema":"",
"HistoryTimeslotInMinutes":5,
"HistoryTableEnabled": false,
"SuspendedTablesTableEnabled": false,
"StatusTableEnabled": false
},
"StreamBufferSettings": {
"StreamBufferCount": 3,
"StreamBufferSizeInMB": 8
},
"ChangeProcessingTuning": {
"BatchApplyPreserveTransaction": true,
"BatchApplyTimeoutMin": 1,
"BatchApplyTimeoutMax": 30,
"BatchApplyMemoryLimit": 500,
"BatchSplitSize": 0,
"MinTransactionSize": 1000,
"CommitTimeout": 1,
"MemoryLimitTotal": 1024,
"MemoryKeepTime": 60,
"StatementCacheSize": 50
},
"ChangeProcessingDdlHandlingPolicy": {
"HandleSourceTableDropped": true,
"HandleSourceTableTruncated": true,
"HandleSourceTableAltered": true
},
"ErrorBehavior": {
"DataErrorPolicy": "LOG_ERROR",
"DataTruncationErrorPolicy":"LOG_ERROR",
"DataErrorEscalationPolicy":"SUSPEND_TABLE",
"DataErrorEscalationCount": 50,
"TableErrorPolicy":"SUSPEND_TABLE",
"TableErrorEscalationPolicy":"STOP_TASK",
"TableErrorEscalationCount": 50,
"RecoverableErrorCount": 0,
"RecoverableErrorInterval": 5,
"RecoverableErrorThrottling": true,
"RecoverableErrorThrottlingMax": 1800,
"ApplyErrorDeletePolicy":"IGNORE_RECORD",
"ApplyErrorInsertPolicy":"LOG_ERROR",
"ApplyErrorUpdatePolicy":"LOG_ERROR",
"ApplyErrorEscalationPolicy":"LOG_ERROR",
"ApplyErrorEscalationCount": 0,
"FullLoadIgnoreConflicts": true
}
}
When you migrate data from one database to another, you might take the opportunity to rethink how
your LOBs are stored, especially for heterogeneous migrations. If you want to do so, there’s no need to
migrate the LOB data.
If you decide to include LOBs, you can then decide the other LOB settings:
than other methods. The maximum size of a VARCHAR in Oracle is 64 K. Therefore, a limited
LOB size of less than 64 K is optimal when Oracle is your source database.
• When a task is configured to run in limited LOB mode, the Max LOB size (K) option sets the maximum
size LOB that AWS DMS accepts. Any LOBs that are larger than this value is truncated to this value.
• When a task is configured to use full LOB mode, AWS DMS retrieves LOBs in pieces. The LOB chunk
size (K) option determines the size of each piece. When setting this option, pay particular attention to
the maximum packet size allowed by your network configuration. If the LOB chunk size exceeds your
maximum allowed packet size, you might see disconnect errors.
Some reasons to create multiple tasks for a migration include the following:
• The target tables for the tasks reside on different databases, such as when you are fanning out or
breaking a system into multiple systems.
• You want to break the migration of a large table into multiple tasks by using filtering.
Note
Because each task has its own change capture and log reading process, changes are not
coordinated across tasks. Therefore, when using multiple tasks to perform a migration, make
sure that source transactions are wholly contained within a single task.
Each source engine has specific configuration requirements for exposing this change stream to a given
user account. Most engines require some additional configuration to make it possible for the capture
process to consume the change data in a meaningful way, without data loss. For example, Oracle requires
the addition of supplemental logging, and MySQL requires row-level binary logging (bin logging).
To read ongoing changes from the source database, AWS DMS uses engine-specific API actions to read
changes from the source engine’s transaction logs. Following are some examples of how AWS DMS does
that:
• For Oracle, AWS DMS uses either the Oracle LogMiner API or binary reader API (bfile API) to read
ongoing changes. AWS DMS reads ongoing changes from the online or archive redo logs based on the
system change number (SCN).
• For Microsoft SQL Server, AWS DMS uses MS-Replication or MS-CDC to write information to the SQL
Server transaction log. It then uses the fn_dblog() or fn_dump_dblog() function in SQL Server to
read the changes in the transaction log based on the log sequence number (LSN).
• For MySQL, AWS DMS reads changes from the row-based binary logs (binlogs) and migrates those
changes to the target.
• For PostgreSQL, AWS DMS sets up logical replication slots and uses the test_decoding plugin to
read changes from the source and migrate them to the target.
• For Amazon RDS as a source, we recommend ensuring that backups are enabled to setup CDC. We also
recommend ensuring that the source database is configured to retain change logs for a sufficient time
—24 hours is usually enough.
• Full load plus CDC – The task migrates existing data and then updates the target database based on
changes to the source database.
• CDC only – The task migrates ongoing changes after you have data on your target database.
• From a custom CDC start time – You can use the AWS Management Console or AWS CLI to provide
AWS DMS with a timestamp where you want the replication to start. AWS DMS then starts an ongoing
replication task from this custom CDC start time. AWS DMS converts the given timestamp (in UTC) to a
native start point, such as an LSN for SQL Server or an SCN for Oracle. AWS DMS uses engine-specific
methods to determine where exactly to start the migration task based on the source engine’s change
stream.
Note
PostgreSQL as a source doesn't support a custom CDC start time. This is because the
PostgreSQL database engine doesn't have a way to map a timestamp to an LSN or SCN as
Oracle and SQL Server do.
• From a CDC native start point – You can also start from a native point in the source engine’s
transaction log. In some cases, you might prefer this approach because a timestamp can indicate
multiple native points in the transaction log. AWS DMS supports this feature for the following source
endpoints:
• SQL Server
• Oracle
• MySQL
Following are examples of how you can find the CDC native start point from a supported source engine:
SQL Server
• Slot number
To get the start point for a SQL Server migration task based on your transaction log backup settings,
use the fn_dblog() or fn_dump_dblog() function in SQL Server.
To use CDC native start point with SQL Server, create a publication on any table participating in
ongoing replication. For more information about creating a publication, see Creating a SQL Server
Publication for Ongoing Replication (p. 103). AWS DMS creates the publication automatically when
you use CDC without using a CDC native start point.
Oracle
A system change number (SCN) is a logical, internal time stamp used by Oracle databases. SCNs
order events that occur within the database, which is necessary to satisfy the ACID properties of a
transaction. Oracle databases use SCNs to mark the location where all changes have been written to
disk so that a recovery action doesn't apply already written changes. Oracle also uses SCNs to mark
the point where no redo exists for a set of data so that recovery can stop. For more information
about Oracle SCNs, see the Oracle documentation.
To get the current SCN in an Oracle database, run the following command:
MySQL
Before the release of MySQL version 5.6.3, the log sequence number (LSN) for MySQL was a 4-byte
unsigned integer. In MySQL version 5.6.3, when the redo log file size limit increased from 4 GB to
512 GB, the LSN became an 8-byte unsigned integer. The increase reflects that additional bytes
were required to store extra size information. Applications built on MySQL 5.6.3 or later that use
LSN values should use 64-bit rather than 32-bit variables to store and compare LSN values. For more
information about MySQL LSNs, see the MySQL documentation.
To get the current LSN in a MySQL database, run the following command:
The query returns a binlog file name, the position, and several other values. The CDC native
start point is a combination of the binlogs file name and the position, for example mysql-bin-
changelog.000024:373. In this example, mysql-bin-changelog.000024 is the binlogs file
name and 373 is the position where AWS DMS needs to start capturing changes.
You can get the checkpoint information in one of the following two ways:
• Run the API command DescribeReplicationTasks and view the results. You can filter the
information by task and search for the checkpoint. You can retrieve the latest checkpoint when the
task is in stopped or failed state.
• View the metadata table named awsdms_txn_state on the target instance. You can query the table
to get checkpoint information. To create the metadata table, set the TaskRecoveryTableEnabled
parameter to Yes when you create a task. This setting causes AWS DMS to continuously write
checkpoint information to the target metadata table. This information is lost if a task is deleted.
You can modify a task and set a time in UTC to stop as required. The task automatically stops based
on the commit or server time that you set. Additionally, if you know an appropriate time to stop the
migration task at task creation, you can set a stop time when you create the task.
Modifying a Task
You can modify a task if you need to change the task settings, table mapping, or other settings. You
modify a task in the DMS console by selecting the task and choosing Modify. You can also use the AWS
CLI or AWS DMS API command ModifyReplicationTask.
1. Sign in to the AWS Management Console and choose AWS DMS. If you are signed in as an AWS
Identity and Access Management (IAM) user, you must have the appropriate permissions to access
AWS DMS. For more information on the permissions required, see IAM Permissions Needed to Use
AWS DMS (p. 31).
2. Choose Tasks from the navigation pane.
3. Choose the running task that has the table you want to reload.
4. Choose the Table Statistics tab.
5. Choose the table you want to reload. If the task is no longer running, you can't reload the table.
6. Choose Reload table data.
When AWS DMS is preparing to reload a table, the console changes the table status to Table is being
reloaded.
You can include transformations in a table mapping after you have specified at least one selection rule.
You can use transformations to rename a schema or table, add a prefix or suffix to a schema or table, or
remove a table column.
The following example shows how to set up selection rules for a table called Customers in a schema
called EntertainmentAgencySample. You create selection rules and transformations on the Guided
tab. This tab appears only when you have a source endpoint that has schema and table information.
To specify a table selection, filter criteria, and transformations using the AWS console
1. Sign in to the AWS Management Console and choose AWS DMS. If you are signed in as an AWS
Identity and Access Management (IAM) user, you must have the appropriate permissions to access
AWS DMS. For more information on the permissions required, see IAM Permissions Needed to Use
AWS DMS (p. 31).
2. On the Dashboard page, choose Tasks.
3. Choose Create Task.
4. Enter the task information, including Task name, Replication instance, Source endpoint, Target
endpoint, and Migration type. Choose Guided from the Table mappings section.
5. In the Table mapping section, choose the schema name and table name. You can use "%" as a
wildcard value when specifying the table name. Specify the action to be taken, to include or exclude
data defined by the filter.
6. Specify filter information using the Add column filter and the Add condition links.
The following example shows a filter for the Customers table that includes AgencyIDs between
01 and 85.
7. When you have created the selections you want, choose Add selection rule.
8. After you have created at least one selection rule, you can add a transformation to the task. Choose
add transformation rule.
9. Choose the target that you want to transform, and enter the additional information requested.
The following example shows a transformation that deletes the AgencyStatus column from the
Customer table.
You can specify what tables or schemas you want to work with, and you can perform schema and table
transformations. You create table mapping rules using the selection and transformation rule types.
load-order A positive integer. The maximum value Indicates the priority for loading tables.
is 2147483647. Tables with higher values are loaded
first.
The following example migrates all tables from a schema named Test in your source to your target
endpoint.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "Test",
"table-name": "%"
},
"rule-action": "include"
}
]
}
The following example migrates all tables except those starting with DMS from a schema named Test in
your source to your target endpoint.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "Test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "selection",
"rule-id": "2",
"rule-name": "2",
"object-locator": {
"schema-name": "Test",
"table-name": "DMS%"
},
"rule-action": "exclude"
}
]
}
The following example migrates all tables from a schema named Test in your source to your target
endpoint.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "Test",
"table-name": "%"
},
"rule-action": "include"
}
]
}
The following example migrates two tables. Table loadfirst (with priority 2) is migrated before table
loadsecond.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "Test",
"table-name": "loadfirst"
},
"rule-action": "include",
"load-order": "2"
},
{
"rule-type": "selection",
"rule-id": "2",
"rule-name": "2",
"object-locator": {
"schema-name": "Test",
"table-name": "loadsecond"
},
"rule-action": "include",
"load-order": "1"
}
]
}
For table mapping rules that use the transformation rule type, the following values can be applied.
object-locator schema-name The name of the The schema and table the rule applies
schema. to.
value An alpha-numeric value that follows The new value for actions that require
the naming rules for the target type. input, such as rename.
old-value An alpha-numeric value that follows The old value for actions that require
the naming rules for the target type. replacement, such as replace-
prefix.
The following example renames a schema from Test in your source to Test1 in your target endpoint.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "Test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "rename",
"rule-target": "schema",
"object-locator": {
"schema-name": "Test"
},
"value": "Test1"
}
]
}
The following example renames a table from Actor in your source to Actor1 in your target endpoint.
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "Test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "rename",
"rule-target": "table",
"object-locator": {
"schema-name": "Test",
"table-name": "Actor"
},
"value": "Actor1"
}
]
}
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "transformation",
"rule-id": "4",
"rule-name": "4",
"rule-action": "rename",
"rule-target": "column",
"object-locator": {
"schema-name": "test",
"table-name": "Actor",
"column-name" : "first_name"
},
"value": "fname"
}
]
}
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
}, {
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "remove-column",
"rule-target": "column",
"object-locator": {
"schema-name": "test",
"table-name": "Actor",
"column-name": "col%"
}
}]
}
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
}, {
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "convert-lowercase",
"rule-target": "table",
"object-locator": {
"schema-name": "test",
"table-name": "ACTOR"
}
}]
}
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "convert-uppercase",
"rule-target": "column",
"object-locator": {
"schema-name": "%",
"table-name": "%",
"column-name": "%"
}
}
]
}
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
}, {
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "add-prefix",
"rule-target": "table",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"value": "DMS_"
}]
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "replace-prefix",
"rule-target": "column",
"object-locator": {
"schema-name": "%",
"table-name": "%",
"column-name": "%"
},
"value": "NewPre_",
"old-value": "Pre_"
}
]
}
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"rule-action": "include"
}, {
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "remove-suffix",
"rule-target": "table",
"object-locator": {
"schema-name": "test",
"table-name": "%"
},
"value": "_DMS"
}]
}
{
"rules": [{
"rule-type": "table-settings",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "table1"
},
"parallel-load": {
"type": "partitions-auto"
}
}]
}
The following filter replicates all employees where empstartdate >= January 1, 2002 to the
target database.
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "employee"
},
"rule-action": "include",
"filters": [{
"filter-type": "source",
"column-name": "empstartdate",
"filter-conditions": [{
"filter-operator": "gte",
"value": "2002-01-01"
}]
}]
}]
}
The following table shows the parameters used for source filtering.
Parameter Value
filter-type source
column-name The name of the source column you want the filter applied to. The
name is case-sensitive.
filter-conditions
The following examples show some common ways to use source filters.
The following filter replicates all employees where empid >= 100 to the target database.
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "employee"
},
"rule-action": "include",
"filters": [{
"filter-type": "source",
"column-name": "empid",
"filter-conditions": [{
"filter-operator": "gte",
"value": "100"
}]
}]
}]
}
The following filter applies multiple filter operators to a single column of data. The filter replicates all
employees where (empid <=10) OR (empid is between 50 and 75) OR (empid >= 100) to the
target database.
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "employee"
},
"rule-action": "include",
"filters": [{
"filter-type": "source",
"column-name": "empid",
"filter-conditions": [{
"filter-operator": "ste",
"value": "10"
}, {
"filter-operator": "between",
"start-value": "50",
"end-value": "75"
}, {
"filter-operator": "gte",
"value": "100"
}]
}]
}]
}
The following filter applies multiple filters to two columns in a table. The filter replicates all employees
where (empid <= 100) AND (dept= tech) to the target database.
{
"rules": [{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "1",
"object-locator": {
"schema-name": "test",
"table-name": "employee"
},
"rule-action": "include",
"filters": [{
"filter-type": "source",
"column-name": "empid",
"filter-conditions": [{
"filter-operator": "ste",
"value": "100"
}]
}, {
"filter-type": "source",
"column-name": "dept",
"filter-conditions": [{
"filter-operator": "eq",
"value": "tech"
}]
}]
}]
}
You can also monitor the progress of your tasks using Amazon CloudWatch. By using the AWS
Management Console, the AWS Command Line Interface (CLI), or AWS DMS API, you can monitor the
progress of your task and also the resources and network connectivity used.
Finally, you can monitor the status of your source tables in a task by viewing the table state.
Note that the "last updated" column the DMS console only indicates the time that AWS DMS last
updated the table statistics record for a table. It does not indicate the time of the last update to the
table.
Topics
• Task Status (p. 261)
• Table State During Tasks (p. 262)
• Monitoring Replication Tasks Using Amazon CloudWatch (p. 263)
• Data Migration Service Metrics (p. 265)
• Managing AWS DMS Task Logs (p. 267)
• Logging AWS DMS API Calls with AWS CloudTrail (p. 268)
Task Status
The task status indicated the condition of the task. The following table shows the possible statuses a
task can have:
Deleting The task is being deleted, usually from a request for user
intervention.
Failed The task has failed. See the task log files for more
information.
Ready The task is ready to run. This status usually follows the
"creating" status.
Modifying The task is being modified, usually due to a user action that
modified the task settings.
The task status bar gives an estimation of the task's progress. The quality of this estimate depends on
the quality of the source database’s table statistics; the better the table statistics, the more accurate the
estimation. For tasks with only one table that has no estimated rows statistic, we are unable to provide
any kind of percentage complete estimate. In this case, the task state and the indication of rows loaded
can be used to confirm that the task is indeed running and making progress.
State Description
Table does not exist AWS DMS cannot find the table on the source endpoint.
Before load The full load process has been enabled, but it hasn't started
yet.
State Description
The AWS DMS console shows basic CloudWatch statistics for each task, including the task status, percent
complete, elapsed time, and table statistics, as shown following. Select the replication task and then
select the Task monitoring tab.
The AWS DMS console shows performance statistics for each table, including the number of inserts,
deletions, and updates, when you select the Table statistics tab.
In addition, if you select a replication instance from the Replication Instance page, you can view
performance metrics for the instance by selecting the Monitoring tab.
• Host Metrics – Performance and utilization statistics for the replication host, provided by Amazon
CloudWatch. For a complete list of the available metrics, see Replication Instance Metrics (p. 265).
• Replication Task Metrics – Statistics for replication tasks including incoming and committed changes,
and latency between the replication host and both the source and target databases. For a complete list
of the available metrics, see Replication Task Metrics (p. 266).
• Table Metrics – Statistics for tables that are in the process of being migrated, including the number of
insert, update, delete, and DDL statements completed.
Task metrics are divided into statistics between the replication host and the source endpoint, and
statistics between the replication host and the target endpoint. You can determine the total statistic for
a task by adding two related statistics together. For example, you can determine the total latency, or
replica lag, for a task by combining the CDCLatencySource and CDCLatencyTarget values.
Task metric values can be influenced by current activity on your source database. For example, if a
transaction has begun, but has not been committed, then the CDCLatencySource metric continues to
grow until that transaction has been committed.
For the replication instance, the FreeableMemory metric requires clarification. Freeable memory is not a
indication of the actual free memory available. It is the memory that is currently in use that can be freed
and used for other uses; it's is a combination of buffers and cache in use on the replication instance.
While the FreeableMemory metric does not reflect actual free memory available, the combination of the
FreeableMemory and SwapUsage metrics can indicate if the replication instance is overloaded.
If you see either of these two conditions, they indicate that you should consider moving to a larger
replication instance. You should also consider reducing the number and type of tasks running on the
replication instance. Full Load tasks require more memory than tasks that just replicate changes.
CPUUtilization
Units: Bytes
FreeStorageSpace
Units: Bytes
FreeableMemory
Units: Bytes
WriteIOPS
Units: Count/Second
ReadIOPS
Units: Count/Second
WriteThroughput
Units: Bytes/Second
ReadThroughput
Units: Bytes/Second
WriteLatency
The average amount of time taken per disk I/O (output) operation.
Units: Milliseconds
ReadLatency
The average amount of time taken per disk I/O (input) operation.
Units: Milliseconds
SwapUsage
Units: Bytes
NetworkTransmitThroughput
The outgoing (Transmit) network traffic on the replication instance, including both customer
database traffic and AWS DMS traffic used for monitoring and replication.
Units: Bytes/second
NetworkReceiveThroughput
The incoming (Receive) network traffic on the replication instance, including both customer database
traffic and AWS DMS traffic used for monitoring and replication.
Units: Bytes/second
FullLoadThroughputBandwidthSource
Incoming network bandwidth from a full load from the source in kilobytes (KB) per second.
FullLoadThroughputBandwidthTarget
Outgoing network bandwidth from a full load for the target in KB per second.
FullLoadThroughputRowsSource
Incoming changes from a full load from the source in rows per second.
FullLoadThroughputRowsTarget
Outgoing changes from a full load for the target in rows per second.
CDCIncomingChanges
The total number of change events at a point-in-time that are waiting to be applied to the target.
Note that this is not the same as a measure of the transaction change rate of the source endpoint.
A large number for this metric usually indicates AWS DMS is unable to apply captured changes in a
timely manner, thus causing high target latency.
CDCChangesMemorySource
Amount of rows accumulating in a memory and waiting to be committed from the source.
CDCChangesMemoryTarget
Amount of rows accumulating on disk and waiting to be committed from the source.
CDCChangesDiskTarget
Network bandwidth for the target in KB per second. CDCThroughputBandwidth records bandwidth
on sampling points. If no network traffic is found, the value is zero. Because CDC does not issue
long-running transactions, network traffic may not be recorded.
CDCThroughputBandwidthTarget
Network bandwidth for the target in KB per second. CDCThroughputBandwidth records bandwidth
on sampling points. If no network traffic is found, the value is zero. Because CDC does not issue
long-running transactions, network traffic may not be recorded.
CDCThroughputRowsSource
The gap, in seconds, between the last event captured from the source endpoint and current system
time stamp of the AWS DMS instance. If no changes have been captured from the source due to task
scoping, AWS DMS sets this value to zero.
CDCLatencyTarget
The gap, in seconds, between the first event timestamp waiting to commit on the target and the
current timestamp of the AWS DMS instance. This value occurs if there are transactions that are
not handled by target. Otherwise, target latency is the same as source latency if all transactions are
applied. Target latency should never be smaller than the source latency.
For example, the following AWS CLI command shows the task log metadata in JSON format.
{
"ReplicationInstanceTaskLogs": [
{
"ReplicationTaskArn": "arn:aws:dms:us-
east-1:237565436:task:MY34U6Z4MSY52GRTIX3O4AY",
"ReplicationTaskName": "mysql-to-ddb",
"ReplicationInstanceTaskLogSize": 3726134
}
],
"ReplicationInstanceArn": "arn:aws:dms:us-east-1:237565436:rep:CDSFSFSFFFSSUFCAY"
}
In this response, there is a single task log (mysql-to-ddb) associated with the replication instance. The
size of this log is 3,726,124 bytes.
To delete the task logs for a task, set the task setting DeleteTaskLogs to true. For example, the
following JSON deletes the task logs when modifying a task using the AWS CLI modify-replication-
task command or the AWS DMS API ModifyReplicationTask action.
{
"Logging": {
"DeleteTaskLogs":true
}
}
To learn more about CloudTrail, see the AWS CloudTrail User Guide.
For an ongoing record of events in your AWS account, including events for AWS DMS, create a trail. A
trail enables CloudTrail to deliver log files to an Amazon S3 bucket. By default, when you create a trail
in the console, the trail applies to all regions. The trail logs events from all regions in the AWS partition
and delivers the log files to the Amazon S3 bucket that you specify. Additionally, you can configure
other AWS services to further analyze and act upon the event data collected in CloudTrail logs. For more
information, see:
All AWS DMS actions are logged by CloudTrail and are documented in the AWS Database Migration
Service API Reference. For example, calls to the CreateReplicationInstance, TestConnection and
StartReplicationTask actions generate entries in the CloudTrail log files.
Every event or log entry contains information about who generated the request. The identity
information helps you determine the following:
• Whether the request was made with root or IAM user credentials.
• Whether the request was made with temporary security credentials for a role or federated user.
• Whether the request was made by another AWS service.
The following example shows a CloudTrail log entry that demonstrates the
RebootReplicationInstance action.
{
"eventVersion": "1.05",
"userIdentity": {
"type": "AssumedRole",
"principalId": "AKIAIOSFODNN7EXAMPLE:johndoe",
"arn": "arn:aws:sts::123456789012:assumed-role/admin/johndoe",
"accountId": "123456789012",
"accessKeyId": "ASIAYFI33SINADOJJEZW",
"sessionContext": {
"attributes": {
"mfaAuthenticated": "false",
"creationDate": "2018-08-01T16:42:09Z"
},
"sessionIssuer": {
"type": "Role",
"principalId": "AKIAIOSFODNN7EXAMPLE",
"arn": "arn:aws:iam::123456789012:role/admin",
"accountId": "123456789012",
"userName": "admin"
}
}
},
"eventTime": "2018-08-02T00:11:44Z",
"eventSource": "dms.amazonaws.com",
"eventName": "RebootReplicationInstance",
"awsRegion": "us-east-1",
"sourceIPAddress": "72.21.198.64",
"userAgent": "console.amazonaws.com",
"requestParameters": {
"forceFailover": false,
"replicationInstanceArn": "arn:aws:dms:us-
east-1:123456789012:rep:EX4MBJ2NMRDL3BMAYJOXUGYPUE"
},
"responseElements": {
"replicationInstance": {
"replicationInstanceIdentifier": "replication-instance-1",
"replicationInstanceStatus": "rebooting",
"allocatedStorage": 50,
"replicationInstancePrivateIpAddresses": [
"172.31.20.204"
],
"instanceCreateTime": "Aug 1, 2018 11:56:21 PM",
"autoMinorVersionUpgrade": true,
"engineVersion": "2.4.3",
"publiclyAccessible": true,
"replicationInstanceClass": "dms.t2.medium",
"availabilityZone": "us-east-1b",
"kmsKeyId": "arn:aws:kms:us-east-1:123456789012:key/f7bc0f8e-1a3a-4ace-9faa-
e8494fa3921a",
"replicationSubnetGroup": {
"vpcId": "vpc-1f6a9c6a",
"subnetGroupStatus": "Complete",
"replicationSubnetGroupArn": "arn:aws:dms:us-
east-1:123456789012:subgrp:EDHRVRBAAAPONQAIYWP4NUW22M",
"subnets": [
{
"subnetIdentifier": "subnet-cbfff283",
"subnetAvailabilityZone": {
"name": "us-east-1b"
},
"subnetStatus": "Active"
},
{
"subnetIdentifier": "subnet-d7c825e8",
"subnetAvailabilityZone": {
"name": "us-east-1e"
},
"subnetStatus": "Active"
},
{
"subnetIdentifier": "subnet-6746046b",
"subnetAvailabilityZone": {
"name": "us-east-1f"
},
"subnetStatus": "Active"
},
{
"subnetIdentifier": "subnet-bac383e0",
"subnetAvailabilityZone": {
"name": "us-east-1c"
},
"subnetStatus": "Active"
},
{
"subnetIdentifier": "subnet-42599426",
"subnetAvailabilityZone": {
"name": "us-east-1d"
},
"subnetStatus": "Active"
},
{
"subnetIdentifier": "subnet-da327bf6",
"subnetAvailabilityZone": {
"name": "us-east-1a"
},
"subnetStatus": "Active"
}
],
"replicationSubnetGroupIdentifier": "default-vpc-1f6a9c6a",
"replicationSubnetGroupDescription": "default group created by console for
vpc id vpc-1f6a9c6a"
},
"replicationInstanceEniId": "eni-0d6db8c7137cb9844",
"vpcSecurityGroups": [
{
"vpcSecurityGroupId": "sg-f839b688",
"status": "active"
}
],
"pendingModifiedValues": {},
"replicationInstancePublicIpAddresses": [
"18.211.48.119"
],
"replicationInstancePublicIpAddress": "18.211.48.119",
"preferredMaintenanceWindow": "fri:22:44-fri:23:14",
"replicationInstanceArn": "arn:aws:dms:us-
east-1:123456789012:rep:EX4MBJ2NMRDL3BMAYJOXUGYPUE",
"replicationInstanceEniIds": [
"eni-0d6db8c7137cb9844"
],
"multiAZ": false,
"replicationInstancePrivateIpAddress": "172.31.20.204",
"patchingPrecedence": 0
}
},
"requestID": "a3c83c11-95e8-11e8-9d08-4b8f2b45bfd5",
"eventID": "b3c4adb1-e34b-4744-bdeb-35528062a541",
"eventType": "AwsApiCall",
"recipientAccountId": "123456789012"
}
AWS DMS provides support for data validation, to ensure that your data was migrated accurately from
the source to the target. If you enable it for a task, then AWS DMS begins comparing the source and
target data immediately after a full load is performed for a table.
Data validation is optional. AWS DMS compares the source and target records, and reports any
mismatches. In addition, for a CDC-enabled task, AWS DMS compares the incremental changes and
reports any mismatches.
During data validation, AWS DMS compares each row in the source with its corresponding row at
the target, and verifies that those rows contain the same data. To accomplish this, AWS DMS issues
appropriate queries to retrieve the data. Note that these queries will consume additional resources at the
source and the target, as well as additional network resources.
• Oracle
• PostgreSQL
• MySQL
• MariaDB
• Microsoft SQL Server
• Amazon Aurora (MySQL)
• Amazon Aurora (PostgreSQL)
Data validation requires additional time, beyond the amount required for the migration itself. The extra
time required depends on how much data was migrated.
For example, the following JSON turns on validation and increases the number of threads from the
default setting of 5 to 8.
ValidationSettings": {
"EnableValidation":true,
"ThreadCount":8
}
• ValidationState—The validation state of the table. The parameter can have the following values:
• Not enabled—Validation is not enabled for the table in the migration task.
• Pending records—Some records in the table are waiting for validation.
• Mismatched records—Some records in the table don't match between the source and
target. A mismatch might occur for a number of reasons; For more information, check the
awsdms_validation_failures table on the target endpoint.
• Suspended records—Some records in the table can't be validated.
• No primary key—The table can't be validated because it had no primary key.
• Table error—The table wasn't validated because it was in an error state and some data wasn't
migrated.
• Validated—All rows in the table are validated. If the table is updated, the status can change from
Validated.
• Error—The table can't be validated because of an unexpected error.
• ValidationPending—The number of records that have been migrated to the target, but that haven't
yet been validated.
ValidationSuspended—The number of records that AWS DMS can't compare. For example, if a record
at the source is constantly being updated, AWS DMS can't compare the source and the target. For more
information, see Error Handling Task Settings (p. 234)
• ValidationFailed—The number of records that didn't pass the data validation phase. For more
information, see Error Handling Task Settings (p. 234).
• ValidationSucceededRecordCount— Number of rows that AWS DMS validated, per minute.
• ValidationAttemptedRecordCount— Number of rows that validation was attempted, per minute.
• ValidationFailedOverallCount— Number of rows where validation failed.
• ValidationSuspendedOverallCount— Number of rows where validation was suspended.
• ValidationPendingOverallCount— Number of rows where the validation is still pending.
• ValidationBulkQuerySourceLatency— AWS DMS can do data validation in bulk, especially in certain
scenarios during a full-load or on-going replication when there are many changes. This metric
indicates the latency required to read a bulk set of data from the source endpoint.
• ValidationBulkQueryTargetLatency— AWS DMS can do data validation in bulk, especially in certain
scenarios during a full-load or on-going replication when there are many changes. This metric
indicates the latency required to read a bulk set of data on the target endpoint.
• ValidationItemQuerySourceLatency— During on-going replication, data validation can identify on-
going changes and validate those changes. This metric indicates the latency in reading those changes
from the source. Validation can run more queries than required, based on number of changes, if there
are errors during validation.
• ValidationItemQueryTargetLatency— During on-going replication, data validation can identify on-
going changes and validate the changes row by row. This metric gives us the latency in reading those
changes from the target. Validation may run more queries than required, based on number of changes,
if there are errors during validation.
You can view the data validation information using the console, the AWS CLI, or the AWS DMS API.
• On the console, you can choose to validate a task when you create or modify the task. To view the data
validation report using the console, choose the task on the Tasks page and choose the Table statistics
tab in the details section.
API Version API Version 2016-01-01
273
AWS Database Migration Service User Guide
Revalidating Tables During a Task
• Using the CLI, set the EnableValidation parameter to true when creating or modifying a task to
begin data validation. The following example creates a task and enables data validation.
create-replication-task
--replication-task-settings '{"ValidationSettings":{"EnableValidation":true}}'
--replication-instance-arn arn:aws:dms:us-east-1:5731014:
rep:36KWVMB7Q
--source-endpoint-arn arn:aws:dms:us-east-1:5731014:
endpoint:CSZAEFQURFYMM
--target-endpoint-arn arn:aws:dms:us-east-1:5731014:
endpoint:CGPP7MF6WT4JQ
--migration-type full-load-and-cdc
--table-mappings '{"rules": [{"rule-type": "selection", "rule-id": "1",
"rule-name": "1", "object-locator": {"schema-name": "data_types", "table-name":
"%"},
"rule-action": "include"}]}'
Use the describe-table-statistics command to receive the data validation report in JSON
format. The following command shows the data validation report.
{
"ReplicationTaskArn": "arn:aws:dms:us-west-2:5731014:task:VFPFTYKK2RYSI",
"TableStatistics": [
{
"ValidationPendingRecords": 2,
"Inserts": 25,
"ValidationState": "Pending records",
"ValidationSuspendedRecords": 0,
"LastUpdateTime": 1510181065.349,
"FullLoadErrorRows": 0,
"FullLoadCondtnlChkFailedRows": 0,
"Ddls": 0,
"TableName": "t_binary",
"ValidationFailedRecords": 0,
"Updates": 0,
"FullLoadRows": 10,
"TableState": "Table completed",
"SchemaName": "d_types_s_sqlserver",
"Deletes": 0
}
}
• Using the AWS DMS API, create a task using the CreateReplicationTask action and set the
EnableValidation parameter to true to validate the data migrated by the task. Use the
DescribeTableStatistics action to receive the data validation report in JSON format.
Troubleshooting
During validation, AWS DMS creates a new table at the target endpoint:
awsdms_validation_failures_v1. If any record enters the ValidationSuspended or the
ValidationFailed state, AWS DMS writes diagnostic information to awsdms_validation_failures_v1.
You can query this table to help troubleshoot validation errors.
TABLE_OWNER
VARCHAR(128) Schema (owner) of the table.
NOT NULL
FAILURE_TIME
DATETIME(3) Time when the failure occurred.
NOT NULL
KEY TEXT NOT NULL This is the primary key for row record type.
FAILURE_TYPE
VARCHAR(128) Severity of validation error. Can be either Failed or Suspended.
NOT NULL
The following query will show you all the failures for a task by querying the
awsdms_validation_failures_v1 table. The task name should be the external resource ID of the
task. The external resource ID of the task is the last value in the task ARN. For example, for a task with an
ARN value of arn:aws:dms:us-west-2:5599:task: VFPFKH4FJR3FTYKK2RYSI, the external resource ID of
the task would be VFPFKH4FJR3FTYKK2RYSI.
Once you have the primary key of the failed record, you can query the source and target endpoints to see
what part of the record does not match.
Limitations
• Data validation requires that the table has a primary key or unique index.
• Primary key columns can't be of type CLOB, BLOB, or BYTE.
• For primary key columns of type VARCHAR or CHAR, the length must be less than 1024.
• If the collation of the primary key column in the target PostgreSQL instance isn't set to "C", the sort
order of the primary key is different compared to the sort order in Oracle. If the sort order is different
between PostgreSQL and Oracle, data validation fails to validate the records.
• Data validation generates additional queries against the source and target databases. You must ensure
that both databases have enough resources to handle this additional load.
• Data validation isn't supported if a migration uses customized filtering or when consolidating several
databases into one.
• For a source or target Oracle endpoint, AWS DMS uses DBMS_CRYPTO to validate BLOBs. If your
Oracle endpoint uses BLOBs, then you must grant the execute permission on dbms_crypto to the
user account that is used to access the Oracle endpoint. You can do this by running the following
statement:
• If the target database is modified outside of AWS DMS during validation, then discrepancies might not
be reported accurately. This result can occur if one of your applications writes data to the target table,
while AWS DMS is performing validation on that same table.
• If one or more rows are being continuously modified during the validation, then AWS DMS can't
validate those rows. However, you can validate those rows manually, after the task completes.
• If AWS DMS detects more than 10,000 failed or suspended records, it stops the validation. Before you
proceed further, resolve any underlying problems with the data.
• Replication instances
• Endpoints
• Replication tasks
• Certificates
An AWS DMS tag is a name-value pair that you define and associate with an AWS DMS resource. The
name is referred to as the key. Supplying a value for the key is optional. You can use tags to assign
arbitrary information to an AWS DMS resource. A tag key could be used, for example, to define a
category, and the tag value could be a item in that category. For example, you could define a tag key of
"project" and a tag value of "Salix", indicating that the AWS DMS resource is assigned to the Salix project.
You could also use tags to designate AWS DMS resources as being used for test or production by using a
key such as environment=test or environment =production. We recommend that you use a consistent set
of tag keys to make it easier to track metadata associated with AWS DMS resources.
Use tags to organize your AWS bill to reflect your own cost structure. To do this, sign up to get your AWS
account bill with tag key values included. Then, to see the cost of combined resources, organize your
billing information according to resources with the same tag key values. For example, you can tag several
resources with a specific application name, and then organize your billing information to see the total
cost of that application across several services. For more information, see Cost Allocation and Tagging in
About AWS Billing and Cost Management.
Each AWS DMS resource has a tag set, which contains all the tags that are assigned to that AWS DMS
resource. A tag set can contain as many as ten tags, or it can be empty. If you add a tag to an AWS DMS
resource that has the same key as an existing tag on resource, the new value overwrites the old value.
AWS does not apply any semantic meaning to your tags; tags are interpreted strictly as character strings.
AWS DMS might set tags on an AWS DMS resource, depending on the settings that you use when you
create the resource.
• The tag key is the required name of the tag. The string value can be from 1 to 128 Unicode characters
in length and cannot be prefixed with "aws:" or "dms:". The string might contain only the set of
Unicode letters, digits, white-space, '_', '.', '/', '=', '+', '-' (Java regex: "^([\\p{L}\\p{Z}\\p{N}_.:/=+\
\-]*)$").
• The tag value is an optional string value of the tag. The string value can be from 1 to 256 Unicode
characters in length and cannot be prefixed with "aws:" or "dms:". The string might contain only the
set of Unicode letters, digits, white-space, '_', '.', '/', '=', '+', '-' (Java regex: "^([\\p{L}\\p{Z}\\p{N}_.:/=+\
\-]*)$").
Values do not have to be unique in a tag set and can be null. For example, you can have a key-value
pair in a tag set of project/Trinity and cost-center/Trinity.
You can use the AWS CLI or the AWS DMS API to add, list, and delete tags on AWS DMS resources. When
using the AWS CLI or the AWS DMS API, you must provide the Amazon Resource Name (ARN) for the AWS
DMS resource you want to work with. For more information about constructing an ARN, see Constructing
an Amazon Resource Name (ARN) for AWS DMS (p. 11).
Note that tags are cached for authorization purposes. Because of this, additions and updates to tags on
AWS DMS resources might take several minutes before they are available.
API
You can add, list, or remove tags for a AWS DMS resource using the AWS DMS API.
To learn more about how to construct the required ARN, see Constructing an Amazon Resource Name
(ARN) for AWS DMS (p. 11).
When working with XML using the AWS DMS API, tags use the following schema:
<Tagging>
<TagSet>
<Tag>
<Key>Project</Key>
<Value>Trinity</Value>
</Tag>
<Tag>
<Key>User</Key>
<Value>Jones</Value>
</Tag>
</TagSet>
</Tagging>
The following table provides a list of the allowed XML tags and their characteristics. Note that values for
Key and Value are case dependent. For example, project=Trinity and PROJECT=Trinity are two distinct
tags.
TagSet A tag set is a container for all tags assigned to an Amazon RDS resource.
There can be only one tag set per resource. You work with a TagSet only
through the AWS DMS API.
Key A key is the required name of the tag. The string value can be from 1 to 128
Unicode characters in length and cannot be prefixed with "dms:" or "aws:".
The string might only contain only the set of Unicode letters, digits, white-
space, '_', '.', '/', '=', '+', '-' (Java regex: "^([\\p{L}\\p{Z}\\p{N}_.:/=+\\-]*)$").
Keys must be unique to a tag set. For example, you cannot have a key-pair
in a tag set with the key the same but with different values, such as project/
Trinity and project/Xanadu.
Value A value is the optional value of the tag. The string value can be from 1 to
256 Unicode characters in length and cannot be prefixed with "dms:" or
"aws:". The string might only contain only the set of Unicode letters, digits,
white-space, '_', '.', '/', '=', '+', '-' (Java regex: "^([\\p{L}\\p{Z}\\p{N}_.:/=+\
\-]*)$").
Values do not have to be unique in a tag set and can be null. For example,
you can have a key-value pair in a tag set of project/Trinity and cost-center/
Trinity.
AWS DMS groups events into categories that you can subscribe to, so you can be notified when an event
in that category occurs. For example, if you subscribe to the Creation category for a given replication
instance, you are notified whenever a creation-related event occurs that affects your replication instance.
If you subscribe to a Configuration Change category for a replication instance, you are notified when the
replication instance's configuration is changed. You also receive notification when an event notification
subscription changes. For a list of the event categories provided by AWS DMS, see AWS DMS Event
Categories and Event Messages (p. 281), following.
AWS DMS sends event notifications to the addresses you provide when you create an event subscription.
You might want to create several different subscriptions, such as one subscription receiving all event
notifications and another subscription that includes only critical events for your production DMS
resources. You can easily turn off notification without deleting a subscription by setting the Enabled
option to No in the AWS DMS console or by setting the Enabled parameter to false using the AWS DMS
API.
Note
AWS DMS event notifications using SMS text messages are currently available for AWS DMS
resources in all regions where AWS DMS is supported. For more information on using text
messages with SNS, see Sending and Receiving SMS Notifications Using Amazon SNS.
AWS DMS uses a subscription identifier to identify each subscription. You can have multiple AWS DMS
event subscriptions published to the same Amazon SNS topic. When you use event notification, Amazon
SNS fees apply; for more information on Amazon SNS billing, see Amazon SNS Pricing.
1. Create an Amazon SNS topic. In the topic, you specify what type of notification you want to receive
and to what address or number the notification will go to.
2. Create an AWS DMS event notification subscription by using the AWS Management Console, AWS CLI,
or AWS DMS API.
3. AWS DMS sends an approval email or SMS message to the addresses you submitted with your
subscription. To confirm your subscription, click the link in the approval email or SMS message.
4. When you have confirmed the subscription, the status of your subscription is updated in the AWS DMS
console's Event Subscriptions section.
5. You then begin to receive event notifications.
For the list of categories and events that you can be notified of, see the following section. For more
details about subscribing to and working with AWS DMS event subscriptions, see Subscribing to AWS
DMS Event Notification (p. 282).
The following table shows the possible categories and events for the replication instance source type.
The following table shows the possible categories and events for the replication task source type.
In a notification subscription, you can specify the type of source you want to be notified of and the
AWS DMS source that triggers the event. You define the AWS DMS source type by using a SourceType
value. You define the source generating the event by using a SourceIdentifier value. If you specify both
SourceType and SourceIdentifier, such as SourceType = db-instance and SourceIdentifier
= myDBInstance1, you receive all the DB_Instance events for the specified source. If you specify
SourceType but not SourceIdentifier, you receive notice of the events for that source type for all your
AWS DMS sources. If you don't specify either SourceType or SourceIdentifier, you are notified of events
generated from all AWS DMS sources belonging to your customer account.
1. Sign in to the AWS Management Console and choose AWS DMS. Note that if you are signed in as
an AWS Identity and Access Management (IAM) user, you must have the appropriate permissions to
access AWS DMS.
2. In the navigation pane, choose Event Subscriptions.
3. On the Event Subscriptions page, choose Create Event Subscription.
4. On the Create Event Subscription page, do the following:
f. Choose Create.
The AWS DMS console indicates that the subscription is being created.
• Call CreateEventSubscription.
Using AWS DMS and the AWS Schema Conversion Tool (AWS SCT), you migrate your data in two stages.
First, you use the AWS SCT to process the data locally and then move that data to the AWS Snowball
Edge appliance. AWS Snowball then automatically loads the data into an Amazon S3 bucket. Next, when
the data is available on Amazon S3, AWS DMS takes the files and migrates the data to the target data
store. If you are using change data capture (CDC), those updates are written to the Amazon S3 bucket
and the target data store is constantly updated.
AWS Snowball is an AWS service you can use to transfer data to the cloud at faster-than-network speeds
using an AWS-owned appliance. An AWS Snowball Edge device can hold up to 100 TB of data. It uses
256-bit encryption and an industry-standard Trusted Platform Module (TPM) to ensure both security and
full chain-of-custody for your data.
Amazon S3 is a storage and retrieval service. To store an object in Amazon S3, you upload the file you
want to store to a bucket. When you upload a file, you can set permissions on the object and also on any
metadata.
• Migration from an on-premises data warehouse to Amazon Redshift. This approach involves a client-
side software installation of the AWS Schema Conversion Tool. The tool reads information from
the warehouse (the extractor), and then moves data to S3 or Snowball. Then in the AWS Cloud,
information is either read from S3 or Snowball and injected into Amazon Redshift.
• Migration from an on-premises relational database to an Amazon RDS database. This approach
again involves a client-side software installation of the AWS Schema Conversion Tool. The tool reads
information from a local database that AWS supports. The tool then moves data to S3 or Snowball.
When the data is in the AWS Cloud, AWS DMS writes it to a supported database in either Amazon EC2
or Amazon RDS.
Process Overview
The process of using AWS DMS and AWS Snowball involves several steps, and it uses not only AWS DMS
and AWS Snowball but also the AWS Schema Conversion Tool (AWS SCT). The sections following this
overview provide a step-by-step guide to each of these tasks.
Note
We recommend that you test your migration before you use the AWS Snowball device. To do so,
you can set up a task to send data, such as a single table, to an Amazon S3 bucket instead of the
AWS Snowball device.
The migration involves a local task, where AWS SCT moves the data to the AWS Snowball Edge device, an
intermediate action where the data is copied from the AWS Snowball Edge device to an S3 bucket. The
process then involves a remote task where AWS DMS loads the data from the Amazon S3 bucket to the
target data store on AWS.
The following steps need to occur to migrate data from a local data store to an AWS data store using
AWS Snowball.
1. Create an AWS Snowball job using the AWS Snowball console. For more information, see Create an
Import Job in the AWS Snowball documentation.
2. Download and install the AWS SCT application on a local machine. The machine must have network
access and be able to access the AWS account to be used for the migration. For more information
about the operating systems AWS SCT can be installed on, see Installing and Updating the AWS
Schema Conversion Tool .
3. Install the AWS SCT DMS Agent (DMS Agent) on a local, dedicated Linux machine. We recommend that
you do not install the DMS Agent on the same machine that you install the AWS SCT application.
4. Unlock the AWS Snowball Edge device using the local, dedicated Linux machine where you installed
the DMS Agent.
5. Create a new project in AWS SCT.
6. Configure the AWS SCT to use the DMS Agent.
7. Register the DMS Agent with the AWS SCT.
8. Install the database driver for your source database on the dedicated machine where you installed the
DMS Agent.
9. Create and set permissions for the Amazon S3 bucket to use.
10.Edit the AWS Service Profile in AWS SCT.
11.Create Local & DMS Task in SCT.
12.Run and monitor the Local & DMS Task in SCT.
13.Run the AWS SCT task and monitor progress in SCT.
You can install the DMS Agent on the following Linux platforms:
• Red Hat Enterprise Linux versions 6.2 through 6.8, 7.0 and 7.1 (64-bit)
• SUSE Linux version 11.1 (64-bit)
To configure the DMS Agent, you must provide a password and port number. You use the password
in AWS SCT, so keep it handy. The port is the one that the DMS Agent should listen on for AWS SCT
connections. You might have to configure your firewall to allow connectivity.
sudo /opt/amazon/aws-schema-conversion-tool-dms-agent/bin/configure.sh
For example, the following command lists the Amazon S3 bucket used by the device.
1. Start AWS SCT, and choose New Project for File. The New Project dialog box appears.
2. Add the following project information.
Project Name Type a name for your project, which is stored locally on
your computer.
To update the AWS SCT profile to work with the DMS Agent
2. Choose Settings, and then choose Global Settings. Choose AWS Service Profiles.
3. Choose Add New AWS Service Profile.
AWS Access Key Type the AWS access key for the AWS account and AWS
Region that you plan to use for the migration.
AWS Secret Key Type the AWS secret key for the AWS account and AWS
Region that you plan to use for the migration.
Region Choose the AWS Region for the account you are using.
Your DMS replication instance, S3 bucket, and target data
store must be in this AWS Region.
S3 Bucket folder Type a name for S3 bucket that you were assigned when
you created the AWS Snowball job.
5. After you have entered the information, choose Test Connection to verify that AWS SCT can connect
to the Amazon S3 bucket.
The OLTP Local & DMS Data Migration section in the pop-up window should show all entries with
a status of Pass. If the test fails, the failure is probably because the account you are using does not
have the correct privileges to access the Amazon S3 bucket.
6. If the test passes, choose OK and then OK again to close the window and dialog box.
1. Start AWS SCT, choose View, and then choose Database Migration View (DMS).
2. Choose the Agent tab, and then choose Register. The New Agent Registration dialog box appears.
Host Name Type the IP address of the machine where you installed
the DMS Agent.
Port Type the port number that you used when you configured
the DMS Agent.
Password Type the password that you used when you configured
the DMS Agent.
4. Choose Register to register the agent with your AWS SCT project.
To restart the DMS Agent after database driver installation, change the working directory to
<product_dir>/bin and use the steps listed following for each source database.
cd <product_dir>/bin
./arep.ctl stop
./arep.ctl start
To install on Oracle
Install Oracle Instant Client for Linux (x86-64) version 11.2.0.3.0 or later.
In addition, if not already included in your system, you need to create a symbolic link in the
$ORACLE_HOME\lib directory. This link should be called libclntsh.so, and should point to a specific
version of this file. For example, on an Oracle 12c client:
In addition, the LD_LIBRARY_PATH environment variable should be appended with the Oracle lib
directory and added to the site_arep_login.sh script under the lib folder of the installation. Add this
script if it doesn't exist.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/microsoft/msodbcsql/lib64/
DriverManagerEncoding=UTF-16
ODBCInstLib=libodbcinst.so
export SYBASE_HOME=/opt/sap
export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SYBASE_HOME/
DataAccess64/ODBC/lib:$SYBASE_HOME/DataAccess/ODBC/
lib:$SYBASE_HOME/OCS-16_0/lib:$SYBASE_HOME/OCS-16_0/
lib3p64:$SYBASE_HOME/OCS-16_0/lib3p
[Sybase]
Driver=/opt/sap/DataAccess64/ODBC/lib/libsybdrvodb.so
Description=Sybase ODBC driver
To install on MySQL
Make sure that the /etc/odbcinst.ini file contains an entry for MySQL, as in the following example
To install on PostgreSQL
Make sure that the /etc/odbcinst.ini file contains an entry for PostgreSQL, as in the following
example
[PostgreSQL]
Description = PostgreSQL ODBC driver
Driver = /usr/pgsql-9.4/lib/psqlodbc.so
Setup = /usr/pgsql-9.4/lib/psqlodbcw.so
Debug = 0
CommLog = 1
UsageCount = 2
1. Start AWS SCT, choose View, and then choose Database Migration View (Local & DMS).
2. In the left panel that displays the schema from your source database, choose a schema object to
migrate. Open the context (right-click) menu for the object, and then choose Create Local & DMS
Task.
Replication Instance Choose the AWS DMS replication instance that you want
to use. You can only specify a Replication Instance that is
version 2.4.3.
Target table preparation mode Choose the preparation mode you want to use.
IAM role Choose the predefined IAM role that has permissions to
access the Amazon S3 bucket and the target database.
For more information about the permissions required
to access an Amazon S3 bucket, see Prerequisites When
Using Amazon S3 as a Source for AWS DMS (p. 141).
Job Name Choose the AWS Snowball job name you created.
Port Type the port value for the AWS Snowball appliance.
Local AWS S3 Access key Type the AWS access key for the account you are using for
the migration.
Local AWS S3 Secret key Type the AWS secret key for the account you are using for
the migration.
You can monitor the DMS Agent logs by choosing Show Log. The log details include agent server (Agent
Log) and local running task (Task Log) logs. Because the endpoint connectivity is done by the server
(since the local task is not running and there are no task logs), connection issues are listed under the
Agent Log tab.
To do so, disconnect the AWS Snowball appliance and ship back to AWS. For more information about
returning the AWS Snowball appliance to AWS, see the steps outlined in Getting Started with AWS
Snowball Edge: Your First Job in the AWS Snowball documentation. You can use the AWS Snowball
console or AWS SCT (show details of the DMS task) to check the status of the appliance and find out
when AWS DMS begins to load data to the Amazon S3 bucket.
After the AWS Snowball appliance arrives at AWS and unloads data to S3 bucket, you can see that the
remote (DMS) task starts to run. If the migration type you selected for the task was Migrate existing
data, the status for the DMS task will show 100% complete when the data has been transferred from
Amazon S3 to the target data store. If you set the a task mode to include ongoing replication, then
after full load is complete the task status shows that the task continues to run, while AWS DMS applies
ongoing changes.
Topics
• Slow Running Migration Tasks (p. 297)
• Task Status Bar Not Moving (p. 298)
• Missing Foreign Keys and Secondary Indexes (p. 298)
• Amazon RDS Connection Issues (p. 298)
• Networking Issues (p. 298)
• CDC Stuck After Full Load (p. 299)
• Primary Key Violation Errors When Restarting a Task (p. 299)
• Initial Load of Schema Fails (p. 299)
• Tasks Failing With Unknown Error (p. 299)
• Task Restart Loads Tables From the Beginning (p. 300)
• Number of Tables Per Task (p. 300)
• Troubleshooting Oracle Specific Issues (p. 300)
• Troubleshooting MySQL Specific Issues (p. 302)
• Troubleshooting PostgreSQL Specific Issues (p. 306)
• Troubleshooting Microsoft SQL Server Specific Issues (p. 308)
• Troubleshooting Amazon Redshift Specific Issues (p. 309)
• Troubleshooting Amazon Aurora MySQL Specific Issues (p. 310)
For more information about determining the size of your replication instance, see Choosing the
Optimum Size for a Replication Instance (p. 314)
You can increase the speed of an initial migration load by doing the following:
• If your target is an Amazon RDS DB instance, ensure that Multi-AZ is not enabled for the target DB
instance.
• Turn off any automatic backups or logging on the target database during the load, and turn back on
those features once the migration is complete.
• If the feature is available on the target, use Provisioned IOPS.
• If your migration data contains LOBs, ensure that the task is optimized for LOB migration. See Target
Metadata Task Settings (p. 227) for more information on optimizing for LOBs.
To migrate secondary objects from your database, use the database's native tools if you are migrating to
the same database engine as your source database. Use the Schema Conversion Tool if you are migrating
to a different database engine than that used by your source database to migrate secondary objects.
Networking Issues
The most common networking issue involves the VPC security group used by the AWS DMS replication
instance. By default, this security group has rules that allow egress to 0.0.0.0/0 on all ports. If you
modify this security group or use your own security group, egress must, at a minimum, be permitted to
the source and target endpoints on the respective database ports.
• Replication instance and both source and target endpoints in the same VPC — The security group
used by the endpoints must allow ingress on the database port from the replication instance. Ensure
that the security group used by the replication instance has ingress to the endpoints, or you can create
a rule in the security group used by the endpoints that allows the private IP address of the replication
instance access.
• Source endpoint is outside the VPC used by the replication instance (using Internet Gateway) —
The VPC security group must include routing rules that send traffic not destined for the VPC to the
Internet Gateway. In this configuration, the connection to the endpoint appears to come from the
public IP address on the replication instance.
• Source endpoint is outside the VPC used by the replication instance (using NAT Gateway) — You
can configure a network address translation (NAT) gateway using a single Elastic IP Address bound to
a single Elastic Network Interface which then receives a NAT identifier (nat-#####). If the VPC includes
a default route to that NAT Gateway instead of the Internet Gateway, the replication instance will
instead appear to contact the Database Endpoint using the public IP address of the Internet Gateway.
In this case, the ingress to the Database Endpoint outside the VPC needs to allow ingress from the NAT
address instead of the Replication Instance’s public IP Address.
memory, swap files, and IOPS to ensure your instance has enough resources to perform the migration.
For more information on monitoring, see Data Migration Service Metrics (p. 265).
Topics
• Pulling Data from Views (p. 300)
• Migrating LOBs from Oracle 12c (p. 300)
• Switching Between Oracle LogMiner and Binary Reader (p. 301)
• Error: Oracle CDC stopped 122301 Oracle CDC maximum retry counter exceeded. (p. 301)
• Automatically Add Supplemental Logging to an Oracle Source Endpoint (p. 301)
• LOB Changes not being Captured (p. 302)
• Error: ORA-12899: value too large for column <column-name> (p. 302)
• NUMBER data type being misinterpreted (p. 302)
exposeViews=true
useLogminerReader=N
6. Use an Oracle developer tool such as SQL-Plus to grant the following additional privilege to the
AWS DMS user account used to connect to the Oracle endpoint:
SELECT ON V_$TRANSPORTABLE_PLATFORM
4. Select Modify.
5. Select Advanced, and then add the following code to the Extra connection attributes text box:
addSupplementalLogging=Y
6. Choose Modify.
• Add a primary key to the table. This can be as simple as adding an ID column and populating it with a
sequence using a trigger.
• Create a materialized view of the table that includes a system generated ID as the primary key and
migrate the materialized view rather than the table.
• Create a logical standby, add a primary key to the table, and migrate from the logical standby.
Topics
• CDC Task Failing for Amazon RDS DB Instance Endpoint Because Binary Logging Disabled (p. 303)
• Connections to a target MySQL instance are disconnected during a task (p. 303)
• Adding Autocommit to a MySQL-compatible Endpoint (p. 303)
• Disable Foreign Keys on a Target MySQL-compatible Endpoint (p. 304)
• Characters Replaced with Question Mark (p. 304)
• "Bad event" Log Entries (p. 304)
To solve the issue where a task is being disconnected from a MySQL target, do the following:
• Check that you have your database variable max_allowed_packet set large enough to hold your
largest LOB.
• Check that you have the following variables set to have a large timeout value. We suggest you use a
value of at least 5 minutes for each of these variables.
• net_read_timeout
• net_write_timeout
• wait_timeout
• interactive_timeout
3. Select the MySQL-compatible target endpoint that you want to add autocommit to.
4. Select Modify.
5. Select Advanced, and then add the following code to the Extra connection attributes text box:
6. Choose Modify.
Initstmt=SET FOREIGN_KEY_CHECKS=0
6. Choose Modify.
When AWS DMS is set to create the tables and primary keys in the target database, it currently does
not use the same names for the Primary Keys that were used in the source database. Instead, AWS
DMS creates the Primary Key name based on the tables name. When the table name is long, the auto-
generated identifier created can be longer than the allowed limits for MySQL. The solve this issue,
currently, pre-create the tables and Primary Keys in the target database and use a task with the task
setting Target table preparation mode set to Do nothing or Truncate to populate the target tables.
"[SOURCE_CAPTURE ]E: Column ‘<column name>' uses an unsupported character set [120112]
A field data conversion failed. (mysql_endpoint_capture.c:2154)
This error often occurs because of tables or databases using UTF8MB4 encoding. AWS DMS does
not support the UTF8MB4 character set. In addition, check your database's parameters related to
connections. The following command can be used to see these parameters:
As a workaround, you can use the CharsetMapping extra connection attribute with your source MySQL
endpoint to specify character set mapping. You might need to restart the AWS DMS migration task from
the beginning if you add this extra connection attribute.
For example, the following extra connection attribute could be used for a MySQL source endpoint where
the source character set is utf8 or latin1. 65001 is the UTF8 code page identifier.
CharsetMapping=utf8,65001
CharsetMapping=latin1,65001
Topics
• JSON data types being truncated (p. 306)
• Columns of a user defined data type not being migrated correctly (p. 307)
• Error: No schema has been selected to create in (p. 307)
• Deletes and updates to a table are not being replicated using CDC (p. 307)
• Truncate statements are not being propagated (p. 307)
• Preventing PostgreSQL from capturing DDL (p. 307)
• Selecting the schema where database objects for capturing DDL are created (p. 308)
• Oracle tables missing after migrating to PostgreSQL (p. 308)
• Task Using View as a Source Has No Rows Copied (p. 308)
For example, the following log information shows JSON that was truncated due to the Limited LOB
mode setting and failed validation.
03:00:49
2017-09-19T03:00:49 [TARGET_APPLY ]E: Failed to execute statement:
'UPDATE "public"."delivery_options_quotes" SET "id"=? , "enabled"=? ,
"new_cart_id"=? , "order_id"=? , "user_id"=? , "zone_id"=? , "quotes"=? ,
"start_at"=? , "end_at"=? , "last_quoted_at"=? , "created_at"=? ,
"updated_at"=? WHERE "id"=? ' [1022502] (ar_odbc_stmt
2017-09-19T03:00:49 [TARGET_APPLY ]E: Failed to execute statement:
'UPDATE "public"."delivery_options_quotes" SET "id"=? , "enabled"=? ,
"new_cart_id"=? , "order_id"=? , "user_id"=? , "zone_id"=? , "quotes"=? ,
"start_at"=? , "end_at"=? , "last_quoted_at"=? , "created_at"=? ,
"updated_at"=? WHERE "id"=? ' [1022502] (ar_odbc_stmt.c:2415)
#
03:00:49
2017-09-19T03:00:49 [TARGET_APPLY ]E: RetCode: SQL_ERROR SqlState:
22P02 NativeError: 1 Message: ERROR: invalid input syntax for type json;,
Error while executing the query [1022502] (ar_odbc_stmt.c:2421)
2017-09-19T03:00:49 [TARGET_APPLY ]E: RetCode: SQL_ERROR SqlState:
22P02 NativeError: 1 Message: ERROR: invalid input syntax for type json;,
Error while executing the query [1022502] (ar_odbc_stmt.c:2421)
captureDDLs=N
ddlArtifactsSchema=xyzddlschema
Your tables and data are still accessible; if you migrated your tables without using transformation rules
to convert the case of your table names, you will need to enclose your table names in quotes when
referencing them.
Topics
• Special Permissions for AWS DMS user account to use CDC (p. 308)
• Errors Capturing Changes for SQL Server Database (p. 309)
• Missing Identity Columns (p. 309)
• Error: SQL Server Does Not Support Publications (p. 309)
• Changes Not Appearing in Target (p. 309)
SOURCE_CAPTURE E: No FULL database backup found (under the 'FULL' recovery model).
To enable all changes to be captured, you must perform a full database backup.
120438 Changes may be missed. (sqlserver_log_queries.c:2623)
Review the pre-requisites listed for using SQL Server as a source in Using a Microsoft SQL Server
Database as a Source for AWS DMS (p. 100).
AWS DMS currently does not support SQL Server Express as a source or target.
The SIMPLE recovery model logs the minimal information needed to allow users to recover their
database. All inactive log entries are automatically truncated when a checkpoint occurs. All operations
are still logged, but as soon as a checkpoint occurs the log is automatically truncated, which means
that it becomes available for re-use and older log entries can be over-written. When log entries are
overwritten, changes cannot be captured, and that is why AWS DMS doesn't support the SIMPLE data
recovery model. For information on other required pre-requisites for using SQL Server as a source, see
Using a Microsoft SQL Server Database as a Source for AWS DMS (p. 100).
Topics
• Loading into a Amazon Redshift Cluster in a Different Region Than the AWS DMS Replication
Instance (p. 310)
• Error: Relation "awsdms_apply_exceptions" already exists (p. 310)
• Errors with Tables Whose Name Begins with "awsdms_changes" (p. 310)
• Seeing Tables in Cluster with Names Like dms.awsdms_changes000000000XXXX (p. 310)
To see all the pre-requisites required for using Amazon Redshift as a target, see Using an Amazon
Redshift Database as a Target for AWS Database Migration Service (p. 163).
Topics
• Error: CHARACTER SET UTF8 fields terminated by ',' enclosed by '"' lines terminated by '\n' (p. 311)
2016-11-02T14:23:48 [TARGET_LOAD ]E: Load data sql statement. load data local infile
"/rdsdbdata/data/tasks/7XO4FJHCVON7TYTLQ6RX3CQHDU/data_files/4/LOAD000001DF.csv" into
table
`VOSPUSER`.`SANDBOX_SRC_FILE` CHARACTER SET UTF8 fields terminated by ','
enclosed by '"' lines terminated by '\n'( `SANDBOX_SRC_FILE_ID`,`SANDBOX_ID`,
`FILENAME`,`LOCAL_PATH`,`LINES_OF_CODE`,`INSERT_TS`,`MODIFIED_TS`,`MODIFIED_BY`,
`RECORD_VER`,`REF_GUID`,`PLATFORM_GENERATED`,`ANALYSIS_TYPE`,`SANITIZED`,`DYN_TYPE`,
`CRAWL_STATUS`,`ORIG_EXEC_UNIT_VER_ID` ) ; (provider_syntax_manager.c:2561)
Topics
• Improving the Performance of an AWS DMS Migration (p. 312)
• Choosing the Optimum Size for a Replication Instance (p. 314)
• Reducing the Load on Your Source Database (p. 315)
• Using the Task Log to Troubleshoot Migration Issues (p. 315)
• Converting Schema (p. 315)
• Migrating Large Binary Objects (LOBs) (p. 315)
• Ongoing Replication (p. 316)
• Changing the User and Schema for an Oracle Target (p. 317)
• Improving Performance When Migrating Large Tables (p. 317)
In our tests, we've migrated a terabyte of data in approximately 12 to 13 hours using a single AWS DMS
task and under ideal conditions. These ideal conditions included using source databases running on
Amazon EC2 and in Amazon RDS with target databases in Amazon RDS, all in the same Availability Zone.
Our source databases contained a representative amount of relatively evenly distributed data with a few
large tables containing up to 250 GB of data. The source data didn't contain complex data types, such as
BLOB.
You can improve performance by using some or all of the best practices mentioned following. Whether
you can use one of these practices or not depends in large part on your specific use case. We mention
limitations as appropriate.
By default, AWS DMS loads eight tables at a time. You might see some performance improvement by
increasing this slightly when using a very large replication server, such as a dms.c4.xlarge or larger
instance. However, at some point, increasing this parallelism reduces performance. If your replication
server is relatively small, such as a dms.t2.medium, we recommend that you reduce the number of
tables loaded in parallel.
To change this number in the AWS Management Console, open the console, choose Tasks, choose
to create or modify a task, and then choose Advanced Settings. Under Tuning Settings, change the
Maximum number of tables to load in parallel option.
To change this number using the AWS CLI, change the MaxFullLoadSubTasks parameter under
TaskSettings.
Working with Indexes, Triggers and Referential Integrity Constraints
Indexes, triggers, and referential integrity constraints can affect your migration performance and
cause your migration to fail. How these affect migration depends on whether your replication task is
a full load task or an ongoing replication (CDC) task.
For a full load task, we recommend that you drop primary key indexes, secondary indexes, referential
integrity constraints, and data manipulation language (DML) triggers. Alternatively, you can delay
their creation until after the full load tasks are complete. You don't need indexes during a full load
task and indexes will incur maintenance overhead if they are present. Because the full load task
loads groups of tables at a time, referential integrity constraints are violated. Similarly, insert,
update, and delete triggers can cause errors, for example, if a row insert is triggered for a previously
bulk loaded table. Other types of triggers also affect performance due to added processing.
You can build primary key and secondary indexes before a full load task if your data volumes
are relatively small and the additional migration time doesn't concern you. Referential integrity
constraints and triggers should always be disabled.
For a full load + CDC task, we recommend that you add secondary indexes before the CDC phase.
Because AWS DMS uses logical replication, secondary indexes that support DML operations should
be in-place to prevent full table scans. You can pause the replication task before the CDC phase to
build indexes, create triggers, and create referential integrity constraints before you restart the task.
Disable Backups and Transaction Logging
When migrating to an Amazon RDS database, it’s a good idea to disable backups and Multi-AZ on
the target until you’re ready to cut over. Similarly, when migrating to non-Amazon RDS systems,
disabling any logging on the target until after cut over is usually a good idea.
Use Multiple Tasks
Sometimes using multiple tasks for a single migration can improve performance. If you have
sets of tables that don’t participate in common transactions, you might be able to divide your
migration into multiple tasks. Transactional consistency is maintained within a task, so it’s important
that tables in separate tasks don't participate in common transactions. Additionally, each task
independently reads the transaction stream, so be careful not to put too much stress on the source
database.
You can use multiple tasks to create separate streams of replication to parallelize the reads on the
source, the processes on the replication instance, and the writes to the target database.
Optimizing Change Processing
By default, AWS DMS processes changes in a transactional mode, which preserves transactional
integrity. If you can afford temporary lapses in transactional integrity, you can use the batch
optimized apply option instead. This option efficiently groups transactions and applies them in
batches for efficiency purposes. Using the batch optimized apply option almost always violates
referential integrity constraints, so you should disable these during the migration process and enable
them again as part of the cut over process.
During a full load task, AWS DMS loads tables individually. By default, eight tables are loaded at a time.
AWS DMS captures ongoing changes to the source during a full load task so the changes can be applied
later on the target endpoint. The changes are cached in memory; if available memory is exhausted,
changes are cached to disk. When a full load task completes for a table, AWS DMS immediately applies
the cached changes to the target table.
After all outstanding cached changes for a table have been applied, the target endpoint is in a
transactionally consistent state. At this point, the target is in-sync with the source endpoint with respect
to the last cached changes. AWS DMS then begins ongoing replication between the source and target.
To do so, AWS DMS takes change operations from the source transaction logs and applies them to the
target in a transactionally consistent manner (assuming batch optimized apply is not selected). AWS DMS
streams ongoing changes through memory on the replication instance, if possible. Otherwise, AWS DMS
writes changes to disk on the replication instance until they can be applied on the target.
You have some control over how the replication instance handles change processing, and how memory
is used in that process. For more information on how to tune change processing, see Change Processing
Tuning Settings (p. 233).
From the preceding explanation, you can see that total available memory is a key consideration. If the
replication instance has sufficient memory that AWS DMS can stream cached and ongoing changes
without writing them to disk, migration performance increases greatly. Similarly, configuring the
replication instance with enough disk space to accommodate change caching and log storage also
increases performance. The maximum IOPs possible depend on the selected disk size.
Consider the following factors when choosing a replication instance class and available disk storage:
• Table size – Large tables take longer to load and so transactions on those tables must be cached until
the table is loaded. After a table is loaded, these cached transactions are applied and are no longer
held on disk.
• Data manipulation language (DML) activity – A busy database generates more transactions. These
transactions must be cached until the table is loaded. Transactions to an individual table are applied as
soon as possible after the table is loaded, until all tables are loaded.
• Transaction size – Long-running transactions can generate many changes. For best performance, if
AWS DMS applies changes in transactional mode, sufficient memory must be available to stream all
changes in the transaction.
• Total size of the migration – Large migrations take longer and they generate a proportionally large
number of log files.
• Number of tasks – The more tasks, the more caching is likely to be required, and the more log files are
generated.
• Large objects – Tables with LOBs take longer to load.
Anecdotal evidence shows that log files consume the majority of space required by AWS DMS. The
default storage configurations are usually sufficient.
However, replication instances that run several tasks might require more disk space. Additionally, if your
database includes large and active tables, you might need to increase disk space for transactions that are
cached to disk during a full load task. For example, if your load takes 24 hours and you produce 2GB of
transactions each hour, you might want to ensure that you have 48GB of space for cached transactions.
Also, the more storage space you allocate to the replication instance, the higher the IOPS you get.
The guidelines preceding don’t cover all possible scenarios. It’s critically important to consider the
specifics of your particular use case when you determine the size of your replication instance. After your
migration is running, monitor the CPU, freeable memory, storage free, and IOPS of your replication
instance. Based on the data you gather, you can size your replication instance up or down as needed.
If you find you are overburdening your source database, you can reduce the number of tasks or tables for
each task for your migration. Each task gets source changes independently, so consolidating tasks can
decrease the change capture workload.
Converting Schema
AWS DMS doesn't perform schema or code conversion. If you want to convert an existing schema
to a different database engine, you can use the AWS Schema Conversion Tool (AWS SCT). AWS SCT
converts your source objects, table, indexes, views, triggers, and other system objects into the target data
definition language (DDL) format. You can also use AWS SCT to convert most of your application code,
like PL/SQL or TSQL, to the equivalent target language.
You can get AWS SCT as a free download from AWS. For more information on AWS SCT, see the AWS
Schema Conversion Tool User Guide.
If your source and target endpoints are on the same database engine, you can use tools such as Oracle
SQL Developer, MySQL Workbench, or PgAdmin4 to move your schema.
1. AWS DMS creates a new row in the target table and populates the row with all data except the
associated LOB value.
2. AWS DMS updates the row in the target table with the LOB data.
This migration process for LOBs requires that, during the migration, all LOB columns on the target table
must be nullable. This is so even if the LOB columns aren't nullable on the source table. If AWS DMS
creates the target tables, it sets LOB columns to nullable by default. If you create the target tables using
some other mechanism, such as import or export, you must ensure that the LOB columns are nullable
before you start the migration task.
This requirement has one exception. Suppose that you perform a homogeneous migration from an
Oracle source to an Oracle target, and you choose Limited Lob mode. In this case, the entire row is
populated at once, including any LOB values. For such a case, AWS DMS can create the target table LOB
columns with not-nullable constraints, if needed.
• Limited LOB mode migrates all LOB values up to a user-specified size limit (default is 32 KB). LOB
values larger than the size limit must be manually migrated. Limited LOB mode, the default for all
migration tasks, typically provides the best performance. However you need to ensure that the Max
LOB size parameter setting is correct. This parameter should be set to the largest LOB size for all your
tables.
• Full LOB mode migrates all LOB data in your tables, regardless of size. Full LOB mode provides the
convenience of moving all LOB data in your tables, but the process can have a significant impact on
performance.
For some database engines, such as PostgreSQL, AWS DMS treats JSON data types like LOBs. Make sure
that if you have chosen Limited LOB mode the Max LOB size option is set to a value that doesn't cause
the JSON data to be truncated.
AWS DMS provides full support for using large object data types (BLOBs, CLOBs, and NCLOBs). The
following source endpoints have full LOB support:
• Oracle
• Microsoft SQL Server
• ODBC
• Oracle
• Microsoft SQL Server
The following target endpoint has limited LOB support. You can't use an unlimited LOB size for this
target endpoint.
• Amazon Redshift
For endpoints that have full LOB support, you can also set a size limit for LOB data types.
Ongoing Replication
AWS DMS provides ongoing replication of data, keeping the source and target databases in sync. It
replicates only a limited amount of data definition language (DDL). AWS DMS doesn't propagate items
such as indexes, users, privileges, stored procedures, and other database changes not directly related to
table data.
If you plan to use ongoing replication, you should enable the Multi-AZ option when you create your
replication instance. By choosing the Multi-AZ option you get high availability and failover support for
the replication instance. However, this option can have an impact on performance.
For example, if you want to migrate from the user source schema PERFDATA to the target data
PERFDATA, you'll need to create a transformation as follows:
{
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-action": "rename",
"rule-target": "schema",
"object-locator": {
"schema-name": "PERFDATA"
},
"value": "PERFDATA"
}
For more information about transformations, see Specifying Table Selection and Transformations by
Table Mapping Using JSON (p. 250).
To apply row filtering in the AWS Management Console, open the console, choose Tasks, and create
a new task. In the Table mappings section, add a value for Selection Rule. You can then add a
column filter with either a less than or equal to, greater than or equal to, equal to, or range condition
(between two values). For more information about column filtering, see Specifying Table Selection and
Transformations by Table Mapping from the Console (p. 245).
Alternatively, if you have a large partitioned table that is partitioned by date, you can migrate data based
on date. For example, suppose that you have a table partitioned by month, and only the current month’s
data is updated. In this case, you can create a full load task for each static monthly partition and create a
full load + CDC task for the currently updated partition.
AWS DMS maintains data types when you do a homogenous database migration where both source and
target use the same engine type. When you do a heterogeneous migration, where you migrate from
one database engine type to a different database engine, data types are converted to an intermediate
data type. To see how the data types appear on the target database, consult the data type tables for the
source and target database engines.
Be aware of a few important things about data types when migrating a database:
• The UTF-8 4-byte character set (utf8mb4) isn't supported and can cause unexpected behavior in a
source database. Plan to convert any data using the UTF-8 4-byte character set before migrating.
• The FLOAT data type is inherently an approximation. When you insert a specific value in FLOAT, it
might be represented differently in the database. This difference is because FLOAT isn't an exact data
type, such as a decimal data type like NUMBER or NUMBER(p,s). As a result, the internal value of
FLOAT stored in the database might be different than the value that you insert. Thus, the migrated
value of a FLOAT might not match exactly the value in the source database.
Topics
• Data Types for AWS Database Migration Service (p. 319)
Support for latency You can now recalculate time offsets during a daylight saving time
recalculation when there is a change when using Oracle or PostgreSQL as a source.
daylight saving time change
July 19, 2018 Fixed an issue where PostgreSQL as a source was sending Null values
as Empty to Oracle during change data capture in Full LOB mode.
September 19, 2018 Fixed an issue where Null values in SQL Server varchar columns
were migrated differently to all targets.
October 7, 2018 Fixed an issue where the LOB setting didn't work when transformation
rules were present.
October 12, 2018 Fixed an issue where ongoing replication tasks with Oracle as a source
failed to resume after a stop in certain cases.
October 12, 2018 Fixed an issue where ongoing replication tasks with SQL Server as a
source failed to resume after a stop in certain cases.
Multiple dates Fixed multiple issues with PostgreSQL as a source that were present in
version 3.1.1.
Migration of 4-byte UTF8 AWS DMS now supports all 4-byte character sets, such as UTF8MB4,
characters and so on. This feature works without any configuration changes.
Support for Microsoft SQL Added support for SQL Server 2017 as a source. For more details,
Server 2017 as a source see Using a Microsoft SQL Server Database as a Source for AWS
DMS (p. 100).
Support for parallel full load Added support for parallel full load of large tables based on partitions
of tables and subpartitions. This feature uses a separate unload thread for each
table partition or subpartition to speed up the bulk load process. You
can also specify specific ranges or partitions to migrate a subset of the
table data. Supported sources are Oracle, SQL Server, Sybase, MySQL,
and IBM Db2 for Linux, UNIX, PostgreSQL, and Windows (Db2 LUW).
For more information, see Parallel Loading of Tables (p. 229).
Control large object (LOB) You can now control LOB settings per table with additional table
settings per table mapping settings. For more information, see Target Metadata Task
Settings (p. 227). Supported sources are Oracle, SQL Server, MySQL,
and PostgreSQL.
Control the order for loading You can now control the order for loading tables with table mappings
tables in a single migration in a migration task. You can specify the order by tagging the table
task with a load-order unsigned integer in the table mappings. Tables with
higher load-order values are migrated first. For more information, see
Using Table Mapping to Specify Task Settings (p. 245).
Support for updates to Updates to primary key values are now replicated when you use
primary key values when PostgreSQL as a source for ongoing replication.
using PostgreSQL as a source
April 24, 2018 Fixed an issue where users couldn't create Azure SQL as a source
endpoint for SQL Server 2016.
May 5, 2018 Fixed an issue where CHR(0) in an Oracle source was migrated as
CHR(32) in an Aurora with MySQL compatibility target.
May 10, 2018 Fixed an issue where ongoing replication from Oracle as a source didn't
work as expected when using Oracle LogMiner to migrate changes
from an Oracle physical standby.
May 27, 2018 Fixed an issue where characters of various data types in PostgreSQL
were tripled during migration to PostgreSQL.
June 12, 2018 Fixed an issue where data was changed during a migration from TEXT
to NCLOB (PostgreSQL to Oracle) due to differences in how these
engines handle nulls within a string.
June 17, 2018 Fixed an issue where the replication task failed to created primary keys
in target MySQL version 5.5 instance when migrating from a source
MySQL version 5.5.
June 23, 2018 Fixed an issue where JSON columns were truncated in full LOB mode
when migrating from a PostgreSQL instance to Aurora with PostgreSQL
compatibility.
June 27, 2018 Fixed an issue where batch application of changes to PostgreSQL as a
target failed because of an issue creating the intermediate net changes
table on the target.
June 30, 2018 Fixed an issue where the MySQL timestamp '0000-00-00
00:00:00' wasn't migrated as expected while performing a full load.
July 2, 2018 Fixed an issue where a DMS replication task didn't continue as expected
after the source Aurora MySQL failover occurred.
July 9, 2018 Fixed an issue with a migration from MySQL to Amazon Redshift where
the task failed with an unknown column and data type error.
July 21, 2018 Fixed an issue where null characters in a string migrated differently
from SQL Server to PostgreSQL in limited LOB and full LOB modes.
July 23, 2018 Fixed an issue where the safeguard transactions in SQL Server as a
source filled up the transaction log in SQL Server.
July 26, 2018 Fixed an issue where null values were migrated as empty values in a
roll-forward migration from PostgreSQL to Oracle.
Multiple dates Fixed various logging issues to keep users more informed about
migration by Amazon CloudWatch logs.
Validation for migrations You can now validate data when migrating a subset of a table using
with filter clauses table filters.
Open Database Connectivity The underlying ODBC driver for MySQL was upgraded to 5.3.11-1,
(ODBC) driver upgrades and the underlying ODBC driver for Amazon Redshift was upgraded to
1.4.2-1010.
Latency recalculation in You can now recalculate the time offset during daylight saving time
case of daylight saving time changes for Oracle and PostgreSQL as a source. Source and target
changes latency calculations are accurate after the daylight saving time change.
UUID data type conversion You can now convert a UNIQUEIDENTIFER data type (that is, a
(SQL Server to MySQL) universally unique identifier or UUID) to bytes when migrating between
SQL Server as a source and MySQL as a target.
Ability to change encryption You can now change encryption mode when migrating between S3 as
modes for Amazon S3 as a a source and Amazon Redshift as a target. You specify the encryption
source and Amazon Redshift mode with a connection attribute. Server side encryption and AWS
as a target KMS are both supported.
July 17, 2018 Fixed an issue where PostgreSQL as a source sent null values as empty
values to target Oracle databases during change data capture (CDC) in
full large binary object (LOB) mode.
July 29, 2018 Fixed an issue where migration tasks to and from Amazon S3 failed to
resume after upgrading from DMS version 1.9.0.
September 12, 2018 Fixed an issue where DMS working with SQL Server safeguard
transactions blocked the transaction log from being reused.
September 21, 2018 Fixed an issue with failed bulk loads from PostgreSQL as a source to
Amazon Redshift as a target. The failed tasks did not report a failure
when the full load was interrupted.
October 3, 2018 Fixed an issue where a DMS migration task didn't fail when
prerequisites for ongoing replication weren't properly configured for
SQL Server as a source.
Multiple dates Fixed multiple issues related to data validation, and added enhanced
support for validating multibyte UTF-8 characters.
Table metadata recreation on Added a new extra connect attribute for MySQL endpoints:
mismatch CleanSrcMetadataOnMismatch.
February 12, 2018 Fixed an issue in ongoing replication using batch apply where AWS
DMS was missing some inserts as a unique constraint in the table was
being updated.
March 16, 2018 Fixed an issue where an Oracle to PostgreSQL migration task was
crashing during the ongoing replication phase due to Multi-AZ failover
on the source Amazon RDS for Oracle instance.
Binary Reader support for Added support for using Binary Reader in change data capture (CDC)
Amazon RDS for Oracle scenarios from an Amazon RDS for Oracle source during ongoing
during change data capture replication.
Additional COPY command Introduced support for the following additional Amazon Redshift copy
parameters for Amazon parameters using extra connection attributes. For more information,
Redshift as a target see Extra Connection Attributes When Using Amazon Redshift as a
Target for AWS DMS (p. 166).
• TRUNCATECOLUMNS
Option to fail a migration Introduced support to fail a task when a truncate is encountered
task when a table is in a PostgreSQL source when using a new task setting. For more
truncated in a PostgreSQL information, see the ApplyErrorFailOnTruncationDdl setting in
source the section Error Handling Task Settings (p. 234).
Validation support for Introduced data validation support for JSON, JSONB, and HSTORE
JSON/JSONB/HSTORE in columns for PostgreSQL as a source and target.
PostgreSQL endpoints
Improved logging for MySQL Improved log visibility for issues when reading MySQL binary logs
sources (binlogs) during change data capture (CDC). Logs now clearly show an
error or warning if there are issues accessing MySQL source binlogs
during CDC.
Additional data validation Added more replication table statistics. For more information, see
statistics Replication Task Statistics (p. 273).
January 14, 2018 Fixed all issues with respect to handling zero dates (0000-00-00)
to MySQL targets during full load and CDC. MySQL doesn't accept
0000-00-00 (invalid in MySQL) although some engines do. All these
dates become 0101-01-01 for a MySQL target.
January 21, 2018 Fixed an issue where migration fails when migrating a table with table
name containing a $ sign.
February 3, 2018 Fixed an issue where a JSON column from a PostgreSQL source was
truncated when migrated to any supported target.
February 12, 2018 Fixed an issue where migration task was failing after a failover in
Aurora MySQL target.
February 21, 2018 Fixed an issue where a migration task couldn't start its ongoing
replication phase after a network connectivity issue.
February 23, 2018 Fixed an issue where certain transformation rules in table mappings
were causing migration task crashes during ongoing replication to
Amazon Redshift targets.
JSONB support for Introduced support for JSONB migration from PostgreSQL as a source.
PostgreSQL sources JSONB is treated as a LOB data type and requires appropriate LOB
settings to be used.
HSTORE support for Introduced support for HSTORE data type migration from PostgreSQL
PostgreSQL sources as a source. HSTORE is treated as a LOB data type and requires
appropriate LOB settings to be used.
Additional COPY command Introduced support for the following additional copy parameters by
parameters for Amazon using these extra connection attributes:
Redshift as a target
• ACCEPTANYDATE
• DATEFORMAT
• TIMEFORMAT
• EMPTYASNULL
July 12, 2017 Fixed an issue where migration task hung before the full load phase
starts when reading from an Oracle table with TDE column encryption
enabled.
October 3, 2017 Fixed an issue where a JSON column from a PostgreSQL source didn't
migrate as expected.
October 5, 2017 Fixed an issue when DMS migration task shows 0 source latency when
an archive redo log file is not found on the source Oracle instance. This
fix linearly increases source latency under such conditions.
November 20, 2017 Fixed an issue with LOB migration where a TEXT column in PostgreSQL
was migrating to a CLOB column in Oracle with extra spaces after each
character in the LOB entry.
November 20, 2017 Fixed an issue with a migration task not stopping as expected after an
underlying replication instance upgrade from version 1.9.0 to 2.4.0.
November 30, 2017 Fixed an issue where a DMS migration task doesn't properly capture
changes made by a copy command run on a source PostgreSQL
instance.
December 11, 2017 Fixed an issue where a migration task failed when reading change data
from a nonexistent binlog from a MySQL source.
December 11, 2017 Fixed an issue where DMS is reading change data from a nonexistent
table from a MySQL source.
December 20, 2017 Includes several fixes and enhancements for the data validation
feature.
December 22, 2017 Fixed an issue with maxFileSize parameter for Amazon Redshift
targets. This parameter was wrongly being interpreted as bytes instead
of kilobytes.
January 4, 2018 Fixed a memory allocation bug for an Amazon DynamoDB as a target
migration tasks. In certain conditions, AWS DMS didn't allocate enough
memory if the object mapping being used contained a sort key.
January 10, 2018 Fixed an issue with Oracle 12.2 as a source where data manipulation
language (DML) statements weren't captured as expected when
ROWDEPENDENCIES are used.
Replicating Oracle index Adds functionality to support replication of Oracle index tablespaces.
tablespaces More details about index tablespaces can be seen here.
Support for cross-account Adds functionality to support canned ACLs (predefined grants) to
Amazon S3 access support cross-account access with S3 endpoints. Find more details
about canned ACLs in Canned ACL in the Amazon Simple Storage
Service Developer Guide.
• NONE
• PRIVATE
• PUBLIC_READ
• PUBLIC_READ_WRITE
• AUTHENTICATED_READ
• AWS_EXEC_READ
• BUCKET_OWNER_READ
• BUCKET_OWNER_FULL_CONTROL
July 19, 2017 Fixed an issue where replication task runs in a retry loop forever when
a source PostgreSQL instance runs out of replication slots. With this
fix, the task fails with an error reporting that DMS can't create a logical
replication slot.
July 27, 2017 Fixed an issue in the replication engine where the enum MySQL data
type caused task failure with a memory allocation error.
August 7, 2017 Fixed an issue that caused unexpected behavior with migration tasks
with Oracle as a source when the source is down for more than five
minutes. This issue caused the ongoing replication phase to hang even
after the source became available.
August 24, 2017 Fixed an issue with PostgreSQL target where the fraction part in the
TIME data type was handled incorrectly.
September 14, 2017 Fixed an issue where incorrect values were being written to TOAST
fields in PostgreSQL-based targets during updates in the CDC phase.
October 8, 2017 Fixed an issue from version 2.3.0 where ongoing replication with
MySQL 5.5 sources would not work as expected.
October 12, 2017 Fixed an issue with reading changes from a SQL Server 2016 source
during the ongoing replication phase. This fix needs to be used
with the following extra connect attribute in the source SQL Server
endpoint: IgnoreTxnCtxValidityCheck=true
SQL Azure as a source Using Microsoft Azure SQL Database as a Source for AWS DMS (p. 109)
Platform – AWS SDK update Update to the AWS SDK in the replication instance to 1.0.113. The AWS
SDK is used for certain endpoints (such as Amazon Redshift and S3)
to upload data on customers' behalf into these endpoints. Usage is
unrestricted.
Oracle source: Support Ability to migrate tablespaces from an Oracle source eliminating the
replication of tablespace in need to precreate tablespaces in the target before migration.
Oracle
Usage: Use the ReadTableSpaceName setting in the extra connect
attributes in the Oracle source endpoint and set it to true to support
tablespace replication. This option is set to false by default.
Oracle source: CDC support Ability to use a standby instance for Oracle Active Data Guard as a
for Oracle Active Data Guard source for replicating ongoing changes to a supported target. This
standby as a source change eliminates the need to connect to an active database that
might be in production.
PostgreSQL source: add WAL Added a write-ahead log (WAL) heartbeat (that is, running dummy
heartbeat queries) for replication from a PostgreSQL source. This feature was
added so that idle logical replication slots don't hold onto old WAL
logs, which can result in storage full situations on the source. This
heartbeat keeps restart_lsn moving and prevents storage full
scenarios.
All endpoints: Maintain Ability to do like-to-like migrations for homogeneous migration tasks
homogeneous replication (from a table structure/data type perspective) came in 2.2.0. However,
with transformation DMS still converted data types internally when a task was launched
with table transformations. This feature maintains data types from the
source on the target for homogeneous lift-and-shift migrations, even
when transformations are used.
All endpoints: Fail task when Ability to force replication task failure when include transformation
no tables are found rules find no matches.
Oracle source: Stop task Ability to come out of a retry loop and stop a task when the archive
when archive redo log is redo log on the source is missing.
missing
Usage: Use the RetryTimeoutInMinutes extra connect attribute to
specify the stop timeout in minutes.
January 5, 2017 Fixed a server ID collision issue when launching multiple DMS tasks to
the same MySQL instance (version 5.6+)
February 21, 2017 Fixed an issue where table creation fails for nestingLevel=ONE when
_id in MongoDB is a string in the document. Before this fix, _id (when
May 5, 2017 Fixed an issue where NULL LOBs were migrating as empty when using
full LOB mode with an Oracle source.
May 5, 2017 Fixed an issue where a task with MySQL as a source fails with a too
many connections error when custom CDC start time is older than 24
hours.
May 24, 2017 Fixed an issue where task was in the starting status for too long when
multiple tasks were launched on the replication instance at one time.
July 7, 2017 Fixed an issue that caused a PostgreSQL error message about using
all available connection slots to appear. Now an error is logged in the
default logging level when all available connection slots to PostgreSQL
are used up and DMS can't get more slots to continue with replication.
July 19, 2017 Fixed an issue where updates and deletes from Oracle to DynamoDB
were not being migrated correctly.
August 8, 2017 Fixed an issue that caused unexpected behavior during CDC when an
Oracle source database instance went down for more than five minutes
during a migration.
August 12, 2017 Fixed an issue where nulls from any source were being migrated as
amazon_null, causing issues when inserted into data types other than
varchar in Amazon Redshift.
August 27, 2017 For MongoDB, fixed an issue where a full load task crashes when
nestingLevel=NONE and _id is not ObjectId.
Document History
The following table describes the important changes to the AWS Database Migration Service user guide
documentation after January 2018.
Support for Elasticsearch and Added support for Amazon November 15, 2018
Kinesis Data Streams as a target Elasticsearch and Amazon
Kinesis Data Streams as targets
for data migration.
CDC native start support Added support for native start June 28, 2018
points when using change data
capture (CDC).
Db2 LUW support Added support for IBM Db2 LUW April 26, 2018
as a source for data migration.
Task log support Added support for seeing task February 8, 2018
log usage and purging task logs.
SQL Server as target support Added support for Amazon RDS February 6, 2018
for Microsoft SQL Server as a
source.
Earlier Updates
The following table describes the important changes to the AWS Database Migration Service user guide
documentation prior to January 2018.
New feature Added support for using AWS DMS with AWS Snowball November 17, 2017
to migrate large databases. For more information,
see Migrating Large Data Stores Using AWS Database
Migration Service and AWS Snowball (p. 285).
New feature Added support for task assessment report and data November 17, 2017
validation. For more information about the task
assessment report, see Creating a Task Assessment
Report (p. 215). For more information about data
validation, see Data Validation Task Settings (p. 234).
New feature Added support for AWS CloudFormation templates. July 11, 2017
For more information, see AWS DMS Support for AWS
CloudFormation (p. 11).
New feature Added support for using Amazon Dynamo as a target. April 10, 2017
For more information, see Using an Amazon DynamoDB
Database as a Target for AWS Database Migration
Service (p. 175).
New feature Added support for using MongoDB as a source. For more April 10, 2017
information, see Using MongoDB as a Source for AWS
DMS (p. 132).
New feature Added support for using Amazon S3 as a target. For more March 27, 2017
information, see Using Amazon Simple Storage Service as
a Target for AWS Database Migration Service (p. 171).
New feature Adds support for reloading database tables during a March 7, 2017
migration task. For more information, see Reloading
Tables During a Task (p. 242).
New feature Added support for events and event subscriptions. January 26, 2017
For more information, see Working with Events and
Notifications in AWS Database Migration Service (p. 280).
New feature Added support for SSL endpoints for Oracle. For December 5, 2016
more information, see SSL Support for an Oracle
Endpoint (p. 50).
New feature Added support for using change data capture (CDC) September 14, 2016
with an Amazon RDS PostgreSQL DB instance. For more
information, see Setting Up an Amazon RDS PostgreSQL
DB Instance as a Source (p. 115).
New region support Added support for the Asia Pacific (Mumbai), Asia Pacific August 3, 2016
(Seoul), and South America (São Paulo) regions. For a
list of supported regions, see What Is AWS Database
Migration Service? (p. 1).
New feature Added support for ongoing replication. For more July 13, 2016
information, see Ongoing Replication (p. 316).
New feature Added support for secured connections using SSL. For July 13, 2016
more information, see Using SSL With AWS Database
Migration Service (p. 47).
New feature Added support for SAP Adaptive Server Enterprise (ASE) July 13, 2016
as a source or target endpoint. For more information,
see Using an SAP ASE Database as a Source for AWS
DMS (p. 129) and Using a SAP ASE Database as a Target
for AWS Database Migration Service (p. 170).
New feature Added support for filters to move a subset of rows from May 2, 2016
the source database to the target database. For more
information, see Using Source Filters (p. 257).
New feature Added support for Amazon Redshift as a target endpoint. May 2, 2016
For more information, see Using an Amazon Redshift
Database as a Target for AWS Database Migration
Service (p. 163).
General availability Initial release of AWS Database Migration Service. March 14, 2016
Public preview Released the preview documentation for AWS Database January 21, 2016
release Migration Service.
AWS Glossary
For the latest AWS terminology, see the AWS Glossary in the AWS General Reference.