Microsoft Integration Runtime - Release Notes: Azure Data Factory
Microsoft Integration Runtime - Release Notes: Azure Data Factory
Microsoft Integration Runtime - Release Notes: Azure Data Factory
May 2020
Microsoft Integration Runtime is a client agent that enables cloud access for on-premises data sources within your organization.
It acts as a connection bridge between Microsoft’s cloud services and the customer’s on-premises data sources. With Integration Runtime, you
can copy data from on-premises data sources to cloud and vice versa.
For more information on products and services, which are currently using Integration Runtime, please refer to:
o Copy activity: added support for data consistency verification when coping files as-is between file-based data stores (public
o Copy activity: added support for generating execution session logs (public preview).
o Copy activity: added support for skipping error files during copy (public preview).
Enhancements –
o Oracle connector: fixed the issue that password cannot contain semicolon.
Enhancements –
o When copying data from partition-option-enabled data stores in parallel (e.g. Oracle/Teradata/SAP HANA/etc.), copy activity
adapts the parallel copy count according to the number of Self-hosted Integration Runtime nodes.
o Dynamics 365/Dynamics CRM/Common Data Service for Apps connectors: when copy/lookup data from entity in these data
stores without FetchXML query, ADF now retrieves all attributes instead of sampling the top rows.
What’s new –
o New Snowflake connector: added support for copying data to and from Snowflake using copy activity.
o OData connector: added support for configuring authentication headers in linked service.
o Added option to preserve ACLs when copying data between Azure Data Lake Storage Gen2.
Enhancements –
o DB2 connector: fixed the "invalid codepage: 5348" error, in which case DB2 connector cannot recognize IBM CCSID 5348 – a
form of Windows Latin-1 (with Euro) code page for compatibility on IBM z/OS and i.
Enhancements –
Enhancements –
o DB2 driver upgrade: fixed the issue that copying data from DB2 hit “SQLSTATE=HY000 SQLCODE=-343” error.
Enhancements –
o SFTP connector as sink: added an option to disable upload with temp file rename, to avoid hitting errors like
“UserErrorSftpPermissionDenied”, “UserErrorSftpPathNotFound”, and “SftpOperationFail” when SFTP server doesn’t support
rename operation.
o Fixed: copying data from Salesforce Attachment object hit error "Couldn't resolve host name".
Enhancements –
o Fixed: copying data into SQL sink encountered error of "Specified argument was out of the range of valid values" when write
batch timeout is set to longer than 30 days .
Enhancements –
o Fixed: copying data to SFTP sink hit error “failed to rename temp file” when extension method is
not supported by the server.
Enhancements –
o Get Metadata activity: increased the maximum size of returned metadata from 1MB to 2MB.
What’s new –
o Azure Blob connector: added support for using prefix to filter source blobs.
o DB2 connector: added support for connection string configuration in linked service with advanced connection options.
Enhancements –
o Fixed the issue in SAP BW MDX connector that null value of decimal type cannot be copied in copy activity.
Enhancements –
o Fixed the retry logic for transient Azure Blob & SQL operation failure.
What’s new –
o SFTP connector: added support for writing data into SFTP server using Copy activity.
o SAP HANA connector: added support for parallel load from SAP HANA in Copy activity to improve performance.
Enhancements –
o [Fixed] Proxy changes done directly in the diahost.exe.config & diawp.exe.config weren’t honored during self-hosted IR update.
What’s new –
o Azure Synapse Analytics (formerly SQL Data Warehouse) connector: added support for loading data using new COPY statement
(preview) in additional to PolyBase and bulk insert options.
Enhancements –
What’s new –
Enhancements –
o Auto-update resilience: Fixed issue with auto-updates on machines with low disk space
o Accessibility improvements
What’s new –
o Support Server-to-Server authentication for Dyanmics365, Dynamics CRM, Common Data Service for App connector.
o Upgraded Azure Cosmos DB SQL API connector including retrieving/writing data in hierarchical shape and more available settings
Enhancements –
o Improvement for automation of self-hosted integration runtime registration using PowerShell cmdlet
(regisiterintegrationruntime.ps1) which now allows passing a node name parameter.
What’s new –
o Added support for copy activity to write to files in ORC format with "snappy" compression.
Enhancements –
o General reliability improvements that fix the random occurrence of transient activity execution delays when capacity is full.
With the release of Version 4.0, we target .NET Framework 4.6.2 for enhanced functionality and security. Please ensure your machine meets the
prerequisite of .NET Framework 4.6.2. It is fully backwards compatible with all existing pipelines and provide the new features and
enhancements as below.
What’s new –
o New dataset model for JSON format on all file-based data stores, supported by Copy/Lookup/GetMetadata/Delete activities.
o Added support for data mapping from hierarchical source to hierarchical sink in copy activity, including Azure Cosmos DB’API for
MongoDB, MongoDB, JSON format, REST, OData, SAP ECC, and SAP C4C connectors.
o Added support for test connection to subfolder for ADLS Gen2, Azure Blob and Amazon S3 connector, to improve the authoring
experience if the identity you use to access the data store (e.g. service principal, managed identity) only have permission to
subdirectory instead of the entire account.
o New dataset model for ORC format on all file-based data stores, supported by Copy/Lookup/GetMetadata/Delete activities
o Added new metric 'Activity queue duration' for monitoring queue time on the self-hosted integration runtime
o Added support for upsert using alternate key in copy activity when writing data to Dynamics 365, Dynamics CRM, and Common
Data Service
Enhancements –
o When copying data to files and you don’t specify a sink file name, copy activity now always generates the sink file name with
correct file extension.
What’s new –
o New dataset model for Avro format on all file-based data stores, supported by Copy/Lookup/GetMetadata/Delete activities.
Enhancements -
o Parallel load from Oracle and Teradata (when data partitioning feature is enabled) can now leverage multiple Self-hosted IR
nodes to scale out and get better performance.
o Fixed: copy activity column mapping doesn’t work as configured in few cases.
o General reliability improvements that fix the random occurrence of transient activity execution delays when capacity is full.
o Upgraded the following connectors: Amazon Marketplace Web Service, Amazon Redshift, Concur, Drill, Oracle Eloqua, Google
AdWords, HBase, HubSpot, Jira, Magento, Marketo, Paypal, QuickBooks Online, Oracle Responsys, Salesforce, Salesforce
Marketing Cloud, ServiceNow, Shopify, Spark, Square, Xero.
Enhancements -
o General reliability improvements that fix the random occurrence of transient activity execution delays when capacity is full.
What’s new –
o New Binary dataset model for all file-based data stores to treat files as-is without parsing, supported by
Copy/GetMetadata/Delete activities.
o New dataset model for the following connectors, to split original single table name into separate schema and table name so that
you don’t need to quote the names in any cases even with special characters: Azure SQL Database, Azure SQL Data Warehouse,
SQL Server, Oracle, DB2, Google Big Query, Hive, PostgreSQL, Redshift, Impala, Drill, Greenplum, Phoenix, Presto, Spark, Vertica
Enhancements -
o Fixed: Dynamics 365 / CRM as source – copy activity fails when Link Entity in FetchXML is a data type of EntityReference or
o Fixed: Parquet format as source – data truncation issue in copy activity that when skipping incompatible row is enabled, the
remaining rows are ignored after the first bad row is read.
What’s new –
o Added support for copying TABLE along with existing QUERY support from SAP HANA with improved reliability with configurable
packet size.
o Added support for Cluster and Pooled table for SAP table connector with support for ConvertDateToDatetime and
o Added new authentication mechanism for SQL Managed Instance - Service Principal Name (SPN) and system-assigned managed
identity (MSI).
Enhancements -
o General reliability improvements that fix the random occurrence of transient activity execution delays.
What’s new –
o Added support in copy activity to auto create sink table if not exists when writing to Azure SQL Database/SQL DW/SQL Server/
o SQL Server/Azure SQL Database/SQL DW connectors: supported "schema" and "table" as two separate properties in dataset
instead of one combined "tableName", to eliminate the complexity of quoting special characters.
Enhancements -
o Fixed: when copying data from ADLS Gen1 to Gen2 with preserving ACLs, recursive setting didn’t take effect.
What’s new –
o Teradata connector:
Powered by a new out-of-box Teradata ODBC driver to save you from installing the .NET Data Provider for Teradata, and
offers more connection options e.g. query timeout;
Added support for parallel load from Teradata in Copy activity to improve performance.
Added support for listing HANA tables in authoring UI and copying data from HANA tables;
o Added support for using PolyBase to copy data into Azure SQL DW from Blob storage with VNET endpoint.
Enhancements -
o Fixed: Salesforce connector issue which caused incomplete records read from Salesforce object without error, due to
uninitialized memory usage.
o Fixed: ServiceNow connector issue when copying data with large number of columns which only returned limited records.
What’s new –
o Oracle connector: added support for parallel load from Oracle in Copy activity to improve performance.
Enhancements -
o When copying data in-parallel from a tabular source (e.g. SAP Table, SAP Open Hub) to a file-based sink data store, Copy activity
generated files will have appropriate file extension based on the sink dataset specification.
o Fixed: Self-hosted Integration runtime capacity leak issue resulting in jobs to show ‘In-progress’ state forever.
o Fixed: racing issues when reading from SAP Open Hub connector in parallel.
Enhancements -
o Fixed: copying data from folder in Azure Blob/ADLS Gen1/ADLS Gen2 to Azure SQL DW cannot use PolyBase when source is
Parquet/Delimited Text type dataset.
Enhancements -
o Fixed: for Parquet/DelimitedText dataset on Amazon S3 connector, copy activity/import schema/preview data failure when
using "prefix" to filter files.
o Fixed: for Parquet/DelimitedText dataset on HTTP connector, copy activity failure and import schema/preview data issue which
always returns no data.
What’s new -
o New dataset model for SQL, Parquet and DelimitedText (on all file-based data stores) to support physical schema and to align
across Copy/Data Flow/Lookup/GetMetadata activities.
o Added support for copying from ADLS Gen2 into Azure SQL Data Warehouse using PolyBase
Enhancements -
o Fixed: SQL connector connection number reaches session limit, by disabling connection pooling for SQL authentication
Enhancements -
o Fixed: Intermittent Copy activity failure when copying to Azure Blob storage
What’s new-
o Added support for on-premise data access from Azure-SSIS IR using SSIS activity
o Bug fixes
o Fixed: Copy activity failure when copying data from source - SAP HANA, Sybase, MySQL to sink – Azure SQL DB (ADF v1), SQL
Server using Stored Procedure in the activity
What’s new-
o Added support for Delete Activity. The 'Delete Activity' supports file based connectors which includes Azure Blob Storage,
Amazon S3, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, File System, FTP and SFTP.
o Added support for Azure Data Explorer connector both as source and sink
o Added support for copying files incrementally in Copy Activity based on lastModifiedTime for HDFS, SFTP, Azure Data
Lake Storage Gen1, Azure Data Lake Storage Gen2.
o Added support for Azure Data Lake Storage Gen2 as supported connector for GetMetadata Activity.
What’s new-
o Added support for SAP BW Open Hub Connector as source
o Added support for OpenJDK in addition to JRE when writing/parsing Parquet and ORC formats
What’s new-
o Added support for Azure Cosmos DB MongoDB API connector as source and sink
o Added support for copying data from generic OData by using service principal and MSI authentication
What’s new-
o Added support for copying data to/from Azure Data Lake Storage Gen2 by using service principal and MSI authentication.
o Fixed: Issue with OData connector when using windows authentication while connecting to the data store
What’s new-
o FairFax Support
o Fixed: Intermittent failure in accessing key vault secrets. Improved robustness in accessing key vault secrets from ADF
What’s new-
o Added support for controlling max number of concurrent connections established to the data store during copy activity run
o Improved Self-hosted IR auto-update strategy to adapt to the current activity execution status
o Added support for reading data from Google AdWords and Oracle Service Cloud
o Added support for Azure Active Directory Authentication while connecting to Azure Blob
o Fixed: Issue with navigation in ADF UI while using generic ODBC connector
o Fixed: Issue with large payloads during Interactive queries in the ADF UI (like UI Navigation, Test Connection, etc.)
What’s new-
o Support for Azure Data Lake Storage Gen2 connector (as both source and sink)
o Improved copy resilience while copying huge number of files from a File Share
What’s new-
o Fixed: Issue while copying parquet files in sink while not compressed.
What’s new-
o Support for passing Azure Key Vault linked service in Custom Activity
o File-filter support for Azure Blob, Azure Data Lake Store, Amazon S3 and HDFS stores
o Ability to enable/disable remote access used node-to-node communication in HA setup and for storing on-premises credentials
o Memory optimizations
o Fixed: Reduced worker process count and memory limit for lower specification machine
What’s new-
o Support for storing Linked Service credential in Azure Key Vault (AKV) for data store and compute linked services. Currently all
activity types except custom activity support retrieving credential from AKV during execution. (v2 only)
What’s new-
o You can now specify a custom port for remote access endpoint.
o Fixed: Issue with incorrect copy of files when coping from zipped files (3rd party).
What’s new-
o You can now see the Data Factory name to which the Integration Runtime belongs to
o Fixed: ‘Invalid Payload’ issue during some (transform) activities’ execution (v2 only)
o Fixed: Issue during re-registering same node to another IR. You will not be able to register a node to another Integration
Runtime if this node is already registered.
o Fixed: Credential synchronization issue due to token expiry (across nodes in multi-node/ High availability setup) causing some
activities to fail in case of failover.
o Fixed: Issue while setting HTTP Proxy during lack of internet connection
With the release of Version 3.0, what was formerly called the Data Management Gateway (DMG) is now the Self-Hosted Azure Data Factory
Integration Runtime (ADF-IR). It is fully backwards compatible with all existing pipelines and provide the new features and enhancements as
What’s new-
o Support for Azure Data Factory version 2 (also referred to as v2)- You can now dispatch transform activities through the Self-
hosted Integration Runtime. Please see Self-hosted Integration Runtime in v2 for more details.
o You can copy data into ODBC-compatible data stores using the generic ODBC connector (v2 only)
o You can copy data from Dynamics CRM and Dynamics 365 into all the supported sink data stores. (v2 only)
o You can copy data from HDFS using built-in DistCp support. (v2 only)
o You can copy data into/ from Oracle data store using the built-in Microsoft driver without needing any addition driver
installation. (v2 only)
o Name change to Integration Runtime- This change is transparent to your existing pipelines.
o Upgraded to use .NET Framework 4.6.1- Requires you to have .NET 4.6.1 runtime installed.
o Fixed: Test Connection failure while using DFS (Distributed File System) from Copy Wizard. (v1 only)
What’s new-
o Added notification to help customer upgrade their .Net Framework (to 4.6.1 or above), to enable the future auto-updates of the
Self-hosted IR.
What’s new-
o High Availability and Scalability Preview- You can add up to 4 nodes to a single logical gateway to enable high availability and
scalability. All existing gateways can be enrolled from the Portal (to be rolled out during August 2017)
What’s new-
o You can now skip incompatible rows during copy by enabling ‘Skip incompatible rows’
o Fixed: Issue regarding language/culture setting which didn’t honor the value selected during DMG Setup
o You can add DNS entries to allow service bus rather than allowing all Azure IP addresses from your firewall (if needed). You can
find respective DNS entry on Azure portal (Data Factory -> ‘Author and Deploy’ -> ‘Gateways’ -> "serviceUrls" (in JSON)
o HDFS connector now supports self-signed public certificate by letting you skip SSL validation.
o Fixed: Issue with gateway offline during update (due to clock skew)
o Fixed: Out of memory issue while unzipping several small files during copy activity.
o Fixed: Index out of range issue while copying from Document DB to on premise SQL with idempotency feature.
o Fixed: SQL cleanup script doesn't work with on premise SQL from Copy Wizard.
o Fixed: Column name with space at the end does not work in copy activity.
o Fixed: Issue with registration during gateway restore using backup file.
What’s new-
o We have added support for copying data from SFTP (SSH File Transfer Protocol) Server.
o We have added support for copying data from HTTP/ HTTPS endpoints.
o Users can now easily attach screenshot of the Gateway along with the feedback.
o Bug Fixes: Issue in copying data to on premise SQL Server with Stored Procedure using Copy Wizard.
o Support for richer authentication types for SFTP- SSH Public Key, HTTP/ HTTPS - Digest, Windows, Client Certificate.
What’s new-
o You can now authenticate into your Azure Data Lake Store using service principal. Previous we only supported OAuth.
o We have packaged new driver for reading data from Oracle on premise data store in gateway.
o Support JSON format with nested Array
What’s new-
o Bug fix for gateway auto update, gateway parallel processing capacity.
What’s new-
o Now ADF Copy activity will manage the Schema migration automatically in Destination SQL DW while copying data from on
premise SQL Server.
o Improved and more robust Gateway registration experience - Now you can track progress status during the Gateway registration
process, which makes the registration experience more responsive.
o Improvement in Gateway Restore Process- You can still recover gateway even if you do not have the gateway backup file with
this update. This would require you to reset Linked Service credentials in Portal.
o Bug fix.
What’s new-
o You can now store data source credentials locally. The credentials are encrypted. The data source credentials can be recovered
and restored using the backup file that can be exported from the existing Gateway, all on premise.
o Support auto detection of QuoteChar configuration for Text format in copy wizard, and improve the overall format detection
Support firstRowAsHeader and SkipLineCount auto detection in copy wizard for text files in on-premises File system and HDFS
Enhance the stability of network connection between gateway and Service Bus
Bug fixes
Support setting HTTP proxy for the gateway using gateway Configuration Manager
Support Text format header handling when copying data from/to Azure Blob, Azure Data Lake Store, on-prem File System and on-prem
Support copying data from Append Blob and Page Blob along with the already supported Block Blob
Support auto detection on file format settings in Copy Wizard for on-prem File System and on-prem HDFS
Introduce a new gateway status “Online (Limited)”, which indicates the main gateway functionality works except the interactive
operation support for Copy Wizard
Users can connect to on-premises SQL server via gateway with remote logon privilege
Support read/query/copy wizard functions via Salesforce/Cassandra/MongoDb/Redshift ODBC drivers
Users are able to select the language/culture to be used by a gateway during manual installation
In case users have gateway issues, they can choose to send gateway logs of the last 7 days to Microsoft to better troubleshoot their
issues. If gateway is not connected, they can choose to save and archive gateway logs
Users are able to copy non-Blob into SQLDW via Polybase & staging blob in code free copy preview
Users can leverage Data Management Gateway to directly ingress data from on-premises SQL Server database into Azure Machine
Performance improvements
o Improve performance on Schema/Preview against SQL Server in code free copy preview
Add meaningful messages to error pages for auto-update
Gateway tray is automatically launched in system tray when gateway installation finishes
Bug fixes:
o Early detection of wrong format settings against CSV format in code free copy preview
Support ORC format for File, HDFS, Azure Blob and Azure Data Lake (as source or destination)
Security improvement on credential handling for 9 on-prem data source types (SQL Server, MySQL, DB2, Sybase, PostgreSQL, Teradata,
Oracle, File and ODBC)
Bug fixes
Increase the default maximum size of gateway event log from 1MB to 40MB.
Add a warning dialog in case a restart is needed during gateway auto-update. Users can choose to restart right then or later.
In case auto-update fails, gateway installer will retry auto-updating 3 times at maximum.
Performance improvements
o Improve performance for loading large tables from on-premises server in code-free copy scenario.
Bug fixes
Performance improvements
Bug fixes
Zero touch auto update capability
Performance improvements
Bug fixes
Improve troubleshooting experience
Performance improvements
Bug fixes
Performance improvements
Bug fixes
Performance improvements
Bug fixes
Support On-Prem HDFS Source/Sink
Performance improvements
Bug fixes
Performance improvements
Bug fixes
Support diagnostic tools on Configuration Manager
Support table columns for tabular data sources for Azure Data Factory
Support CopyBehavior – MergeFiles, PreserveHierarchy and FlattenHierarchy in BlobSink and FileSink with Binary Copy for Azure Data
Bug fixes
Support table name for ODBC data source for Azure Data Factory
Performance improvements
Bug fixes
Support File Sink for Azure Data Factory
Bug fixes
Support 3 more data sources for Azure Data Factory (ODBC, OData, HDFS)
Bug fixes
Support 5 relational databases for Azure Data Factory (MySQL, PostgreSQL, DB2, Teradata, and Sybase)
Performance improvements
Bug fixes
Add Oracle data source support for Azure Data Factory
Performance improvements
Bug fixes
Unified binary that supports both Microsoft Azure Data Factory and Office 365 Power BI services
Support scheduled data refresh for Power Query data connection with additional data sources:
o Folder
o SharePoint List
o OData Feed
o Azure Marketplace, Azure HDInsight, Azure Blob Storage and Azure Table Storage
o Only Power Query connection string from Excel DATA tab can be recognized in Admin Center. Direct copying Power Query
connection string from Power Pivot is not supported. A valid Power Query connection string should contain:
Data Source=$EmbeddedMashup(SomeGUID)$;
o Data sources other than SQL Server and Oracle can only be used for scheduled data refresh for Power Query connections.
o All data sources in the Power Query connection must be hosted on the same gateway.
Enhance the scalability for Data Management Gateway. Power BI provides a way for customer to easily scale-out the service to multiple
machines/gateway instances (up to 10 instances), in order to meet the growing demands.
Fix timeout issue to support more time-consuming data source connections. Now gateway supports data refresh requests lasting up to
30 minutes.
Support scheduled data refresh for Power Query data connection with SQL Server and Oracle data source only.
o Only Power Query connection string from Excel DATA tab can be recognized in Admin Center. Direct copying Power Query
connection string from Power Pivot is not supported.
o All data sources in the Power Query connection must be hosted on the same gateway.
Support scheduled data refresh for SQL Server and Oracle data sources.
Publish and index SQL Server and Oracle data sources as corporate OData feeds.
o Certain data types are not supported. Please refer to Supported Data Sources and Data Types for more information.
This is a security design where you can only configure on-premises data sources for cloud access within your corporate network, and your
credentials will not flow outside of your corporate firewall. Ensure your computer can reach the machine where the gateway is installed.
• The gateway is too busy so that you want to migrate heavy data sources to other gateways, or
• The database is overloaded so that you need to scale the server, or
• The query takes too long (>50 seconds) so that you may want to optimize the query or cache the result in data mart instead.
Currently it is not supported to get Power Query connection string from Power Pivot. Please copy Power Query connection string from the data
table instead. Please refer to Get a connection string from a data table for more information.