Everything you need to know about the On-premise Data Gateway

What’s an On-premise Data Gateway? When to use it?

On-premise Data Gateway as the name suggest it’s a Gateway or hub to connect on-premise data sources to Microsoft cloud services like Power BI, Power Apps, etc. You will need to use the Gateway if you are dealing with an on-premise data sources. Cloud data-sources usually don’t need to use the Gateway.

Architecture

On-premise Data Gateway Architecture

Types of Gateway

1. Personal mode Gateway

This is a personal Gateway for a single user who wants to connect to on-premise data sources. This can’t be shared with other users. It’s a good way for developers to start and test different data sources. This Gateway can be installed in your machine.

2. Shared mode Gateway

You will be using this Gateway when you are working in a team or production level reports. It performs the same task of connecting to on-premise data sources. It’s recommended to install the Gateway in same machine or as close to the data source as possible. If you are working with multiple data sources in production settings then install the Gateway in a dedicated machine that can be kept on at all the times. In fact, it’s better to have multiple clusters (machines) for Production Gateway to achieve load balancing and avoid single-point failure using distribution of traffic when one cluster fails.

3. Virtual Gateway

This Gateway should be used when you want to connect data sources using virtual networks. You don’t need to install this Gateway to any machine as they are managed by Microsoft.

How to install Gateway?

1. Manually

  • Go to Gateway download page here.
  • Accept terms and conditions.
  • Provide your email address. You will be Gateway admin by default if you are installing the Gateway.
  • Register Gateway, provide a name and create a Recovery key. Recovery key is extremely important as it behaves like a password of the Gateway. If you need to create cluster or migrate gateway then you will need to use the recovery key.
  • For more details please visit Microsoft documentation here.

2. Code

  • If you want to perform any automated Power BI REST API tasks that will involve Power BI data sources then I would highly recommend installing Power BI Gateway using Service principal.
  • Use below code to install Power BI Gateway using Powershell 7 and above.
#Enter Gateway details
$AppID = #Enter Service Principal Application ID.
$Cert = #Enter Service Principal Certificate Thumbprint.
$Tenant = #Enter Service Principal Tenant.

#Install DataGateway modules
Install-Module -Name DataGateway

#Connect using Service Principal credentials.
Connect-DataGatewayServiceAccount -ApplicationId $AppID -CertificateThumbprint $Cert -Tenant $Tenant

#Installs the latest Data Gateway. Accept all conditions.
Install-DataGateway -AcceptConditions

#Create RecoveryKey (You can use any string). It's like a password that will be required if you are configuring Gateway.
Add-DataGatewayCluster -Name "SK_Gateway_ServicePrincipal" -OverwriteExistingGateway -RecoveryKey (Read-Host "Enter RecoveryKey: " -AsSecureString)

#Add additional user as Gateway admin (Optional). Execute after you have installed the Gateway. 
$GCID = #Enter Gateway Cluster ID.
$PObjID = #Enter Object ID of the user. Make sure you find this under Overview > Managed Application in local directory
Add-DataGatewayClusterUser -GatewayClusterId $GCID -PrincipalObjectId $PObjID -Role "Admin" -AllowedDataSourceType $null
  • Once you have successfully installed Power BI On-prem Data Gateway and added user to it, you should be able to view something similar like below image under Settings > Manage gateways. The Service principal will show in red color. Overall, this is how you can have Service Principal as Power BI Gateway administrator.
Service Principal as a Gateway Administrator
  • You can use the Service principal, in this case “pbi-usr-dataunlock” for any automated tasks related to Gateway.

Security and Network

  • The data is encrypted end – to – end including transit when using On-premise Gateway using TLS 1.2 and above. Anything request attempting to use the service with TLS 1.1 or below will be rejected. TLS – Transport Layer Security protocol is the latest and successor of SSL (Secure Sockets Layer). It’s used in HTTPS. Only the processed data in memory isn’t encrypted.
  • The data sources credentials are encrypted and stored at Gateway cloud service. There decryption happens at the on-premise level.
  • You can enforce Gateway to use HTTPS traffic with Azure Service Bus instead of default TCP. Gateway depends on Azure Service Bus to communicate with cloud services.

Troubleshooting issues

  • Check if the Gateway is upgraded to the latest version. Microsoft releases Gateway upgrades every month and supports only last 6 releases. This means it’s necessary to upgrade Gateway at least every 6 months.
  • Latency and speed issues can be tested using https://azurespeedtest.azurewebsites.net/
  • To test issues related to firewall or proxy blocking you can open Gateway app to test the network ports. The Gateway will test if it’s able to reach and connect to all the ports to confirm there isn’t any blocking.
On-premise Gateway – Network ports test
  • In rare instances, you may want to check the settings in the Gateway config file (i.e. Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config). You can find this file in the machine where Gateway is installed under C:\Program Files\On-premises data gateway. You may want to check if the settings for threshold on CPU and Memory is set to 0 which means it will use full CPU and Memory available in the machine.

If the scheduled dataset refreshes are running slow and the mash-up memory used is high then you can enable “StreamBeforeRequestCompletes” setting to True. The mash-up memory is the memory used by Power queries in the datasets. If a dataset is having multiple queries with multiple complex steps then it may hit mash-up memory. This can also happen if the Gateway is installed in a different cloud vendor or not close to the data sources.

Recommendations

  • Install On-premise Gateway closer to data sources and Azure data centers. This minimizes network latency.
  • Install Gateway on multiple dedicated clusters for Production usage. This will mitigate the risks related to single-point failure, distributes the traffic evenly across the clusters and helps in maintenance without disruption of service.
  • Upgrade Gateway every 6 months to avoid any issues related to the Gateway.
  • Enforce HTTPS communication over Azure Service Bus instead of TCP.
  • Install separate Non-Prod and Prod Gateway instead of using the one for all the purposes.
  • Create a Gateway monitoring report for getting insights on Gateway usage. Refer Microsoft template here.

Wrap-up

If you are dealing with on-premise data sources then you will need to use On-premise Data Gateway. This blog post attempts to provide you with all the important details related to the Gateway. After working on multiple Gateways for quite sometime, I have come across different challenges and recommendations to tackle those challenges.

2 thoughts on “Everything you need to know about the On-premise Data Gateway”

  1. I am extremely impressed with your writing skills as well as with the layout on your blog. Is this a paid theme or did you modify it yourself? Anyway keep up the excellent quality writing, it is rare to see a great blog like this one these days..

Leave a Comment

Your email address will not be published. Required fields are marked *