Setting up the right infrastructure to support your Power BI deployment can be a tedious task loaded with tough decisions and lots of different variables to consider.
Our goal is to make one part of your enterprise deployment – choosing the right virtual machines (VMs) for your gateways – a little bit easier.
Choosing the right VMs is an important decision in order to make sure that the performance and scalability of your deployment are supported properly. In this post, we are going to cover the different types of Azure VMs as well as some things you should consider when choosing the correct type and size of virtual machine.
What is a Virtual Machine, and Why Does it Matter?
A virtual machine is essentially a dedicated capacity on a physical server that contains all of the components of a physical computer like CPU, disks, and memory. This “virtual” computer is not a physical desktop computer, but rather a virtualized version of a computer that exists purely as code. It is important to understand this concept because virtual machines are where you will host your Power BI Gateways instead of hosting them on your actual physical machine.
Hosting gateways on a VM is important because it allows you to have a central place to house your gateways that will not disappear if an employee leaves the company. Virtual Machines can also be scaled up or down very easily to meet the needs of your workloads while scaling up a physical machine requires you to replace the entire physical computer.
There are many different types of VMs available for purchase on your Azure Portal that also have different capabilities and purposes. Choosing the right VM for your needs will be important for the long term success of your Power BI deployment.
In this next section, we’ll discuss the different types of VMs available and how to choose which one for your business.
How to Choose the Right Virtual Machine
Before we dive into the different types of virtual machines, we first need to understand the use case for the VMs. These VMs will be handling the workloads for your data refreshes and/or queries to your data sources. It is important to note whether developers will be utilizing Direct Query or Import mode for their data model. You will also want to know how large the datasets will be that you are querying. Our recommendation for VM sizing and type will be based on this information.
Once you’ve gathered the appropriate information, we can take a look at which virtual machines fit your use case. Let’s go over the types of VMs available in Azure Portal. You can find a link to the Microsoft documentation that includes all of these VMs here.
In the table below (also found in the previous link), you will find the different VMs broken down by type, SKU, and a brief description.
Type | Sizes | Descriptions |
---|---|---|
General purpose | B, Dsv3, Dv3, Dasv4, Dav4, DSv2, Dv2, Av2, DC, DCv2, Dv4, Dsv4, Ddv4, Ddsv4, Dv5, Dsv5, Ddv5, Ddsv5, Dasv5, Dadsv5 | Balanced CPU-to-memory ratio. Ideal for testing and development, small to medium databases, and low to medium traffic web servers. |
Compute optimized | F, Fs, Fsv2, FX | High CPU-to-memory ratio. Good for medium traffic web servers, network appliances, batch processes, and application servers. |
Memory optimized | Esv3, Ev3, Easv4, Eav4, Ebdsv5, Ebsv5, Ev4, Esv4, Edv4, Edsv4, Ev5, Esv5, Edv5, Edsv5, Easv5, Eadsv5, Mv2, M, DSv2, Dv2 | High memory-to-CPU ratio. Great for relational database servers, medium to large caches, and in-memory analytics. |
Storage optimized | Lsv2, Lsv3, Lasv3 | High disk throughput and IO ideal for Big Data, SQL, NoSQL databases, data warehousing, and large transactional databases. |
GPU | NC, NCv2, NCv3, NCasT4_v3, ND, NDv2, NV, NVv3, NVv4, NDasrA100_v4, NDm_A100_v4 | Specialized virtual machines targeted for heavy graphic rendering and video editing, as well as model training and inferencing (ND) with deep learning. Available with single or multiple GPUs. |
High performance compute | HB, HBv2, HBv3, HC, H | Our fastest and most powerful CPU virtual machines with optional high-throughput network interfaces (RDMA). |
There are a lot of different options here, so to save you some time, let’s look into the different category types and use them to narrow down the list.
- General purpose – As the name suggests, these VMs are a jack of all trades that are meant for smaller workloads.
- Compute optimized – Meant for lots of computing which can be better for Direct Query workloads.
- Memory optimized – Proper VM for large caches which means that it will work best with lots of Import mode use cases.
- Storage optimized – Best for leveraging large data warehousing jobs.
- GPU – Very specialized use cases that include heavy graphics and video editing.
- High performance compute – these are the most powerful (and expensive) VMs offered designed for real-time use cases.
Once you have determined which general category fits your use cases, we now must determine what model of the VM fits the rest of your requirements. An easy way to do this is to view the VM selection screen in your Azure Portal.
In your Portal, select Virtual Machines from Azure services, click Create, then Azure virtual machines. About halfway down the screen you will see a section called Instance Details where you will be required to choose a size for your VM. You can click on the See all sizes button to view each of the SKUs with their associated prices.
Check out the screenshot below for this view. You can also find the Azure VM pricing calculator here.
At the beginning of this section I stated that you will need to find out the size of your datasets to make the right VM decision, this is where that information will be applied!
Each virtual machine SKU will have different specifications in terms of CPU cores, RAM, disks, and temporary storage. These different specifications will also come with different price tags. It is important to understand these different requirements to make sure that you choose the right VM for your Power BI deployment.
Microsoft recommends a VM with at least 8 CPU cores and 8 GB of RAM for a Power BI gateway. This guidance can serve as a starting point for excluding the VMs within your category that does not meet this criteria.
Once you’ve isolated the VMs that will meet the base requirements, you will need to identify a VM that meets your needs in terms of data size. If you are caching your data, you will want a VM with an amount of RAM that meets or preferably exceeds your largest dataset sizes.
As an example, if your post compression dataset sizes max out at 32 GB, you might want to find a VM with 64 GB of RAM to ensure that you will be able to handle the loads.
Finally, once you’ve found a few VMs that meet your requirements, you will want to consider the price of procuring the proper amount of VMs for your deployment. Since using at least two gateways to form a cluster is deployment best practice, you will want to procure at least two of your specified VM.
Make sure that you’re anticipating the cost of multiple virtual machines when you are taking the price into account.
Closing
As you can see, sizing virtual machines can be a difficult exercise because of the complexity and number of variables involved in the process. Although not an exact science, we hope this blog post has helped answer some of your questions and removed some of the uncertainty around choosing the correct virtual machine.
Thanks for reading! Feel free to reach out with any questions about your Power BI deployment. Our experienced team of Power Platform professionals will be able to assist you with any questions about this content of your next Power BI deployment!