Quick Bites:
- Understanding memory usage and optimizing it in virtual environments like VMware is crucial for efficient resource management and maximizing performance. As we can understand, inadequate memory allocation can lead to performance degradation, resource contention, and potential system failures, impacting overall workload performance and user experience
- This blog delves into understanding various memory metrics in VMware vSphere, aiming to clarify their meanings and implications for troubleshooting
- It outlines key metrics like Consumed, Granted, Active, Overhead, Shared, Ballooned, Compressed, Swapped, SwapIn, and SwapOut, explaining their significance and how they impact virtual machine performance
- Emphasizing the importance of proper memory management, the blog provides insights into memory reclamation techniques such as ballooning, compression, and swapping, offering practical tips for optimizing memory usage in virtualized environments
The virtual machine monitoring tab of vCenter provides a lot of very useful information about the various resources being used by it. One of them is the memory and it includes a lot of metrics which can be confusing at the start to know what they mean and what to do with them.
We are not going to describe all of them here but we will focus on the main ones that are most likely to be used for troubleshooting.
I will start by defining a few terms, to avoid confusions in the next chapters.
Host physical memory: refers to the memory that is visible to the hypervisor as available on the system.
Guest physical memory: refers to the memory that is visible to the guest operating system running in the virtual machine.
Guest virtual memory: refers to a continuous virtual address space presented by the guest operating system to applications. It is the memory that is visible to the applications running inside the virtual machine. It is backed by host physical memory, which means the hypervisor provides a mapping from the guest to the host memory.
Table of Contents
The built-in memory view of the VM monitoring tab only includes “Consumed”, “active”, “Ballooned” and “Granted”.
Click on “Chart options” and add the metrics you want to see, you can then save it to get quick access to it later by clicking “Save option as…” and give a name.
Now, let us get started with the memory metrics
Consumed
This metric is used to measure the amount of machine memory allocated to the VM. The consumed memory represents the amount of host memory that has been allocated to the VM upon request minus the savings made with inter-VM TPS (Transparent Page Sharing).
A way to calculate the consumed host memory is to count the number of unique blocks accessed by the VM while dividing shared block by the number of VMs accessing it. Multiply the total by the block size (i.e. 4K).
Here is a good example from VMware’s documentation center. Virtual machine 1 touches 3 unique blocks, but 1 is shared with Virtual Machine 2.
If TPS was disabled, the block “b” would be written 2 times in host memory so the consumed memory value of Virtual Machine 1 would be 12K. Note that transparent page sharing is enabled only for intra-VM sharing by default for security reasons, so different virtual machines won’t share host’s memory pages if you don’t change the setting (more on that later).
When a VM touches a memory page that hasn’t been allocated, ESXi will allocate the memory and puts it in the “consumed” metric. However, it is difficult for the host to know when the virtual machine deallocates the page in the guest, because even if it is not used anymore it will remain allocated on the host’s memory and still appear in the consumed graph.
This is the reason why you will very often see a huge difference between Active and Consumed memory and this is where memory reclamation methods come into play in case of contention (more on this later).
Extract from the memory resource management guide from VMware:
“Virtual machine memory deallocation acts just like an operating system, such that the guest operating system frees a piece of physical memory by adding these memory page numbers to the guest free list, but the data of the “freed” memory may not be modified at all. As a result, when a particular piece of guest physical memory is freed, the mapped host physical memory will usually not change its state and only the guest free list will be changed.
The hypervisor knows when to allocate host physical memory for a virtual machine because the first memory access from the virtual machine to a host physical memory will cause a page fault that can be easily captured by the hypervisor. However, it is difficult for the hypervisor to know when to free host physical memory upon virtual machine memory deallocation because the guest operating system free list is generally not publicly accessible. Hence, the hypervisor cannot easily find out the location of the free list and monitor its changes.”
Granted
Use this metric to measure the virtual machine memory. It accounts for the amount of guest physical memory that has been provided to the VM. Memory is not granted to the VM until it has been touched once. As opposed to consumed memory, the focus is shifted to the VM rather than the host. Granted memory does not take intra-VM page sharing into consideration as it looks at the guest physical memory.
If you were to run multiple VMs that leverage TPS to share memory, you would notice that the consumed memory would go down as the shared memory goes while granted memory remains the same. You can clearly see this pattern in the graph below.
Active
This metric is probably the most confusing one. It is important to understand what it means before taking actions as it could have unexpected negative impacts on the VM’s performances. The name seems explicit, you would think that the “active memory” stands for the actual memory the guest is currently using.
Unfortunately, it is not as straightforward as it seems.
The purpose of this metric is actually to estimate how much the guest is using. In order to achieve this the host would need to monitor every page that is touched, however, the overhead in doing this wouldn’t be worth the effort. Instead, the host will use a sampling mechanism to estimate the amount of memory that has been touched. So it represents what the VMkernel thinks is currently being actively used by the VM.
Now there are two considerations to be aware of with regard to this:
- During the sampling period, the amount of memory pages touched is regardless of whether they are unique or not and accessed in the previous sampling period. So, if the active memory of a VM remains 1GB for 10 sampling periods, it means your VM may have accessed between 1GB and 10GB of RAM, even though vCenter shows a straight line on 1GB
- The fact that the metric looks for pages that have been touched means that it won’t necessarily reflect what’s going on in the guest OS. A clear example would be a SQL database that caches part of its data in RAM. The DB may store 20GB in RAM but just access 2GB during a sampling period, the active memory value of the VM will be 2GB but if you look in Windows you will see 22GB of RAM used
This metric was originally created for DRS to get real-time resource usage in order to help distribute the load across hosts. For these reasons, it is important to understand that this metric MUST not be used for capacity planning or monitor memory usage of the guest OS. For the latter, you should use tools like Zabbix that will be “guest-aware” (i.e. snmp or agent-based). However, if a VM has a high “active memory” counter you will know that it is actively working in memory.
There is a good VMware blog that gives some background on the Active memory metric and why it works this way.
A quick note about a feature introduced in 6.5 that allows you to force DRS to use consumed memory rather than active memory. This is really a good feature, I had been waiting for, especially for customers that don’t overcommit memory on their hosts.
Overhead
The overhead memory accounts for the amount of machine memory used by the VMkernel to run the virtual machine. Its size depends on the number of virtual CPUs and the configured memory for the guest operating system. A VM requires a certain amount of memory to be able to power on which qualifies as overhead memory.
You can find these requirements below but note that it does not cover all the scenarios, only the most common ones.
Shared
We already covered a fair bit about Transparent Page Sharing in the previous metrics but there is a lot more to be said about it. The shared memory metric reflects the amount of host memory pages that are shared by two or more virtual machines. If a virtual machine requests a memory page that has already been accessed by another VM, the host will not create a new one but map the existing page to the VM, some kind of memory deduplication if you wish. This has the benefit to save memory and allow for greater levels of over commitment.
The way ESXi works out which pages can be shared by running a background activity that scans at intervals for opportunities. The more similar and constant workloads you have, the greater your memory sharing will get over time. The interval can be controlled with the advanced settings Mem.ShareScanTime and Mem.ShareScanGHz. To identify which pages are similar, ESXi creates a of every page that it stores in a hash table. When a page requested by a VM generates a hash that already exists in the table, the VM is mapped to the corresponding address in memory.
Do note that inter-VM page sharing is disabled by default for security reasons and restricted to intra-VM. See this extract from VMware’s resource management guide.
“Due to security concerns, inter-virtual machine transparent page sharing is disabled by default and page sharing is being restricted to intra-virtual machine memory sharing. This means page sharing does not occur across virtual machines and only occurs inside of a virtual machine. The concept of salting has been introduced to address concerns system administrators those who may have security implications of transparent page sharing. Salting can be used to allow more granular management of the virtual machines participating in transparent page sharing than was previously possible. With the new salting settings, virtual machines can share pages only if the salt value and contents of the pages are identical. A new host config option Mem.ShareForceSalting can be configured to enable or disable salting”
A quick note about the concept of salting in TPS, where a salt value is used in the generation of the hash. It means that by setting the VM advanced setting sched.mem.pshare.salt, you can allow a group of VMs to share memory with each other. This could be interesting if you have a bunch of memory hungry VMs known to have a similar purpose. More info on TPS salting in KB2097593.
Ballooned
We are now getting into memory reclamation techniques.
VMware ballooning is a feature that leverages the balloon driver (vmmemctl) included in the VMware tools installed in the guest OS to release memory, in order to give it back to the host in case of contention. Note that, if the installed VMware tools are not running in the guest, ballooning will not be available for the virtual machine.
In order to force the guest to release memory pages, the balloon driver will simulate a memory contention scenario in the guest by increasing memory pressure on the OS (Inflating), which will make it uses its own native reclamation methods and release the least valuable pages.
And if the resources are really tight, it will force the guest OS to swap pages to its virtual disk. The fact that the guest OS itself decides which pages to swap is a good thing as it will select the ones that will have the least impact on the system instead of random pages like the host would do if it forced the VM to swap to its page file (.vswp), which could very well be important and heavily accessed pages, seriously hindering performances… Once the reclamation is done, the balloon driver will stop and give the pages back to the host (deflate).
It is possible to limit the amount of memory the balloon driver can reclaim with the advanced setting sched.mem.maxmemctl.
Compressed
Memory compression is another reclamation technique used to accommodate overcommitment, it will try to compress memory pages and store them in a small portion of the VM’s memory. This allows for much faster access than swapping should the page be needed by the VM after compression. If the swapped pages can be compressed to 2 KB or smaller, it will be stored in the virtual machine’s compression cache, increasing the capacity of the host.
However, if the page can’t be compressed it will be swapped to disk (see next chapter). The default size of the cache used to store the compressed pages is 10% of a VM’s memory space but it can be changed with the Mem.MemZipMaxPct advanced setting (between 5 and 100%).
Swapped
Memory swapping happens when things are starting to smell toasty. This is the last resort in terms of reclamation mechanisms, it is invoked when TPS, ballooning, and compression can’t save any more memory. It is the one that will impact performances the most.
We talked about swapping in the ballooning chapter and we said that in this case the swapping is executed by the guest OS running inside the virtual machine, meaning it is aware of the best pages to swap. They will be swapped to the virtual disk, the performance will still be terrible should they be needed, but there is less chance for this to happen (or it would be less critical). This type of swapping will not show up in the monitoring view as it is done inside the guest.
The difference with the swapped metric resides in the fact that, when ballooning and compression have done their job and there is no more memory to scavenge, ESXi will start grabbing pages from the VM memory and throw them to disk in its swap file. When it happens there is no choosing which page is the best to swap as ESXi has no idea how critical they are. It is basically a blindfolded darts game with the VM’s performances taped on the target. The performance when accessing swapped out pages will be the performances of your datastore.
Note that the swap file is stored with the VM by default but you can specify a location on a better performing storage like an SSD backed datastore to mitigate the pain a little if your systems are short on memory.
You will also notice 2 different types of swapping related to whether it is writing to disk or back to memory.
SwapIn
It refers to the amount swapped-in to memory from disk. It is the amount of data that has been read into host memory from the swap file since the virtual machine was powered on. So if the VM swapped in the past due to memory contention which does not exist anymore, you will slowly see the SwapIn increase as the VM accesses the swapped pages that are being moved back to RAM.
SwapOut
Swapping out refers to the amount the VMkernel has written to the virtual machine’s swap file from machine memory. As mentioned before this is different from guest OS swapping.
Conclusion
Managing memory is one tricky thing to do, so hopefully this blog post will give you a better idea of what’s what. Note that I didn’t use monitoring, as mentioned earlier, the best way to monitor virtual machine memory is by doing it in the guest OSes. This article does not cover everything and there still a lot more to be said about it, we focused primarily on the metrics but if you are interested in memory reclamation mechanisms you can check out this article by running-system.com.
Strengthen the security of your VMware environment with ease by obtaining BDRSuite: Download BDRSuite
Explore the impressive capabilities of VMware backup using BDRSuite and experience its effectiveness firsthand.
Follow our Twitter and Facebook feeds for new releases, updates, insightful posts and more.