Read on:
Virtualization Trends Series: What is Edge Computing: Part 5
Storage virtualization and data protection
Welcome to part 6 of our virtualization trends series in which we will discuss storage virtualization and data protection. Those are pretty big topics that are very relevant to BDRSuite as it is one of the leading data protection platforms, especially when it comes to virtualization.
Just like the other topics we covered in the previous parts of this series, the world of storage has gone through major evolutions over the years, going hand in hand and supporting the growing needs of virtual machines. High performance workloads are also an important driver for innovation with the constant need to push the envelope in various use cases such as machine learning, artificial intelligence, data science and so on.
What is storage management
Storage is one of the most critical areas of any SDDC (Software-defined Data Center) in several different aspects such as:
- Data persistence: Some data can be ephemeral like logs and cache; those usually require fairly high levels of performance. But it can also be persistent, in which case the data must remain available regardless of the status of the workloads generating or consuming it (e.g., databases)
- Performance: Not all workloads require the same level of performance, which often directly relate to the cost of the storage to use. A good example for this is the storage tiers in cloud providers. Different storage tiers will be available at different price points, so you don’t store your backup copies on expensive flash storage for instance
- Capacity: Obviously, capacity is an important factor in storage solutions as you want to future-proof your installation without leaving 80% of it unused during its lifecycle. This, unfortunately, is often the case in hyperconverged infrastructures
20 years of storage evolution
The world of IT went through different stages in the storage space over the last 20 years or so. Starting from old school bare metal servers with dedicated RAID cards to associate local drives to the operating system running on the machine and the services installed on it.
Then came shared storage technologies with protocols such as iSCSI and FC to present a block volume to multiple servers, so long as it is formatted with a file system that supports concurrent access or at least a solid locking mechanism like VMFS. On the other hand, file sharing protocols like NFS already offered concurrent access at the OS level.
Then Object based storage made its appearance where data is stored as components on the underlying storage backend with the use of storage policies. Implementations like VMware’s VVols (virtual volumes) offer a great deal of flexibility by allowing storage management at a much more granular level than traditional block storage.
Building on this technology, virtualized storage or software defined storage (SDS) disrupted the IT landscape by leveraging local server disks to create a shared pool of storage across servers participating in a cluster. VMware offers VSAN and Microsoft has Storage Spaces Direct which is now part of Azure Stack HCI.
Importance of staying up to date with the latest trends and technologies
Seeing all these evolutions and the rate at which they happen, it is critical for organizations to keep up with new developments in this space. It is often not relevant to burn all the yearly budget to jump on the shiny full NVME (Non-volatile Memory Express) train or whatever new hot technology is rising at the time if it is not solving a current or future problem. Business needs are often what drive decisions to move towards a modern storage solution compared to keeping with what is currently in place. In which case, it is important for the IT department to be able to deliver and offer expertise in this area.
While no one can be expected to be an expert in everything, having basic knowledge of the technological advancements in his or her space will allow the engineers to make informed recommendations to help business departments keep up with their needs.
Emerging Trends in Storage Management
Modern workloads require ever more performance, flexibility, mobility, capacity, you name it, they need it.
Hybrid cloud storage solutions and multi-cloud strategies
Hybrid cloud storage solutions are becoming increasingly popular among businesses and organizations as they offer a flexible and scalable approach to data storage that combines the benefits of public cloud and on-premise datacentre environments. Here are a few of the main hybrid cloud storage solutions:
- AWS Storage Gateway: AWS Storage Gateway is a hybrid cloud storage service that integrates on-premises applications with Amazon S3 or Amazon Glacier cloud storage. It offers several storage options like file, volume and even tape gateways. They have the benefits of storing and retrieving data from the cloud while retaining data locally for low-latency access
- Microsoft Azure Stack: Azure Stack is a hybrid platform that extends Azure services to on-premises environments. It allows organizations to build and deploy applications using Azure services such as Azure Blob storage and Azure Virtual Machines, in their own data centers. As a result, you get the advantage of the scalability and flexibility that comes from the cloud while keeping sensitive data on-premises
- Google Cloud Storage: Google Cloud Storage combines the scalability and durability of Google Cloud Storage with on-premises storage solutions, such as Network Attached Storage (NAS) and Storage Area Network (SAN) environments. That way, data can move seamlessly between on-premises and cloud storage based on their needs
Other solutions include IBM Cloud Object Storage, Dell EMC Unity Cloud Edition, Oracle Cloud Infrastructure Storage Gateway and a plethora of other offerings with most of the main big IT vendors.
Edge storage for IoT, 5G, and AI applications
When we talk about edge storage, it refers to the practice of storing and processing data locally at the edge of a network and closer to the source of data generation. This prevents transmitting data to a central datacenter or cloud over WAN which adds latency and can be costly. Edge storage is specifically relevant for Internet of Things (IoT), 5G, and artificial intelligence (AI) applications. Those require massive amounts of data that are generated in real-time and need to be processed quickly to enable real-time decision-making.
Edge storage solutions usually involve deploying a basic storage infrastructure on site where data is generated. More and more vendors offer solutions to address the growing need for edge workloads, a few examples:
- VMware Edge Compute Stack: A purpose-built and integrated stack offering HCI and SDN for small-scale VM and container workloads to effectively extend an organization’s SDDC to the Edge and manage edge-native applications at the Near and Far Edge
- NVIDIA EGX Platform: A combination of NVIDIA GPUs with edge servers to enable AI inference and other compute-intensive workloads at the edge. Made of NVIDIA EGX servers, purpose-built edge machines with powerful GPUs and NVIDIA Metropolis (an edge-to-cloud platform for video analytics and AI-powered applications)
- AWS Greengrass: Edge computing service that allows users to run AWS Lambda functions and containerized applications at the edge of the network. It enables local data processing, device management, and seamless integration with AWS services for cloud-based analytics
Hyperconverged infrastructure (HCI)
HCI is nowhere near to be a new contender in the storage space as software defined storage has been around for a long time. Players like VMware and Microsoft started rolling out their implementations in the early 2010s and have been building on it since then.
Hyperconverged infrastructure refers to scenarios where the shared storage pool is provided by the nodes themselves as opposed to a central storage array like you would typically see in most virtualized environments. Each participating host in the cluster is equipped with local drives and a proprietary protocol takes care of synchronizing data between the nodes.
Both implementations have pros and cons. For instance, it can be tricky to keep resources usage optimized as adding more nodes to add compute will also add storage that is perhaps not needed and vice versa. However, many organizations went with hyperconverged architectures as it has a number of significant upsides over traditional storage array architectures such as:
- Ease of set up and configuration and multiple available architectures
- Same lifecycle as the hosts themselves
- Benefits of object storage capabilities (per-VM storage policies)
- Flexibility in failure domains configuration
- Less of a single point of failure (SPOF) as storage is distributed across all nodes
VMware pushed the concept even further by integrating with Dell VXRail to fully automate all update processes of versions, drivers and firmware. They also simplified figuring out compatibility and hardware support with VSAN Ready Nodes, certified hardware configurations that have been tested by VMware themselves.
From hybrid to full flash storage
Before SSD devices were common hardware in the IT landscape, you would achieve storage tiering by building RAID groups backed by hard disks with faster or slower spindles. Drives dedicated to cold data and large capacities would usually be 5400 RPMs (rotations per minute), then 7200RPMs, 10K RPMs, up to 15K RPMs for those workloads that requires faster disk access. You then get into the RAID design considerations like RAID type, sector size, cache/buffer size etc., which could wildly change the outcome of a benchmark.
Then came hybrid deployments where you dedicate a set of SSD devices for cache purpose only. For read operations, the most often accessed blocks are also stored on the SSD tier for faster access. Some providers also offer write cache, in which case, when a workload issues a write operation, the ACK will be sent back when the block is written to the cache tier (which is much faster than spindles) and then offloaded to the capacity tier, backed by regular hard drives. VMware VSAN offered hybrid storage based on disk groups made of cache devices and capacity devices.
Finally, as chips prices went down over the years, SSD devices became more affordable, and more organizations decided to go the full-flash route in order to future proof their hardware and ensure that their workloads would get the performances they required regardless.
Since a few years, the performances of flash devices have been increased further with new types of SSD technologies such as:
- NVME (Non-Volatile Memory Express): NVMe SSDs are designed to take advantage of the high-speed capabilities of PCIe, providing data transfer rates of up to 32 GB/s
- 3D NAND flash memory is a type of non-volatile memory that stacks layers of memory cells on top of each other, allowing for greater storage capacity in a smaller space
- Intel Optane: A memory technology that is designed to bridge the gap between volatile and non-volatile memory. It is a form of non-volatile memory that is based on 3D XPoint technology, faster than traditional NAND (10 times lower latency) flash memory and more durable than DRAM. It can be used as a standalone storage device or as a cache to accelerate existing storage devices
Object storage and data management
Unlike traditional file-based storage or block-based storage, object storage systems manage data as discrete units, known as objects. Each object contains data, metadata, and a unique identifier, allowing for efficient and flexible data management and retrieval. In the 2020s, object storage has seen a significant rise in popularity due to the growth of unstructured data and the need for scalable, cost-effective storage solutions. Let’s explore some of the key factors driving the rise of object storage.
- Scalability: Object storage systems can scale horizontally by adding more nodes, allowing for virtually limitless storage capacity. This is particularly useful for organizations that need to store and manage large amounts of unstructured data
- Cost-Effectiveness: Since it allows organizations to store data on commodity hardware, it eliminates the need for expensive storage arrays and proprietary hardware, making it an attractive option for organizations looking to reduce their storage costs
- Durability: Erasure coding is a way to protect data against hardware failures (comparable to RAID levels). It is a technique that splits data into multiple pieces and distributes them across different nodes in the storage cluster. This ensures that even if a node fails, the data can be reconstructed from the remaining pieces
- Accessibility: Objects can be accessed from anywhere using a RESTful API. This allows organizations to easily integrate object storage into their existing applications and workflows. In this use case we are talking about providers like Amazon S3
- Flexibility: Unlike file or block storage, object storage allows you to manage the placement of components based on policies that can be as granular as file level. Meaning your most critical VMs can be mirrored a couple times and encrypted while test VMs can have no redundancy to save storage by storing a single copy of its components
Conclusion
Over the last 20 years, datacenter storage technologies have seen dramatic advancements and innovations, from the introduction of solid-state drives to the rise of cloud storage and object storage. These evolutions have allowed for faster, lower latency, more efficient, and more scalable storage solutions. These enabled organizations to manage and process ever-increasing amounts of data and keep up with the data decade.
Looking ahead, there are several potential disruptive technologies to watch out for, including non-volatile memory express (NVMe), Intel Optane, and machine learning-driven storage management. These technologies have the potential to revolutionize the datacenter storage landscape, offering even faster speeds, higher capacities, and more efficient data management.
As these new technologies emerge and existing technologies continue to evolve, it is crucial for organizations and IT professionals to stay up to date and prepared for the changing storage landscape. By keeping up with the latest advancements and understanding how they can benefit an organization, IT departments ensure they are using the most efficient and effective storage solutions at their disposal for specific use cases.
Follow our Twitter and Facebook feeds for new releases, updates, insightful posts and more.