Moving to Cloud: When to Migrate and When to Tier

This blog was adapted from its original version on The New Stack.

The cloud has had its ups and downs in the last year, but it remains a viable and increasingly vital infrastructure play for the enterprise. Moving your data and workloads the right way can cut costs dramatically and stage a platform for AI projects. With so many storage tiers now available, IT leaders need to understand the options while also navigating the choice: migrate, tier or do some of each?

Compare Cloud Data Migration to Cloud Data Tiering

First, let’s review the differences between cloud data migration and cloud tiering of files.

Cloud data migration means taking data that is currently stored on-prem and moving it to a cloud storage service (like Amazon Elastic File Services or Azure Files) that makes the data instantly accessible from the cloud. Cloud data migrations may occur when it’s time to refresh storage and as part of an overall move-to-the-cloud strategy.

Migrating data to the cloud has at least two purposes: One is to leverage cloud file systems and run applications in the cloud to achieve scale and on-demand pricing. The other purpose is to use the cloud as an offline archive, using low-cost object storage like Amazon S3 Glacier and Glacier Instant Retrieval.

In contrast, cloud data tiering is the process of continuously offloading older, cold data that has not been accessed in months to cloud storage services. Tiering creates an “online archive” in the cloud in which files still appear to be on-prem and can be accessed by simply double-clicking on them. Archival storage in the cloud is (aka Glacier) is much less than standard S3 storage. Because tiering is continuously moving older data to the cloud, it reduces the amount of expensive, high-performance storage you need on-prem as well as the amount of backup storage required. This results in a reduction in storage costs by as much as 70%.

Cloud Migration Considerations

Pre-assessment of data: It’s important to use an analytics-first approach to identify what should be moved to the cloud and what should be deleted or archived. This will reduce cloud costs and migration times and ensure that you are choosing the right strategy for the right data sets at the right time.

Pre-assessment of environment and network: Too often migration performance is poor due to bottlenecks in the on-premises infrastructure and associated network settings. Some migration solutions provide a tool that runs standard tests to identify bottlenecks within your environment. This can fundamentally improve the success of your migration project. Read more about Komprise ACE.

Performance: Migrating large volumes of data, and especially lots of small files, to the cloud can be painfully slow, due to high-latency WANs–especially if migration depends on chatty network protocols like SMB to transfer data. Look for solutions designed to work over WANs and improve file transfer times, such as Komprise Hypertransfer. Network bandwidth limitations and outages can also impede the performance of data migration, and some file attributes or metadata can be lost in the data transfer process. Look for solutions that provide re-tries in the event of network issues and that perform a checksum test to ensure that all the bits of each file have been properly transferred. Read about Komprise Elastic Data Migration.

Security: If you migrate data over the network, you’ll want to ensure the data is encrypted in transit to prevent eavesdropping. In addition, it’s important to configure proper access controls once your data is in the cloud to prevent data leakage or exfiltration.

Cloud Tiering Considerations

Block vs. file-level tiering: Block-level tiering is ideal for system data such as snapshots but has drawbacks when migrating regular user and application data. Because files are stored as proprietary blocks, they cannot be accessed natively from the cloud. Also, when you need to replace the on-prem file system, all the data tiered from it will have to be rehydrated and will require adequate capacity on the existing file server, followed by a migration of the rehydrated data to the new file server. Then you’ll need to tier the cold data back to the cloud. This can be daunting if you have tiered petabytes of data and it will be expensive due to egress fees and cloud API costs.

File-level tiering in contrast, tiers the entire file which can be accessed natively from the cloud for use in AI and other cloud applications. Rather than rehydration, file level tiering, available in Komprise, will allow the tiered files to be accessible from the new file server without having to re-hydrate all the tiered data. This is a huge advantage that should not be overlooked.

Transparency: Tiering should provide transparency so that users can access their data by simply double-clicking on what appears to be the file in the on-prem file server yet redirects to the location to which it was tiered. Transparency allows IT administrators to tier cold data automatically and continuously without disrupting their users and making them hunt for data that has been moved. Learn more about Komprise Transparent Move Technology.

Bulk recall: When needed, tiering solutions should allow you to recall data en masse. If a revision of a project whose data has been tiered is needed, rather than restoring files as they are required, you should be able to recall all the files ahead of time for the best performance.

Conclusion

Cloud data migration is great if your goal is to reduce on-premises storage capacity, adopt new storage technologies and increase investments in the more flexible, on-demand nature of cloud storage. Data tiering is better in cases where you want to lower storage costs and capacity for data that you access infrequently — but which you may still need to recall on-premises in the future.

guide_unstructureddatamigration_blog_linkedinsocial1200x628

Getting Started with Komprise:

Contact | Data Assessment