Blogs

Cloud Migrations, Simplified!

Rajesh Dangi / August 06, 2018

Cloud Migrations

The digital transformation is driving the cloudification, and cloud migrations are the core activities every enterprise is grappling with, since long time there was a debate about cloud adaptation enough water has passed under the bridge and most enterprises now have some or the other workload footprint in the cloud and others are following the suite to catchup. The Cloudification was easier for those who directly started the cloud computing as their only choice and built all applications and associated IT environments and did not have any baggage of legacy systems to worry about. However, for others this still remains humungous task to strategize and migrate. I am making an attempt to help organize few thoughts and strategies via this article about cloud migrations, what's, how's and when's? (We agree Why is already answered!)

 

 

With its healthy mix of on-demand reliability, high availability, security, and reduced operations costs, hybrid cloud implementations can be attractive. Going hybrid can sometimes give you the best of both worlds, the multi-cloud is a reality as per the RightScale 2017 state of the cloud report.

 

What is Cloud Migration?

 

Simply put, Cloud migration is a program that enables partially or completely deploying an organization's digital assets, services, IT resources, tools or applications to the Cloud from their current non-cloud environments and takes advantage of consolidation, technology refresh and reengineering opportunities thereof. Obviously based on the use case and the complexities of the "Workloads" it deserves planning, designing, preparing, testing and executing the migration.

 

The reengineering opportunities can span from process and workflows, code creation, re-architecture, re-design part or complete landscape of the environment wherein the workloads will reside in the cloud.

The approach, process, plan and resources required for migration thus depend on the use case and applied criteria that covers scope, schedule, cost/benefits and risk associated with such migration. The assessment and planning for workload categorization will determine how much efforts are required, duration and cost thus this is a critical milestone way before the actual migration can begin.

 

What are the strategic imperatives and considerations involved in stages of Cloud Migration?

 

There are key stages in the cloud migration journey, I would call them 9A's of cloud migration..

 

Awareness - Establish Understanding of Cloud Services, Benefits and available choices that can range from - Time-to-market, deliver new capabilities, platform upgrade, technology refresh, Cost Saving/Benefit, ROI, Operational efficiencies, support scalability, leverage new technology frameworks. It is prudent for any organization to remain informed about the changing landscape of this competitive market and choose the CSP ( Read, cloud service provider) wisely.

 

There are 6 key criteria's that can help you assess right cloud service provider..

  • Business health & Company profile - standing of the vendor for sustainability and long term, after all you have to run your business with them and have major dependency towards continuity of services.
  • Reliability & Performance - check if provider has established, documented and proven processes for dealing with planned and unplanned downtime. They should have plans and processes in place documenting how they plan to communicate with customers during times of disruption including timeliness, prioritization and severity level assessment of issues.
  • Certifications and Compliance - Providers that comply with recognized standards and quality frameworks demonstrate an adherence to industry best practices and standards.
  • Technologies, Product Roadmap and Services - provider's platform and preferred technologies align with your current environment and/or support your cloud objectives and offer you choice of cloud services atleast at par with the industry and the technology trends, understand the provider's data loss and breach notification processes and ensure they are aligned with your organization's risk appetite and legal or regulatory obligations.
  • Service Dependencies & Partnerships - provider's relationship with key vendors, their accreditation levels, technical capabilities and staff certifications.
  • Commercials & SLAs - predictable outgo, hidden charges if any, pay as you go model etc - Service level objectives (SLOs) typically cover: accessibility, service availability (usually uptime as a percentage), service capacity (what is the upper limit in terms of users, connections, resources, etc.), response time and elasticity (or how quickly changes can be accommodated). Check for a valid exit terms on the contract to secure your data assets.

 

Assessment - purpose clarity, Identifying scope, source systems, data sets, risks and mitigations and high level effort estimation, cloud models ( public, private & hybrid), CSP selection criteria

  • Financial Assessment - is it not always about the cost but is an important lever for decision making both internal and external facing cost implications, not all CSPs have a predictable model of pricing and due to pay per use models any un intended use and unattended resource will inflate the outgo and needs to be governed tightly to remain well within the budgets. Most public clouds operate on commodity pricing and getting a picture of running cost structure requires expertise and professional services from external partners and there are cases where the entire program is stalled due to cost inflection due to unsuspecting outgoes due to bad design or subscription of services ( read, hidden charges). In short, making a good financial model for cloud services is essential along with good governance of subscribed services.
  • Security & Compliance Assessment - Most of the CSPs offer great level of security tools and services and nothing comes by default, it is important to know CSPs are responsible for security of the cloud and customer is responsible for security in the cloud. Any workload you run in the cloud will be your responsibility and required services such as backups, antivirus / endpoint security, SSL certificates, firewall policies and network configurations etc to name the few are your responsibilities. CSP only takes care of the security of the hypervisors, cloud components, orchestration tools and management access of the cloud over which customer deploys it workloads. CSP will not have access to customer workloads unless managed services are subscribed yet have very limited role to play on the data and user administration and access profiles thereof.
  • Technical and Functional assessment - Oversight and risk ownership is mandate for management, to have a transparent visibility on the risk realization and draw mitigation strategy one must have clear understanding and consequences thereof. There are key areas in awareness, assessment and establishing accountability, the variations exist due to selection of different service models (SaaS, PaaS or IaaS) and cloud deployment models (public, private, hybrid or multi-cloud) in any case a cloud-consuming business needs to be aware of risk variations within each cloud model and remain accountable for risk and security regardless of the cloud model or the contractual obligations of the cloud service provider thereof.
  • Vendor Assessment - as mentioned above, a detailed assessment is required for vendor / service selection and validating the terms and conditions to align is a must.

 

Approach & Alternatives - Migration strategy, cost benefit and technology refresh opportunities, usability, risk tolerance, differentiations, training needs, prepare business case with cost outlay, key measures, implementation planning etc, few alternatives on approaches based on Gartner framework with some of my additional R's are..

 

  • Re-Host - On IaaS - Managed Infrastructure, no changes to application - Instance level migration, Host Cloning, minimum downtime, low risk - change the underlying platform / OS.
  • Refactor - On PaaS - application made cloud ready - Re-engineering - efforts on coding required for cloud - significant change - efforts on development and cost. is "backward-compatible" PaaS. Existing programming models, languages and frameworks can be used and extended
  • Revise - for PaaS or IaaS - to modify or extend the existing codebase for helping port the application for Re-Host or Re-factor, provides benefits for adopting new technologies and containerization etc, significant cost and efforts might be required, entire business logic and database structure gets overhauled.
  • Rebuild - application to run on new platform or framework- Re-engineering - Code compilation required for different platform / OS or Database. requires rearchitecting the application for a new container ( J2EE ) etc
  • Replace - with SaaS - Software as a Service - no changes to application functionality, move from own hardware to fully managed service provider cloud data replication and cutover, exit barriers and reverse migration not possible, vendor lock in.
  • Retire - Rebuild entire environment on new cloud stack, just port the data from current setup and retire the current setup
  • Render - port only specific application services or modules from legacy setup to cloud rather than the entire OS overloads, the serverless computing in line with the Rebuild aspect but going further leveraging emerging trends of application modernization in the cloud computing spectrum.

Image result for modern programming containers

Apart from these approaches there are many tactical techniques such as bubble setups, mobile storage, migration prerequisites etc that play major part in the migration journey. If the setup have multiple themes running for different applications, hosts and databases combinations and complexities will differ from one use case to another.

 

For example, look at existing datacenter and remote cloud setup workloads running in parallel, all the dependent services must coexist and operate seamlessly at both locations and across all levels. This is tough task, the DNS, DHCP, IP subnets, NTP, FTP, FQDNs and VLANS demand attention to detail about each aspect and planning these is requires to be done meticulously and tested well in advance before even actual migrations commence. The test workloads are to be created and checked if they can work seamlessly with existing setup rather atomically. The rollback will only be possible if a granular control and activity isolation is applied, the databases and storage migrations take longer time and work best if they are already synchronized and working at both locations thus any traffic hitting them from both locations must be successfully tested.

 

The network and application teams must work in tandem and any references of hard coded IP addresses, VLAN tagging etc must be software driven, the FQDN usage rather than IP address for connecting app and web servers is recommended so use of VIP ( read, virtual IPs) for automated load balancing and host resolution via DNS makes life easier in case few workloads are operating from cloud and few from local datacenter or other cloud provider. The security policies and controls must be deployed in advanced ensuring all workloads have uniform policies and access management as a unified service to have right personnel making changes to only authorized systems and resources to safeguard datasets and configurations before, while and after migrations. All these aspects must be discussed and drafted in the plan and presented to all stakeholders along with RACI matrix to set right expectations and support.

 

Acceptance - The high-level stages involve Scope Sign off, Risk acceptance, Budget allocation, Resource mobilization, Governance framework, WBS - work breakdown structure, POA - plan of action, vendor negotiations, GO/No-Go.

 

All of these tenets must be dwelled upon before making the informed decision and commence the work, typically very few migrations are completed well within the cost, effort and time estimated since multiple activities are involved and there could be surprises on compatibility and integration aspects even if it is lift and shift operations the data migration over internet, conversion of physical to virtual workloads, capacity of customers own infrastructure ( typically many organizations still run on 1G ports and thus limit the data transfer) and many such nitty-gritty's unless a very detailed WBS is created and validated on such aspects. Many cloud providers do not have their own teams to help execute these migrations and advise customers to leverage professional services and thus any delay due to bottlenecks will escalate cost and timelines.

 

Adoption - Contact award, BoM & SoW, training & upskilling, Technicalities of user access, integration / API gateways, prerequisite services and security thereof.

 

Technical blue printing and deployment & security architecture, feature validation and risk handling. A systematic approach towards application, system and security architecture pre and post migration helps capture all dependencies, changes and prerequisite validations as discussed above. The activities involved must be done by Application and IT teams together and debated to surface weak links of assumptions, any external agency delivering value as part of the workflow must also validate their bit to avoid last minute who will do what questions during execution.

 

Acclimatization - Pre-production POCs / Test setups of few services, users and functionalities, integration validations and change management, UAT. It is said that a POC likes of production helps visualise the functionality and UAT sign-off from the person accountable provides confidence, this step is mostly avoided in over confidence and might be risky while implementation. A good period of POC helps allowing it to maintain performance across a range of environmental conditions before actual production workloads are migrated, this might not be applicable for forklift kind of operations but still recommended since the environment is new and toolsets differ from legacy on premise data center to cloud. A written acceptance criteria and UAT as a checklist will help validations and sign offs resting accountability with identified stakeholders.

 

Action ( Read, Actual Migration)- Application virtualization, Migrate data, services and users, monitor utilization and availability, feedback and fine tuning, Final UAT, Failover to cloud. This is the actual phase of migration wherein the workloads and data is moved, most cases the seeding of data replication is set well in advance and as the databases get synchronized a cutover is planned, that static images or templates of application and web instances is made ready but only activated when databases are up and running in the cloud. Let us look at three aspects of migrations, Application workloads, Database workloads and File Servers or Storage Migrations.

  • Application Workloads - typically static as only patches and upgrades will change the workload dynamics, for moving these workloads to Cloud we need to look at VM Conversions - There are multiple methods and tools to migrate workloads to cloud, basically two type of conversions are considered. P2V is physical to virtual, V2V is virtual to virtual.
  • The P2V tool is physical to virtual conversion tool and creates a virtual machine in a virtual host environment and copies all the files from OS, applications and data from the source running on bare metal servers, one should check the compatibility matrix of the P2V tool, yet most of them support all wintel platforms. While conversion it is wise to shut down the running databases or applications, even antivirus as it can create issues and delta changes running in the memory might not get captured in the target image, also the mounted storage volumes might require a separate block replication to destination since source OS has limited view on data volumes, for e.g. if you use GPT on the disk partitions even the Vmware converter will not be able to see those volumes on windows OS. Remember, Some legacy hardware or application simply cannot be emulated in a virtual infrastructure running on cloud, thus performing a P2V migration with a DOS-based application package running on 16-bit hardware, for instance, will not work because none of the major virtualization platforms support 16-bit guest OS's anymore.
  • V2V - Virtual to Virtual is the approach when workloads are already running on a virtualized platform and destined to migrate on cloud. This will be simple enough if the source and destination hypervisors remain the same along with same disk format. The disk formats differ from one hypervisor to another and thus becomes bit challenging. The complexities are many if the disk formats change due to different hypervisors and portability is limited since most disk formats don't convert easily and requires multiple conversions. VM images must be converted between the incompatible virtual disk formats, the tools vendor that support the Open Virtual Machine Format generally do so only for virtual appliances, so typical hypervisors expect to handle proprietary formats such as QED (KVM), QCOW2 (KVM, Xen), VHD (Hyper-V) and VMDK (VMware).
  • Hypervisor replications - Most advanced hypervisors do provide replication ability between two locations and if the source location and cloud location is managed by single hypervisor instance ( rare case) then replication of workloads can be setup and automated failovers can be triggered via automated manner. Many tools are available in the market to leverage that work at hypervisor level for granular control but gets limited since most of the cloud providers will not allow administrative access to hypervisor level for public or shared resource clouds, this will only work of the CSP has multitenant cloud and allows API calls made to the cloud from third party tool from remote locations. It is best suited strategy of the control of the orchestration and hypervisor level admin access is available and can be tested well in advance.
  • Database migrations have multiple approaches, based on the size and type of database and duration of migration window will differ based on the same. For legacy databases If the destination cloud setup is often made available well in advance and replication is setup using native tools to replicate the data between two instances in real-time. If the databases are distributed systems by design then ensuring the clusters and data nodes, gateway nodes are spinning / spawning in destination cloud and rebalanced well in advance as a prerequisite. In any case all database services must be running in the target location before application or web services start, this ensures right sequencing and reduces the risk of data corruption maintaining consistency.

 

Abstraction - Resource pooling, server & storage consolidation, application reengineering, App containerization etc The Cloud computing is fundamentally an abstraction based on the notion of pooling physical / bare metal resources and presenting them as a virtual resource to the workloads consuming them. It logically separates the workloads from the physical hardware resources and enabled dynamic allocation, consolidation and thus abstraction of the infrastructure resources such as compute, memory, storage, network, bandwidth and even at instance level allocations of services etc. it is crucial to envisage the end state of an application and even re-engineering the application architecture to become cloud native and adopt modern frameworks, technologies and code reviews to harness benefits of cloud and distributed architecture at design level. The micro services and server-less computing trends are driving the application architecture in a completely new paradigm and porting the business logic directly onto cloud provides equal opportunity to rethink, redesign and render applications and ecosystem in entirety.

 

Automation - policy based provisioning, optimization, auto scaling, fail-overs, business continuity and self-service etc. Cloud benefits are proven and adaptions are increasing day by day, the digital transformation demands agility and availability and kind of self-healing ability that is delivered via horizontal and vertical scaling of resources and services, enable multi-location, multi-tenant and multi-mode operations with native capabilities of cloud offerings. The automation in the technologically advanced streams such as Artificial intelligence, Internet of things, Big data and all resting on the principle of distributed always on cloud resources and CI/CD development environments. This is driving the adaptation as well as pushing the features and functionalities of cloud orchestration and abstractions beyond imagination.

 

Conclusion

 

In summary, the good strategy of cloud migration provides greater business benefits via efficiency, performance and scalability and reduces manageability and clutter of service management simplifying the usability of applications for consumers. The value proposition of cloud still depends upon the approach and risk-taking ability to fast forward technology adaptation for the organization, it is going beyond lowering costs, ease of utilization, quality and reliability of services, maintainability and simplified pay-per-use opex models. To reap these benefits, one must start and look at what they need to carry to the cloud and discard gracefully building alternatives on the cloud taking calculated risks yet balancing the data portability and security of data assets of the organization!