Pages

Thursday, October 21, 2010

Shortening the Distance from There to Here - The Benefits of Virtualization

A large corporation has a development team in India. An application developer in India needs to see the latest revisions to the data warehouse design in order to finalize a new web portal to the data warehouse. Unfortunately, the model that contains the design was not saved to the correct share drive before the US team members left for the night. The India team does not have direct access to the data warehouse so the project is delayed for another day.

A corporation has downsized and is now forced doing more with less. A newly reduced staff of technicians on the East Coast requires an expert data architect with Teradata experience. A perfect candidate exists within the corporation and recently has had their Los Angeles office closed.  Unfortunately, attempts to integrate the team member prove to be inefficient and add too many more steps from design to implementation and the company is unable to take advantage of their asset.

Due to a merger, two teams are attempting to merge their system processes. However, since team members are using different operating systems, they are not able to collaborate using the same software tools.

These are a few simple examples of situations that can be ameliorated by virtualizing infrastructure. As our enterprises become more and more geographically disparate and a 24 hour cycle becomes more commonplace, the question becomes how to best merge our processes and assets. Users implementing a repository based solution are ideal candidates for this type of solution.

Users of the CA ERwin/Model Manger suite get a dynamic and customizable data modeling tool with more robust features than any other similar product on the market. This solution includes a repository for model storage and global reporting. This repository allows for complex 3-way model merges and complex model lineage and history. The trade-off for this complexity, however, is performance.

Anyone using the Model Manager in a geographically diverse team has dealt with issues when attempting to merge models as team members remote to the repository server send data to the server, await verification of synchronized and diverse objects, save appropriate changes, and pass information back to the remote user for difference reconciliation. This back and forth traffic to these remote users can run into many bottlenecks.

Often times, these remote users are accessing the network via VPN or the data is passing through multiple subnets. Meanwhile, local network users may need to await these changes to save their own recent changes. This leads to a cascading effect of performance issues as the queue of users awaiting server access grows longer. Worse yet, this ever lengthening delay increases the likelihood of a network or server failure leading to potential data loss, as the current model changes are lost.

While it would be possible to fine tune every step along this complex network to improve the movement of data from one subnet to another, virtualization provides a more elegant solution. Furthermore, there are added benefits that virtualization provides.

In a virtualized environment, the server and virtual desktops would reside within a single physical server. Since the client and server components are both running locally, in relation to each other, users experience huge improvements in performance. There is a compounding effect as each model merge executes rapidly, minimizing the queue of demands on the network. Also, previous workarounds such as saving the files locally for future merge or scheduling explicit save times for your users can be avoided, giving a truer assessment of your project at any time.

But these are only some of the benefits. Any network will have to deal with inconsistent network performance and data loss. Frequently, as an application is attempting to access a database, packet delivery failure can occur at the database server level, via any bridge over the network or via the VPN connection. Failure at any point could lead to the failure of the software and data loss if the current model changes have not been saved.

By virtualizing the components, any network failure will no longer lead to data loss since any network failure will simply require the remote user to reconnect to the virtual environment to pick up right where they left off.

But wait, there’s more! Containing the entire infrastructure on a single physical server makes backup and restore for disaster recovery possible as a single step. Depending on the frequency of our backups, we can ensure that no more than a few minutes of work are lost.

Alternately, multiple users can have access to the same login at different times in the design phase. Let’s assume that we have a modeler during a data integration phase of our data management initiative. But another user will be the modeler during the data warehouse design phase of the process. By simply revoking one user’s network access to the image and replacing them with another, we can maintain our design flow with fewer licenses. Consultants working on one phase of a project can seamlessly be replaced with another group of users. This implementation would give the functionality of floating licenses.

Virtualization also helps as the data management initiative progresses. Upgrading our database, repository and client software can all be managed by a single administrator of a single device. No longer will many users be running multiple versions of the software using dissimilar operating systems.

Similarly, scaling upwards would simply require upgrading a single physical server or adding a second server on a shared subnet. No longer will multiple users in different offices need to add more RAM to their individual environments. Even a lightweight laptop on an unstable wi-fi connection in an airport can request massive processing on a remote server since the laptop behaves like a console. A user can quickly disconnect, go through security and reconnect to find their project exactly where they left off. There would no longer be any reason to have these physical files saved on remote PCs.

As the complexity of our data continues to grow unabated along with our ever-expanding enterprises in this flattened business world, virtualizing the infrastructure of these processes and containing them as independent and easily scaled appliances has more and more value. The need for our businesses to be more agile without significant new resources is more and more essential. In a world where we need to learn to do more with less, here is an opportunity to actually improve performance and scalability while simplifying our business process.