21 November 2012

Isilon vs. SONAS Part 4: Migration

# update comment April 2014: the following article misses some actual information on isi_vol_copy. isi_vol_copy is part of OneFS and allows volume based migration from Netapp and EMC NAS systems. It's very fast and handles ACL etc well. I'll do a later post on this, just cannot find the time to play with it at the moment....

One of the challenges when implementing a large scale NAS system is the migration of existing data to the
new system. To my knowledge SONAS doesn’t provide any special tool that supports the migration while Isilon typically facilitates the EMC Cloud Tiering Appliance (CTA) for migrations. CTA’s primary use case is file based tiering in NAS environments with various targets such as other NAS systems, Object storage like Atmos or Amazon’s S3 Cloud Storage. However, here I will focus on how CTA facilitates data migrations. The CTA is available as a hardware appliance or as a virtual machine for a VMWare deployment.
In general there are three areas to consider when planning the migration to an Isilon or SONAS systems:

1.    Authentication (UID/GID/SID)
2.    File Meta Data
3.    Files
4.    Shares


Authentication and UID/SID translation. Users need to be authenticated in one or the other way. SONAS supports Microsoft Active Directory (AS) as well as LDAP while Isilon supports AD, LDAP, local user/group authentication, local file databases and NIS.  Special care needs to be taken when migrating systems with different authentication sources and when users have differed UIDs/GIDs or SIDs like from filesystems with local user authentication. In that case a consolidation is required that results in a single data source with unique UID/GID and/or SIDs. In the ideal case this is done before the data migration is taking place because otherwise a translation is required while data is migrated from source to target.

SONAS doesn’t provide special tooling for this while the Cloud Tiering Appliance has SID translation capabilities at least for Windows. The SID translator is a tool (available on Powerlink) that is installed on a Windows server and helps with the automatic creation of a mapping table in the following way: if the source server contains a local user User1 it will create a User1 on the destination server or AD with a new unique SID. This mapping will then be written to a text file and loaded into the CTA before the actual data migration is kicked off.  If the CTA is then hitting a file during the migration that belongs to user1 on the source system, it will copy all ACLs and write them to the destination servers using the new SID. User1 on the destination system has now a new SID but the effective security is the same. This is an elegant way to perform the consolidation for the Windows world.

 
Figure 1: CTA SID translation

File Meta Data such as POSIX Bits and ACLs for NFS and CIFS needs to be copied and probably modified according to the new environment. For UNIX this can be achieved with tools such as cp and  tar, while for Windows environments robocopy is a commonly used tool. In mixed environments Windows tools should be used because Windows ACLs are more complex than POSIX Bits. Many available tools for mixed environments are based on the Network Data Management Protocol, NDMP (so is CTA) which is capable to copy ACLs and extended Attributes as well as POSIX bits correctly.

Files: This is probably the most straight forward thing but at the same time –depending on the amount of data- the most time consuming part. A good strategy for migration is to use snapshots of SONAS or OneFS to migrate a large portion of the data from the snapshot while production can continue to read and write from the original filesystem. If the snapshot data has been copied completely, a new snapshot is taken and a new copy process is then continuing to copy the data that has been modified. This process continues until the deltas got small enough to copy the remaining changes and switch over to the new system. This of course requires a short interruption of the production. For the migration to an Isilon system, the CTA can be utilized to automate that process if the source system is a VNX, Celera, Natapp or Windows host. Supported target systems for the automated migration are Isilon, VNX, VNXe, Celera. Why is CTA limited to these systems ? That is because CTA facilitates the various interfaces, i.e.File Mover API to facilitate the Distributed Hierarchical Storage Management (DHSM). This way CTA can perform a stub-aware migration that don’t require to recall stubs before the migration. Also the XML Api (VNX/Celera) and ONTAPI (Netpp Filers) is utilized to automate the migration process.

The following steps outline the migration steps to be performed for migration of files from a  Netapp filer to an Isilon System (VNX to Isilon is quite similar) :

  1. A new snapshot of the source dataset is created on the source filer or vFiler.
  2. The destination is scanned and compared against the source snapshot to synchronize renames and deletes for incremental copies.
  3. An NDMP connection is initiated to the source and destination. The first run of the task is a full copy, all subsequent runs are incremental copies based on the last successful copy date.
  4. The snapshot of the source is dumped to the NDMP connection and the migration policy is run against the snapshot.
  5. A file matching the migration policy is read from the snapshot. If file is a stub file, the following steps are taken:
    a.) The stub file is converted to the Celerra format based on the stub contents and the CTA configuration.
    b.) The required DHSM connection is verified on the destination. If it is not properly configured or missing the stub file is not copied.
  6. If a file exists under the same path with the same name on the destination, it will be overwritten if one of the following is true:
    a.) The source file 'last modified' timestamp is newer than the destination file timestamp
    b.) The source and destination 'last modified' timestamps are equal but the destination file size is smaller
  7. If the destination file ‘last modified’ timestamp is newer than the source file it is not overwritten and the source file is not migrated.

CIFS Shares and NFS Exports
The CTA migrates also CIFS shares and NFS exports. For CIFS the CTA requires administrative access to filer(s) that owns the source data. For NFS the CTA must be given root and read/write permissions to source and target.

Summary
Migrating NAS filers can be much more complex compared to block storage migration. There is no general method or tool that covers all scenarios. Migrations require more thoughts than just the file movement. I outlined here the major aspects which are: Authentication, Meta Data, Files and Shares/Exports.  Sometimes it can be hard work, especially if you think about consolidating different authentication or user-admin sources. The considerations are quite similar for Isilon and SONAS. If you migrate to an Isilon system the EMC Cloud Tiering Appliance helps to automate the process for the migration. It takes care of eventually required SID translations and can perform a stub-aware migration. For the file migration itself NDMP is being facilitated.



4 comments:

  1. Very nice post here thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.
    seo company in chennai

    ReplyDelete
  2. In order to transfer the data from one end to the other, the listed tools in the article are more helpful.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete