SONAS vs. Isilon Part 6: Remote Replication
In this article we’ll look at the remote replication
implementations of both systems. The comparison is based on SONAS 1.4 and
Isilon OneFS 7.0. Both systems provide asynchronous replication with
configurable RPOs.
Use Cases
Asynchronous replication is typically used for large scale
filesystems for the following use cases:
- Disaster Recovery
- Business continuance
- Disk-to-Disk backup
- Remote disk archive
Although a zero RPO cannot be achieved using asynchronous
replication, it is typically feasible for non-
transactional data. The achievable RPO as well as the RTO depends on several factors as the change rate of data, available bandwidth, latency and other. In practice a RPO of a few minutes can be achieved (for example, in Isilon this can be set down to a minute but it depends of the other factors whether that is realistic in practice.
transactional data. The achievable RPO as well as the RTO depends on several factors as the change rate of data, available bandwidth, latency and other. In practice a RPO of a few minutes can be achieved (for example, in Isilon this can be set down to a minute but it depends of the other factors whether that is realistic in practice.
Performance
Both systems are capable of doing parallel replication
leveraging multiple nodes on source and target and on both systems the number
of processes (SONAS) and worker threads (Isilon) can be modified to adapt
throughput and resource utilization. CPU and IO throttling is only supported on
Isilon as well as Sub-Directory replication.
Picture 1: Replication is done in parallel from all or multiple nodes (Isilon). SONAS uses musliple or all of their interface nodes to replicate. |
For the differences in functionality please refer to the
performance section in the table below.
What gets replicated?
The list of things that one
could think of to be replicated can be quite long. However, the basic thing
that need to be replicated are obvious, it’s the data that is stored in the
filesystem. Then there is a couple of things are needed on the remote site in
case of a failover, for example:
- SMB shares
- NFS exports
- User/Group Quotas
- UID/GID mapping
Then, we have a couple of things that you may or may not
like to be replicated depending on the environment:
- Network configuration (like IP addresses, routes, etc.). This is only useful if both sites are within the same layer2 network. It might be more appropriate to re-route the clients using other mechanisms like changing the DNS records or make use of DFS.
- Authentication configuration (only if Site B is supposed to use the same AD or LDAP servers). If you think about a DR use case this is quite unlikely since the original AD or LDAP server may have also be ‘failed over’ to something else.
The list does not end here (i.e.
you could also think about replicating Access Zone configuration, file policies
and more) but it would go beyond this article to discuss all the potential use
cases and implications.
Fact is, that both platforms of
course replicate file system data. From the nice to have features mentioned
above, Isilon replicates the UID/GID mapping which is a very important aspect.
SONAS does not do that and therefore requires AD authentication with installed
Unix Services on the AD (which should be best practice for a UID/GID mapping but
I know of many customers don’t have that for various reasons. But having AD SFU
installed is centralizing the UID/GID to SID mapping and is therefore very
helpful in many other respects too).
The other big difference is that
SONAS can only replicate the whole filesystem (or at fileset level) whereas
Isilon can do replicate also on subdirectory level. This may an important
aspect since you can schedule different portions of your filesystem replication
at different times. For further differences see the ‘What gets replicated’ section in the table below. Also the ability to replicate on directory level allows you to implement an active/active like concept where the clients access data on both sites and relevant directories are replicated in the other direction (see figure 2).
Failover/Failback
In case
of an outage of the primary site you may want to failover to the secondary
site. For this to happen we have typically these steps to perform:
- Stop the replication from A to B (if not already done by the outage)
- Roll back to the last known good snapshot on site B
- Make the filestem (or directory) writeable on site B
- If site A is available again we would need to revert the replication direction from B to A to replicate the changes that have been done on B while A was down.
- One site B and A are at the same level you may decide to fail back to site A.
In SONAS
you have to revert the replication direction manually which might be a source
for errors if not well documented and trained. Also you have to consider that
in outage situations there is often an increased level of adrenalin in the
admins blood so that a pre-defined routine may help a lot to avoid mistakes. Isilon
provides a One Push Button solution
for it (well, it’s basically the failover that requires only one push button
while the failback is performed in three steps. However, this pre-defined steps
and the ability to perform them also via the WebUI are very helpful.
Some comments on automatic failover
Many
customers are asking for automatic failover/failback capabilities. This is because
of their experience with synchronous replication where that might be
appropriate. However, in the unstructured file based scale out environment synchronous
replication is not really a choice. You have to consider that with asynchronous
replication a failover is –with high probability- causing some data loss
(remember that we have to roll-back using the last known good snapshot).
Therefore you might want to have an administrator deciding whether you need to
perform a failover or initiate other appropriate actions. Nevertheless,
automatic failover can be done. I have seen projects where this have been
implemented on Isilon using a so called Automatic Failover Management solution
(AFM) which was implemented by some clever services colleagues.
Topology
Typically the replication is built in a 1:1 relation. SONAS
does it on filesystem level whereas Isilon canb e configured to replicate on a
subdirectory level. Therefore a bi-directional replication can be setup for
different directories. Figure one illustrates that. The target directory (or
filesystem) is read only until the replication relation gets stopped. A single
system can be target for multiple
sources. In SONAS that requires the target directory to be different from the
root /gpfs. For a potential failover this requires modification on the
application (of DFS) level since source and target directory paths are not
identical.
Figure2: OneFS bi-directional replication on different
sub-directories
Bi-directional replication (in support of an active-active
like solution) can only be implemented with Isilon because in SONAS the same
system cannot be source and target at the same time (because you can only to
replication of filesystem level)
[update 11.Feb 2014]
[/update]
[update 11.Feb 2014]
Aspera Integration
One important aspect that I forgot to mention when I did the initial post is the Apsera integration into Isilon. Aspera is a third party solution for WAN optimized data replication and synchronization (for more information you may visit their web-site http://asperasoft.com/software/synchronization/ ). Isilon has the Aspera solution integrated into the code and can therefore replicate and synchronize at a high performance level. Ironically, IBM has recently acquire Asperasoft but the solution is not integrated into SONAS.[/update]
Management
Both systems allow Web UI and CLI configuration of the
replication. Isilon also provides a RESTful API for management but the support
for SynIQ (that’s the name of the replication module in Isilon) is yet very
limited. What future releases for complete support of all functions here.
Failover/Failback is an important aspect for disaster
situations. In SONAS you have to revert the replication direction manually
which might be a source for errors if not well documented and trained. Isilon
provides a One Push Bottom solution
for it (well, in practice only the failover is one push, failback requires of
course some additional steps).
Both solutions have online accessible documentation. The
SONAS information center (google it) seems to me a good structured and complete
resource with html and PDF format and search capabilities. The Isilon PDF
documentation can be downloaded on support.emc.com where you can also find some
best practices papers.
Isilon
|
SONAS
|
|
Performance
|
||
Parallel replication using multiple or all nodes
|
yes
|
yes1)
|
Throttle throughput
|
yes
|
no
|
Throttle CPU usage
|
yes
|
no
|
Transfer compressed data
|
no6)
|
yes
|
Target aware initial replication
|
yes
|
yes
|
Efficient block based deltas
|
yes
|
yes
|
Modifiable number of processes or threads
|
yes
|
yes
|
Aspera integration
|
yes
|
no
|
What gets
replicated
|
||
Subdirectory replication
|
yes
|
no2)
|
Replicate UID/GIS mapping
|
yes
|
no3)
|
Replicate shares/quotas
|
no
|
no
|
Include/Exclude policies
|
yes
|
no
|
Topology &
Security
|
||
1:N replication
|
yes5)
|
?
|
N:1 replication
|
yes
|
yes
|
A source cluster can also be a target
|
yes5)
|
no
|
Cascading replication of same directory
|
no
|
no
|
Can encrypt replication data on wire
|
no 6)
|
yes
|
Management
|
||
GUI configuration and Management
|
yes
|
yes
|
GUI configuration and Management
|
yes
|
yes
|
RESTful API for management
|
yes7)
|
no
|
Push button Failover/Failback
|
yes
|
no
|
Performance/throughput monitoring
|
yes
|
no
|
Failover dry-run support
|
yes
|
no
|
Additional Snapshots on target
|
yes4)
|
?
|
Rating on available online documentation
|
+
|
++
|
1) Only interface nodes
2) Can use file space replication
3) SONAS async replication requires AD and
installed Unix Services to be installed
4) Requires SynIQ license
5) Only on different directories
6) Requires external solution
7) Limited in 7.0 and almost complete in 7.1
Summary
Both SONAS and Isilon do asynchronous replication. Although
customers often ask for synchronous replication, it is hard to achieve with a Scale Out NAS System due do latency and other issues. Both systems can
leverage their parallel architecture to move data and they can incrementally
move changed data on block level. Both systems are very limited in replicating
configuration data such as shares, quotas, networking config so that manual
and/or scripted actions are required for failover/failback. In this regard
Isilon has more automated failover/failback functionality while SONAS has
advantages with compression and encryption capabilities.
Disclaimer
As always: this article reflects my own personal view of the
facts. As the time of writing, Isilon release 7.02 and SONAS 1.4 were the actual releases. Please consult the appropriate manuals for details and actual release. If I got something wrong or missing please feel free to use the comment
function to post your comments or send me a mail. The also the general
disclaimer.
ReplyDeleteThanks for sharing NAS storage dubai