27 April 2018

Scaling Splunk with the Qumulo File Fabric

Splunk is a market leading platform for machine data. It allows to gather all kinds of log and machine generated data in a scalable manner to index, analyze, visualize large data sets. It provides historic and real time data analytics and a large ecosystem around it, including Machine Learning libraries and many more tools.
Figure 1: Splunk harnesses machine data of any kind for indexing, searching, analysis etc.


The main components of any Splunk implementation are Forwarders, Indexers and Search Heads. Forwarders are typically software agents that run on the devices to monitor and forward steams of logs to the indexers. Indexers are the heart of Splunk’s Architecture. This is where data is parsed and

02 January 2018

Attributes of a Modern File Storage System

About two decades ago, a number of parallel and distributed file systems were developed. The impetus was that, when data began growing exponentially, it became clear that scale-out storage was the paradigm to follow for large data sets. Some examples of good scale-out file systems are WAFL (not really scale-out) , IBM Spectrum Scale (aka GPFS), Lustre, ZFS and OneFS. All these systems have something in common: they had their "first boot" sometime around the year 2000.  They also all have their strengths and weaknesses. Some of these systems are not really scale-out; others are difficult to install and operate; some require special hardware or don't support common NAS protocols; they may have scalability limits, or lack speed of innovation.

Just the fact that these systems were designed 20 years ago is a problem. Many important Internet technology trends such as DevOps, big data, converged infrastructure, containers, IoT or virtual everything were invented much later than 2000, so these file systems are now used in situations they were never designed to handle. It is clearly time for a new approach to file storage

Recently, I became aware of a modern file storage system: Qumulo File Fabric (QF2). Gartner recently named Qumulo the only new visionary vendor in the 2017 Magic Quadrant for distributed file systems and object storage. QF2 was designed by several of the same engineers who built Isilon roughly 15 years ago, and obviously their experiences led them to a very modern and flexible solution.

This article highlights some of QF2's main features that I think are worth sharing here. 1)

QF2 is hardware independent

Several vendors say their product is independent of hardware-specific requirements. They may have used the term "software defined." According to Wikipedia, two qualities of a software-defined product are:

 It operates independent of any hardware-specific dependencies and is programmatically extensible.

06 February 2017

How Isilon addresses the multi-petabyte growth in ADAS development and simulation

Advanced Driver Assistance Systems, or ADAS, is the fastest growing segment in the automotive electronics [1][2]. The purpose of Advanced Driver Assistant Systems is to automate and improve safe driving. We use several ADAS features already built into our cars, such as Adaptive Light Control, Adaptive Cruise Control, Lane departure warnings, Traffic sign recognition and many more. Almost all car manufacturers and all leading suppliers such as Bosch, Autoliv, Continental, Mobileye and many others are working on ADAS systems and the final goal is to build a car that can drive completely autonomous – without any driver involvement. The Society of Automotive Engineers have defined 6 levels to describe the degree of automation[3].

Table1: Six Levels of automation in ADAS

The higher the desired automation level, the larger the validation efforts that are required to develop these assistance systems. The majority of the ADA Systems built into mass production cars today are between 2 and 4. For these systems, millions of kilometers need to be captured and simulated before the final control units are production ready.


The majority of the data volume today is produced by video sensors. However, there are many other sensors generating data:
• Radar
• Lidar
• Ultrasonic
• Vehicle Data


26 January 2017

IoT Messaging at scale with MQTT and Kafka

Kafka vs. MQTT for IoT:

Kafka is a very famous streaming platform. It scales well because you can cluster the brokers and it has intelligent but relatively THICK clients. These intelligent clients make it good for server to server communication and it keeps the brokers quite lightweight. However, heavy clients are not well suited for IoT where you have tiny devices with very little CPU and memory resources. For these type of environments, MQTT is an often used light weight protocol.

However, MQTT is weak when it comes to scaling it horizontally (you’d need load balances from both sides, publishers and subscribers and http which is too heavy and not reliable (subscribers must always be on)).  In this video, Tim Kellog describes a method where MQTT environments have been made salable with Kafka! Quite interesting approach to combine the strength of MQTT (lightweight on the client side) with Kafka (very scale-able streaming plattform). 

27 December 2016

Isilon Search: Mining user-generated Data on OneFS in Real Time

During an average week, an ‘interaction worker[1]’ spends 19% of the time searching and gathering information[2]. Another source specifies that in 2013 content searches cost companies over $14,000 per worker and nearly 500 hours per worker[3]. Utilizing an efficient tool to assist in this process can have a considerable ROI.

On Dell EMC’s Isilon Scale-Out NAS platforms, users generate petabytes of unstructured data and billions of files. Data is created by individual users and machine generated data is exploding due to the growing number of sensors, log files, security devices etc. To be able to mine and search within the growing data lakes (imagine, the average size on Isilon clusters is approaching 1PB!), Dell EMC is working on a Search Appliance that Indexes data on OneFS in real time and allows users and admin to search metadata and content in a fraction of a second.  Alongside the functionality to increase corporate efficiency through search, we are embarking on a journey to mine and analyze user-generated data and further leverage it to create additional business-value.

In it's first version, the planned features are:

  • Index files from multiple Isilon clusters 
  • Search for files by name, location, size, owner, file type, and date.
  • Index files within containers such as zip and tar files.
  • Perform a targeted full content index (FCI) on search results to view a preview of the content and search for keywords and content inside.
  • Perform advanced search queries including symbols, wildcards, filters, and operators.
  • Preview and download content.

For example, administrators and end-users can execute the following use-cases on Isilon arrays:
  • As an End-user, find all my MS-Word files from last year, and then index the full-content of the files, and show me all the files with ‘project Andromeda’ in it
  • As an End-user, show me a chart of how my files breakdown by size and/or last-accessed date and/or size
  • As an Admin, find all PDF files owned by corp/user1 that were modified in the first three months of this year, compact them, and export them to a specified location
  • As an Admin, find all MPG files that are over 1GB in the /ifs/recordings subtree
  • As an Admin, find all Word, Excel, and PowerPoint documents that have not been accessed in a year
To get an Idea of the capabilities of the search appliance in it’s coming first release, watch the following video.

 It's a true Scale-Out Solution

The product is a virtual appliance with Wizards for configuration, and it relies on Elasticsearch indexing technology and the Lucene search engine; it has a ‘google-like’ UI with visual Filtering capabilities. The technology is scaleable: Search nodes can be added ‘hot’, and it scales to billions of files and provides responses in 1-2 seconds. Once the user filters appropriately, s/he can execute actions such as export, and full-content indexing on the results.


Real time Indexing

While the initial index scan may take some time to complete, the solution will update the index in real time by plugging into Isilon’s audit notifications and the CEE framework.  The solution will index meta data such as filename, file type, path, size, dates (last modified, create, last accessed), owner/uid, group/gid and access mode. Optionally we can index full content and application specific meta data.
Fig 1: Components of the solution

It uses the OneFS API to perform certain actions like deletes and other stuff. The protocol auditing (create, delete, modify,...) forwards notifications to a CEE server (typically running on a VM) so that index updates can be made in real time (watch the video so see it). A current limitation is that only file changes carried out via SMB and NFS are monitored and updated. Changes via FTP, the OneFS API (HTTP), HDFS, or on the local file system will not be reflected in the index without a re-scan at this point in time. User actions such as downloading files that show up in a search result is performed via an SMB share.



It is important to mention that searches are done against the index and regardless of the complexity of the query; the OneFS cluster will not be affected by the search.  The UI is very simple to use and allows filtering, it shows detailed metadata of search matches, visualizations and allows user actions such as preview, download, export etc.

Fig 2: The search UI



The install is self-contained.  The user does not need to ‘leave’ the  UI at all during the whole process.

Interested in Beta testing?

If your customer is interested in participating in the Beta test, please register here. Be aware that we are interested in serious feedback and discussion with the user. The program is not required to have a nice test and play experience.

Requirements for the Beta Test

The customer needs to provide the following to be able to run the Alpha code:
  • VMWare ESX v 5.x or 6.x
  • Resources for the VM
    • 32GB RAM
    • 8 vCPUs
    • Can be reduced for smaller Isilon clusters
    • 556GB disk space
    • Can be increased up to 2TB disk space
    • Can be thin provisioned
    • 2TB is enough for 6+ billion files and folders
  • Isilon Cluster with OneFS 7.2 or higher
  • Chrome or Firefox web browser  (IE will be supported for GA)
  • External Active Directory or LDAP server(s) (optional)
    • The Isilon Search virtual appliance has a built-in OpenLDAP server
  • Add additional external AD or LDAP servers to support specific users/groups for search or administration
  • OneFS must expose an SMB share on /ifs. The user specified when the Isilon Search is configured must have full access to this share. The share is used to download files and access them for full content indexing
Isilon Search will automatically:
  • Enable protocol auditing for all Access Zones (Indexing per Access Zone is planned for a future release.
  • Point “Event Forwarding” to the CEE server on the Isilon Search virtual appliance
  • For the Beta, no existing CEE audit servers may be configured.  This will not be a restriction for GA
  • Only one Isilon Search system can point to a single Isilon cluster
  • Event forwarding can only be set for one destination

How many objects on your cluster ?

To determine the total objects on the Isilon Cluster, SSH into one of the nodes and run isi job start lincount.  This will return a job number.  Use isi job reports view <job number>  to see the results once it completes. It may take a while to complete – typically about 30m for 1 billion object (like always depending on utilization, node types etc.).


More to come

Join us for this journey of creating business-value from user-generated data.  The next stations are support for additional Dell EMC platforms, and for more high-value use-cases.



[1] Defined by McKinsey as “high-skill knowledge workers, including managers and professionals”
[2] McKinsey Global Institute (MGI) report, July 2012: “The social economy: Unlocking value and productivity through social technologies”.
[3] Source: https://aci.info/2013/11/06/the-cost-of-content-clutter-infographic/

30 August 2016

A Software-only Version of Isilon: IsilonSD Edge

For more than a decade, Isilon has been delivered as an appliance, based on standard x64 servers using internal disks on which the scale out filesystem resides. Beside the scale-out character of the system and the ease of use, I strongly believe that the appliance idea is one of the main reason for the success of Isilon (and even other NAS appliances in the market). You get everything pre-installed, pre-tested, pre-optimized and easy to support system. However, there are also use cases, for which customers have asked for a software only version of Isilon. And because it’s based on FreeBSD and runs on standard x64 servers it’s just a matter of configuration, support and testing. The nodes in the Isilon appliance that EMC delivers are nothing else than x64 servers with different number of CPUs, memory, SSDs and HDDs, depending on the not type you choose. There is basically only one piece of hardware which has been specifically developed by the Isilon team. This is a NVRAM card that is used for the filesystem journal and caching. However, with OneFS 8.0, the code has been adopted so that now SSDs can be used as well for this purpose.

The Architecture

Considering the potential use cases, the decision has been made to provide the first version of IsilonSD Edge (SD stands for Software Defined) in a virtualized environment, based on VMWare ESXi 5.5 or 6.  Since Isilon has a scale-out architecture and does erasure coding across nodes, it’s required to run a OneFS cluster on at least 3 ESX servers to maintain the availability the Isilon users are used to have. The OneFS cluster nodes run as virtual machines (VMs) and IsilonSD Edge supports only direct-attached disks and VMFS-5 datastores (data disks) that are created from the direct-attached disks. The maximum number of nodes in an IsilonSD Edge cluster currently is 6.
The components that are included in the download package [4] are:

  • The IsilonSD Edge Management Server  (runs in a VM)
  • IsilonSD Management Plug-in  for VMWare vCenter
  • the OneFS virtual machine files.


Figure 1: Architecture of IsilonSD Edge

Current Configuration Options


The Software-only version of Isilon allows currently to be deployed with the following configurations:

  • Three to six nodes with one node per ESXi server
  • Data Disks:
  • Either 6 or 12 defined data disks
  • Minimum size of each data disk—64 GB
  • Maximum size of each data disk—2 TB
  • Minimum of eight of disks per node 8 (6 data disks, 1 journal disk, and 1 boot disk)
  • Maximum of 14 disks per node14 (12 data disks, 1 SSD for journal disk and 1 boot disk)
  • Minimum cluster capacity—1152 GB (Calculated as shown: Minimum disk size * # of drives per node * 3 (minimum number of nodes))
  • Maximum cluster capacity—Varies depending on your licenses and the resources available on your system.
  • Journal Disks
  • One SSD for the journal journal per node with at least 1 GB of free space
  • Boot Disk
  • One SSD or HDD  for the OS per node with at least 20 GB of free space
  • Memory: minimum of 6 GB of free memory per node


Supported Servers


EMC IsilonSD Edge is supported on all of the VMware Virtual SAN compatible systems that meet the minimum deployment requirements. You can identify the compatible systems on the VMWare Compatibility Guide . Please note: although we use the vSAN compatible HCL to identify the supported systems, IsilonSD Edge itself does not support vSAN at this point in time. That might sound a bit strange but think about how OneFS protects data: with erasure coding across nodes using native disks. Although it will most probably work, using vSAN would add a redundant level of data protection and would most probably be contra-productive for performance. A typical good system for an IsilonSD Edge deployment would be the Dell PowerEdge R630 servers. If you spend a bit more by using a Dell PowerEdge FX2, you would get a 4 node IsilonSD cluster in a single 2U chassis.

A Free Version of IsilonSD Edge

There are two versions of IsilonSD Edge available. One is the regular paid version and one free community version. A lot of features are enabled in both versions except SyncIQ, SmartLock, CloudPools which are only enabled on the community version. You can start installing the free version and acquire license keys later on which can then be entered via the UI. The following table lists the differences of enabled features in both versions.

Feature Function Free license Paid license
SmartPools Groups nodes and files into pools yes yes
CloudPools Transparent File Archiving to the Cloud no yes
Protocols NAS NFS, SMB, HTTP,FTP,HDFS yes yes
Object Protocols Swift yes yes
InsightIQ Nice Monitoring for performance, capacity and forecasting yes yes
SyncIQ Policy based and parallel synchronization between clusters no yes
SmartLock WORM functionality for directories which require compliance protection no yes
SmartConnect Advanced Load Balancing: round robin or based on CPU-utilization, connection count or throughput yes yes
SmartDedupe Post process deduplication yes yes
SmartQuota Enforcement and monitoring for quotas yes yes
SnapshotIQ Filesystem or directory based Snapshots yes yes
NDMP Backup IsilonSD Edge only supports 3way backups (no 2way due to lack of Fibre Channel connection) yes yes

Table 1: Supported features for the free and paid version of IsilonSD Edge

Use Cases

The most obvious use case for the software-only version are remote offices where no data center is available or locations where everything runs in a virtual environment. Using SyncIQ you can pull data that’s stored in the remote location into a central datacenter for backup purposes for example. Or you can even push content from a central location towards the edges (remote offices). You can even combine this with CloudPools [5] which enables you to keep the actual content local, while files that have not been used for some time are transparently pushed out the cloud. This can be very powerful because you get local performance with a small footprint but logically your NAS could be huge! What people also like is the fact that the Isilon appliance that resides in a data center is being managed in the same way as the virtual instances in remote locations (except the additional management VM that’s being used for the deployment).

The OneFS code is the very same code used in the appliance and SD version of Isilon.  Therefore, IsilonSD Edge might be a good vehicle for functional testing. Be aware that performance tests are very much depending on the underlying hardware so they might not make sense if you want to know performance characteristics for an appliance version.


Webcast and more Information

I am running a short 30-40 minutes WebCast explaining IsilonSD Edge:
Title: Extend your Data Center to your Remote Office with Software Defined Storage
When:  14 September 2016 – 14:00 UK time / 15:00 Berlin time
Register here:  This link works if you want to see the recording.
Here are some more links with useful information. Especially [1] contains almost everything you need.

[1] Isilon Info Hub: https://community.emc.com/docs/DOC-49267
[2] Technical Demo: EMC IsilonSD Edge Installation: https://www.youtube.com/watch?v=BgNzHRZMmo4
[3] EMC Community Network: Isilon: https://community.emc.com/community/products/isilon
[4] IsilonSD Edge Download packages: https://www.emc.com/products-solutions/trial-software-download/isilonsd-edge.htm
[5] Transparent Storage Tiering to the Cloud using Isilon Cloudpools : http://stefanradtke.blogspot.de/2016/02/transparent-storage-tiering-to-cloud.html

A good alternative: the Isilon x210 Fast Start Bundle

While I am writing this post, EMC hast just announced an entry level starter kit, containing three or more x210 nodes for a very attractive price. If you want to start with a small Isilon cluster, this might be a fantastic option to consider as well. As every appliance, it come pre-installed, pre-tested, ready to go. The new fast start bundle kit contains:

  • Three to eight X210 12TB Nodes
  • TWO 8-Port Mellanox Infiniband Switches (for the interal network) – no need to manage.
  • Enterprise Bundle:
    • SmartConnect
    • SnapshotIQ
    • Cables & Accessories
    • Optional Support

This bundle is aggressively priced and the promotion runs until December 31st 2016. It can be acquired through  an EMC partner. The maximum cluster size you can get is 8x 12TB = 96TB (raw). However, this limit is only a promo limit, not a technical one. It means that you can extend your cluster with any other available node type, including additional x210. In that case please note two things: 1.) You need to purchase bigger Infiniband Switsches if you want to build bigger clusters.  2.) For the configuration options you choose beyond the special offer, your regular company discount applies.



IsilonSD Edge is the first software-only version of Isilon and is a good starting point for a virtualized environment (based on VMWare ESXi  - other hypervisors might be supported in the future).  It’s a good way to connect remote locations with your central Data Lake built on an EMC Isilon Scale out Cluster. The functionality is equal to the appliance version of Isilon ( the free version has some restrictions). My personal preferred alternative would be a small Isilon cluster based on the appliance with x210 nodes, but this attractive promotion only runs until end of 2016.

26 May 2016

Isilon Backup with Networker and DD Boost to Data Domain

Quite frequently I am getting asked when we will integrate the DD-boost protocol into Isilon/OneFS for customers who want to backup to an EMC Data Domain system with high degree of deduplication. Although we haven’t got a direct boost integration into OneFS yet, there is a standard way of backing up a OneFS filesystem through DD-boost using a NetWorker Data Service Agent.

In the configuration shown below, we are leveraging NDMP to backup out of the front end of the Isilon storage platform over Ethernet to a NetWorker Storage Node. Depending on the amount of data and number of streams, you can have one or more Isilon nodes sending data and employ one or more NetWorker storage nodes to help facilitate the backup.

NetWorker features something called the Data Service Agent – commonly referred to as DSA – which is able to save or recover data via an NDMP connection between a NetWorker server or storage node and a NAS storage system. DSA essentially puts a “wrapper” around the backup data stream to enable the NetWorker software to write it to a non-NDMP backup device.




With this capability, NetWorker can then take advantage of Data Domain Boost software integrated at the NetWorker storage node. Parts of the deduplication process can then take place at the NetWorker storage node to increase throughput performance and improve network bandwidth utilization between the storage node and the Data Domain system.

The other advantage of this configuration which leverages the integration of NetWorker with DD Boost is that NetWorker can schedule and have full indexing of Data Domain replication to a second Disaster Recovery target.

In general, if each component shown here – Isilon, the Storage Node, and Data Domain – all have 10 Gigabit Ethernet links, performance of backup and recovery is optimal.  This configuration balances good throughput with the ability to leverage DD Boost integration with NetWorker. An Isilon Backup Accelerator is not required in this case because you don’t need a SAN here.

Side Node: OneFS 8.0 and Networker 9 SP1 do support NDMP Multi Stream. That yields into much faster backups as Networker can parallelize backups in a granular manner. This topic is definitely worth another blog post….

Technorati Tags: Isilon,OneFS,Backup,Networker,Data Domain,dd-boost,NDMP

22 February 2016

Transparent Storage Tiering to the Cloud using Isilon Cloudpools

Isilon Storage Tiering

The Isilon Storage Tiering (aka Smartpools) is a functionality that has been around for many years. It allows to send data to specific storage pools (a storage pool is a pool of nodes with the same node type or density). This allows to store data economically effective to the lowest price level. For example (see figure 1), you may create a policy that stores all new data onto a Pool1 that has been built out of some fast S210 nodes. Response times will be extremely good, but price point is also higher (faster disks, faster CPUs etc.). Then, you create another policy that says, move all data that has not been accessed for 30 days into Pool2. This Pool2 may contain X410 nodes. Much more capacity (36 SATA disks) but somewhat lower response times compared to Pool1. Further, you may have a third pool Pool3 that contains data that has not been touched for a year. This data is hosted on HD400 nodes (very dense, 59 SATA drives + 1 SSD per 4U chassis), but portentially lower response times than tier2. However, since this tier is only used for rarely accessed data, it would not impact the user experience significantly (may vary from use case to use case of course). The movement of data, according to policies will be done by the Job-Engine in OneFS. It happens in the background and the user is not impacted. The logical location of the files (the path) will not change. That means, that we could have a single directory that contains files that reside on three different storage technologies.

Figure 1:  Policy Based Storage Tiering with Smartpools

07 October 2015

EMC Isilon vs. IBM Elastic Storage Server or how IBM performs an Apple to Apple comparison

Readers of my blog know that I share best practices and client experiences here in my blog. In the post Isilon as a TSM Backup Target – Analyses of a Real Deployment [8], I described the “before and after” situation in an Isilon deployment for an IBM Spectrum Protect (former name Tivoli Storage Manager, TSM) solution. These results were simply a view on how a production work load looked like and how throughput and resulting backup windows evolved.

Interestingly, IBM –in the intent to position their IBM Spectrum Scale/Elastic Storage Server as the better solution-, hired a marketing firm to evaluate a performance benchmark on IBM Elastic Storage Server (ESS) against my mentioned post and publish that as a white paper [1]. The results highlighted in this paper indicate, that IBM Spectrum Scale is 11 times faster than a similar EMC Isilon configuration. I’m just guessing at why IBM did not publish that themselves, rather than paying a marketing firm to do it. I assume they are too serious to publish a comparison between snapshots of an averagely loaded production environment with a prepared benchmark test that was suited to evaluate the maximum performance of their solution.

The results published in my post were by no means showing any limits of the platform and the results were influenced by external clients, additional server and network traffic etc. Also nothing was said about server and storage utilization or any other means of potential limits or bottlenecks. It’s obvious that the authors of the white paper did not read my blog, otherwise it would be hard to explain, how they could accept such a comparison.

To wit: IBM sponsored a white paper that compared an early customer workload of a new production environment with a well prepared benchmark. They used quite different equipment but called it a “similar environment”.

Not a like to like workload comparison

The results published in my post are by no means showing any limits of the platform and the results were influenced by external clients, additional server and network traffic etc. Also nothing was said

28 June 2015

How to optimize Tivoli Storage Manager operations with dsmISI and the OneFS Scale-Out File System

In a previous blog, Using Isilon as a Backup Target for TSM*), I explained why Isilon is a fantastic target for backup and archive data. While our field experiences have matured through many successful implementations, there are still things that need improvement. Especially the unawareness of TSM regarding scale-out file systems has several side effects, which I’ll explain in this blog and how they can be solved with a solution called dsmISI, developed by General Storage.

*) IBM has recently re-branded some of their products. Also the IBM Tivoli Storage Manager, TSM, got a new name. It is now called IBM Spectrum Protect (TM). However, since the name TSM is known for decades in the user community, I’ll still use it in this article.

The Manual Approach

Consider a setup illustrated in figure 1.
Figure 1: Two TSM Servers, connected via 10 Gigabit Ethernet to a 5 node Isilon cluster.

12 May 2015

Comparing Hadoop performance on DAS and Isilon and why disk locality is irrelevant

In a previous blog [3] I discussed how Isilon enables you to create a Data Lake that can serve multiple Hadoop compute clusters with different Hadoop versions and distributions simultaneously. I stated that many workloads run faster on Isilon than on traditional Hadoop clusters that use DAS storage. This statement has recently been confirmed by IDC [2] who ran various Hadoop benchmarks against Isilon and a native Hadoop cluster on DAS. Although I will show their results right at the beginning, the main purpose is to discuss why Isilon is capable of delivering such good results and what the differences are in regard to data distribution and balancing within the clusters.

The test environment

  • Cloudera Hadoop Distribution CDH5.
  • The Hadoop DAS cluster contained 7 nodes with one master and six worker nodes with eight 10k RPM 300 GB disks.
  • The Isilon Cluster was build out of four x410 nodes, each with 57 TB disks and 3,2 TB SSDs and 10 GBE connections.
  • For more details see the IDC validation report [2].

05 May 2015

How to access data with different Hadoop versions and distributions simultaneously

Many companies start rolling out or at least think about using Hadoop for data analysis and processing. Hadoop Distributed Filesystem (HDFS) is the underlying filesystem and typically local storage within the compute nodes is used to provide the storage capacity. HDFS has been designed to satisfy the workload characteristics for analytics and has been born during the time where 1 Gigabit Ethernet was the standard networking technology in the datacenter. The idea was to bring the data to the compute nodes in order to minimize network utilization and bandwidth and reduce latency through data locality. (I’ll post another article to discuss this in more detail and show that this requirement is less important these days where we have 10 Gigabit almost everywhere in the datacenter). For the moment we’ll look at some other side effects of this strategy which reminds me somehow to a lot of data silos. Business Intelligence dudes know what I am talking about.

Figure 1: The “Bring Data to Compute” strategy results in a lot of data silos, complex and time consuming workflows.
One thing you may already know is the fact that HDFS is not compatible with POSIX protocols like

01 April 2015

Some Aspects on Online Data vs. Tape

We often get involved in  discussions around cost comparison of online data vs. data that has been backed up to tape. There are tons of TCO tools that you can use which you could use to make one or the other look cheaper. I’ll give two examples here:

  • An argument that is often used in favor of tape is: their acquisition cost is about 1/3 of dense archive storage. As a result, the overall TCO should be something like a third of disk.
  • Well, that’s obviously just a part of the story. The real question is: how long does it take to restore business relevant data and what does it cost if you need to wait for that restore for days or weeks to complete ?

The latter aspect is no fiction. I have had customers telling me that they stopped a restore of an important database restore after eleven days. This database was just 11 TB of size but during a restore from tape you don’t have any idea how fragmented it is and what the progress on the restore is. At the end, the customer lost a lot of money because this database was not available and the company could not work on customer request. It doesn’t require a lot of creativity to think about use cases where you lose thousands of Euros every minute.

This example shows how careful you need to be when someone shows you a TCO study that proofs that one or the other technology is cheaper. You really need to understand the use case for a serious TCO calculation that includes all relevant aspects. 

However, one thing is clear on tape: the only thing that you can do with data is to restore it to disks. That’s its. And you can hope that the media will be readable after some time and that the maintenance tasks of the backup software and tape library went well to keep the data readable.

Need to analyze or access data with Hadoop ?

Furthermore, these days, companies have already realized that they have often big value in their data that they want to monetize. The keyword here is big data analytics with solutions like Hadoop. That of course is only possible if you have all your data online and accessible for relevant tools. Solution like Isilon are ideal for this purpose as they allow you to keep active data and archive data on the same platform [1]. Policies allow you to move the data to the most (cost) efficient storage media while their logical access path doesn’t change. Applications and users will find the data always on and always at the same place. Since Isilon is a multiprotocol system you can even access the data via NFS, CIFS, HDFS, Openstack Swift, FTP etc.



Figure 1: Multiprotocol Access to data on Isilon with policy based tiering.

I’ll probably will do another post that talks about the advantages of using Isilon for Hadoop but this one is about tape vs. online data. There are several tools out there which could be used to do your TCO calculation and I just wanted to remind you with this little article that there is more to consider than just €/TB.


Further Reading:

[1]  White Paper: Next Generation Storage Tiering With EMC Isilon SmartPools

[2] White Paper: Next-Generation Storage Efficiency with Isilon SmartDedupe

[3] White Paper: EMC Isilon OneFS: A Technical Overview

23 January 2015

Backup OneFS Data over NFS/CIFS with TSM

In several of my previous posts I have mentioned the shortcomings of NDMP. One of it is the lack of support for an incremental forever strategy, a feature that TSM users typically used to have. Furthermore, TSM support for NDMP is way below average, compared to other backup Software solutions (for example, EMC Networker can create and maintain OneFS Snapshots, roll them over to tape and index them. Watch out for my blogpost here soon).

One way around this would be to backup the files in the filesystem via NFS or CIFS. To avoid the required file system scan (or treewalk), the ideal solution would be something like the isi_change_list feature mentioned in a previous post. However, the first version of that change_list API has not shown been very efficient with TSM so we have to wait for the next version which we’ll anticipate at the end of 2015. Until then, the only way to accelerate a backup via CIFS/NFS is massive parallelism. General Storage has developed a solution for this called MAGS – Massive Attack General Storage.


During backup, TSM scans through file systems, compares their content with what it already backed up and transfers changed/new files, expires deleted files, updates properties etc. TSM is doing backups in a multi-threaded fashion, spawning off up to 128 independent threads (officially 10) doing the actual transports of data (resourceutil parameter).
However TSM does NOT multi-thread effectively when it comes to scanning a file system, hence the act of comparing existing files with backed-up versions may take a very long time –even if only little or no data changed.
Backing up an Isilon Filesystem with TSM could be as easy as entering

dsmcincremental \\my_isilon_cluster\ifs

on any TSM Windows client. Provided the appropriate permissions are in effect, this will work. But it will take a very long time. Depending on the file system structure, network latency, kind of Isilon cluster nodes and the TSM Infrastructure, there will probably be no more than 5,000 to 10,000 file system objects (files and directories) scanned per second. On an Isilon, hosting 500,000,000 file system objects, scanning alone would theoretically take about 20 hours. In real life, it usually takes much longer. Working around that “scanning” bottleneck usually involves trying to logically split the file system and backing it up with multiple jobs. So instead of running:

dsmcincremental \\my_isilon_cluster\ifs

You could run:

dsmcincremental \\my_isilon_cluster\ifs\my_first_dir
dsmcincremental \\my_isilon_cluster\ifs\my_second_dir
dsmcincremental \\my_isilon_cluster\ifs\my_third_dir

dsmcincremental \\my_isilon_cluster\ifs\my_nth_dir

That would certainly speed things up but you’d consider the following:
  • You would have to keep track of your directories –adding a new one means it won’t get backed up unless you explicitly add it to your TSM jobs.
  • It means you have to balance the number of jobs running against which directory manually. They won’t all be of the same size –there’ll be a couple very big ones and others will be small.
  • It will require monitoring a potentially large number of jobs, their individual logs etc.
  • It won’t take care whether your client can handle the number of parallel, memory-hungry jobs you’re starting, so you’ll constantly have to tune it yourself.
So for the time being, General Storage developed MAGS to address these issues and automated the massive parallel approach to backup. It requires one ore multiple Windows Servers where MAGS runs as a TSM client wrapper. It is started exactly like TSM’s own dsmc

magsincremental \\my_isilon_cluster\ifs

Then, MAGS performs the following steps:
  • It scans the file system recursively to a configurable depth (i.e. 6 levels, which usually takes no more than a few seconds).
  • It starts as many parallel TSM clients for each sub-tree found as can be handled by the machine’s memory on which it is running (maximum number of jobs and memory threshold is configurable).
  • It preserves the entire file system structure in a single, consistent view. For the user running a restore via the standard TSM GUI or command line, there will only be one TSM node to address and only one tree to search.´
  • It can spread its TSM Client sessions across more than one Windows machine (practical limit is about 8).
  • It can be scheduled via the standard TSM client scheduler, logs to a single schedlog/errorlogand ends in a single return code.

MAGS usually shortens backup times to 10-30% of what a “normal” incremental would take, depending on the infrastructure and other bottlenecks associated with TSM servers, network etc. There are some large customers using it already and is seems to do a good job. The plan of General Storage is to include version 2.0  of Isilon’s change_list API once it is available and tested. This will the accelerate the scan-time dramatically and will most probably also reduce the  required resources on the TSM client machines.
Figure 1: Workflow of massive parallel OneFS backups with TSM using MAGS


  • At least one Windows/Intel 2008R2 or 2012 machine with at least 16 GB RAM
  • Microsoft .Net4.5
  • TSM Windows Backup/Archive Client V6.3 or newer
  • EMC Isilon with configured SMB shares for the data to backup
  • At least 3 GB of free disk space on each machine running MAGS

Impact of SSDs for meta data acceleration

File System scans and as well as all other meta data operations can be accelerated very much by using SSDs for Meta Data (Read/Write) Acceleration in  OneFS. Until recently, SSDs usage in NL-Nodes, which are typically used for backup/archive targets, has not been supported. This has not been a technical reason and EMC has recently announced that SSDs can now be used as well in NL- and HD-Nodes. This is good news because even a single SSD per Node may help to accelerate scans quite significantly. 

More Info

Thanks to Lars Henningsen from General Storage for providing the information. If you are interested in MAGS you may drop me a mail and I’ll forward to the right people or contact General Storage directly at http://www.general-storage.de

Stefan on linkedin:

21 December 2014

How to remove an IPv6 Address from an interface in Windows

This seems somewhat off-topic fro my storage blog but remember that we talk about NAS and this means IP-storage. While I was working on some IPv6 project I found this quite helpful:

If you prepare your site for IPv6 you’ll most probably run through some trial and error phases to understand what’s going on with all the the things like DHCPv6, Router Advertisement, Prefix Delegation, DNS and so forth. While doing this, you might want to remove an acquired IPv6 address from a Windows interface. In the IPv4 world is has been sufficient to do a

ipconfig /release

and Windows would release it’s DHCP leases. You’ll figure out that this will not work with the IPv6 equivalent in all cases:

ipconfig /release6

I'll not cover here why that is but would just share how you can force a remove of a IPv6 address:
Let’s look at the interface status:

c:\Users\Stefan> ipconfig

Windows IP Configuration

Ethernet adapter Local Area Connection 2:

Connection-specific DNS Suffix . : lan
IPv6 Address. . . . . . . . . . . : fd7e:df1d:94d9:0:381d:a3b4:8849:b4bf
IPv6 Address. . . . . . . . . . . : fd6f:3a25:838a:0:381d:a3b4:8849:b4bf
Temporary IPv6 Address. . . . . . : fd6f:3a25:838a:0:8d17:6265:429e:d7eb
Temporary IPv6 Address. . . . . . : fd7e:df1d:94d9:0:8d17:6265:429e:d7eb
Link-local IPv6 Address . . . . . : fe80::381d:a3b4:8849:b4bf%16
IPv4 Address. . . . . . . . . . . :
Subnet Mask . . . . . . . . . . . :
Default Gateway . . . . . . . . . :

To remove one of the above IPv6 addresses, you need to use Microsoft’s netsh interface to the IPv6 stack:

delete address [[interface=]Zeichenfolge] [address=]IPv6-Adresse [[store=]{active | persistent}]

Let’s assume we want to remove one of the IPv6 addresses listed above:

netsh interface ipv6 delete address interface="Local Area Connection 2" address="fd7e:df1d:94d9:0:381d:a3b4:8849:b4bf"

That’s basically it.  

The fully netsh command reference can be found here.

23 October 2014

Direct I/O considerations for TSM Storage pools on OneFS (NFS)



For several reasons explained in previous posts, EMC’s Scale Out NAS system Isilon is a fantastic target for TSM storage pools. It’s easy to use; it scales linearly and provides a shared and fast filesystem for TSM instances. This makes a TSM admin’s life much easier. During a recent project we learned that the throughput we can achieve with the Tivoli Storage Manager (TSM)  vary quite significantly depending on whether TSM is configured to perform buffered vs. un-buffered I/O (direct I/O) on the storage pools that reside on OneFS. This article describes some dependencies between buffered or direct I/O, CPU utilization, I/O latencies and throughput and the role of Isilon’s Smartcache feature. Although we discuss that here in the context of TSM and OneFS, the discussed aspects should be valid for other applications as well.

17 September 2014

TSM Copy pool or Isilon SyncIQ to create a copy of your backup data ?


If you follow this blog you are aware that many customers use Isilon Scale Out NAS as a backup target for TSM. In this context I am getting asked quite frequently whether to use Isilon’s SyncIQ or TSM Copy pools to replicate your backup data. The response to that is clear: use TSM Copy Pools. Here are the reasons:

  1. SyncIQ is a parallel replication that is designed to replicate massive amount of data. It works is parallel from multiple nodes to multiple nodes at the target side. Due to the nature of a scale out NAS system it’s designed to replicate the data asynchronously (otherwise your response time on the primary side would suffer). Assume your primary site would go down due to a power outage. Transactions will not be lost due to the non-volatile NVRAM buffer that OneFS uses to store transactions. However, if you failover your cluster to the secondary side, OneFS would rollback the target directory to the last known good snapshot. That has two consequences:

    a.) Even if you have set the RPO to some minutes, you will lose the backup data that has been written between the last snapshot and the occurrence of the power outage (if you cannot recover your primaries site volume).

    b.) TSM will recognize it and would report inconsistencies between the database and the volume in the log. This might not be a disaster but you would then need to perform an AUDIT VOLUME which checks for inconsistencies between database information and a storage pool volume. This can take quite a long time !

  2. The second reason why you would use TSM Copy Pools to duplicate your data is that TSM uses it intelligently in case you need to restore data in a quite utilized environment. Since TSM is aware of it’s (synchronous) Copy Pool Volume, it would mount it and use it in addition to the primary volume to restore data.

A final comment on disaster recovery for TSM. The above topic discusses how to make you backup data highly available but not the TSM server itself. Take a look at the General Storage Cluster Controller for TSM. It is a backup recovery solution that allows to create a highly available solution for TSM. It even covers the outage of TSM servers and works perfectly with Storage Pool Volumes on Isilon.


[1]  White Paper: High Availability and Data Protection with EMC Isilon Scale-Out NAS
[2]  TSM Info Center: AUDIT VOLUME (Verify database information for a storage pool volume)
[3]  General Storage Cluster Controller for TSM: Brochure

02 September 2014

Challenges and Options for Filesystem Backups at Petabyte Scale

The data growth for unstructured data is accelerating and filesystems that contain one or multiple petabyte are common. The current version of OneFS 7.1 has a tested support for 20PB of raw capacity, new node types and larger disks will lift that limit going forward. Challenges start even before a filesystem size reaches a petabyte. In this post I will identify the challenges that come along with such large filesystems. Even though the Isilon data protection mechanisms are very mature [1] you may want to backup your data to prevent logical or physical data due to disasters or for compliance reasons. We’ll define what the attributes of an ideal backup solution are and compare some existing technical solutions against this list of attributes and functions. The list of technical solutions is determined by my working experience and discussion with some renown colleagues from EMC’s Data Protection Availability Division (see Acknowledgement) and I would not make a claim for completeness.


The challenges


The NDMP challenge

Remember that the context of this discussion is LARGE filesystems. The industry standard solution for backing up NAS appliances is the NDMP protocol. I have mentioned several disadvantages of NDMP in a recent post. By far the biggest challenge is that NDMP does not support a progressive incremental forever strategy (well, there is an exception that I will explain later but without any 3rd party tools the statement is true). That means you need to perform a full backup every so often. In practice this is not feasible: assume we would have a dedicated 10 Gigabit Ethernet available for backup and we could saturate it with 900 MB/s. A full petabyte backup would still take

22 July 2014

Use Isilon's new change-list feature to accelerate your backups

The new OneFS Changelist API

As mentioned in a previous post, OneFS can be completely managed through an API [1]. That's an important integration feature for environments like VMWare or interaction with other solutions.In this post is about the changelist API that it typically used by backup software to manage OneFS snaphots. Networker 8.2 and other backup software solutions for example use this API to manage and control snapshots. As we go forward we can expect that more and more data management solutions will make use of the API. However, you can also make use of the new changelist function (through API or CLI) to create a list of files that lists files which have been changed between two snapshots. If you backup software does not yet use the API itself, you can use the list and feed it into your backup client for faster backups.

This is not about NDMP

Although the new OneFS Version 7.1.1 has some new and cool NDMP features like snapshot based backup and parallel restore (see more in the OneFS 7.1.1 Release Notes [1]), many people favor to avoid NDMP at all. Here are some reasons:

19 June 2014

Isilon as a TSM Backup Target – Analyses of a Real Deployment

In my recent blog “Using Isilon as a Backup Target” I have explained why Isilon is a perfect target for TSM backups (well, the same applies for sure to other backup and archive solutions as well but we want to show real world example here and this one has been with TSM). Beside the nice and simple administration and the fact that you can get rid of the SAN complexity for a large degree, one of the most appealing advantages is that your whole backup process becomes much faster. Why? Because