22 July 2014

Use Isilon's new change-list feature to accelerate your backups

The new OneFS Changelist API

As mentioned in a previous post, OneFS can be completely managed through an API [1]. That's an important integration feature for environments like VMWare or interaction with other solutions.In this post is about the changelist API that it typically used by backup software to manage OneFS snaphots. Networker 8.2 and other backup software solutions for example use this API to manage and control snapshots. As we go forward we can expect that more and more data management solutions will make use of the API. However, you can also make use of the new changelist function (through API or CLI) to create a list of files that lists files which have been changed between two snapshots. If you backup software does not yet use the API itself, you can use the list and feed it into your backup client for faster backups.

This is not about NDMP

Although the new OneFS Version 7.1.1 has some new and cool NDMP features like snapshot based backup and parallel restore (see more in the OneFS 7.1.1 Release Notes [1]), many people favor to avoid NDMP at all. Here are some reasons:

  • NDMP is not storage agnostic. In general you cannot backup data and restore to another array from another vendor or sometimes even another OS version.
  • NDMP requries admin privileges. No problem for backups of large systems but not nice for restores, especially if a user want's to restore a single file.
  • The majority of the backup software solutions do not index the files of the NDMP files. In TSM for example you can store a Table of Content (TOC) with the backup but if you want to restore a single file you have to load the TOC into a temporary table to work with it. This can be very time consuming.
  • NDMP doesn't really support a incremental forever strategy. That means you have to do a full backup periodically which is a no go with large filesystems at petabyte scale that contain billions of files.
 There are also some positive aspect but it would justify another article.

 

Avoid the treewalk problem

Assume that we want to perform a traditional backup by mounting the filesystem (NFS or SMB) to a backup server (or backup client) and do a native backup without NDMP. By doing so we can avoid the above problems that are associated with NDMP. For large environments we have to use multiple backup clients (or servers) to perform the backup in a parallel manner. However, a major issue with backing up large scale filesystems is not the data movement itself rather than the treewalk. I have seen filesystems with about 150 million files where a parallel treewalk takes about 6-8 hours. This may still be acceptable but many clusters grew into the billion file count. That means we have to avoid the treewalk to get back to an acceptable backup duration.

OneFS 7.1.1 now provides a feature that allows to create a changelists. A changelist contains all files that have been changed between two consecutive snapshots. You can then use this changelist to feed it into your backup client to directly backup these files (without the backup software needs to perform a treewalk). The creation of the changelist is very fast and done by a OneFS's job that is controlled by the job engine. It works in parallel and multi_threaded on all nodes.

Requirements

In version OneFS version 7.1.1 the function is depending on the existence of a SynIQ license and (unfortunately) a second cluster that is a target for replication. The target cluster can be similar or different to the source cluster since OneFS can replicate per directory. One reason for this requirement is that some backup vendors have asked for this implementation to be able to control replication and to index what's in OneFS's snapshots. Another reason (my guess) is that the development team could use a so called repstate that is automatically produced by SynIQ policies. A repstate is a system b-tree that contains all Logical Inode Numbers (LINs) that have changed between two snapshots. This repstate can be easily used to create the changelist by the changelist job. However, the plan is to remove this dependency and I am quite optimistic that the dependency to SynIQ will be removed in future versions.

Step by Step - Overview

Assume we have two clusters which we call  jaws1 and jaws2. jaws1 contains a directory /ifs/data that we want to backup. Here is the overview of the steps that we need to perform in order to create the changelist:

  1. Run a full backup of /ifs/data (not required to create the change list but best practices when doing backups)
  2. Create the SynIQ policy to replicate jaws1:/ifs/data to jaws2:/ifs/data
  3. Run the replication policy. This will create snapshot1 (on the source cluster)
  4. Modify the data in /ifs/data (modify, create, delete files)
  5. Run the replication policy again. This will not only replicate the data but also create snapshot2.
  6. Create the changelist from snapshot1 and snapshot2 (on the source cluster)
  7. Parse and format the changelist and feed it into your backup client
For illustration of the steps see figure 1.

Figure 1: Directory the changelist is create from repstate. Differences between snapshot1 and snapshot2 will appear in the chanelist which is used to feed the backup client (or server) without a treewalk.


For step 7 we'll just list the content of the changelist here. I leave it to creative guys to write a script to format the changelist into the required format of the backup software. (If I come across of some nice gurus who do and share it I am happy to post it here).


Step by Step - Detail

OneFS can be managed by the WebUI [3] , the CLI [4] as well as an API [2]. I will use the CLI and API here for demonstration. To create a fully automated solution the API usage might be the best choice.

First let's see what the content of our directory is (three files):


# ls -ali
total 5734
4295229440 drwxrwxr-x +  2 root    wheel      166 Jul 22 08:54 .
         2 drwxrwxrwx    5 root    wheel       65 Jul 22 08:23 ..
4296605699 -rwxrwx--- +  1 stefan  wheel   712506 Jul  9 08:36 OneFS-7.1.1-Backup-and-Recovery-Guide.pdf
4296015893 -rwxrwx--- +  1 stefan  wheel  3416848 Jul  9 10:04 OneFS-7.1.1-CLI-Administration-Guide.pdf
4296015891 -rwxrwx--- +  1 stefan  wheel  1696985 Jul  9 08:19 OneFS-7.1.1-Event-Reference.pdf


 

Create The SyncIQ Policy

Now we create a syncIQ policy:

# isi sync policies create data_repl sync /ifs/data jaws2 /ifs/data

Here data_repl is the name that we use for the policy (can be anything else). To enable changelists you need to modify the policy (alternatively you can also create the policy with the --changelist true option and omit the this command: 

# isi sync policies modify data_repl --changelist true

 

Run the replication policy


# isi sync jobs start data_repl

Verify that the first snapshot is created

# isi snapshot snapshots list
ID   Name                                     Path
------------------------------------------------------------
4    SIQ-e929c5b6d171171bbe14ac48b7140579-latest /ifs/data
------------------------------------------------------------
Total: 1


Modify working data

Now you would work as normal on your directory. For demonstration we add some files and delete one:

# ls -ali /ifs/data
total 7295
4295229440 drwxrwxr-x +  2 root    wheel      280 Jul 22 09:47 .
        2 drwxrwxrwx    5 root    wheel       65 Jul 22 08:23 ..
4296015893 -rwxrwx--- +  1 stefan  wheel  3416848 Jul  9 10:04 OneFS-7.1.1-CLI-Administration-Guide.pdf
4296671237 -rwxrwx--- +  1 stefan  wheel  1935409 Jul  9 08:17 OneFS-7.1.1-Event-Reference-for-Technical-Support.pdf
4296015891 -rwxrwx--- +  1 stefan  wheel  1696985 Jul  9 08:19 OneFS-7.1.1-Event-Reference.pdf
4296671236 -rwxrwx--- +  1 stefan  wheel   366104 Jul  9 10:05 OneFS-7.1.1-Migration-Tools-Guide.pdf
4296540181 -rwxrwx--- +  1 stefan  wheel   470604 Jul  9 08:26 OneFS-7.1.1-Release-Notes.pdf



Re-run the Policy

Re-Run the policy to replicate the data and create a second snapshot:

# isi sync jobs start data_repl

Now look at the snapshots and remember the IDs:

# isi snapshot snapshots list
ID   Name                                        Path
----------------------------------------------------------
4   
SIQ-Changelist-data_repl-2014-07-22_09-48-44   /ifs/data
6   
SIQ-e929c5b6d171171bbe14ac48b7140579-latest    /ifs/data
----------------------------------------------------------
Total: 2


Create the changelist

Now you can create the change-list that contains all files that have been added, changed or removed between the two snapshots. You need to have the snapshot IDs from the previous step at hand:


# isi job jobs start ChangelistCreate --older-snapid 4 --newer-snapid 6 --retain-repstate

The --retain-repstate parameter retains the repstate. This is just required if you want the re-create the changelist later. Be aware that the repstate consumes space so usually you would omit the flag and the repstate will be deleted.

View the IDs of changelist(s) by running the following command:

# isi_changelist_mod -l
4_6

The output 4_6 contains the IDs of both snapshots. That can now be used to view the content of the snapshots.

View the content of the changelist

 

# isi_changelist_mod –a 4_6
isi_changelist_mod -a 4_6
lin: 000000001  entry_type: metadata  size: 67  reserved: 8
        root_path: /ifs/data
        owning_job_id: 3  num_cl_entries: 5
        root_path_size: 10  root_path_offset: 57
lin: 100040000  entry_type: file  size: 58  reserved: 0
        path:
        type: directory  size: 280  path_size: 1  path_offset: 57
        atime: 1406022442  atimensec: 989987921
        ctime: 1406022442  ctimensec: 989987921
        mtime: 1406022442  mtimensec: 989987921
lin: 100180015  entry_type: file  size: 88  reserved: 0
        path: /OneFS-7.1.1-Release-Notes.pdf
        type: regular
  size: 470604  path_size: 31  path_offset: 57
        atime: 1404894388  atimensec: 723925300
        ctime: 1406022454  ctimensec: 43381506
        mtime: 1404894388  mtimensec: 723925300
lin: 100190003  entry_type: file  size: 100  reserved: 0
        path: /OneFS-7.1.1-Backup-and-Recovery-Guide.pdf
        type: (REMOVED)
  size: 0  path_size: 43  path_offset: 57
        atime: 0  atimensec: 0
        ctime: 0  ctimensec: 0
        mtime: 0  mtimensec: 0
lin: 1001a0004  entry_type: file  size: 96  reserved: 0
        path: /OneFS-7.1.1-Migration-Tools-Guide.pdf
        type: regular
 
size: 366104  path_size: 39  path_offset: 57
        atime: 1404900325  atimensec: 285772900
        ctime: 1406022430  ctimensec: 43256370
        mtime: 1404900325  mtimensec: 285772900
lin: 1001a0005  entry_type: file  size: 112  reserved: 0
        path: /OneFS-7.1.1-Event-Reference-for-Technical-Support.pdf
        type: regular 
size: 1935409  path_size: 55  path_offset: 57
        atime: 1404893867  atimensec: 731767700
        ctime: 1406022430  ctimensec: 44918531
        mtime: 1404893867  mtimensec: 731767700
Entries found: 6



In the first section you can see the root directory that has been used for replication and changelist creation (marked red). Then you see three entries where files have been changed or added (blue). Added or changed files are indicated by type: regular. There are more types, see [4,5] for details. Also removed files (green) are listed and indicated by type: (REMOVED).

Using the API

For scripting you should preferably use the OneFS API [2]. For demonstration I'll just show here how you would use it for the last step which is to list all available changelists and display the content:

List the available changelists:

# curl https://192.168.245.101:8080/platform/1/snapshot/changelists --insecure --basic --user root:passw0rd
{
"changelists" :
[
{
"id" : "4_6",
"job_id" : 3,
"num_entries" : 5,
"root_path" : "/ifs/data",
"snap1" : 4,
"snap2" : 6,
"status" : "ready"
}
],
"resume" : null,
"total" : 1
}


The --insecure flag in the command means that no certificates are required for SSL.

With the listed information we have the ID for the two snapshots and can now list the content of the changelist using the API:

#curl https://192.168.245.101:8080/platform/1/snapshot/changelists/4_6/lins  --insecure --basic --user root:passw0rd
{
"lins" :
[
{
"atime" :
{
"nsec" : 989987921,
"sec" : 1406022442
},
"ctime" :
{
"nsec" : 989987921,
"sec" : 1406022442
},
"id" : "4295229440",
"mtime" :
{
"nsec" : 989987921,
"sec" : 1406022442
},
"path" : "",
"size" : 280,
"type" : "directory"
},
{
"atime" :
{
"nsec" : 723925300,
"sec" : 1404894388
},
"ctime" :
{
"nsec" : 43381506,
"sec" : 1406022454
},
"id" : "4296540181",
"mtime" :
{
"nsec" : 723925300,
"sec" : 1404894388
},
"path" : "/OneFS-7.1.1-Release-Notes.pdf",

"size" : 470604,
"type" : "regular"
},
{
"atime" :
{
"nsec" : 0,
"sec" : 0
},
"ctime" :
{
"nsec" : 0,
"sec" : 0
},
"id" : "4296605699",
"mtime" :
{
"nsec" : 0,
"sec" : 0
},
"path" : "/OneFS-7.1.1-Backup-and-Recovery-Guide.pdf",
"size" : 0,

"type" : "(REMOVED)"
},
{
"atime" :
{
"nsec" : 285772900,
"sec" : 1404900325
},
"ctime" :
{
"nsec" : 43256370,
"sec" : 1406022430
},
"id" : "4296671236",
"mtime" :
{
"nsec" : 285772900,
"sec" : 1404900325
},
"path" : "/OneFS-7.1.1-Migration-Tools-Guide.pdf",
"size" : 366104,
"type" : "regular"

},
{
"atime" :
{
"nsec" : 731767700,
"sec" : 1404893867
},
"ctime" :
{
"nsec" : 44918531,
"sec" : 1406022430
},
"id" : "4296671237",
"mtime" :
{
"nsec" : 731767700,
"sec" : 1404893867
},
"path" : "/OneFS-7.1.1-Event-Reference-for-Technical-Support.pdf",
"size" : 1935409,
"type" : "regular"

}
],
"resume" : null,
"total" : 5
}




As said, the final step of parsing and formatting the changelist into a format that the backup software can understand is omitted here. If you use the described method to script and automate your backup I'd appreciate if you share your work with me so that I can publish that here.

Cleanup

Please don't remember to remove snapshots, repstates (only if you used --retain-repstate) and changelists that you don't need anymore:

List changelists: isi_changelist_mod -l
Remove a changelist: isi_changelist_mod -k <repstate>

List snapshots:  isi snapshot snapshots list
Remove snapshots: isi snapshot snapshots delete <snapshot>

 

Summary

Starting with OneFS 7.1.1 you can accelerate you (non-NDMP) backup dramatically using the changelist function. A changelist lists all files that have been changed between two consecutive snapshots and it can be created very fast with the new ChangelistCreate job. At this stage the feature requires a SyncIQ policy (and therefore a replication target) but this requirement will most probably lifted in a future release.

 

References

[1] OneFS 7.1.1 Release Notes
[2] OneFS 7.1.1 API Reference
[3] OneFS 7.1.1 Web Administration Guide
[4] OneFS 7.1.1 CLI Administration Guide
[5] OneFS 7.1.1 Online Help

Acknowledgement

Thanks to Dan Knudson, who is one of the famous OneFS developers, for several discussions around changelist and his input to this post.

Updates

 [31.07.2012] Changed introduction slightly to mention that several backup solutions are already using the new changelist function.

2 comments:

  1. There is so much good informations that is otherwise not published in you blog! Thank you for sharing Stefan.

    ReplyDelete
  2. FWIW, you can use two snapshotiq snapshots (same path) and changelistmod them. Same with ndmp snapshots, especially if using fast-incremental feature.

    ReplyDelete