NetWorker and Avamar Integration Overview

With the introduction of NetWorker 7.4.1, we see the introduction of a new component under the NetWorker family, namely the dedupe node. This new type of storage node is used only for dedupe operations and leverages Avamar's data de-duplication technology.

The NetWorker Management Console (NMC) interface has been updated to support data de-duplication capabilities using NetWorker workflows.

Dedupe is performed on the NetWorker client. The dedupe node is a combination of software and optional hardware such as the Avamar Date Store available from EMC

Terminology

NetWorker NameAvamar EquivalentFunction
De-dupe client Avtar client Contains Data, duplication is done on Avtar client
De-dupe Node Avamar/Axion Server Stores de-duplicated data
Storage node Stores hashid for each dedupe file (adv_filedevice)
media database Stores snapup-id for data
nsravtar binary (nsravamar on Linux) Avtar binary nsravtar used to provide backup/recover de-dupe logic Is used in place of avtar binary for NetWorker backups, saveset creation and deletion

Data de-duplication is the process of saving only one instance of common data. This concept is also called a Single-Instance Store and makes use of a snapup process to locate, capture and store unique data. The first snap will take longer to perform than a traditional backup but subsequent snapups will be much faster.

Data De-duplication is done on the de-dupe client side before data is sent to the de-dupe node. This reduces the media, network bandwidth and time required to perform backups because it prevents redundant data from being transmitted over the network and stored. De-dupe functionality is configured and managed via the NetWorker Management Console. Additional add-on software is not required to enable de-duplication as it will be included in the standard NetWorker server and client installation packages. No additional licensing is required on the NetWorker side for backing up de-duplication clients.

New binaries

Four new binaries are included with a client install:

  • nsravtar
  • libavctl
  • nsravamar - Linux only
  • nsrmccli (sym link to native mccli) - Linux only

Supported platforms

The following platforms are supported as deduplication clients:

  • AIX
  • HPUX
  • Linux
  • MAC
  • Solaris
  • Windows

New Directories

The de-dupe client creates new directories:

  • /nsr/dedup
  • /nsr/dedup/cache - contains the avtar client-side cache
  • /nsr/dedup/logs - contains the avtar client-side logs
  • On the Avamar server/de-dupe node the nsravamar.raw file is located under /nsr/logs.

Tips and Recommendations

  • The savegrp command is still used and a new asm called AvASM is applied to de-dupe enabled clients
  • De-Dupe clients should be in a group separate from non-dedupe clients
  • It is recommended that de-dupe clients are backed up to an adv_file device
  • For existing clients the first dedup backup must be full
  • If one file on the client is not de-dup, the entire backup will be registered as a standard NetWorker saveset
  • The client parallelism of a de-dupe client should not exceed 4
  • During a backup, the data is stored on the de-dupe node and the hashids are stored on the NetWorker Storage Node(Avamar server).
  • To query de-dupe savesets in the media database use: mminfo -S -q de-dupe Note: The scanner command cannot be used to restore de-dupe data directly from tape as hashes are required for the recover and these are not stored on the tape but the NetWorker Storage Node

Deletion of de-dupe savesets

Based on retention policy can also be invoked by nsrmm or when volume becomes recycled

Label information created for the de-dupe node in the RAP db and contains info to delete the snapup on te Axion server. The delete process is run every 6 hours by nsrd to delete the snap-up and clean rap entries. nsrd will also clean up the RAP resource at startup. Can view the status of deletion through properties of saveset in NMC

Replication

The Replication node should be configured as the NetWorker de-dupe node

Replication should be scheduled on de-dupe node separately as a onetime operation at time of backup replication node is registered in the saveset

At time of restore, if main axion server is down, the restore will be directed to the replication node.

Clone/stage metadata (hash files) to another adv_file device on another node is recommended

Limitations

  • Directives that modify data are ignored for de-dupe clients (that is compressasm, AES)
  • NetWorker Module client instances do not support de-duplication with 7.4 SP1
  • Client parallelism for a de-dupe client should be no higher than 4
  • Only metadata (hash files) can be staged and cloned. To make a second copy of data via NetWorker (and not Replication) a second traditional client would need to be configured to for the de-dupe client for backup to tape.
  • Backups cannot be performed during the scheduling of a read-only state of the Avamar server. This can adversely impact the Avamar server and overlapping of these tasks should be avoided.
  • Snapup deletion does not automatically occur on the replication server only the primary Axion server.
  • Dedupe client are I18N compliant but not Localized.
  • Non English locales can be backed up
  • Message log and screens for de-dupe will remain in English only

References

Refer to EMC knowledge base article esg92533 for more information