ZFS file systems are created with the pools, data set allow more granular control over some elements of your file systems and this is where data sets come in. Data sets have boundaries made from directories and any properties set at that level will from to subdirectories below until a new data set is defined lower down. By default in Solaris 11 each users’ home directory id defined by its own data set.
Listing data sets
Data sets in ZFS are controlled with the command zfs (/usr/bin/sbin), the simplest of the command, using the list sub-command, is used to display the available zfs data sets
zfs list
This will list all data sets. Using a data set name as an option to the list sub-command will then list just that data set
zfs list rpool/data1
A data set name is preceded by the pool it is created in <poolname>/<dataset name>
Data sets are used to represent both the current and previous versions of the file-system. Snapshots and clones are contained in their own data sets. Note on the first zfs list output rpool/nozone and rpool/solaris are both Boot Environment clones; solaris being the default and nozone being a clone, in this case before a zone was installed. Clones and snapshots are covered separately.
Creating ZFS Data sets
The sub-command create and be used to create a new data set. In the simplest form we only need the data set name
zfs create rpool/d1 zfs list rpool/d1
We can see that if we do not use the mountpoint option the data set is automatically created in the file system to represent the path in relationship to the pool, in this case the directory /rpool/d1 is created and the data set is mounted to that new directory. Should you want more control over the mountpoint then you can specify your own locations. If the directory does not exist then it will be created, if the directory does exist is must be empty. If required, all parent directories to the mountpoint are created as if you had used the mkdir –p command, ZFS manages mounting of the data-set and no additional entries need to be added to /etc/vfstab.
zfs create –o mountpoint=/data2 rpool/d2 zfs list rpool/d2
Options
We can see the use of the –o for the mountpoint but other options exist; if we want to use more than one option we specify –o for each option that we want set. Options exist among others for:
- mountpoint
- quota
- userquota
- compression
- atime
- exec
- dedup
These are options represent some options that can be set in mounting traditional filesystems and others that are new to ZFS such as dedup.
Setting compression on a data set
Setting the compression option for a data set will ensure data is compressed as it is stored reducing the need for additional storage. The possible values for compression include on and off; I also guess that you may have been able to work those option out:
- compression=on
- compression=off
- compression=zle
- compression=lzjb
- compression=gzip-1 through to gzip-9
Compression = on is equivalent to the mid-range gzip setting of gzip-6.
Reading data set options
The get sub-command can be used to retrieve information about options that have been set on a data set:
zfs get all rpool/data1
If it is needed to read just a single option then this can be achieved as such:
zfs get compression rpool/data1
As compression has not been set yet on the data set or its parents we can see that it is unset, the source shows that it is the default so it is not set at this level or inherited.
Setting option post creation
We can set the options and change options for zfs data sets post-creation as well as during the creation. To add compression now we could use the option
zfs set compression=gzip-5 rpool/data1
Using the get sub-command we can now see that the source for the option comes from Local rather than being at the default. Local just means that the setting is made at this level rather than being inherited or at the default.
Summary
ZFS data sets allow granular control of elements of the file system through available configuration options such as compression and data deduplication