About ZFS data sets

The data set of ZFS used as the file system of TeraCLOUD has a nested structure like a directory.

Since each data set can have different properties, it becomes a control unit of various functions such as acquisition of values ​​such as capacity, creation of snapshots, rollback, cloning, quota, encryption and so on.

In addition, since property inherits in parent and child, the basic setting inherits the parent's person, and the use capacity etc. includes the lower rank.

These are managed in a format like a pass ... ...

DATASET name
someuser
someuser / backup
someuser / backup / data.ZsXw
someuser / backup / data.WtPO
someuser / files
someuser / someappdata

It is actually mounted on the directory path as follows.

DATASET name MOUNTPOINT
someuser /
someuser / backup / backup
someuser / backup / data.ZsXw /backup/data.ZsXW
someuser / backup / data.WtPO / backup / data.WtPO
someuser / files / files
someuser / someappdata / someappdata

Mount points mean placement to the path.

If there is a directory named / foo in the above data set structure, this is just a directory set in the datuset named someuser and there is no data set called someuser / foo (you can dare to make it ). If there is a file /foo/xyz.jpg, this will be the file placed in the dataset called someuser.

Creating a data set is more expensive than creating a directory, so it does not create a data set for each directory.

Reference Oracle's ZFS document

What is a snapshot?

A snapshot is a function that can hold an image of the current file system as it is. This can be acquired in "almost instantaneous" and even if data is written after creation of the snapshot, it is possible to record data only in the difference in the file system. Therefore, the current state can be maintained with a very low cost method.

A snapshot can be acquired in units of data sets. If you set the recursive, you can also acquire the lower data set at the same time.

For reference, there is a directory of snapshot name under /.zfs/snapshot located directly under each data set, and it is readable. Since snapshots can only be referenced "files in the dataset itself", in the example above, snapshot files of someuser / files are not included in the parent's someuser.

Therefore, snapshots of someuser / files must be taken individually or taken recursively, and if taken snapshots are also stored under /files/.zfs/snapshot/.

Also, the snapshot holds the current state only, it is not a backup. However, it is very useful for restoring from logical mistakes.

What is a file system?

A file system that can be placed in ZFS means a data set that can be mounted on a directory path.

That is, the data set is a generic name of a file system, a snapshot or the like.

What is rollback

It is a function to rewind the file system to the point of the snapshot. It discards everything written after the snapshot and restores the file system to its original state.

In the case of rolling back, since all the progress snapshots are lost, it is impossible in principle to cancel the rollback.

What is a clone?

A clone is a function to create a new file system based on a snapshot.

Since only the difference from the snapshot is retained, if the data contains similar data, it is possible to prevent the compression of the capacity.

What is file system level encryption?

It is a function of encrypting AES etc. on a file system basis.

This file system requires a password or key when mounting. If you lose your password or key, the data is encrypted, so you can not restore it.

On the data set in TeraCLOUD

TeraCLOUD can also be said to be a service using HTTP (WebDAV) as a file transfer protocol by exposing a part of the functions of ZFS with the REST API.

The user of TeraCLOUD has one data set at the time of account creation and has plural data under it. And it is related to WebDAV and the area visible from the web interface of TeraCLOUD as follows.

DATASET name MOUNTPOINT URL
(ROOT) / none
backup / backup private
backup / data.ZsXw /backup/data.ZsXW Unpublished
backup / data.WtPO / backup / data.WtPO private
files / files / dav / ← Areas visible with WEB UI
someappdata / someappdata none
someappdata 2 / someappdata 2 none

The user's data set (ROOT) All of the following items are owned by the user, and the user's used capacity means the used capacity of the area of ​​(ROOT) (As described above, the capacity of all the lower data sets Incidentally, this is not a data set name named (ROOT) for the sake of convenience).

In addition, all of the data sets can be managed with the user's ID and password, and up to 32 data sets can be created anywhere in the REST API.

In the Tera cloud, the file system type data set is always mounted at the same location as the data set name from the root. In other words, when you create a data set named someappdata, it mounts in the path / someappata, and when you create a data set called someappdata / foo under it, it mounts in the path / someappdata / foo.

However, when you try to create a dataset called files / foo, you already can not create a dataset if the user creates a directory named foo and installs files.

/ dav /, the range that WebDAV can access

As mentioned above, there is a unique relationship between the dataset name and the mount point, but there is no unique relationship between the dataset and the published URL.

Normally, the user can access the web interface and the area that can be viewed with WebDAV from the path "/ dav /".

URL layer / dav /

URL path conversion rule (fixed) / dav / → / dav /

Directory path layer / dav /

Mapper

/ dav / → / files /

/ files

Mount path rule (fixed) / files → files

Dataset layer files

For users created after April 2015, the path: / dav / is mapped to the path: / files / and the data set: files is mounted on the path: / files /. However, users created before that have no files dataset, and a directory called / files / is created in the root dataset.

The above mapper is scheduled to be switched with the mapper API that will be released in the future. Therefore, it is possible to map the data set created by the data set API freely to / dav /. However, since the user may have trouble accessing when uploading files with a specific application, careful attention is required for execution.

There is also a need for an application to hold a file in an area that can not be accessed by a normal operation from a user. In this case, it is necessary to create a data set other than files.

However, at the present time, there is no plan to show individual paths as paths other than / dav /.