1 Basic concepts: metadata, extended attributes
For most file system types like ext3, ext4, and xfs, a file has two kinds of data, one is the content and the other is called metadata which is used to describe the attributes of a file. The most common metadata is file permissions besides which a lot of other attributes are developed for special usage like ACL, SELinux context.Attributes except for file permissions, size, and timestamps, are not always there and less standard, so they are called "extended attributes".
2 Examples
2.1 Set attributes using setfattr
# A attribute is a Key/Value pair
$ setfattr --name=user.id --value="s001" 1.c
Make sure the key/name starts with "user.", otherwise it will fail.
The name is a string while the value can be a string, a hex number (started with 0x).
2.2 List user defined Attributes using getfattr
# use -d to dump all attributes with name like "user.*"
$ getfattr -d 1.c
# file: 1.c
user.id="s001"
2.3 List all attributes
I said SELinux and ACL take advantage of extended attributes to mark every file. But we didn't see any attributes for that in last example. That's because by default "getfattr -d" only prints out attributes whose name starting with "user.". To list all attributes, we have to add "-m" option.
# use -m to dump all attributes
$ getfattr -d -m ".*" 1.c
# file: 1.c
security.selinux="unconfined_u:object_r:user_home_t:s0"
user.id="s001"
The -m accepts a regular expression that matches the attributes' name. There is a special "-" which has the same function as ".*".
3 Further thought: Where should attributes reside
Attributes can be saved as metadata just as what we have talked about. But they can also be embedded into the file content. Many file formats support customized attributes. For example, a jpeg file can contain GPS, camera name, and many other attributes. It also allows for user-defined attributes.
Let's list the pros and cons of each way.
3.1 attributes as metadata
Pros:
- It doesn't touch the file's content, so it's transparent to end-users.
- Reading metadata is much faster than reading the file content.
Cons:
- Not all file systems support user-defined metadata
- It's not portable across different file system types. So when a file is copied from the ext4 file system to NTFS file system, user-defined metadata may be lost.
- Usually, a file system's metadata has a limited size.
3.2 attributes embedded into file content
Pros:
- It doesn't depend on the file system underhood, so it can be portable across any file system types.
- The attribute can be of any size.
Cons:
- It modifies the file content, so it may break the software processing such a file.
- To get/set attributes, the file content must be read/write, which is much slower than operating metadata.
- It requires a special file format, can not be a plain text file. So generic text tools cannot be used on it easily.
3.3 Any other way?
YES. Actually, attributes can also be saved as a separate file rather than mixing with the original file content. Looking at the implementation of GIT, you will have an in-depth understanding.

No comments:
Post a Comment