Follow By Email

Sunday, January 1, 2023

[New post] Inside APFS: from containers to clones

Site logo image hoakley posted: "This article is an attempt to explain some of the key features in APFS, as of macOS Ventura. Older file systems are relatively simple and straightforward to use; modern file systems like ZFS, Btrfs and APFS are much richer in features, many of which chall" The Eclectic Light Company

Inside APFS: from containers to clones

hoakley

Jan 2

This article is an attempt to explain some of the key features in APFS, as of macOS Ventura. Older file systems are relatively simple and straightforward to use; modern file systems like ZFS, Btrfs and APFS are much richer in features, many of which challenge our understanding.

Partition table

Whatever the type of storage, hard disk or solid-state, its space needs to be organised to store files and metadata. At the top level of each disk is its partitioning scheme, dividing storage into large contiguous blocks for use with file systems. Conventional usage refers to these as partitions, but in APFS they're also known as containers. The scheme now used universally for macOS is the GUID Partition Table (GPT), shown diagrammatically below with the start of the storage at the top.

GPT

Near the start of the storage is a Primary GPT Header, containing the table mapping where the partitions are on the disk. This header is repeated at the end of the storage, as the Secondary GPT Header, which should of course remain identical at all times.

In the header, there's an initial block containing information about the storage as a whole, followed by an entry for each partition. Those entries specify the type of partition, give each its own unique GUID/UUID, give the start and end locations of that partition, its attributes, and name. Following the header and its list of entries are the partitions themselves, each containing file-system specific data.

Container

In HFS+ each volume, with its own file system, is a separate partition. If you want to change the size of a volume, that requires changing the disk's partition table, which may be impossible without losing data. This also means that HFS+ volumes can't share free space.

In APFS partitions are known as containers, which have fixed size and don't share storage with other containers. Within each container are one or more volumes, each containing its own file system and sharing the same space within that container. An APFS container stores all the higher-level information common to the file systems within it. These include volume metadata, snapshots, and provision for space management and crash protection.

Each APFS container has one instance of the Space Manager, a major feature of APFS to keep track of free space within the container, allocate and free storage blocks on demand. A container also has one instance of the Reaper, to manage the deletion of objects too large to be deleted between file system transactions. This tracks the deletion state of those large objects so they can be removed across multiple transactions.

Volume

An APFS volume contains file system directories, file metadata and file data. Each has its own superblock, containing the location of the root file system tree, the extent reference tree, and the snapshot metadata tree, as well as the volume object map.

Objects stored on disk are never modified in place, a major departure from HFS+. Instead, a copy of the object is modified and written out to a new location on disk. This is the overriding principle of copy on write and applies both to objects being stored by the file system, and within the file system itself.

DiskStructure1015over

Hard links

These are available in both HFS+ (where directory hard links are also available) and in APFS, which only supports hard links to files. They can only be created in Terminal using a command like
ln /Users/myname/Movies/myMovie.mov /Users/myname/Documents/Project1/myNewMovie.mov

That command creates a second entry in the file system to the same file data. The file system keeps a count of those references to determine when to delete the file, so when you've finished using a hard link, you can put it into the Trash without the original being deleted. Only when there are no remaining references to that file will it then be deleted from the file system.

Hard links look and work exactly like the original file, and can be moved around freely within the same volume. Copy one to another volume, though, and the copy will be a complete unlinked file. Hard links to files and to directories are one of the essential ingredients of Time Machine backups on HFS+, but as APFS doesn't support directory hard links, Time Machine has to use a different backup format when stored on APFS.

Clones

Duplicate or copy a file in HFS+ and a new entry is made in the file system for the copy, and all the data in the original file are copied to a new storage area to create a different file. Whenever it can, APFS doesn't copy any data at all, but creates a clone file instead. This resembles a hard link, in that the file record points to the same data as the original, but a clone is a separate file with its own iNode.

Conditions which have to be met for macOS to create a clone are:

  • both the original and copy files must be on the same APFS volume, so sharing the same file system;
  • copying must be performed using either of two specific commands (both forms of copyItem()) in the FileManager.

In practice, these include all copies and duplicates made within the same volume by the Finder, and most made by apps. This also applies to whole folders, provided they're copied according to these rules.

Where this gets confusing is that the Finder doesn't tell you that the duplicate takes no extra space. Put three duplicates in a folder, and the Finder assures you that they take three times the space of one of them, but that isn't true. What's more, when Time Machine backs them up to an APFS backup store, it doesn't copy three files, just the one and two clones. However, if you copy those three clones to a different volume, that copy doesn't meet the requirements for cloning, and three separate files are created on the destination volume.

Sparse files

Many apps, such as databases, now work with files that are largely empty. Stored conventionally, those would take a lot of space to keep no actual data, so APFS introduces a new type of file, the sparse file. These save wasted space by skipping all the empty data, and only storing contents that aren't empty.

For this to work, the app writing the sparse file has to follow strict rules. If it assembles a block of sparse data, consisting of a few bytes of regular data, 5 GB of zero bytes, and another few bytes of regular data, writing that in the normal way to a file doesn't create a sparse file. To write a sparse file, the app needs to work with file handles, and seek to file offsets to skip writing empty data. Only where empty data have been omitted using the seek call will that data be omitted from the sparse file.

sparsefile02

What you end up with behaves quite uniquely. Use Finder's Get Info and you'll see that its size is 5 GB, but it only uses 8 KB on disk.

sparsefile03

Duplicate it to fill a folder, and that will be reported as having a size of, say, 55 GB, but only taking 90 KB on disk. Results from Terminal are no more helpful: ls -la simply says that each of those sparse files is 5 GB in size.

sparsefile04

Time taken for each of these operations is a good indicator of whether APFS has kept the sparse file, or exploded it to full size. Creating, moving and copying a sparse file takes an instant; the moment a progress indicator appears, you know that the sparse file has exploded.

Duplicating or moving a sparse file within the same APFS volumes retains its sparseness. Originally, copying a sparse file between volumes, or using cp in Terminal, could result in their sparseness being lost. Now, copying sparse files between APFS volumes, even on different disks, should retain their format. Sparse files should also be preserved when backed up using Time Machine to APFS, but other backup utilities may not be as successful.

Sparse files invariably explode to full size when copied to a different file system, for instance when backing up from APFS to HFS+, even when the destination file system offers its own sparse file format. There currently appears no solution: compressing sparse files breaks their format, and when decompressed they explode to full size.

If you want to experiment with sparse files, or survey folders for clones and sparse files, try my free utility Sparsity.

Key point summary

  • HFS+ volumes are fixed-size partitions of a disk.
  • APFS volumes vary in size and share space within a single partition or container.
  • Clone files are different from hard links, as they refer to different files with common data.
  • Clones can only exist on the same volume, but are preserved in Time Machine backups to APFS.
  • Sparse files are a special format containing only non-empty data, requiring special creation.
  • Sparse files are preserved across APFS volumes, even between different disks, and in Time Machine backups to APFS.
  • Sparse files explode to full size if not (re)written correctly, or when transferred to other file systems.
  • Never compress sparse files, as they will explode to full size during compression or when decompressed.
Comment
Like
Tip icon image You can also reply to this email to leave a comment.

Unsubscribe to no longer receive posts from The Eclectic Light Company.
Change your email settings at manage subscriptions.

Trouble clicking? Copy and paste this URL into your browser:
http://eclecticlight.co/2023/01/02/inside-apfs-from-containers-to-clones/

Powered by WordPress.com
Download on the App Store Get it on Google Play
at January 01, 2023
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest

No comments:

Post a Comment

Newer Post Older Post Home
Subscribe to: Post Comments (Atom)

[New post] Godzilla Library Edition by James Stokoe, John Layman, Chris Mowry, Alberto Ponticelli, Dean Haspiel

...

  • Your Ambitious Menu: 5 recipes to cook this week
    New on Ambitious Kitchen ...
  • Making a Layer Cake? You Need This Tip 🍰
    One of the biggest hurdles to a stunning layer cake is... flat layers! Here's how to get them. ...
  • [New post] Everyone Wants to Be a Cat
    Donna...

Search This Blog

  • Home

About Me

PH News Net
View my complete profile

Report Abuse

Labels

  • 【ANDROID STUDIO】Data Binding
  • 【ANDROID STUDIO】Data Binding Show or Hide Progressbar
  • 【ANDROID STUDIO】Data Binding with object
  • 【ANDROID STUDIO】Live Data
  • 【ANDROID STUDIO】Live Data with Data Binding
  • 【ANDROID STUDIO】View Model
  • 【ANDROID STUDIO】ViewModel Data Binding
  • 【ANDROID STUDIO】ViewModel Data Binding Factory
  • 【FLUTTER ANDROID STUDIO and IOS】Common Weight and Mass Conversions
  • 【FLUTTER ANDROID STUDIO and IOS】custom lite rolling switch
  • 【FLUTTER ANDROID STUDIO and IOS】Managing State
  • 【FLUTTER ANDROID STUDIO and IOS】Simple Stopwatch
  • 【FLUTTER ANDROID STUDIO and IOS】Specify Height and Width in Percent with respect to the screen
  • 【FLUTTER ANDROID STUDIO and IOS】tab key or shift focus to next text field
  • 【FLUTTER ANDROID STUDIO and IOS】Weight Convert
  • 【GAMEMAKER】Display
  • 【GAMEMAKER】Draw Name
  • 【GAMEMAKER】enemy fire continously
  • 【GAMEMAKER】Energy
  • 【GAMEMAKER】Explosion
  • 【GAMEMAKER】Health Bar
  • 【GAMEMAKER】Hearts
  • 【GAMEMAKER】Highscore
  • 【GAMEMAKER】Horizontal Shooter
  • 【GAMEMAKER】Inventory
  • 【GAMEMAKER】keep the player facing the mouse pointer
  • 【GAMEMAKER】one way to do a fog of war
  • 【JAVASCRIPT】implements draggable progress bar
  • 【JAVASCRIPT】Math Quiz GAME export CSV
  • 【LARAVEL】PHPWord pass dynamic values when export to ms docx and download using PHPWord
  • 【PYTHON OPENCV】Image classification in Keras using several models for image classification with weights trained on ImageNet
  • 【PYTHON PYTORCH】metric classification accuracy
  • 【PYTHON PYTORCH】metric classification report
  • 【PYTHON】algorithm compare all classification models
  • 【PYTHON】algorithm evaluation k fold cross validation
  • 【PYTHON】leave one out cross validation
  • 【PYTHON】metric confusion
  • 【PYTHON】metric regression mae
  • 【VISUAL Csharp】Enumerate network resources
  • 【VISUAL Csharp】File Properties
  • 【Visual Studio VB NET】Clear Saved Passwords
  • 【Visual Studio VB NET】Swap mouse button
  • 【Visual Studio VB NET】System Properties Remote
  • 【Visual Studio Visual Csharp】Get computer name
  • 【Visual Studio Visual Csharp】Get Disk Free Space
  • 【Visual Studio Visual Csharp】Get processor type
  • 【Visual Studio Visual Csharp】IP Address
  • 【VISUAL VB NET】Delete Form Data
  • 【VISUAL VB NET】Delete History
  • 【VISUAL VB NET】Hibernate
  • 【VISUAL VB NET】Keyboard Properties
  • 【VISUAL VB NET】Sound
  • 【VISUAL VB NET】Tray Icon
  • 【VISUAL VB NET】Web Browser
  • 【Vuejs】 table implements adding and deleting
  • 【VUEJS】seamless carousel effect Marquee using transition

Blog Archive

  • October 2023 (25)
  • September 2023 (1209)
  • August 2023 (1224)
  • July 2023 (1259)
  • June 2023 (1245)
  • May 2023 (1194)
  • April 2023 (1137)
  • March 2023 (1163)
  • February 2023 (1107)
  • January 2023 (1313)
  • December 2022 (1358)
  • November 2022 (1353)
  • October 2022 (1300)
  • September 2022 (1208)
  • August 2022 (1279)
  • July 2022 (1228)
  • June 2022 (1164)
  • May 2022 (1176)
  • April 2022 (1184)
  • March 2022 (1337)
  • February 2022 (1232)
  • January 2022 (1321)
  • December 2021 (1932)
  • November 2021 (3065)
  • October 2021 (3186)
  • September 2021 (3078)
  • August 2021 (3175)
  • July 2021 (3198)
  • June 2021 (3136)
  • May 2021 (1856)
Powered by Blogger.