The AbsolutelyBaching Flac Checker

1.0 What is AFC?

FLAC music files can, over time and for no very obvious reasons, become internally corrupt. The AbsolutelyBaching Flac Checker (or AFC for short) is a shell script which checks all FLAC files found in a specified location for signs of this internal corruption. If it finds any corrupt files, it will warn you and tell you which ones they are. By regularly running the script against your music collection, you can thus identify internal corruption if it happens -or get assurance that it isn't happening.

AFC does not, however, ever attempt to repair or fix corrupt FLAC files. It advises you of their existence, but it's entirely up to you how to fix up the problem. Fixing corrupt FLACs is usually done by re-ripping the music from the original CD or restoring a known-good copy of the FLAC file in question from an older backup.

As of version 2.02, AFC will also check whether the year contained within an ALBUM tag matches the year stored in a music file's DATE or YEAR tag. That is, if you tag a track as belonging to the album Symphony No. 5 (Kubelik - 1970), for example, but then go on to tag the file's recording year to be 1971, then we now have a situation where 1970 isn't the same as 1971, which is not a good or consistent look! There may be nothing wrong at all with the audio signal of such a file: it may pass all the consistency tests in that regard which AFC performs. But it fails a consistent metadata test, and so will be reported as a separate problem. AFC cannot hope to fix such a metadata inconsistency: it cannot know whether 1970 or 1971 is 'right'. Indeed, it's perfectly possible that both dates are wrong! So, if you ask it to in verbose mode, it will simply tell you which files are affected so that you can sort the problem out yourself manually.

2.0 What does AFC do?

AFC runs entirely as a command-line tool, from a terminal session window. You tell it what directory contains the music files you want validated. It will then open each file in turn. If it determines that the file was last checked for corruption sufficiently long ago (by default, 30 days), then it will re-check the file and determine if its audio component has become corrupted. If it determines that a file was checked within the last 30 days (by default), then it won't waste CPU time and memory resources re-checking the file at all, but will simply skip over the file without taking any further action about it.

In this way, AFC is an efficient method of detecting corruption: it only checks files which, by their age since their last check, are deemed to be worth re-validating, whilst more freshly-checked files won't be re-validated unnecessarily.

If AFC does check a FLAC file, and it passes validation, it will modify the file and write the date of the corruption check as a Vorbis Comment called TAGDATE. It is by checking the value of TAGDATE that AFC can determine when a file was last checked for internal corruption.

If AFC checks a FLAC file for audio signal corruption, it will also check its ALBUM and DATE tags are consistently marked up.

3.0 Obtaining AFC

Installing AFC is a simple process of issuing the following commands:

wget https://absolutelybaching.com/abc_installer
bash abc_installer --afc

The first command downloads a generic installer script. The second command is an instruction to run that installer script, specifying (with a double-hyphen parameter) that it's AFC you want to install. You will be prompted at one point for your sudo password, without which the installer cannot install the software in the /usr/bin folder correctly.

Once installed, AFC can be run using a variant of the afc command, which I'll elaborate on shortly. When you run it, it will first check that some software prerequisites are present on your system and, if they are not, will advise you how to install them. The installation instructions will be distro-appropriate. For example, here's what you might see on OpenSuse:

Note how AFC knows that on OpenSuse, you install software at the command line with the 'zypper' utility. Similarly, on Ubuntu, it will advise you to run an 'apt install' command; on Manjaro, a 'pacman -S' command. On Fedora, however, you might see this:

Uniquely, Fedora doesn't contain ffmpeg in its standard repositories (for software patent reasons, I think), so to get it installed you first have to install and enable the RPM Fusion repository, which is done by issuing these two commands:

sudo dnf install https://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm
sudo dnf install https://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm

Once those repositories are enabled, the software installation command shown in yellow/orange in the above screenshot will work without problem. Fortunately, Fedora is the only distro I am familiar with on a daily basis that this applies to.

Once the software prerequisites are installed on a system, AFC should just get on and start working, with no need to display any other messages.

As I've mentioned, the basic command to run the Flac Checker is simply to type afc in a folder full of FLAC files, but the program takes a number of command line options which can alter its behaviour and change what it will do. In summary, the full set of command line options are as follows:

afc --music=<path> --logdir=<path> [--force] [--checkdays=<number] [--verbose]

Taking these switches and options in turn, then:

You can supply an absolute path to wherever your music files are located with the --music switch. This means you don't need to cd (change directory) to a folder full of FLACs before launching AFC, but can inspect them 'remotely' from another folder entirely.

You can also specify an absolute path to wherever you want the log file written to with the --logdir switch. This needn't be the same folder as the music folder.

If you specify neither music nor logging folders, then the current working directory (i.e., the directory you are in when you run AFC) is assumed to be the foldername that should be used for both. If you specify a music folder but not a log directory, the log directory is set to be whatever the music directory was set to. If you specify (or imply) a log directory which doesn't exist or to which you do not have write permissions, the /tmp directory is used for logging instead.

Logging is always performed: the log file will show what folder is being checked, how many files were checked and how many have been found to be 'clean' or 'corrupt'. If you specify the --verbose option, then the full path and file name of every file being checked or skipped will be listed. The full path and file name of any files found corrupt are always recorded in the log, regardless of whether verbose mode was specified or not. Log files are automatically named in the form AFC-<timestamp>.log. The "timestamp" component of the name will be a number, consisting of year, month, date, hour, minute. So, for example, you might have a file called AFC-202004191334.log. This would be created by a run of AFC that started on April 20th, 2020 at 1:34pm.

By default, AFC will only re-check a file for corruption if it determines it was previously checked more than 30 days ago. If you prefer to check files more frequently, you can either specify --force (which essentially forces every file to be re-checked every time AFC encounters it) or a --checkdays=x parameter, where 'x' is a number of days. If you said --checkdays=6, for example, then AFC would only re-check a file if it determined that it was previously checked more than 6 days ago. If you specify both --force and --checkdays parameters, then --force takes precedence and the --checkdays setting is simply ignored.

Whilst AFC can be run in an interactive manner, directly in a command terminal, it is most suitably run from a crontab. For example:

0 2 1 * * /usr/bin/afc.sh --music="/music/flac/hjr/classical" --logdir="/home/hjr/Logs"

...in my personal crontab would mean that on the 1st day of each month, my entire classical music collection would be checked by AFC, with the results of the scan being written to my personal Logs directory. Note that a full path to the executable is advised in crontab, because the environment that applies (including any PATH statement) when invoking a crontab entry may not be the one that you are expecting.

4.0 Worked Examples

For a complete set of worked examples using AFC, and a lengthy discussion of the logic underpinning what it does and why and how it does it, see this page, especially section 9.

5.0 Metadata Consistency Checks

New to AFC in version 2.02 and above is the ability to perform a metadata consistency check as well as a corruption check of the audio signal in a FLAC file. Specifically, if the ALBUM tag has a year in it, and the DATE tag has a year in it, it will check that the two dates match. Writing a year into the ALBUM tag is essential if it is to always be possible to uniquely identify one recording of a composition from another (see this article for an explanation as to why this is the case). Therefore, we should always be tagging our ALBUMS as, for example, Peter Grimes (Britten - 1958) -and now there is a year number in positions 5, 4, 3 and 2 characters from the end of the ALBUM tag. Entirely separately, you may have tagged the recording date for this album as 1959. There would then be a year mismatch between the two pieces of metadata -but AFC cannot know, nor work out, which one (if any!) is the 'correct' year to use. It will therefore flag this sort of problem as a 'bad date' issue.

Note that AFC will only check the ALBUM and DATE metadata tags if it was going to check the audio component of a file for internal consistency. That is, if the file was previously checked within the last 30 days (by default), the audio signal would not be checked for corruption and no metadata consistency check would be carried out either. But if you specify --checkdays or --force to make a file be corruption-checked then it would be checked for metadata consistency, too.

It is perfectly possible for a file to pass the audio integrity checks but fail the metadata consistency checks (or vice versa). Passing the one implies nothing about the state of the other.

Here are some screenshots to explain this consistency check in more detail:

In this screenshot, we see that 42 files were skipped from an audio integrity check. None failed. No files are reported to have 'OK Audio, but bad year data' either: that's simply because the ALBUM and DATE metadata were not checked, because the audio check was not performed. If these files had not been skipped for checking (because they had already been checked within the past 30 days), however, then even so no 'bad year' errors would have been shown, because if you look closely, the ALBUM tag does not have a year in it at all. That's bad from the point of view of uniquely identifying the recording, but since there's no possibility of an ALBUM-year/DATE mismatch, the metadata consistency check would have been skipped anyway.

Here's another check on the same files:

Note that in this screenshot, the ALBUM now has a year in it: 1970. Note the DATE tag (labelled, perhaps annoyingly, as Year!) says 1971. This is therefore now a candidate for a metadata inconsistency warning... but we don't see one being issued in the program output! Why: because all five files were skipped from checking. The program tells us it's only checking files last checked more than 30 days ago and none of these files meets that criteria. So if the audio test is skipped, the metadata test is skipped too.

Finally:

This time, 'Force checking has been specified' (with the --force switch). Now all 5 audio files are validated and none are shown to be in error. That means all 5 files passed their audio integrity checks: there's no internal corruption of the FLAC audio signal in any of these files. However, every single file has triggered the 'year in ALBUM doesn't match the DATE tag' warning, and 5 files are counted in the 'Audio OK, but bad year data' summary line at the end of the output. In other words, because AFC was made to check the FLAC internal consistency, it also checked the ALBUM and DATE tags. As it did so, it spotted the difference between 1970 and 1971 and declares an error on a per-file basis as a result.

AFC can never fix these metadata inconsistencies, because it can neither know nor guess which of the two dates is correct. But it at least tells you that you have a metadata issue which you need to sort out, manually, later on.

Author

AFC was devised and written by Howard Rogers ([email protected]).

License

AFC is copyright © Howard Rogers 2019, but is made available freely under the GPL v2.0 only. That license may be downloaded here.

Bugs Tracking, Feature Requests, Comments

There is no formal mechanism for reporting and tracking bugs, feature requests or general comments. But you are very welcome to email your comments to [email protected].