AMP Doing Its Job

It’s been a little over a fortnight since I modified my AMP player to work with a database -and, when it does so, to record every ‘play’ it decides on in a database table of its own.

So now, 15 days later, I can analyze that ‘plays’ table to determine if AMP has been doing the job I designed it for: picking a wide variety of composers and music genres, at random, and thus not creating any ‘favourites’!

The first results look good. Click on the thumbnail Read More...

CAO – Updated Already

Well, that didn’t take long!

First, I discovered rather late in the day that a cue sheet describing a large ‘super-FLAC’ audio file cannot, by technical design, list more than 99 files. So, if you’ve got more than 99 FLAC-tracks that you want to combine into a single super-FLAC then, you can’t, because the cue sheet cannot contain enough entries to describe it all.

CAO version 1.01 therefore now checks that you don’t Read More...


I’ve been having an interesting discussion of late, over on the Talk Classical forums.

It began life as someone saying they still preferred to use physical media for their classical music listening pleasure rather than any of the streaming, YouTube or similar ‘consume-but-don’t-own’ musical options available these days.

I agree with that sentiment

AUAC Bug Fixes

I have bumped the Absolutely Baching Universal Audio Converter script up to version 2.0 (from 1.13). The changes are minor and mostly cosmetic, but there are some bug fixes applied too, so the upgrade is recommended. Upgrade by downloading the script by clicking this link. Then, assuming you downloaded it to your $HOME/Downloads folder, just issue this command:

sudo mv $HOME/Downloads/ /usr/bin/

The full

Let’s not get physical!

I had a slight mishap with my main PC on New Years’ Eve: Manjaro released a new kernel and I installed it without thinking -and, though I believe the PC rebooted fine, I couldn’t actually see anything on my monitor, so whether it had or not was really kind of moot!

So, a swift rebuild later, and we’re back in business -though it’s not quite how I imagined I would spend my New Years’ Eve!

Anyway: doing the rebuild, I made the decision not to re-install Strawberry or Clementine music players, which have long been my go-to music player/managers on all manner of Linuxes. Why not? Because I’d just released AMP and it was working nicely and I couldn’t imagine a need for either of the two ‘fat’ music players ever again.

I’m still happy with AMP: it does exactly what I need a media player to do. I particularly like the ‘random play’ feature: tell it to play music from a folder that doesn’t itself contain music, but which contains sub-folders which, at some point in the tree hierarchy, themselves contain music files, and AMP will randomly pick a folder to play. It’s meant I’ve been listening to a bewildering variety of composers since New Years’ Day that I probably would never have listened to if left to my own devices. For me, random selection of complete compositions to play is a very important feature to have -and it’s something no other music player I know of, on Windows or Linux, is able to offer.

But here’s the thing: the way AMP does its random selection at the moment is to search through every physical folder that contains FLAC music files and then sorts the results of that search into a random order, before picking the top of the scrambled list. Whilst that works, physically traversing a hard disk that contains 1.8TB of music in 10,000+ folders and containiing nearly 70,000 FLAC files is inevitably going to take time. Practically, the result is that every time I run AMP in random-play mode, it takes maybe 30 seconds or more for it to find something that it then decides to play.

I can live with that. (I wouldn’t have released it if I couldn’t!) It’s a bit like the old days, when you had to extract an LP from its cardboard sleeve, then remove the LP itself from a plastic or paper inner sleeve, place it on the turntable, and wait for the play arm to position itself, drop down, crackle a bit in the lead-in track, and then the music would begin… The anticipation that AMP will finally find something worth playing is worth enduring, I reckon! It’s all part of the (new and digital!) listening experience. 🙂 And yes, I realise that younger readers who have no idea what an LP is (or was!) won’t have the faintest idea what I’m on about, but that’s fine too:  let’s just agree that deferred gratification is a ‘thing’ that can be desirable, shall we?!

But: really, the problem is in making AMP do a physical trawl through a music collection before it picks something to play. That is inevitably slow -and, worse, it’s inevitably biased. If I have 10,000 Bach tracks and 3,000 Mozart tracks and 4 Ēriks Ešenvalds tracks, what do you think the chances are that AMP will ever pick an Ēriks Ešenvalds track?! Practically zero as it turns out: the physical dimensions of your music collection will inevitably affect the results of a physical random walk through that collection. Composers with lots of tracks will tend to be favoured for selection over those that have fewer tracks. I noticed this almost immediately upon AMP’s release when I noticed an unusual propensity for it to select Bach, Mozart and Handel -three composers of whom I have an inordinately large number of ‘tracks’ of music.

I partially fixed this issue a day or two after release by letting you create an ‘excludes.txt’ file, in which you name composers you don’t want randomly selected. But the probability bias created by physically searching through music collections of uneven sizes is still inherent, even if mitigated somewhat.

Fundamentally: the only thing that will fix that physical size bias is to stop using physical searches through your music and to switch, instead, to logical searches.

By a logical search, I mean somehow that we extract the metadata from our music files and store it in accessible fashion within a database table. If you’re not knowledgeable about databases, think of a database table as the equivalent of an Excel spreadsheet, with rows of items -each row representing ‘a track from a CD’ and each row being made up of separate columns, where column A stores the composer name, column B the composition name, colum C the track title and so on. In either case, the problem now becomes one of picking a row at random from a single ‘spreadsheet’, rather than having to physically walk through 10,000+ folders before making a random choice. Guess which is quicker to do?!

Once you’re dealing with a database table, you can select rows from it extremely quickly: it’s one ‘object’ stored in one physical location, so scanning through it is trivially easy. Since you’re not physically visiting 10,000+ folders, but merely querying a 10,000 row table, we’re talking sub-second search times, for any reasonably-sized music collection (by which I mean, 30,000-physical CDs or so).

But it gets better: imagine you capture the composer and composition names from your physical music collection into a table we’ll call ALBUMS. Because you have a huge collection of Bach and a tiny one of Ešenvalds, there will be many rows in this table of Bach and only a few of Ešenvalds. The ‘proportion’ problem still raises its head, therefore: you’re much more likely to randomly select a Bach row than an Ešenvalds one.

But this is where databases -and logical data manipulation- works its magic. Because, when working with databases, you can do this:

create table my_composers as select distinct composer from ALBUMS

That is, you can create a new table as a selection of data column(s) from another table. And -the key point- you are populating this new table with only the unique -or ‘distinct‘- composer names. So from a table containing 10,000 rows of Bach and 45 of Ešenvalds, you have just created a new table that contains one row for Bach and one for Ešenvalds: suddenly, the two composers are on an equal probability footing!

So, if you create a table -say, called COMPOSERS- that contains one row per composer as a selection made out of your ALBUMS table, you now have a maybe 1000-row table, one row per composer. A random selection from that table means every composer stands as equal a chance as any other to be selected. Having randomly selected a composer, you can go back to your ALBUMS table and ask for a random selection from there where the composer name matches the composer name you’ve previously selected at random.

At this point, Ešenvalds’ Ubi Caritas is as likely to be selected as Bach’s Mass in B Minor -because the choice of which composer’s music to select was initially made on a completely equal basis, because of the one-row-per-composer in the my_composers table/spreadsheet.

All of which is by way of prelude to explaining why AMP, merely days after its release, has been bumped to version 1.02: a new option to create or refresh a music database (or even several music databases) as an extraction of logical data from your physical music files has been provided; and another new option to use that database to generate random selections of music has also been made available. Let me elaborate on what the new options are.

First, there’s this:

amp --musicdir=/root/folder/of/your/music/collection --createdb

That tells AMP to scan your physical music collection and extract its metadata into a database. The database is called “music”, because I didn’t tell it otherwise. But I could have done this:

amp --musicdir=/root/folder/of/your/music/collection --createdb --dbname=main
amp --musicdir=/root/folder/of/your/music/collection --createdb --dbname=overflow

…in which I name the database(s) and can therefore have different databases for different music collections. The ‘music’ name is just the default.

When creating a database, AMP will populate it with a fast scan of the specified music collection. That is, it will race off to the folder specified, find the first FLAC file in each sub-folder, and extract the metadata from just that one file. If you’ve tagged everything correctly, the ALBUM, COMMENTS, GENRE and so on for track 1 of a rip should be just the same in all the other tracks, too, of course -but since AMP doesn’t check all those other files, it’s a lot quicker to scan that way.

If you prefer, however, you can do this at any time after database creation:

amp --musicdir=/root/folder/of/your/music/collection --refreshdb --scanmode=full

That is, you can ask for the database contents to be updated using a full scan, which is where AMP runs off to the physical folder specified and reads the metadata out of every FLAC file found in all folders, not just the first one. This is more thorough (and can expose some tagging errors!), but obviously much slower to perform and complete.

As ever, since I didn’t mention a database name in that full scan command, AMP will perform a full scan of the music database. If I needed it to full-scan a database with a non-default name, you just tack on the ‘dbname’ parameter as before:

amp --musicdir=/root/folder/of/your/music/collection --refreshdb --dbname=main --scanmode=full

And if you like, you can also specify ––scanmode=fast, explicitly, though that scan mode is the default anyway.

So that’s how you create and populate and keep up-to-date a musical database. How do you tell AMP to use that database for the purposes of actually playing music? Like this


Birthday Presents

It soon being my birthday (and Christmas having just been and gone), it seemed appropriate to buy myself some presents.

The results are as you see them on the left (which you can click on, to make bigger), which encapsulates the current state of my study’s approach to things audio-visual.

We may skip over the small collection of deco glass vases and paperweights, which became something of an obsession during lockdown, and which is only one of three collections in my study. We can also Read More...

The Absolutely Minimal Media Player…

I suppose it had to come sooner or later: since all my media manipulation is done by scripts I’ve written myself (and which are freely available to download for anyone capable of installing ffmpeg and one or two other packages), it seemed appropriate to consider creating a scripted, minimally-functional media player.

The Absolutely Baching Media Player (AMP, to its friends) is the result.

AMP isn’t terribly sophisticated! It takes a physical folder full of FLAC files (it’s Read More...

Happy Christmas!

A Happy Christmas to all my readers.

It’s been a pretty rough year, but hopefully there are happier (and healthier!) times ahead.

May all your musical journeys in the year to come be interesting and enduring ones 🙂

See you in the New Year…


Once you’ve been collecting music for a while, you will suffer from an abundance of riches: those 15 different versions of the Beethoven symphonies; those 5 complete sets of Bach cantatas; at some point, they will all become difficult to navigate and make playing any particular recording increasingly difficult.

It is for such times that an ‘overflow’ library is a good idea: a separate physical storage area on disk where you move your lesser-played recordings to as you come Read More...

It’s not just Easytag… :(

In my last post, I pointed out the shenanigans that ensue when you use Easytag to update the metadata associated with your FLAC music files. Specifically, I demonstrated how Easytag will silently, and without the opportunity to configure the behaviour, change the name of a tag from COMMENT to DESCRIPTION.

When you then go on to use software which expects a tag to be called COMMENT, this causes problems!

Well, that chance

Easytag Garbage Creation

I had been subsconsciously aware of a problem for quite a while, but had paid it no real attention: whenever I looked, a lot of my music files seemed to lack proper performer details! The tracks played fine and everything else about them seemed OK, but just no ‘conducted by, orchestra playing is, soprano singing is…’ stuff. I’d noticed it, in a casual sort of way, from time to time… but because you can live without knowing that particular information at your fingertips, I mostly just did.

I had also vaguely registered (erroneously, as it turns out!) that the problem was with a lot of music I remember buying in the 1990s and ripping in the early 2000s, so I put it down to sloppy tagging habits, waaay back before I knew any better. But it wasn’t and isn’t. It is a tale of software shenanigans and garbage programming that really kind of ticks me off, to be frank. So, first: let me show you the problem…

Here is the metadata associated with some Darius Milhaud music files:

You may be able to work out that there are 10 tags (the ‘canonical ten’) and one of them is ‘COMMENT=’ and that this particular tag has no value assigned to it. There is nothing showing on the right-hand side of that particular tag’s equals sign. This music therefore has no performer information associated with it!

Well, assigning data to canonical tags is what my very own CDDT software was designed to do, so let me use that to fix this lack-of-data problem:

So that’s me specifying the conductor and orchestra. If I save that data and re-check the metadata tags associated with these files:

So now, we have a COMMENT tag containing the relevant information. It’s currently showing as tag [10], but if I save this in CCDT and then re-load it into CCDT and re-display:

…you can see that CCDT tidies tags up as it saves them, so that the COMMENT tag is saved into ‘canonical position’ [7].

So, now that the data is safely in the COMMENT tag, let’s say that for some reason or other I opened up these music files in a graphical tag editor such as Easytag, a Linux-based tag editor I’ve been using for years to do quick-and-simple tag inspections and the occasional tag ‘touch-up’ when needed, for many years.

Something I’ve always noticed about Easytag at this point, and which you can see happening in that screenshot, is that whenever it opens any of my tagged music files, it has always emboldened their file names in the left-hand panel you see here. Now, it usually uses bold in that panel to indicate that ‘there are unsaved changes that have been made to these files’… but since I’ve literally just opened the files in the program and haven’t changed a thing, I’ve always assumed this to be a ‘quirk’ and nothing more. How wrong I was! As we shall see!!

Anyway, I’ve opened the files in Easytag, just to inspect the data and you can see that it’s all present as CCDT said it should be: the ‘Comment’ tag is there, nicely labelled ‘Comment’, too.

Now, I’ve seen what I needed to see and haven’t typed a thing, so I go to close Easytag down… and this happens:

That is, a pop-up appears warning you that there are unsaved changes and they’ll be lost if you don’t say to ‘Save’ them. Just press [Enter] at this point to make the pop-up go away (as I think a lot of users would tend to do!) and guess what: you’ve just taken the ‘Save’ option!

So let’s say you’ve hit [Enter] without thinking about it much. Let me now immediately re-inspect the files in my own CCDT program:

….and what do we notice? In ‘canonical position’ [7], we now see a tag called ‘DESCRIPTION’, not ‘COMMENT’. Without really telling us, or explaining why, Easytag has renamed the COMMENT tag so that it is now called DESCRIPTION.

Never mind that its own user interface calls this tag ‘Comment’ in its own right-hand pane. Under-the-hood, Easytag thinks it’s OK to change tag names -and the fact that it had changed them behind the scenes is why it displays the files in bold when they are first opened and declares that there are ‘unsaved changes’ when you go to close the program without having manually changed a thing.

Is the fact that a tag got silently renamed the end of the world? Well, not in-and-of itself, no. It’s not. The music file will play fine; it will display the composer, album and track names properly. Even the embedded album art remains safe. Indeed, even if you inspect the tags in another program, you won’t see any indication of a problem at all:

That’s the Clementine music player’s tag editor, being used to investigate the tags associated with the same music files as before. Note that it still displays the file as having a ‘Comment’, so it’s not particularly fazed by having a differently-named tag supplying the information it wants to display there. It’s reading something now called “DESCRIPTION” and displaying it in something called “Comment”, totally transparently and automatically.

This tells you that a lot of music players will quite happily play these files just fine; they’ll also (usually) allow you to search what is now your DESCRIPTION tag as effectively as they’d previously allow you to search your COMMENT tag. So, functionally, not a lot changes because of Easytag’s sleight-of-hand.

But I wrote a tool a while ago now, called the Dizwell Tag Cleaner (DTC to its friends). If run against a set of previously-tagged music files, it ‘cleans’ their tags up, removing ‘non-canonical’ tags and leaving only the canonical ones behind.

Here’s a little code snippet to explain what it does:

# Fetch existing metadata into variables
COMPOSER=$(metaflac --show-tag=Artist "$f" | sed s/.*=//g)
ARTIST=$(metaflac --show-tag=Artist "$f" | sed s/.*=//g)
ALBUM=$(metaflac --show-tag=Album "$f" | sed s/.*=//g)
TRACKNUMBER=$(metaflac --show-tag=TRACKNUMBER "$f" | sed s/.*=//g)