Uniqueness in Giocoso v. Giocoso Pro

1.0 Defining Uniqueness in Giocoso

Giocoso uses synthetic primary keys for its local database tables: each new insert is assigned a guaranteed-to-be-unique sequence number that is totally meaningless except for the fact that it's guaranteed to be unique. The reason for doing that was because the natural primary key for music is (as this article explains at length) a combination of Composer Name+Extended Composition Name and those things together can be very long strings, making searching and thus guaranteeing uniqueness rather slow (searching Jean-Joseph Cassanéa de Mondonville + In decachordo psalterio (Higginbottom - 1987) is always going to be slower and harder than searching for the number '5123', for example). When Giocoso was first designed, speed and lightness of touch were considered key design goals -and, indeed, remain so. If you run Giocoso Version 3.30 and above in local-only mode, your primary keys are still going to be synthetic sequence numbers.

Giocoso Pro introduces new design considerations, however: it requires the use of a 'proper' relational database (MySQL or its cousin MariaDB), probably on semi-dedicated hardware, so maintaining 'lightness of touch' can't really be considered a major issue any longer. Accordingly, Giocoso running in Pro mode uses new GLOBAL_PLAYS and GLOBAL_RECORDINGS tables which do use natural primary keys -though somewhat indirectly! If you take a recording's composer and concatenate it with its extended composition name, you can pass the result through a hashing algorithm to produce a 'hash value' which is guaranteed to be unique for those particular textual input values. For speed of computation Giocoso Pro uses the MD5 hashing algorithm to derive these 'semi-synthetic' key values. If you feed that algorithm the value "BeethovenSymphony No. 5 (Karajan - 1962)", which is a mash-up of one example composer+extended composition name, you get a return value of "8bd561c329191dc779d7b607ccfa6a03", which certainly looks ugly and synthetic, but actually counts as a natural key, because its source data are two completely natural key values. Feed the same algorithm the value "BeethovenSymphony No. 6 (Karajan - 1962)" and you get a return of "c22e22b360def939f694033cc78f18db": you get a wildly different result from simplying changing the text value of the symphony name from '5' to '6', so the hash value is truly unique per composer+composition data.

Of course, you can play the same recording multiple times, so whilst that composer+composition key is valid for the GLOBAL_RECORDINGS table (and is there called the RECHASHVALUE), it won't suffice in the GLOBAL_PLAYS table. For that table, you need to bolt on the time of play, so that the natural key becomes composer+composition+play date, and passing that mash-up of text through the MD5 algorithm returns something that Giocoso Pro calls the PLAYHASHVALUE.

2.0 Consequences of different uniqueness definitions

The consequences of this new definition of 'uniqueness' or primary key-ness in the Pro plays and recording tables are quite significant.

Since the non-Pro Giocoso doesn't really have unique identifiers (synthetic sequence numbers by definition can't really be used to mean anything), Giocoso has previously defined 'an unplayed recording' as "if the recording's full path and filename appears in the PLAYS table, that recording has been played; if it doesn't, then it hasn't". In other words, the physical file locations were used to functionally identify a recording and its plays. Of course, this means that if you ever alter the physical location of a recording, it will seem to be a new recording to Giocoso and thus previously unplayed. If you've stored a Sinfonietta in the /orchestral folder, for example, and then decide that it really ought to be stored in the /symphonic folder instead, the physical movement of the file will mean a 'new' recording appears in the RECORDINGS table -and since that particular /symphonic path has never appeared in the PLAYS table, it will count as an unplayed recording, even if you'd played it 104 times from its original /orchestral location!

Giocoso Pro's use of Composer+Composition Name to act as the new primary key for a recording (albeit after being translated into a RECHASHVALUE piece of MD5 gibberish) means that the unique identifier for a recording is now derived entirely from its metadata, or tagging values, not its physical location on disk. Beethoven+Symphony No. 5 (Karajan - 1962) will always be 'Beethoven+Symphony No. 5 (Karajan - 1962)' whether you store that recording in /symphonic, move it to /orchestral, move it again to /symphonies/Karajan or what have you. The test for 'have I played this recording before' likewise becomes 'does this RECHASHVALUE appear in the global PLAYS table, or not?' Since RECHASHVALUE is invariant so long as the metadata in the FLACs doesn't change, the new test will show a recording has been played no matter how many times you move it on disk ...but the second you alter its metadata, it will once more appear to be a new recording that has never been played before.

As a result of this shift from 'physical location' to 'metadata contents' as the defining characteristics of unique recording, therefore, you may find that your non-Pro statistics differ a little from your Pro ones.

Think of a recording that lived in /orchestral and which you later moved to /symphonic, but haven't yet played from that new location: Giocoso non-Pro would say you have one unplayed recording. Assuming the metadata wasn't changed, however, Giocoso Pro would say that the recording has been played, no matter where the file is physically stored, so would report zero unplayed recordings.

Conversely, think of a recording whose physical location you haven't altered but whose metadata has been altered. Imagine you stored a recording of Vaughan Williams 'Job' in a folder called /ballet/Job (Boult - 1971) for example, but accidentally tagged it up as "Job (Bolt - 1971)" when applying metadata to the file. You've played that file in non-Pro Giocoso, then you noticed the mis-spelling of Boult's name and corrected it in the ALBUM tag. Since you only modified the metadata tag, non-Pro Giocoso still sees that "/ballet/Job (Boult - 1971)" has been played, because the physical path has not changed... but Giocoso Pro will see "Job (Boult - 1971)" as an entirely different recording from "Job (Bolt - 1971)" and will therefore regard the metadata-altered recording as new and unplayed: on this occasion, Pro will report more unplayed recordings than the non-Pro statistics would have shown.

3.0 Did it have to be this way?

You might reasonably ask why Giocoso Pro changes its uniqueness identifiers to be metadata-based, rather than physical-location-on-disk-based. After all, Axiom 17 of the Classical Tagging article says that the physical should map the logical and vice versa, so there shouldn't really be a significant difference between either method of determining uniqueness. However, Giocoso Pro was explicitly designed to allow multiple devices to share a database ...and different devices might well 'see' their music files in different locations, even though they are all accessing the one NAS or shared hard drive over the network. My desktop PC, for example, has a folder structure of /home/hjr/Music/R/Ralph Vaughan Williams... but my laptop has things mapped as /sourcedata/Music/R/Ralph Vaughan Williams, and my listening room PC has exactly the same folder mapped as /netmusic/R/Ralph Vaughan Williams. In a world of heterogenous computing devices and storage paths, you cannot rely on the physical path of a music to be a reliable indicator of uniqueness or played-ness. All three of the computing devices I just mentioned, however, see the same COMPOSER and ALBUM tags, so the same recording will be unique for all of them, regardless of the path to the specific file each has to use.

So, the switch from physical location to logical metadata values as the uniqueness identifier for a recording made sense for Giocoso Pro. Giocoso run locally without any Pro capabilities, however, continues to use the physical location as the determinant of uniqueness: it's something to be aware of if you're switching back and forth between Pro and non-Pro modes of operation.

Would it be nice to go back and re-design non-Pro Giocoso to also use RECHASH and PLAYHASH unique keys? Probably... but for compatibility reasons, I won't (for now, at least: if there's ever a Giocoso Version 4, that will be one reason for the jump to the new whole number version).

4.0 And finally...

The last thing to mention about the consequences of switching from non-Pro to Pro modes of operation is that the change in uniqueness and unplayed-ness directly affects this website. It has long tracked how much of my music collection remains unplayed by Giocoso. I switch the production listening machine over to Giocoso Pro on January 28th, 2025... with an immediate leap in the unplayed percentage 🙁 The immediate upwards lunge in the amount of unplayed recordings caused by this switch is directly attributable to the fact that it turns out that, in the past, I have noticed tagging errors and corrected them, without also modifying the physical paths of the files involved. So non-Pro Giocoso was happy that they'd been played, but Giocoso Pro was computing new RECHASHVALUES based on their modified tag data which made them look new and unplayed. Fortunately, only around 40 recordings were affected in this way, so the upward movement in the 'percentage unplayed' statistic was/will be only temporary as those 40-odd recordings get played once more, triggering the arrival of their corrected RECHASHVALUES in the global plays table at last.