Linefeed

Are you also one of the unlucky ones who uploaded music to YouTube Music without proper local backups, and now you depend on their takeout service to get your files back? Welcome.

The main issue with this is, that you basically get back one huge folder containing all the .mp3's, and a .csv that somehow lists all these songs, and gives the corresponding album and artist(s).

This would be already unhandy enough, but what is much worse is that the title names in the .csv often do not match the file names of the song files. For instance, the .mp3's have truncated names, several special characters are replaced, etc. So it is basically a huge mess to sort these files back into some meaningful folder structure. Maybe there are some easy ways to do so (please tell me!). One thing you can do is trying to import all the files into a media player, and hoping that then the stuff gets sorted automatically. I tried that, and it worked for some albums, but for the most part it didn't. Probably because a lot of my files are not tagged. You can tag them manually, but this is an even bigger mess when the files are not grouped by their album.

So I wrote a little gawk script to at least create the correct folders and move the files into those as much as possible. All this script does is trying to correct the song title in the .csv in a way that it matches the real filename, and then moves the file into either this folder structure: ArtistAlbumTitle or this one: AlbumTitle The first one is probably more what you want, but the problem is that sometimes the titles of one album have several different artists. In this case, you do not get one folder per album, but one folder for each artist of that album. This is avoided with the second structure. You may want to use both structures to help you sorting everything correctly at the end.

This is all by far not perfect, but it helped me a lot to get started. There are several issues still open, like:

  • when there are several songs with the same title, these may not be treated correctly
  • even when all songs of an album are moved into their correct folder, the ordering of the songs in the album is still unknown (unless the song names are ordered lexicographically in the right way)
  • I may have missed necessary corrections of the .csv entries since I only tested it with my songs
  • Google may have changed the .csv structure when you read this, so the script will not work anymore
  • ...

This script is more meant as a pattern that you need to adapt to your needs. I also only tested this on a Mac. So use this as your own risk. Note that an awk script does not change the original .csv. But, of course, only run all this in a backed up test folder.

Preparation

I assume here that you got a bunch of zipped archives named something like takeout-20251213T042017Z-3-001.zip. Inside each archive you find this directory structure: YouTube and YouTube Musicmusic (library and uploads) And in the music folder are mainly a bunch of mp3's. Copy the content of all these music folders into a new common folder. It will then contain all the .mp3's and two .csv's: music library songs.csv – this seems to be the whole library (not only uploads), so I ignored this one. music uploads metadata.csv – here some information about the uploaded (and now dowloaded) songs is available. Copy the sort_takeout.awk script (see below) also into this folder.

Running the Script

The script will create the Artist -> Album -> Title structure per default. If you want the Album -> Title structure, change the createArtistAlbumTitle($1,$2,$3) call to createAlbumTitle($1,$2,$3) first.

Run the script: gawk --csv -f sort_takeout.awk music\ uploads\ metadata.csv > log

On the terminal there should be no output at all, unless something fundamentally is going wrong. When done, check the log file if all files are copied (search for File not found:).

Here is the sort_takeout.awk script:

# Run with:
# gawk --csv -f sort_takeout.awk music\ uploads\ metadata.csv > log

NR==1 {next} # skip header line in CSV

# Clean up "Song Title" from CSV
{if (length($1) > 47) $1 = substr($1, 1, 47)} # titles longer than 47 are truncated
{gsub(/\.mp3/,"",$1)} # in case the title already ends with .mp3, remove the ending
{gsub(/'/,"_",$1)}    # ' -> _
{gsub(/"/,"_",$1)}    # " -> _
{gsub(/:/,"_",$1)}    # : -> _
{gsub(/\//,"_",$1)}   # / -> _
{gsub(/\?+/,"_",$1)}  # ? -> _   (also multiple '?')
{$1 = $1".mp3"}       # add the ending .mp3

# Clean up "Album Title" from CSV
{if ($2 == "") $2 = "unknown"}    # if no album title -> "unknown"
{gsub(/\//,"_",$2)}               # '/' -> '_'

# Clean up "Artist Name 1" from CSV
{if ($3 == "") $3 = "unknown"}    # if no artist name -> "unknown"
{gsub(/\//,"_",$3)}               # '/' -> '_'

# create directories for each file and copy the file there
// { if ($1 == ".mp3") print "No title given!"; else
	if (checkIfFileExits($1) == 0) createArtistAlbumTitle($1,$2,$3);
	else print "File not found: " $1}


function checkIfFileExits(file)
{
    return system("test -f \"" file "\"")
}

function checkIfDirExits(dir)
{
    return system("test -d \"" dir "\"")
}

function createArtistAlbumTitle(title, album, artist)
{
    print "Copying " artist " - " album " - " title
    if (checkIfDirExits(artist) == 0) true; else system("mkdir \"" artist "\"")
    if (checkIfDirExits(artist"/"album) == 0) true; else system("mkdir \"" artist  "\"/\"" album "\"")
    system("cp \"" title "\" \"" artist  "\"/\"" album "\"")
}

function createAlbumTitle(title, album, artist)
{
    print "Copying " album " - " title    
    if (checkIfDirExits(album) == 0) true; else system("mkdir \"" album "\"")
    system("cp \"" title "\" \"" album "\"")
}

I left some if (bla) true; in there to easily add some debugging if needed.