π

Migrating Media Meta-Data from Bashee to Rhythmbox

Show Sidebar

As I upgraded my home system from Debian 9 (Stretch) to Debian 10 (Buster), I noticed that my long-time music player software Banshee was not able to start its UI any more. The last stable version of Banshee was published in 2014 and the project died after that.

Then why is this worth a blog entry? Before I was using Banshee, I used iTunes on OS X since approximately 2004. In the process of moving my mp3 music archive from OS X to GNU/Linux, I migrated all the meta-data which was not represented within the mp3 files from iTunes to Banshee:

Accumulating my listening meta-data over fifteen years, this meta-data does have some value to me.

Before I join the tedious game of trying to get Banshee to run by down-grading libraries, running some container stuff I don't really want to use, or similar, I decided to move away from this unmaintained project to a maintained one.

Requirements

Since my requirements are not that complicated, several programs might be suitable:

Choice of a Tool

Almost any non-trivial music manager should be fine with my requirements. Therefore, I was looking for any migration tool that helps moving away from Banshee.

It was difficult to find anything.

The only promising page I found was this one which describes migration path to Rhythmbox. Rhythmbox seems to be very much alive and widely used. It matches my set of requirements. Ironically, there once was a migration ongoing from Rhythmbox to Banshee because of Rhythmbox was going to die.

Migration

One comment of the article even linked a Python script for migrating ratings and play-count meta-data from Banshee to Rhythmbox. A quick test run showed that this script works.

I had to modify this method so that date added and last played information is migrated as well. So I took a look at the Banshee SQLite data-base format as well as the XML of Rhythmbox.

Then I studied how the Python script works and extended it with the additional meta-data.

If anybody else still used Banshee and needs to migrate to a supported alternative, this blog article should help.

Here are the steps of my migration:

  1. Locate Banshee database which is usually at: ~/.config/banshee-1/banshee.db
  2. It is crucial that you don't move or rename mp3 files after you stopped using Banshee and before you migrated to Rhythmbox. This is because the absolute file path is used to match meta-data between the two databases.
  3. Start Rhythmbox.
    • Change its music library root to the mp3 library sub-hierarchy.
    • Wait until all the mp3 file were indexed and written to its XML file.
  4. Quit Rhythmbox.
  5. Locate Rhythmbox XML file which is located at: ~/.local/share/rhythmbox/rhythmdb.xml
  6. Get the Python script and place it in a directory with the two music library files from above.
  7. Invoke the meta-data migration with python3 THISPYTHONSCRIPT.py
  8. Move the modified rhythmdb.xml to ~/.local/share/rhythmbox/ and keep a backup of the previous version just to be sure.
  9. Start Rhythmbox

Now you should be able to see the migrated meta-data from Banshee in Rhythmbox.

#!/usr/bin/python

"""
Copyright (c) 2009 Wolfgang Steitz
Additional adaptations by Karl Voit

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301  USA
"""

import sys
import sqlite3
from lxml import etree

RB_DB = 'rhythmdb.xml'
BA_DB = 'banshee.db'

class banshee_db():
    def __init__(self, file):
        self.con = sqlite3.connect(file)

    def get_song_info(self, url):
        try:
            res = self.con.execute(
                'select Rating, Playcount, DateAddedStamp, LastPlayedStamp, Comment from CoreTracks where uri = ?',
                (url,) ).fetchone()
            if res is None:
                return None, None, None, None, None
            else:
                return res
        except:
            return None, None, None, None, None


banshee = banshee_db(BA_DB)

tree = etree.parse(RB_DB)
root = tree.getroot()
for song in root:
    if song.get("type") == 'song':
        #print('song: ' + str(song))

        rating = None
        playcount = None
        dateadded = None   ## Rhythmbox: first-seen
        lastplayed = None  ## Rhythmbox: last-played
        comment = None     ## Rhythmbox: comment

        for attr in song:
            if attr.tag == 'location':
                location = attr.text
            if attr.tag == 'rating':
                rating = attr.text
            if attr.tag == 'play-count':
                playcount = int(attr.text)
                song.remove(attr)
            if attr.tag == 'first-seen':
                dateadded = attr.text
                song.remove(attr)
            if attr.tag == 'last-played':
                lastplayed = attr.text
                song.remove(attr)
            if attr.tag == 'comment':
                comment = attr.text
                song.remove(attr)


        rating_banshee, playcount_banshee, \
            dateadded_banshee, lastplayed_banshee, \
            comment_banshee = banshee.get_song_info(location)

        if rating is None: # no rating in rhythmbox XML
            if not (rating_banshee == 0 or rating_banshee is None):
                rating = rating_banshee
                #print('set rating to ' + str(rating_banshee))

        if not (playcount_banshee == 0 or playcount_banshee is None):
            if playcount is None:
                playcount = playcount_banshee
                #print('set playcount to ' + str(playcount_banshee))
            else:
                playcount += playcount_banshee

        # insert rating into rb db
        if rating is not None:
            element = etree.Element('rating')
            element.text = str(rating)
            song.append(element)

        # update playcount of rb db
        if playcount is not None:
            element = etree.Element('play-count')
            element.text = str(playcount)
            song.append(element)

        if dateadded_banshee is not None:
            #print('set dateadded to ' + str(dateadded_banshee))
            element = etree.Element('first-seen')
            element.text = str(dateadded_banshee)
            song.append(element)

        if lastplayed_banshee is not None:
            #print('set last-played to ' + str(lastplayed_banshee))
            element = etree.Element('last-played')
            element.text = str(lastplayed_banshee)
            song.append(element)

        if comment_banshee is not None:
            #print('set comment to ' + str(comment_banshee))
            element = etree.Element('comment')
            element.text = str(comment_banshee)
            song.append(element)

tree.write(RB_DB)	  

Initial Opinion on Rhythmbox

It does the trick. However, there are some more or less minor drawbacks.

Rhythmbox does not find search terms within "Search all fields" in comments. I consider this as a bug. Let me see if I'm able to find out about this.

Furthermore, Rhythmbox does not promote the album art images in a way that I can enjoy it in a decent size. It's part of the UI but more or less as a thumbnail. This is a sad situation given the fact that I run Rhythmbox in full screen on a high definition screen with lots of unused screen estate within Rhythmbox.

I could not find out so far, how I may modify the comments of marked tracks of an album at once.

Existing "Automatic Playlists" can not be modified and I can't see how they are defined.

I get mixed feelings when I am searching for Rhythmbox plugins. Every plugin I was interested in was unmaintained or not prepared for the Rhythmbox version 3.4.3 I've got. Parts of the Rhythmbox ecosystem does seem to be dead somehow.

Comment via email (persistent) or via Disqus (ephemeral) comments below: