For many years, I was a happy user of the Android app MX Player. I solely play back local video files which I sync via Syncthing to my phone. Therefore, I disabled the network permission of MX Player for security reasons using NetGuard.
MX Player is a very fine app and I would have never started to switch away from it if not with recent updates, MX Player seems to wait for some time-outs to happen when started. This has the negative effect that I see the startup screen of MX Player for maybe a minute or so until the app actually is started and I may interact with it.
This is simply a no-go to me.
As a workaround, I could open any video file from my file browser app. This invokes MX Player opening the file instantly. However, this has some disadvantages such as not being able to see that file was played back recently.
This makes me look for alternatives which I did via this Mastodon message.
I got several answers and VLC for Android in particular caught my eye. Before I switched to MPlayer and later mpv, VLC was my default desktop video player for a couple of years. It's open source and therefore, I assume the app is not going to introduce bad patterns anytime soon.
I evaluated VLC for Android for a week and summarized my findings below.
For many years, I was a happy user of the Android app MX Player. I solely play back local video files which I sync via Syncthing to my phone. Therefore, I disabled the network permission of MX Player for security reasons using NetGuard.
MX Player is a very fine app and I would have never started to switch away from it if not with recent updates, MX Player seems to wait for some time-outs to happen when started. This has the negative effect that I see the startup screen of MX Player for maybe a minute or so until the app actually is started and I may interact with it.
This is simply a no-go to me.
As a workaround, I could open any video file from my file browser app. This invokes MX Player opening the file instantly. However, this has some disadvantages such as not being able to see that file was played back recently.
This makes me look for alternatives which I did via this Mastodon message.
I got several answers and VLC for Android in particular caught my eye. Before I switched to MPlayer and later mpv, VLC was my default desktop video player for a couple of years. It's open source and therefore, I assume the app is not going to introduce bad patterns anytime soon.
I evaluated VLC for Android for a week and summarized my findings below.
This is the perfect situation to summarize my personal requirements in order to make a good tool choice.
So this is my list of requirements for a mobile video playback application. Bold requirements are my must-have requirements.
I added VLC comments in case the VLC app doesn't quite fulfill my requirement:
As you can see, VLC ticks almost all the boxes. Unfortunately, the poor UX of forward/backward seeking, the async picture/audio and the missing visualizations for playing or already played files causing me to stay with MX Player as long as I find a better alternative.
At least, I now have my list of requirements to check against for an efficient app analysis.
Ich war beim Podcast Methodisch inkorrekt! in Episode 239 mit einen Audiokommentar on air, wo ich etwas zu den Themen "Wie man eine Authentifizierungs-App auswählt" und Passwortsicherheit im Allgemeinen sagen durfte. Der bezog sich auf die Diskussion zum Thema "Google" der Podcast-Episode 238 "Mö Mö", wo Reini einen etwas saloppen Kommentar zu der Thematik geäußert hat.
In diesem Artikel möchte ich den Teil mit den Tipps zum Umgang mit Passwörtern und Zweifaktorauthentifizierung (2FA) beschreiben.
Natürlich gibt es hier im Detail noch interessante Dinge zu betrachten. Deshalb gehe ich in den folgenden Kapiteln auf die wesentlichen Fallstricke und Hintergründe etwas näher ein. Falls es phasenweise etwas trocken wird empfehle ich trotzdem, da zumindest einmal durchzusteigen, da die Sicherheit all deiner Daten und auch deines Geldes davon abhängt.
Ich war beim Podcast Methodisch inkorrekt! in Episode 239 mit einen Audiokommentar on air, wo ich etwas zu den Themen "Wie man eine Authentifizierungs-App auswählt" und Passwortsicherheit im Allgemeinen sagen durfte. Der bezog sich auf die Diskussion zum Thema "Google" der Podcast-Episode 238 "Mö Mö", wo Reini einen etwas saloppen Kommentar zu der Thematik geäußert hat.
In diesem Artikel möchte ich den Teil mit den Tipps zum Umgang mit Passwörtern und Zweifaktorauthentifizierung (2FA) beschreiben.
Natürlich gibt es hier im Detail noch interessante Dinge zu betrachten. Deshalb gehe ich in den folgenden Kapiteln auf die wesentlichen Fallstricke und Hintergründe etwas näher ein. Falls es phasenweise etwas trocken wird empfehle ich trotzdem, da zumindest einmal durchzusteigen, da die Sicherheit all deiner Daten und auch deines Geldes davon abhängt.
Ich bin ein Fan der als xkcd-Methode bekannt gewordenen Passwortgenerierungsmethode. Ein xkcd-Web-Comic machte diese Methode einer größeren Zahl von Benutzern bekannt.
Die Vorteile der Methode sind, dass die dadurch generierten Passwörter einfach vom Menschen gemerkt werden können, sich durch normale Wörter schnell tippen lassen und trotzdem nur sehr, sehr schwer geknackt werden.
Der Trick besteht darin, etliche normale Wörter zu kombinieren, die nirgendwo auf der Welt in dieser Kombination bereits verwendet wurden. Mit vier Wörtern und den entsprechenden Leerzeichen dazwischen kommt man im Schnitt bereits auf über zwanzig Zeichen. Das ergibt ein langes Passwort, das man dann auch schon als Passphrase bezeichnet. Dessen Entropie (Maß für den Informationsgehalt oder das Chaos) ist durch die Länge so hoch, dass selbst Hochleistungscomputer aktuell Jahrhunderte lang rechnen müssten, um mittels Ausprobierens das Passwort zu knacken.
Wesentlich dabei ist, dass diese Kombination wirklich einzigartig ist. Das lange Passwort "Never Gonna Give You Up" ist durch dessen Verwendung in dem gleichnamigen Song überhaupt kein sicheres Passwort mehr, da Hacker alle bekannten Phrasen aus sogenannten Rainbow Tables rasch herauslesen können. Gleiches gilt für alle Sätze aus allen Büchern, Ortsnamen und in jedem Fall Datumsangaben und Passwörter kleiner als ungefähr zehn bis zwölf Zeichen - das entwickelt sich leider durch stärkere Rechner ständig weiter.
Diverse Passwort-Manager helfen bei der Erstellung solcherart geformten Passwörter, indem sie beliebig viele Wörter aus Wörterbüchern zufällig aneinanderreihen. Dabei kommen Passphrasen wie "super Schloss Regenbogen laufen" heraus. Mit etwas Kreativität kann man sich zu solchen Passphrasen eine mnemonische Gedankenstütze ausdenken, wodurch man sich solche sicheren Passwörter auch einfach merken kann.
Das beste Passwort nützt nichts, wenn es den falschen Menschen ins die Hände fällt. Um dieses Risiko zu minimieren, ist es notwendig, ein Passwort nur für genau einen Service zu verwenden.
Nimmt man diesen Tipp nicht ernst, passiert es leider nur allzu leicht, dass eine Sicherheitslücke bei einem Serviceanbieter (beispielsweise bei einem Essensbesteller) automatisch ermöglicht, dass alle Accounts von anderen Services (beispielsweise Firmen-Account, privater E-Mail-Account, ...) gehackt werden. Entsprechende Computerprogramme von Hackern machen das bereits vollautomatisch, wenn ein Passwort bekannt wird. Die Benutzernamen sind mittlerweile ohnehin häufig nur die (gleiche) E-Mail-Adresse und daher fällt hier das Raten des Benutzernamens weg.
Aus diesem Grunde darf man niemals ein Passwort für zwei verschiedene Zwecke verwenden.
Aus dem vorigen Punkten folgt, dass die Benutzer nun mit einer großen Anzahl an langen Passwörtern konfrontiert werden, von denen man sich meistens nur eine Hand voll merken möchte.
Hier hilft ein vertrauenswürdiger Passwort-Manager.
Ein Passwort-Manager ist eine Software, die beim Öffnen nach einem sogenanntem Master-Passwort fragt. Erst nach erfolgreichem Eingeben des Master-Passwortes öffnet sich die Anwendung und gibt Zugriff auf die darin verschlüsselt gespeicherten Passwörter und andere Informationen frei.
Ich selber habe weit über tausend Einträge in meinem Passwort-Manager, da ich auch Dinge wie Bankomatkarten-PINs, Zahlenschlosskombinationen, teilweise die Passwörter von Verwandten als Backup und so weiter speichere. Und wirklich alle haben unterschiedliche Passwörter.
Wie bereits in diesem Artikel erwähnt, fallen aus Sicht der nachweisbaren Sicherheit als auch Vertraulichkeit alle Cloud-basierten Services automatisch heraus. Es gibt keine Möglichkeit, dass unabhängige Experten den eingesetzten Source-Code als auch die Prozesse rund um einen vertrauenswürdigen Betrieb prüfen könnten.
Niemand kann garantieren, dass ein eventuell böser Angestellter Informationen an interessierte Parteien weiterverkauft. Und gerade bei Passwörtern gibt es sehr viele, extrem gut zahlende interessierte Parteien. Das betrifft alle vergangenen, aktuellen und zukünftigen Angestellten mit direktem oder indirektem Zugriff auf diese Informationen. Das sind hunderte oder tausende potentielle Kandidaten.
Weiters kann die eingesetzte Software Implementierungsschwachstellen haben oder sie wird in falscher Art und Weise verwendet. Da braucht es nicht mal einen böse Absicht, um auf diese Art alle Passwörter zu gefährden.
Bei einem der bekanntesten Passwortmanagement-Services namens LastPass war es im Dezember 2022 soweit. Es musste öffentlich zugegeben werden, das bereits etliche Monate zuvor der absolute Supergau passierte und eine Gruppe von Angreifern Zugriff auf alle Passwörter erlangt hat. Auch heise berichtete darüber. Interessant ist auch diese englischsprachige Quelle, die erklärt, was zwischen den Zeilen aus der Presseerklärung von LastPass herausgelesen werden kann.
Im Zuge des Vorfalls bei LastPass wurde auch bekannt, dass nicht alle Benutzerdaten verschlüsselt wurden. Weiters wurde bekannt, dass ein großer Teil der Passwörter entgegen der Angaben auf der Webseite nur unzureichend verschlüsselt wurde.
Dadurch muss man jedes Passwort, das jemals bei LastPass gespeichert wurde, als kompromittiert betrachten, da nun die Angreifer die abgezogenen, nur teilweise verschlüsselten Informationen in aller Ruhe knacken können. Je nach Passwort und Verschlüsselungsart kann das sehr schnell gehen.
Im Februar 2022 wurden weitere Details an der Sache bekannt, die zeigen, dass die Auswirkungen noch schlimmer waren, als bisher bekannt war.
Sicherheitsexperten warnten bereits Jahre vor diesem Vorfall, dass das einmal passieren wird. Menschen, die Bequemlichkeit vor Sicherheit gestellt haben, sind damit ziemlich auf die Nase gefallen und gefährdeten ihre Sicherheit als auch die ihrer Firmen, Freunde, Verwandten und so weiter.
Insofern sollte das allen Menschen eine Warnung sein, sensitive Daten bei Cloud-Anbietern zu speichern. Weiterführende Informationen dazu kann man in diesem englischsprachigen Artikel lesen. Spätestens danach kann man nicht mehr ernsthaft an vertrauenswürdige Cloud-Anbieter glauben, die nur das Gute für ihre vermeintlichen Kunden im Sinn haben. "Vermeintlich" deshalb, da sie an sich ihren Shareholdern verpflichtet sind. Wenn man den Shareholder-Value erhöhen kann, indem man die Daten der Menschen auch anderweitig nachnutzen kann, ist die Verlockung oft zu groß, hier bewusst auf sehr viel Geld zu verzichten. Data-Broker sind hier nur ein Milliarden-Business von vielen, die hier sehr intensiv locken. Und das ist nur ein einziger Aspekt von sehr vielen, die alle mit dem Verlust der Vertrauenswürdigkeit des Anbieters zu tun haben.
Es nützt nichts, wenn der Computer, auf dem der Passwort-Manager verwendet wird, nicht vertrauenswürdig ist. Spätestens ab dem legitimen Aufsperren oder bei Einlog-Vorgängen durch den Benutzer kann eine Schadsoftware an die Passwörter gelangen.
Aus diesem Grund ist es erforderlich, auch hier einige Dinge zu berücksichtigen.
Ein unabdingbares Muss ist ein aufrechter Support-Zeitraum vom Softwarehersteller als auch das kurzfristige Einspielen von allen Sicherheitsupdates. Das bedeutet, dass man keine Software betreiben soll, für die keine neuen Sicherheitsupdates mehr veröffentlicht werden, wenn Schwachstellen öffentlich bekannt werden. Das gilt insbesondere auch für Handys, sofern der Passwortspeichert auch dort verwendet werden soll. Die meisten Hersteller bieten hier leider keinen langjährigen Support an. Für Windows-Systeme kann man sich hier erkundigen.
Betreibt man ein System außerhalb des Support-Zeitraums, ist man potentiellen Angreifern schutzlos ausgeliefert, die sich aus allen in der Zwischenzeit bekannt gewordenen Schwachstellen die passenden aussuchen können.
Ich persönlich brauche nicht unbedingt einen Passwörterspeicher am Handy und synchronisiere daher nur zwischen Notebook und Desktoprechner (ohne Cloud mittels Syncthing).
Ebenfalls nicht vertrauenswürdig sind Computer, die für andere Personen zugänglich sind. Das sind beispielsweise Familienrechner, wo die Kids alles Mögliche aus dem Internet ausprobieren, geteilte Firmenrechner und insbesondere Computer in Hotels und Internet-Cafés.
Ich persönlich verwende aktuell KeePassXC mit optionalem Syncthing-Share zwischen meinen Rechnern. Doch es gibt eine ganze Reihe von Passwort-Managern, die allgemein als vertrauenswürdig gelten. Mein Artikel zur Auswahl von vertrauenswürdigen Authentifizierungs-Apps gibt Anleitungen zur Wahl solcher sicherheitskritischer Produkte.
Sicherheit ist kein Zustand, sondern ein andauernder Prozess. Zu diesem Prozess gehört es auch, möglichst proaktiv zusätzliche Hürden einem potentiellen Angreifer (Mensch oder Malware) entgegenzustellen.
Eine äußerst effektive Methode im Bezug auf Passwortsicherheit ist eine Zwei-Faktor-Authentisierung, die man auch mit 2FA (two-factor authentification) abkürzt. Dabei verlässt man sich nicht ausschließlich auf die Kombination von Benutzername ("Wer bin ich?") und Passwort ("Ich bin es wirklich"). Als zweites "Geheimnis" kommt hier etwas dazu, auf das ein potentieller Angreifer keinen Zugriff haben soll, selbst wenn das Passwort irgendwie abgegriffen werden kann.
Nebenbei bemerkt ist das auch der Grund, weshalb man möglichst diesen zweiten Faktor nicht mit dem ersten Geheimnis gemeinsam speichern soll. Wenn man am Handy sowohl Passwortmanager als auch Authentifizierungs-App hat, sind bei einem Angriff auf das Smartphone automatisch auch beide Faktoren kompromittiert. Besser wirklich getrennte Faktoren verwenden.
Bei der Auswahl zum zweiten Faktor gibt es durchaus Qualitätsunterschiede. Meine empfohlene Reihung ist:
Nur FIDO2 und Passkeys schützen vor Phishing-Attacken. Das ist eine sehr wichtige Eigenschaft.
Update 2023-05-19: Laut diesem heise-Artikel sieht Microsoft die Prioritäten folgendermaßen:
Nicht meine persönliche Meinung.
FIDO2 ("Fast IDentity Online", Version 2) ist ein offener Standard zur Authentifizierung hauptsächlich gegenüber Web-Services.
Dabei braucht es ein vertrauenswürdiges Ding, auch Token genannt, das einen geheimen Schlüssel gespeichert hat. So ein FIDO2-Token bekommt man ab ungefähr 15 Euro (bis ca. 100€). Super investiertes Geld. Perfekt als Geburtstagsgeschenk.
Ein FIDO2-Token sieht aus wie ein kleiner USB-Stick, der beim Einloggen in den Computer gesteckt wird (USB) oder auch per NFC ans Handy gehalten werden muss.
Beim Authentifizierungsvorgang gibt es einen beidseitigen Austausch zwischen dem Web-Service und dem FIDO2-Token. Im Gegensatz zu allen anderen verbreiteten Authentifizierungsmethoden wird hier nicht nur der geheime Schlüssel des Tokens durch den Web-Service verifiziert, sondern auch der Web-Service vom Token indirekt verifiziert.
Dieser vielleicht schwer verständliche Absatz ist auch für den unbedarften Benutzer superspannend, da dadurch eine Phishing-Attacke nicht mehr möglich ist. Es kann also keine Webseite erfolgreich dem Token vortäuschen, eine andere zu sein. Das ist der Grund, weshalb FIDO2 mit Abstand die aktuell beste 2FA-Methode darstellt.
Smartphones bieten vermehrt auch an, als FIDO2-Token zu fungieren. Doch aus den weiter oben besprochenen Gründen bevorzuge ich ein separates USB-Token, da sonst zu viele Dinge vom Smartphone erledigt werden und dem Angreifern die Sache deutlich erleichtert wird.
Mit Passkeys steht hier eine Erweiterung von FIDO2 in den Startlöchern, wo man sogar das Passwort weglässt und neben dem Benutzernamen nur noch das meistens durch Biometrie abgesicherte Passkeys-Token am Handy benötigt. Die Werbung verspricht, dass dadurch das Passwort ersetzt wird.
Klarerweise entspricht Passkeys einer Variante von FIDO2, wo etwas an Sicherheit und Kontrolle zu Gunsten der Bequemlichkeit gepofert wurde.
Doch das würde ich aus diesem Grund nicht empfehlen. Ich bin mir selber zumdem noch unklar, wie sehr sich hier ein Diebstahl vom Smartphone oder der Cloud-Daten auswirken wird.
Weiters muss man bei Passkeys auch ohne aktivierter Cloud-Synchronisation blind den Passkeys-Betreibern (aktuell: Google, Microsoft oder Apple) vertrauen, da das Geheimnis im Gegensatz zu FIDO2 ganz in der Hand der Betreiber liegt und ich dadurch nicht mehr abschätzen kann, ob es zu einem unvorhergesehenen Zugriff gekommen ist oder nicht.
Da bin ich doch viel lieber noch weiterhin bei der Kombination von dem FIDO2-Token in Zusammenhang mit dem Passwort, das getrennt davon gespeichert ist.
TOTP (kürz für "Time-based One-time Password Algorithmus") ist eine relativ sichere Authentifizierungsmethode, die in den meisten Fällen ohne zusätzliche Kosten verwendet werden kann. Eine Smartphone-App, auch Authentifizierung-App genannt, wird mit einem Geheimnis versehen, das in Form eines QR-Codes übermittelt wird. Ab diesem Zeitpunkt kann dann diese Authentifizierungs-App in Kombination vom Geheimnis mit der aktuellen Uhrzeit Einmal-Codes in Form von sechsstelligen Zahlen generieren.
In meinem Artikel zu TOTP-App-Auswahl gehe ich stärker auf diese Methode ein. Bitte lies dir diesen Artikel durch, wenn du mit TOTP arbeiten möchtest.
Diese Methode biete einen recht guten Schutz, solange man aufpasst, Einmalcodes nicht den falschen Personen oder Webseiten (Phishing) versehentlich zu verraten. Das gilt aber für alle Einmalcodes, die man wo eingibt.
Proprietäre Authentifizierungs-Apps, die von den Web-Anbietern direkt zur Verfügung gestellt werden und auch nur für dessen Service Einmal-Codes verschicken, können zwar nicht auf ihre Sicherheit geprüft werden, sind aber immerhin noch besser als gar kein zweiter Faktor.
Banken nutzen diese Authentifizierungsart sehr gerne, da sie aus irgendwelchen, mir nicht erschließbaren Gründen offenbar nicht auf gut getestete Standards setzen wollen.
Ganz unten auf der Vertrauenswürdigkeits-Skala im Kontext von Authentifizierungssystemen stehen das Versenden von Einmalcodes via SMS (smsTAN, mTAN) oder E-Mail. Beide Übertragungswege sind unverschlüsselt und mit relativ einfachem Aufwand für einen engagierten Angreifer zu knacken. Bei diesem Vorfall wurden Millionen von SMS-Sicherheitscodes von Google, WhatsApp und Facebook veröffentlicht.
Ich persönlich bin überzeugt, dass Serviceanbieter gerade die für sie kostenpflichtige, unsichere SMS-Methode nur deswegen verwenden, da sie an deine Telefonnummer gelangen wollen, die für sie viel Geld wert ist. Für die Serviceanbieter wäre TOTP als auch FIDO2 ohne Kosten einfach einzusetzen und sie würden deutlich bessere Sicherheit bieten.
Wenn ich aus irgendwelchen Gründen alle obigen Authentifizierungs-Methoden nicht verwenden kann oder möchte, sind sogar Einmalcodes per SMS oder E-Mail noch besser als gar keine Zweifaktor-Authentifizierung.
Ich hoffe, ich konnte hier einen groben Überblick über die wichtigsten praxisrelevanten Dinge rund um Passwortsicherheit geben.
Bei dem eingangs erwähnten minkorrekt-Audiokommentar erzählte ich auch noch über meine Empfehlungen zur Auswahl von sicheren Apps zur Authentifizierung. Das kannst du bei diesem Artikel lesen, wo auch mit guten Quellen gezeigt wird, dass man die beliebteste TOTP-App "Google Authenticator" nicht verwenden soll.
If you are user of the social network Mastodon (part of the Fediverse), you may have stumbled over the feature called "content warning" (CW):
Mastodon features a Content Warning system. It’s an optional mask that covers the content of a post with an editable warning message.
It’s used to cover content that is admitted by your Instance policy but may still hurt people, like spoilers, nudity, depiction of violence or threads about sensitive topics.
For example, if you want to start a thread about the ending of a fresh new movie, you can do it using a CW like “Spoiler about the ending of...”
Every Instance has its own rules about CWs and therefore it’s common to see them used in different ways, like on selfies or depictions of food. That is because what on an Instance is considered a sensitive topic on another may be something commonly accepted. It’s possible that an Instance is blocked by others because of its misuse of CWs on certain kinds of topics.
This said, if you want you can always go in the Setting page and set to automatically uncover all the CWs.
While there are tons of valid reasons to use a CW, there is a growing number of posts that do seem to dilute this great idea of CWs.
If you are user of the social network Mastodon (part of the Fediverse), you may have stumbled over the feature called "content warning" (CW):
Mastodon features a Content Warning system. It’s an optional mask that covers the content of a post with an editable warning message.
It’s used to cover content that is admitted by your Instance policy but may still hurt people, like spoilers, nudity, depiction of violence or threads about sensitive topics.
For example, if you want to start a thread about the ending of a fresh new movie, you can do it using a CW like “Spoiler about the ending of...”
Every Instance has its own rules about CWs and therefore it’s common to see them used in different ways, like on selfies or depictions of food. That is because what on an Instance is considered a sensitive topic on another may be something commonly accepted. It’s possible that an Instance is blocked by others because of its misuse of CWs on certain kinds of topics.
This said, if you want you can always go in the Setting page and set to automatically uncover all the CWs.
While there are tons of valid reasons to use a CW, there is a growing number of posts that do seem to dilute this great idea of CWs.
I just took screenshots of some posts from my current personal and global timeline to give you an impression on what I mean. The examples are absolutely random and are only used here to demonstrate the real-world use of CW.
"Facebook, help request, boosts appreciated" doesn't give me enough clues why it requires to click on "SHOW MORE" to read the content. After clicking on the CW button, I saw:
If you could visit this link: [...] and tell me what you see, I'd be super grateful. [...]
Well, I really don't get why this required a warning.
FOSS stands for Free/Libre Open Source Software. I don't know what "Rec" stands for in that context. Well, I had to click on "SHOW MORE" to see:
Hello all!
I'm looking for a FOSS journaling application for my phone or computer. I would like for it to be able to use text, images, url links - but dates are not necessary.
I'm having trouble finding much on FDroid.
If this helps the inquiry, I'm looking for something to catalogue my house plants in and put down information like their name, watering, light, soil, photo, etc.
Let me know if you have anything in mind!
#foss #help #recommendations
OK, now I know that "Rec" stood for "recommendations". But why on earth did I have to click on that button? Who would be offended or triggered by this message?
This one's perfect. Two letters and a hyphen that doesn't mean anything to me at all. I can not have a clue about the content of the post or the motivation why I should expand the message.
I did anyway and could read a harmless personal rant about a person who was helping a peer to get rid of a mold although the person is allergic to mold.
Poor gal but I don't see any reason why this deserves a CW to click on.
Somebody could argue: "If you are annoyed to click on the button in order to see all the CW, you might enable following option in the settings of your Mastodon account":
Yes, this really would solve the issue for me.
However, this would be shortsighted for this great platform in general.
Imagine a person who is really triggered by certain topics such as suicide, war, death, or other clearly horrible topics. Such a person is not able to enable the option above because there are really good cases *where CWs are perfectly valid and in my opinion also necessary*.
Following that logic, you either have to decide to enable the global preference to basically disable CWs by showing their content without clicking on "SHOW MORE" all the time. Or you do need to mass-click posts of your time-line in order to read perfectly normal posts that do not need any warning at all.
This kills the CW feature for this platform and it harms the people who do rely on warnings.
If you do try to find as much reasons to use a CW, you're going to end up with a CW for every post. You can't win here. You can always find a poor soul who's offended by any post you do without CW.
In my opinion, there are great alternatives to mis-using CWs for non-warnings or warnings related to stuff that doesn't trigger most people.
Everybody should learn how to maintain a set of words or hashtags in his or her personal filter settings:
Using that method, you can avoid contact with certain topics. This makes much more sense than forcing the sender to anticipate potential personal aversion of something for each potential reader.
Use CW only for things that typically triggers normal people, not snowflakes. If you triggered now, please do read this article how I mean that. The Internet is and never will be a safe space for everybody. Using Mastodon filters is a near-perfect tool of providing a solution for this issue.
Of course, this also requires the use of proper #hashtags which I also recommend for other reasons because on Mastodon you can search for content using hashtags only.
My personal approach is that I'll send out links to this article to people who mis-use CWs. If I get a response that doesn't explain why a warning was really necessary (I might change my mind as well) or if the person can't explain the reason to me and keeps using CWs like that, I add the account to my list of muted accounts so that I never see anything again from them.
Update 2022-06-07: I should have done this before I wrote this article. I started a Mastodon survey which asked for the preference of using CWs. I tried to come up with neutral questions in order to minimize any bias.
Here are the results after 53 people took part:
To my surprise, it's a result with no clear winner.
Anyway, if the majority of people voting for a very broad definition of CWs, I'd still think that we should change that behavior as it is hurting the platform.
Back in the good old days of the Internet, we had a Netiquette which contained rules and suggestions so that services and platforms like email, the Usenet, instant messengers are working fine for its users.
I miss something like that on nowadays platforms in general. And it should be clear that this is not a matter of taste according to the majority of the users, this is a matter of providing workflows and habits so that the platform is working in a good way. Therefore, I don't care for the current majority, I care for the long-term aspects for the platform.
Friendica posts in Mastodon tend to come along with CWs where the title is shown and the rest is hidden. If you're interested in the technical details, please do read this thread that describes the issue at hand and how you can fix it in your settings.
Update 2024-03-04: I received a comment by @jupiter_rowland which I want to quote here:
Posts and comments coming into Mastodon from Friendica, Hubzilla and (streams) are special cases. That's because neither of them have Mastodon's dedicated CW field. Instead, they have reader-side content warnings that can optionally be created automatically. By default, a post is automatically hidden behind a content warning button if it contains "nsfw" and/or "sensitive".
Automatic reader-side content warnings have been part of the culture of both Friendica and Hubzilla since before Mastodon was even made as well as of fairly new (streams). Poster-side content warnings in a dedicated field have never been part of their culture, both because they've got a much better solution for content warnings and because they don't provide dedicated means for poster-side content warnings.
Technically speaking, Hubzilla and (streams) do have Mastodon's CW field. But unlike Mastodon, it isn't labelled "CW". It's labelled "summary". It has been a summary field originally, but since a hard-coded maximum of 500 characters doesn't require summaries, Mastodon repurposed that field for content warnings. Hubzilla and (streams) have no character limits at all, so using that field for summaries is still justified.
Hence, Hubzilla and (streams) users use this field for short summaries for very long posts, if at all, but not for content warnings.
Also, just like Friendica, both Hubzilla and (streams) have a conversation model like Facebook or Tumblr or blogs with only one post and many comments. All replies are comments, and there is a separate, dedicated entry form for comments under each post. This entry form does not, however, have a summary field because what sense does it make to give a summary for a blog comment?
Hence, Hubzilla and (streams) users can't put Mastodon-style content warnings on replies at all. Also, neither Hubzilla nor (streams) can re-use content warnings from posts or comments they reply to.
Friendica, the oldest one of the three, doesn't even have a dedicated summary field. The only way to add a summary, i.e. a Mastodon-style CW, is between the Friendica-specific BBcode tag pair[abstract][/abstract]
. This works in both posts and comments, but the availability of this tag pair is neither advertised in the post editor nor in the comment editor.
Since Friendica doesn't have a dedicated UI element for Mastodon-style CWs, neither for posts nor for comments, most Friendica users don't give Mastodon-style CWs at all because they simply don't know that this is possible in the first place, much less how.
Also, users of Friendica, Hubzilla and (streams) tend to find the idea of writing a content warning between a pair of obscure BBcode tags (Friendica) or into the summary field (Hubzilla, (streams)) ridiculous. Trying to convince them otherwise would be like trying to convince Mastodon users to double their CWs with keywords or hashtags to trigger the NSFW content-warning generation on non-Mastodon projects.
With the 2022-11 mass migration from Twitter to Mastodon instances, this article gained momentum. At a certain time, it became even my most popular article:
It got even mentioned in media such as this daily dot article.
I got all sorts of comments on Mastodon. The vast majority was very positive and supportive. Only a few negative comments were written. Most of those people did not seem to have read this article because they simply objected to the idea in general without going into details about my concrete arguments listed here.
As I wrote: It does seem to hit a nerve.
On 2022-11-11, Eugen (the founder of Mastodon) published an interesting message that relates to the subject at hand:
While some people may think that this is settled and everybody may use CWs as they wish, I do think that all of my arguments above are still perfectly valid.
Of course, CWs that can't describe properly what's hidden below are an issue in any case. It's useless like the popular but senseless email subject "a question".
For all the people who really do rely on content warnings (and I do mean warning when I write "warning"), any different use of CWs other than for warnings with a clear purpose is an issue.
We might start to use the CW feature for titles, summaries, abstracts, whatever, yes, of course. As Eugen is writing, the consensus of the community may change. However, this also means that there is no working "warning" feature any more. First, we would need to rename this feature as soon as possible in the UI and documentation in order to avoid misunderstanding and establish the new consensus.
At the same time, it is inevitable that people with legit issues with problematic topics (death, suicide, pornography, strong violence, ...) for whom this feature was made originally (I presume), are not able to profit from its use any more. You can never tell what an arbitrary sender thought about this feature when using it. There might be some very violent content hidden, there might be an informative message suited for everybody. You can never tell any more.
To me, you can not weaken up CWs without accepting the consequences.
I'm lucky that I don't rely on CWs because I'm able to deal with most problematic content I've come across so far. The reason why I wrote this article in the first place was sympathy for the people who don't have that luxury.
Warnings and titles/subjects/summaries/abstracts are two very different use-cases to me. Each do have their legitimacy. Done properly, we would require to have separate features for explicitly defining both. I would not recommend Mastodon to go into that direction as the interface should keep its current simplicity and not overcomplicate things.
So far, I did not read or hear any single argument that proved the arguments of this article wrong. I got the impression that people tend to ignore the fact that using CWs for a different purpose than for warnings does cause issues for people who relied on CWs so far. Either way, we need to understand the consequences of any change of consensus in that direction.
There is a Github issue which is discussing to rename the CW feature to something else.
Here's a research study from 2023-08: A Meta-Analysis of the Efficacy of Trigger Warnings, Content Warnings, and Content Notes
It's free to read online. I'm just quoting the Conclusions:
Existing research on content warnings, content notes, and trigger warnings suggests that they are fruitless, although they do reliably induce a period of uncomfortable anticipation. Although many questions warrant further investigation, trigger warnings should not be used as a mental-health tool.
I guess that really supports my point here from a slightly different point of view.
I wrote lazyblorg in order to get the blogging software I wanted to use myself. Therefore, I optimized it for minimal effort for a posting and being embedded into my Org-Mode setup.
Of course, I published lazyblorg on GitHub so that other people could use it as well. The second reason for publishing lazyblorg for others was that I was forced not to deliver an ugly works-on-my-machine hack.
I wrote lazyblorg in order to get the blogging software I wanted to use myself. Therefore, I optimized it for minimal effort for a posting and being embedded into my Org-Mode setup.
Of course, I published lazyblorg on GitHub so that other people could use it as well. The second reason for publishing lazyblorg for others was that I was forced not to deliver an ugly works-on-my-machine hack.
On the other hand side, lazyblorg is not as clean as it could be. There was some learning on the way so that I would implement some things differently. So far I did not invest the time for huge refactoring. For example, the Org-mode parser is a dirty parser and some replacement functions are scattered around the code at different stages of the blog generating process without a clean concept I could defend.
However, lazyblorg works and delivers great value for me. To my astonishment, lazyblorg also seems to be a good blogging solution for other people. Meanwhile, lazyblorg on GitHub got 99 stars, 13 people are watching the project, and 14 forks were created. I also got four pull requests with fixes and new features which is absolutely awesome to me.
My dear brother Andreas adopted lazyblorg for his web page. Today, I found the web page of yqrashawn in Beijing, China. So I got the pleasure to see an awesome blog entry in chinese letters I can not even possibly read.
Further pages using lazyblorg:
If you are using lazyblorg, please drop me a line so that I can link your web page on my lazyblorg tag page.
There are many reasons why someone would not want to use https://YouTube.com in a web browser to search for and watch videos.
My most important reasons are:
Therefore, I created a way to search for YouTube videos, download YouTube videos and watch them locally from my zsh command line interface.
If you don't want to use other solutions like Individous or FreeTube, you might want to check out my workflow.
You don't have to use a shell if you want to use my method. You can also wrap the shell scripts into easy to start temporary Terminal windows to interact with them. I didn't bother so far. I'm very fine with using the shell.
So here is my method which you can use and adapt to your personal taste in case you're familiar with basic shell scripting and how to invoke them.
There are many reasons why someone would not want to use https://YouTube.com in a web browser to search for and watch videos.
My most important reasons are:
Therefore, I created a way to search for YouTube videos, download YouTube videos and watch them locally from my zsh command line interface.
If you don't want to use other solutions like Individous or FreeTube, you might want to check out my workflow.
You don't have to use a shell if you want to use my method. You can also wrap the shell scripts into easy to start temporary Terminal windows to interact with them. I didn't bother so far. I'm very fine with using the shell.
So here is my method which you can use and adapt to your personal taste in case you're familiar with basic shell scripting and how to invoke them.
I split up the method into several small shell scripts. This allows for easy maintenance and modular re-use.
Furthermore, I differ between two different video sizes: low and high resolutions. I use high resolutions when I want to watch the videos on my desktop and I use low resolution videos for synchronizing them via Syncthing to my mobile phone for watching when I'm away from my desktop. Therefore, most scripts are available in two versions that only differ with the output video resolution.
Since I'm using existing tools to do the heavy lifting, my shell script method has some dependencies:
yt-dlp is a fork
of youtube-dl. This
fork is better maintained and so it's recommended to migrate away from
youtube-dl
to yt-dlp
as a proper replacement.
ytfzf offers a nice interface to find YouTube videos. As of 2024-02-24 it's no longer maintained but it's still working on my side so far.
guess-filename is one of my scripts from my file management method which renames downloaded files like I want them. Instead of
ayo why you runnin [gFh4dAX5g-U].mp4=
I get the file name
2023-06-28 youtube - ayo why you runnin - gFh4dAX5g-U 1;07.mp4
which contains the date of upload, the original title, the YouTube hash for re-finding it later and the duration of the video.
I search for videos via the wrapper scripts yth
and ytl
which stand for "YouTube high resolution" and "YouTube low resolution" in my head.
You invoke the scripts with a search string like in the following example invocations:
yth "trailer big lebowski" ytl "how to install newpipe on android" yth "review kaweco lilliput"
This will then show the videos matching this query:
In this interface, you can filter the results by typing your filter strings. With the cursor keys, you can move among the result hits. With the return key you invoke the download and close the window.
My yth
and ytl
scripts below are currently using the "view-count" for sorting the results. You might as well change this to different settings.
If you play around with ytfzf
you could even get preview images. I tried it for a couple of minutes, failed and did not care to re-try so far. ytfzf
does offer more stuff - just read its documentation.
This is my yth
:
#!/usr/bin/env sh URL=$(ytfzf -l -L --upload-sort=view-count "${1}") [ "x${URL}" != "x" ] && yd "${URL}" 22 #end
Here is my ytl
:
#!/usr/bin/env sh URL=$(ytfzf -l -L --upload-sort=view-count "${1}") [ "x${URL}" != "x" ] && yd "${URL}" 18 #end
The numbers 22 and 18 stand for the YouTube-specific download quality indicators which you can see when you execute:
yt-dlp -F 'https://www.youtube.com/watch?v=gFh4dAX5g-U'
At the moment, 22 is 720×720 mp4 using avc1.64001F codec and 18 is 360×360 mp4 using the avc1.42001E codec.
If you want to use the same method for other video platforms supported by yt-dlp
, you only need to add functionality that chooses the different quality indicators because 22 and 18 are only used by YouTube to my knowledge.
Please note that on YouTube not all videos are available in all qualities. Especially older content might only be available in lower resolutions.
Both wrapper scripts are using yd
which is my general download script:
#!/usr/bin/env bash URL="${1}" FORMAT="${2}" if hash yt-dlp 2>/dev/null; then YDBIN="yt-dlp" Y_QUERY_OPTIONS="--no-check-certificate --compat-options list-formats -F" Y_DL_OPTIONS="--no-mtime --write-info-json --no-check-certificate -f" elif hash youtube-dl 2>/dev/null; then echo "WARNING: \"yt-dlp\" not found, using \"youtube-dl\" instead" YDBIN="youtube-dl" else echo "ERROR: no youtube downloader tool found." exit 1 fi if [ "${YDBIN}" = "youtube-dl" ] || [ "${YDBIN}" = "youtube-dl" ]; then Y_QUERY_OPTIONS="--no-check-certificate -F" Y_DL_OPTIONS="--write-info-json --no-check-certificate -f" fi if [ -z ${FORMAT} ]; then "${YDBIN}" ${Y_QUERY_OPTIONS} "${URL}" | grep -v only echo read -p 'Please enter the desired version to download: ' FORMAT echo fi "${YDBIN}" ${Y_DL_OPTIONS} ${FORMAT} "${URL}" # Optional, if you want to get file names like: # "2023-07-04 youtube - Nix flakes explained - S3VBi6kHw5c 7;21.mp4" guess-filename-for-info-json-mp4-files.sh #end
Of course, if you don't want to use yth
and ytl
because you only want high resolution videos, you could modify yd
by replacing ${FORMAT}
with 22
and invoke it instead of yth
and ytl
.
The yd
script could be further simplified by removing the youtube-dl
support. However, some people might find it useful when it works for yt-dlp
as well as with youtube-dl
. I personally don't use the latter any more.
Here is my wrapper script for invoking guess-filename
after the download: guess-filename-for-info-json-mp4-files.sh
#!/usr/bin/env sh #- process # 1. find latest (all?) file(s) in directory (with extension .info.json) # 2. generate video file name by removing .info.json and replace with .mp4 # 3. invoke guess-filename on video file # 4. remove json file for jsonfile in *.info.json; do mp4name=$( echo ${jsonfile} | sed 's/.info.json/.mp4/') m4aname=$( echo ${jsonfile} | sed 's/.info.json/.m4a/') if [ -f "${mp4name}" ]; then guess-filename "${mp4name}" rm "${jsonfile}" elif [ -f "${m4aname}" ]; then guess-filename "${m4aname}" rm "${jsonfile}" else echo "I could not locate \"${mp4name}\" for the given \"${jsonfile}\" ... ignoring it." fi done #end
If you are fine with the default file names, you could remove guess-filename-for-info-json-mp4-files.sh
from yd
. This would also get rid of the guess-filename
dependency.
If you do have a YouTube video URL from anywhere on the web, you can use my wrapper scripts ydh
and ydl
just for downloading the videos without the necessity for searching like above.
The script names ydh
and ydl
stand for "YouTube download high quality" and "YouTube download low quality" in my head.
With the yd
script from the previous section, those wrapper scripts are pretty simple. Here is ydh
:
#!/usr/bin/env sh yd "${1}" 22 #end
And here is ydl
:
#!/usr/bin/env sh yd "${1}" 18 #end
You do invoke them like:
ydh 'https://www.youtube.com/watch?v=gFh4dAX5g-U'
Alternatively, you can also invoke it with the YouTube hash only in you want:
ydh gFh4dAX5g-U
Both methods lead to a local video file like:
2023-06-28 youtube - ayo why you runnin - gFh4dAX5g-U 1;07.mp4
Don't you wonder about the ";" to separate minutes from seconds in the last part of the file name. I had to switch from ":" because somehow, Google decided that the Android file system of my Pixel 4a should inherit the file name restrictions of MS-DOS/FAT by Microsoft. This conflicts with many characters in cluding the colon. The semi-colon worked somehow.
In practice, I rarely use yth
or ytl
for searching.
Most of the time, I come across a YouTube URL on Mastodon or on the web. I copy the URL into my clipboard, switch to my tmux
shell, switch to my default video download directory and invoke ydh
or ydl
for downloading and renaming the video.
Using the zsh
command mpv *(.om[1])
from my command history, I open the most recently downloaded video in my favorite movie player.
I can think of an automation method that watches for changes in the clipboard and invokes ydh
to my default download directory in case the clipboard matches a YouTube URL. So far, I prefer the flexibility to decide on the quality I want to download.
By the way, for Android YouTube consumption, I do recommend NewPipe as a full replacement for the YouTube app by Google. It's a really good mobile app that deserves all support you can give.
If you're, for example, contributing to a reddit thread about something which is irrelevant or anything with only a short-term relevance, this article does not apply to you right now.
However, as soon as you're helping somebody solving an interesting issue, summarize your experiences with something or write anything that might be cool to be around in a couple of years as well, you do provide potential high-value content. My message to all those authors is: don't use web-based forums.
In 2022, I talked about this topic at the Grazer Linuxtage and there is a video on the pages of the CCC as well as on YT:
In late 2023, I got the opportunity to give a talk at the 37C3 by the CCC in Hamburg. This talk was not recorded but overlaps in most parts with the recorded talk above.
TL;DR: all of the content of closed, centralized services will be lost in the long run. Choose the platform you contribute to wisely now instead of learning through more large data loss events later-on.
The longer version is worth your time:
If you're, for example, contributing to a reddit thread about something which is irrelevant or anything with only a short-term relevance, this article does not apply to you right now.
However, as soon as you're helping somebody solving an interesting issue, summarize your experiences with something or write anything that might be cool to be around in a couple of years as well, you do provide potential high-value content. My message to all those authors is: don't use web-based forums.
In 2022, I talked about this topic at the Grazer Linuxtage and there is a video on the pages of the CCC as well as on YT:
In late 2023, I got the opportunity to give a talk at the 37C3 by the CCC in Hamburg. This talk was not recorded but overlaps in most parts with the recorded talk above.
TL;DR: all of the content of closed, centralized services will be lost in the long run. Choose the platform you contribute to wisely now instead of learning through more large data loss events later-on.
The longer version is worth your time:
In this article, I'm using the term "web-based forums" as an umbrella term for closed, centralized services like Reddit, Hacker News, Slashdot, Facebook, or any other web-based forum where you are able to add comments, articles, and so forth in most cases only after creating an account. Some issues are even true for Lemmy.
Typically, those services don't provide any possibility to extract or synchronize content. They don't offer open APIs that allow users to choose among different and open user interfaces. They are owned and operated by private companies.
Please note that when I'm going to mention more or less only reddit as an example in the next sections, this is because reddit is the only web-based forum I'm familiar with to a certain level. This does not mean that reddit is worse than other closed, centralized web-based forums. Not at all.
There is not one issue. There are several things where web-based forums don't qualify for being a platform for quality content. Let's take a look at some of them.
I'm glad you're still reading this article and I hope you bear with me until the end of it. Most people will realize and learn about having contributed lots and lots of high-value information only when platforms are down for good. And this is what makes me really sad. It is just like you know that one building of the Library of Alexandria is going to burn down in a few years and people still bring many unique copies of high-quality books into its shelves, unaware of destroying knowledge this way.
For reasons and examples stated in this article, any centralized web-based service will go offline some day. Some sooner, some later. Popularity is not even a guarantee that a service gets continued, as you can see with hundreds of (partly) very well known and widely used Google services that were shut down. Nothing will be on the web forever. Most people are not aware of this fact. The books set on this machine are more likely to survive history than all of your reddit/Facebook/... contributions:
So when you begin to be aware of this fact, you might want to think of things you can do to mitigate data loss when services are discontinued or "sunrized" as some marketing experts say.
You could, for example, back-up the data of this service. By providing the information on multiple servers, chances are high that not all of them are lost at the same time.
This requires certain properties. For example, you need to be able to duplicate the service on multiple servers. To be able to do so, you'll need not only the data but also the software that is providing access to the service. When different organization are running mirrored servers, it is required to openly share the data and software. This can be ensured by using Open Source software or at least open APIs and a business model that does not rely on keeping data and technical things a secret.
All major commercial services such as reddit, Facebook and so forth keep everything a secret that is not ultimately necessary to use their services. Their software is a secret, they don't offer open APIs or only very crippled ones, you don't have the possibility to get to the raw data. So no luck there. You do have a lock-in situation. You also might recognize the term switching costs which is maximized by platform owners.
Even with personal blogs, "fragile" as they are, you are able to use the Wayback Machine of the Internet Archive to back up your blog. For example, every page on my blog contains a link to its archive in the page footer. This ensures that you can not only browse the latest version of all of my blog articles in case of a server breakdown. This also enables you to browse all previous version, probably changed over time. Go ahead, try a few "Archive" links of my articles. If any of my articles start with an "Updates:" section, you know for sure that there are older versions accessible via the Internet Archive.
The Wayback Machine does not archive reddit threads. It can not properly back up Facebook pages. It's blinded by corporate secrecy when it comes to archive content for the upcoming generations:
Why isn't the site I'm looking for in the archive?
Some sites may not be included because the automated crawlers were unaware of their existence at the time of the crawl. It's also possible that some sites were not archived because they were password protected, blocked by robots.txt, or otherwise inaccessible to our automated systems. Site owners might have also requested that their sites be excluded from the Wayback Machine.
Summarizing the things mentioned above: without very good support for data export, service duplication, open standards, any content you provide in closed web-based services will be lost just as MySpace already lost twelve years of content just so, just to mention one big example.
When you grew up only knowing centralized web-based forums, you can not imagine the many advantages of having the freedom to choose your preferred user interface. While some people might think this is a minor issue, let me explain a few examples where this makes a huge difference.
The first example starts with something that might only annoy people. With comments like on this thread, you clutter up other people's interface for personal gain. It's selfish and distracts from the information consumption.
The reason why people are using such reminder bots is multi-fold. First, they don't use a proper todo management system that would be able to remind them to read a certain article in a few days. They externalize this inability to the web-based forum and all of its other users. I'm working on fixing these educational issues. Secondly, there is no way to have features that you can use that do not affect other people's interface.
Consider people with visual impairment do have special needs. The WHO reports an estimate of 285 million people that do are visually impaired, ninety percent of them living in developing countries. Those are not numbers you can simply ignore. It is obvious that they do need different kind of interfaces. Either they have to use a high-contrast interface, highly unusual interface scaling factors, an interface that avoids certain color combinations, text-to-speech systems or Braille readers that are able to extract the content properly.
If a web-based services that - remember from before - does not offer proper open APIs and which does not implement said features, all those people simply can not participate and you can not profit from their knowledge and experience.
And even when you think that this is just a minority I can provide examples where everybody profits from choosing his or her own interface.
Some services are providing interfaces that aren't working properly on small displays or mobile devices in general. In these cases, without any ability to switch to an alternative app or web-page, you are locked out even with perfect eyesight.
When you're using an web-based forum that does not provide the feature that already read articles are marked or collapsed, you need to skim though a thread completely and re-read content to find out new postings when re-visiting the thread after a while. Our time should not spent on senseless tasks like this.
Alternative interfaces might provide advanced rating features based on your personal taste and choice so that you are able to filter out the most relevant articles easily and do not clutter your view with irrelevant articles at all. This is also called "scoring". It can be based on keywords, the amount of personal contributions to a longer thread, friendship relationships from your contact management, and so forth.
Some people prefer navigating using the keyboard. Either by personal taste or by physical restrictions. If the web-based centralized service only supports mouse-based navigation, you can not use this service.
I could continue with examples like that. The common theme is: when one particular centralized web-based forum is not implementing all of those nice features you need or like, you can not use them properly.
In any case, the information should be made public as a text and not as a video, sound file or images only. This is the only viable way of optimizing for its consumation and making sure that it can be found in the first place.
When you do live in a society with certain set of (legal) rules, providers of relevant web-based forums have to follow and enforce some of them. However, the issue is that this kind of censorship is and will always be related to a particular culture and society at a specific time.
For example, in Germany and Austria, being a Nazi is punishable by law. In the USA, freedom-loving people think fans of the human monsters that tortured and murdered millions of Jews in the Second World War need the possibility to express their personal "opinion". As you can see, there is a different point of view in-between the lines when I write about Nazis compared to an author from the USA who values "freedom of speech" higher than "being a die-hard fan of mass murders". It's a very difficult topic you can not enforce with a world-wide service.
You don't have to follow Godwin's law to make a point here. There are countries where child pornography is - within certain degrees - somewhat legal and socially accepted. In mid-Europe we do have a more relaxed point of view related to nudity. In contrast, we do not accept certain levels of brutality and violence like I've seen in some TV productions when I was living in the USA.
So there is an inherent and not solvable conflict between "enforcing some rules" and "providing a service world-wide". This results in subjective censorship. There are always groups of people who are upset when a service provider regulates its service somehow. While this situation also holds true for open, distributed services, local servers hosting illegal content are able to be put down by law enforcement easily whereas big centralized web-based services often do not react to such request or need to be forced by law before they cooperate. I don't get it why it is practical impossible to upload a nipple while child pornography and other highly problematic content stays online for months even after it got reported.
I've got an issue with even less dramatic rules and content. For example, I'm not able to post something to r/privacy when it contains a link to an article on my blog even though I don't make any money with my site. Therefore, readers of reddit will never discuss with me about my privacy-related work although I think that my contributions are worth reading.
Also interesting here:
With every web-based forum, you need to have yet another user account. While this is acceptable for services you consume on a daily basis, this gets tedious when you just have one quick question in that forum where people talk about this new gadget you just bought.
Of course, you must not share passwords among different services. So you need to curate more and more different account credentials. I probably do have credentials to one hundred web-based forums meanwhile.
Whenever I do have just one question I'd pose in a specific web-based forum, I hesitate before creating a new account. I've spent too much nerves on bad usability of registration processes.
The situation is even worse: when I stumble over a thread in a forum where I know exactly how to solve the issue mentioned and I don't have an account for that forum, I just don't accept that ten to fifteen minute registration effort and learning curve to know how to operate the interface and contribute. It's sad but true.
Now that I did explain the most important reasons why centralized, web-based forums aren't a good idea at all, you might want to read about things you can do differently or alternatives to those forums.
Some issues mentioned above could be fixed. Some issues can not be fixed because they are fundamental technical and business/political issues of centralized web-based platforms. Therefore, you need to fix most issues by using a different concept in the first place.
In order to overcome some issues, platforms might open up and agree to follow open standards for adding content, getting content of the platforms as well as synchronize to separate instances.
One example is lemmy which is a free, federated alternative reddit clone. Similar to email, users are able to freely choose any email provider they want: local Internet provider, running a server on their own, using web-based email providers like GMail, and so forth. When you do not like your current instance, you move over to a different one, taking your data with you.
From my current point of view, I would say that chances for reddit, Facebook, and the others for switching to an open approach is zero point zero. On the contrary: they do whatever they can to lock-in their user with their data even more. Money can only be made with maximized time on their platform and not somewhere else. So you're the product being sold, not the user.
However, the good news is that we already do have alternatives that are around for a couple of years or decades which is a good thing. They have reached a level of maturity most modern platforms will never reach before they collapse for various reasons. So let's take a look at a view of them in the next sections.
When looking for alternatives, the good news is that we already do have plenty of them.
In contrast to web-based platforms, email as an open and federated/distributed standard is far from being dead despite all the articles that said so. Of course, email is no replacement for web-based platforms. However, there are technologies that are almost as old as email that provided very good forum services for many, many years until the big companies privatized forum content and locked it into their closed services. The most prominent example is the Usenet, or "Newsgroups" as they are called. This is why we need to remember that there was a time before the big web-based platforms where people freely exchanged postings in threads on all kind of topics elsewhere.
The open standard protocol used for the Usenet is called NNTP and there are tons of great clients speaking NNTP, Thunderbird being one of the most prominent ones. For any type of special need (remember the handicapped people from above!), you can get text-based Usenet clients, mobile clients, professional clients and even web-based NNTP clients. This way, you can choose an interface that reflects your software environment, technical knowledge, level of features, simplicity and taste. This way, you easily get simple features like "hide already read articles" up to fancy stuff to deal with high-volume Usenet consumption such as scoring.
As a user of the Usenet, you could fetch messages from one or many different servers. So you most probably only need one single account for accessing all major newsgroups worldwide, in case your server has a good connectivity.
With NNTP being an open standard, anybody is able to "back up" or archive Usenet content. For example, this server holds an archive of my local Usenet server (of the Graz, University of Technology) from 2001 onward and provides a nice search feature.
Update 2022-04-10: Recently, the newsarchive server was discontinued due to lack of public interest and too much hazzle with deletion requests by people who are afraid that old postings might get found. However, due to the open nature of the service, you can still browse through the archive here.
Another approach to be able to publish articles on the Internet are personal blogs. The test you're reading is hosted on my personal blog which is running on my server. I even wrote my own software for blogging.
However, you don't have to do this at all. You can start your personal blog using one of the manifold blogging services out there. This way, you don't have to have much technical knowledge. You just concentrate on writing short or long articles and share them with the world.
If you choose to blog yourself, please do make sure that a few things are working fine. The page should be indexed by the WayBackMachine in order to have a fall-back for your content in case something happens to your server instance. 2016 they already covered over 477 billion web pages. This page explains how to add your page to the archive and this does it for whole sites. If you can afford it, please do donate a few bucks that they are able to continue this service.
If you're tech-savvy, you should definitively read the Manifesto for Preserving Content on the Web "This Page is Designed to Last". It describes all necessary things to make sure that your content can be accessed as long as possible. It's not that hard. Actually, it's more about not doing things compared to investing extra effort.
In general, you should make sure that your articles should be indexed by independent search engines. This way, people are able to locate your thoughts and ideas by querying the Internet in contrast to "be on one single platform whose algorithm decides to show this content". Pages that can be indexed and therefore found on the Internet are part of the free web, in contrast to the Dark Web.
When you're publishing great articles on your blog, you don't want to force your readers to re-visit your page every day in order to find your new articles. There is an awesome solution to this issue as well. Actually, there are two standards, solving this issue. One is the older one and much better known: RSS. The more modern standard to accomplish the same is called Atom. Users subscribe to RSS or Atom feeds by adding their URL to their software that deals with those feeds which is called News aggregator. From the user perspective, you don't have to care much about the standards since all modern software solutions can deal with both feed types. If both feed standards are provided, choose Atom.
This way, people using a web-based aggregator service or a local aggregator software are able to get their personal news feed. As an user of aggregators, you take back control. You can even read the articles when being completely offline while taking a train or flying in a plane. I really can not imagine a decent knowledge worker who does not use this great concept.
One final advantage of running personal blogs is that you can keep your privacy and the privacy of your readers. In contrast to centralized, web-based platforms, the access logs won't be analyzed and sold. It's much harder to automatically derive personal profiles from distributed, heterogeneous blog sites than with centralized, closed platforms.
Let's assume, you are using the Usenet or your personal blog for publishing articles, questions, opinions, whatever. Of course, then you can also think of posting to centralized, closed web-based forums and link your original content. This way, you can get the visibility on those platforms while the content is still archived and being able to be found with search engines and so forth.
One thing that still persists is the example of certain sub-reddits having rules where postings of people adding links to their personal blog are deleted automatically. As much as I understand some of it related to people with people self-promoting commercial sites, I don't understand it for personal blogs where no commercial interest is involved. As a consequence, I can not participate on and contribute to the privacy subreddit with my thoughts as I already have mentioned briefly above.
A hat-tip to everybody who read this far. You may have noticed, it's very important to me to explain the negative implications of centralized, web-based forums. Most implications will affect us only in a couple of years. The urgency of the matter lies in the fact that when you realize the implications, it will be too late to save anything or make anything undone.
Therefore, it's necessary to learn about the inevitable data-loss that those services will cause in order to plan for it and deliberately make good decisions starting from now. By distributing content and using open platforms that can be interconnected and share content freely, most of the threats are addressed while getting the advantages of choosing your own interface and so forth.
So let's go ahead and stop dragging books into libraries that are known to be burned down in a couple of years for sure.
After reading this very long article, you have now deserved a picture of a cat:
Erik added a Disqus comment which I would like to include here as well in order to be read by people who do not activate JavaScript or Disqus on my site. A also added links to it:
The indieweb movement calls it POSSE for "Publish (on your) Own Site, Syndicate Elsewhere". Or the other way round: PESOS for "Publish Elsewhere, Syndicate (to your) Own Site". Either way you preserve your own content on your own side.
I wasn't aware of the indieweb movement nor that my suggested approach does have a name. Thank you very much for this. I'm completely on their side.
Since a couple of years, I'm following this principle also with my engagement on Twitter and Mastodon. I do post new status updates using my current Mastodon account only and I have set up a cross-posting service to "the bird-site". This way, I enjoy having the fresh community interaction of a federated and free platform while keeping the old service fed with messages until I quit Twitter for good. A temporary workaround for the hen-and-egg-problem which is a valid approach, if your Mastodon instance has a rule-set that allows bi-directional cross-posting. I moved to a limited instance to my current Mastodon instance for that. This is truly amazing to see a great federated service which supports moving your account that smoothly.
Hi Karl,
this is a very good post. I've been moving my own site to Org (from Wordpress.com), and have found plenty of good food for though here. Thank you very much!
You're welcome.
I have one practical question about storage of content in the Internet Archive/Wayback Machine. I've seen the links you provided, and besides Archive-It, which is a paid subscription service (fair, of course), I've grasped no way to do it systematically and automatically available. How do you do it? I'd love to be able to put this in a script and let systemd take care of scheduling, but I'm afraid something more manual will be required.
Well, I was lucky enough that Archive.org decided to archive my web site periodically. I can not influence the frequency. So I just "blindly" generate the archive.org-URL with every new article. If you click on a brand new article, you will notice that archive.org did not grab the content and made it available via their service yet. After a while (you see their frequency of crawling in older articles of mine) the content appears on archive.org.
So far, that's fine with me. The main thing is that they begin to fetch my content and that older articles are fetched with certainty.
I periodically send them money but I don't have a subscription account so far. If you do have questions about The Wayback Machine and its archiving service, please do read their FAQs.
Here are some thoughts with different angles on the same topic:
From time to time, I do love to eat in more or less fancy restaurants.
If I'm in the right mood, I blog about my experience.
Here is my list of my personal restaurant recommendations for Graz mostly in no particular order:
From time to time, I do love to eat in more or less fancy restaurants.
If I'm in the right mood, I blog about my experience.
Here is my list of my personal restaurant recommendations for Graz mostly in no particular order:
Please note that I did not get any money or other benefit for this list.
On this page, I collect my public/media appearances of any kind.
I do have a separate press information page with my bio in German and English, summary of my academic work and photographs to download. Drop me a line via email in order to get the URL.
Most recent updates:
Some of them are available in German language only.
On this page, I collect my public/media appearances of any kind.
I do have a separate press information page with my bio in German and English, summary of my academic work and photographs to download. Drop me a line via email in order to get the URL.
Most recent updates:
Some of them are available in German language only.
<2012-05-02 Wed> Ö1 Digital.Leben: "Datenschutz mit Freedom Box" (interview on national radio)
<2012-09-09 Sun 22:30> Ö1 Matrix: "Open Science" (interview on national radio)
<2013-01-08 Tue 16:55> Ö1 Digital.Leben: interview on national radio on Personal Information Management (together with William Jones)
<2013-04-04 Thu> paper: What really happened on September 15th 2008? Getting The Most from Your Personal Information with Memacs
<2013-04-20 Sat> GLT13: A day in a life with Org-mode … (YouTube)
<2022-04-23 Sat> Talk: "Don't Contribute Anything Relevant in Web Forums Like Reddit, HN, …" at Linuxdays Graz 2022
<2022-07-02 Sat> Barcamp Graz 2022: a motivation/demo session on PIM with the example of my personal GNU Emacs Org-mode setup. Read this article for shownotes and further links.
<2022-09-06 Tue> Ö1 Digital.Leben: Alternatives Vernetzen mit „Graz.Social“ (interview on national radio)
<2022-10-21 Fri 09:00-13:00> "E-Mails, Dateien, Ordner: Tipps und Methoden zur Erleichterung Ihres digitalen Alltags"
<2022-12-04 Sun> Demo at EmacsConf 2022: Linking headings (poor-man's Zettelkasten) and defining advanced task dependencies
2009–2012: misc scientific papers on the topic of Personal Information Management:
starting with 2010: Misc contributions on http://orgmode.org/worg/ (Emacs Org-mode related)
<2014-10-31 Fri> two photographs of mine were selected out of 356 to be on BearingPoint calendars 2015
You can read more about the tracking and data protection on my about-page. In short: no tracking, advertising, cookies and only limited use of JavaScript for purely optional functionality related to comments and search..
Update 2024-02-07: remark on Wayland below.
Here is a neat little PIM improvement which has a great impact on my personal way on how to deal with Virtual Desktops and windows on my GNU/Linux systems. After using it for a few months, I do find this method brilliant and therefore, I need to blog about it.
Working with many application windows on different Virtual Desktops comes with a burden. In most setups, you have to manually switch desktop before you can see the corresponding windows and switch to them. However, in my usual work I know exactly to what window I'm going to jump to, independent of my current Virtual Desktop.
Same as with using a (local) search engine to "teleport" to a specific web site, computer file or start an application, I introduced myself to a method to teleport to any open window on my computer.
In combination with the Firefox add-on "window-titler", I may switch to arbitrary windows by simply invoking a custom keyboard combination, enter a search term (if it's unique with few letters, it's really quick), press Enter and my focus is switched to the Virtual Desktop and the window of choice.
Update 2024-02-07: remark on Wayland below.
Here is a neat little PIM improvement which has a great impact on my personal way on how to deal with Virtual Desktops and windows on my GNU/Linux systems. After using it for a few months, I do find this method brilliant and therefore, I need to blog about it.
Working with many application windows on different Virtual Desktops comes with a burden. In most setups, you have to manually switch desktop before you can see the corresponding windows and switch to them. However, in my usual work I know exactly to what window I'm going to jump to, independent of my current Virtual Desktop.
Same as with using a (local) search engine to "teleport" to a specific web site, computer file or start an application, I introduced myself to a method to teleport to any open window on my computer.
In combination with the Firefox add-on "window-titler", I may switch to arbitrary windows by simply invoking a custom keyboard combination, enter a search term (if it's unique with few letters, it's really quick), press Enter and my focus is switched to the Virtual Desktop and the window of choice.
When I enter "evo" I may jump to my Evolution email client. When I enter "ema" I jump to my Emacs. Entering "rc" jumps to my Firefox window thich I named "[rc]" using the add-on from above. You get the idea.
I rarely switch desktops or apps otherwise, since I've got that method in place. Few small PIM tricks have had such a great impact on my daily computer usage than this one.
Update 2024-02-07: My implementation mentioned below is only working with X.org and not with Wayland. See discussions like that. If you know how to implement the workflow on Wayland-based machines, please do write a comment!
For the implementation, I found inspiration from that web page. My method requires the following tools:
rofi-theme-selector
for
changing the look and feel of the rofi popup windowAs soon as you understood the principle, you can think of alternative implementation using different tools, of course.
In my xfce environment, I create a new keyboard shortcut. System Settings → "Keyboard" → "Shortcuts" → "Custom Shortcuts". Click the "+" and add the command:
rofi -monitor -2 -show window -kb-accept-alt 'Return' \ -kb-accept-entry 'Shift+Return' \ -window-command "/home/vk/src/misc/vk-switch-to-windowid.sh {window}"
The name and location of the mentioned shell script may vary.
My personal preference for the keyboard shortcut is mapped via my QMK keyboard firmware to LAYER + SPACE
for opening the rofi search window.
The vk-switch-to-windowid.sh
script is:
#!/bin/bash WINID="${1}" ## switch to the virtual desktop of the chosen window and show the window: /usr/bin/wmctrl -i -a "${WINID}" ## xfce window manager has a bug where the focus is not correct with "focus ## follows mouse": ## https://github.com/davatorium/rofi/discussions/1585 ## https://www.reddit.com/r/qtools/comments/siksac/using_rofi_show_window_with_focus_follows_mouse/ ## https://gitlab.xfce.org/xfce/xfwm4/-/issues/224 ## This is a workaround to fix that by placing the mouse: /usr/bin/xdotool mousemove -window "${WINID}" 100 -20 #end
You don't need the xdotool
workaround if you're not affected by the xfce/xfwm4 issue mentioned.
If you want to switch to a different look and feel for rofi, you might want to invoke rofi-theme-selector
. I personally use the theme named "arthur" at the moment.
Have fun improving your PIM workflows!
As my wife is not happy to start using GNU Emacs with Org-mode as I recommended to her, I was looking for an alternative. The general requirements are those of a typical student, collecting all kinds of university- and domain-related knowledge and - I presume it will take the usual direction - contact management, information on things, household stuff, appointments, and so forth.
This article is a brief report on my personal experience from that journey. It's not a tutorial. It's not describing the full set of features of mentioned tools. This is just my remarks from an Org-mode point of view, mentioning things that caught my eye.
As my wife is not happy to start using GNU Emacs with Org-mode as I recommended to her, I was looking for an alternative. The general requirements are those of a typical student, collecting all kinds of university- and domain-related knowledge and - I presume it will take the usual direction - contact management, information on things, household stuff, appointments, and so forth.
This article is a brief report on my personal experience from that journey. It's not a tutorial. It's not describing the full set of features of mentioned tools. This is just my remarks from an Org-mode point of view, mentioning things that caught my eye.
There are many knowledge management and note taking tools to choose from: personal wikis, general purpose wiki tools, note-taking software, and so forth.
I can't test all of them. I can't even take a look on a substantial part of the listed solutions. They are too many.
Obsidian is a current hype, just like Evernote and others were for a certain period. I could never use solutions like that since they are proprietary solutions which come with a large vendor lock-in I'll avoid at all cost.
A knowledge-management tool is not a hype tool I'll use for a few years until I switch to the next hyped solution. Such a tool should be a trusty companion for decades, if not until the end of my (digital) life.
So scratch Obsidian as one of the many proprietary solutions.
The first tool I took a closer look was Joplin. As with most solutions nowadays, it also uses the Markdown lightweight markup language (LML) which is the most well-known LML but by far not the best IMHO.
Whatever, let's try it.
My first impression was that it's actually not that bad. Joplin has a manually curated hierarchy of "notebooks". Each notebook has one to many "notes" and "tasks". Within a notebook, they are listed either in a custom order or they are ordered by updated timestamp. Notes can contain Markdown list items with checkboxes.
This tool is note/todo-oriented as they seem to be the usual entity to work with.
You either see the Markdown source, or a side-by-side view of Markdown and its rendered result or only the rendered result, which is read-only.
Tags can be added via menu or shortcut. They are not visible n the note/todo views. However, they are listed within the dynamic list of tags. When visited, this dynamic view lists all notes and todos that were associated with that tag in a temporary notebook view.
You define a directory to store your data. In this directory, you see files like 834bd20fb2d8432e8b80d87865c1b75d.md
. If you re-use the Markdown files with other tools, you first need to find out which file contains what note or task. Tool configuration files are stored within directories in ~/.config/joplin*
on non-Windows systems.
OK, so far so good. Let's take a look at another tool.
I got a recommendation to try Logseq via Mastodon.
The name "Logseq" is an abbreviation of "logbook sequence" and it's pronounced "log-seek" (and not "log-segg").
Visiting the project's website with activated NoScript, you don't see anything at all. If you allow their website to execute arbitrary JavaScript code, you'll notice that they point to their community on Discord and a forum. Instantly, my article kicks in why such forums are a bad idea. But hey, that's the current normal, I guess.
Right after playing around with the locally installed Logseq for a few minutes, it was very clear to me that Joplin is also off the list of candidates. Logseq is so much better.
First of all, Logseq is also using Markdown as the default LML. However, you can switch to Org-mode syntax, for which I coined the term Orgdown for very good reasons. The Orgdown web page also features a page where you can see a number of software tools that support Orgdown to some extend. The list is already longer than you'd probably expect!
You can clearly see on many occasions (e.g., the documentation) that Markdown is the main LML for Logseq. So far, I did not stumble on many things that were not possible with Orgdown except one Markdown table plugin. Switching to the much better Org-mode syntax is a good idea from my perspective. It's much easier to learn, type and except from a few Logseq-specific extensions, it's also very consistent in comparison to Markdown.
Simple text formatting follows the same principle as Orgdown:
/italic/ *bold* _underline_ +strikethrough+ ^^highlighted^^
You have noticed: the "highlighted" syntax was added by Logseq. It's where the Markdown inconsistency is visible: here, it takes two characters for starting the markup and another two for ending. Sigh.
You need to learn a few concepts in order to understand Logseq basics.
By selecting a local directory which will then hold your data files, you start your first "graph". You can have multiple graphs in parallel but they are not connected at all: you can't link one note of one graph to another node in a different graph. I created a Syncthing share with my wife's computer and started her graph in Logseq.
Each graph comes with certain settings. For example, you can have one or many graphs using Markdown and others with Orgdown default syntax. Unfortunately, when you switch your default LML, the existing pages are not converted. However, you can have Markdown and Orgdown mixed in a graph. Those pages can also link to each other.
Within a graph, you create "pages" which are not part of a hierarchy by default. Each page does contain a hierarchy of "blocks".
In contrast to Joplin, which is a note/todo-oriented tool, and Org-mode which is a file/heading-oriented tool, Logseq is a block-oriented tool. That means that you can add meta-data to each block and you can link to particular blocks within a page. The smallest possible entity for links and such are blocks.
On the file system, the same file looks like that:
* This is an *example* /block/. * This is another block. It features the tag #mytag which also links to [[mytag]]. * If you like to have spaces in page titles, you need to link them like #[[page title]] or [[page title]]. ** Blocks can have sub-blocks. *** You can indent them as you wish. *** TODO this is a scheduled task SCHEDULED: <2024-01-28 Sun> *** Strange thing: headings can occur on each level. *** This is a h1 heading :PROPERTIES: :heading: 1 :END: *** This is no the same indentation level as the h1 heading. **** Here is a particular famous block. 🤩 :PROPERTIES: :id: 65b67977-9737-42d8-9bbe-045f5e0a6d68 :END: * This is a h2 heading :PROPERTIES: :heading: 2 :END: * Here is a link to the famous block: ((65b67977-9737-42d8-9bbe-045f5e0a6d68)) ** DONE link the famous block :PROPERTIES: :id: 65b67ef7-aee6-40c4-936e-aa78ada5d748 :END: SCHEDULED: <2024-01-28 Sun> * You can also embed blocks or whole pages! Here, I embed the famous block: ** {{embed ((65b67977-9737-42d8-9bbe-045f5e0a6d68))}} *
As you can see, the Orgdown headings are used for each block. This way, you can add properties and other meta-data to each block. In this example, it's easy to spot that the heading level information is a special property as well as defined IDs for references. This is somewhat unusual for users of Org-mode where syntax elements like paragraphs don't "need" a heading each. This could cause issues when you're planning to directly use Orgdown files generated with Emacs Org-mode (or similar) in Logseq or vice-versa. Therefore, Logseq Org-mode syntax is not used in the same way as outside of Logseq.
By default, you always see the rendered version of the markup in Logseq. If you put your cursor within a block, you then see its "source" markup you can modify.
If you don't need indentation and you just want to write just like within a word processor or similar, you can toggle the "document mode" by clicking outside of any block (removing the current block edit mode) and type t d
. This might be very interesting for many use-cases that don't need the full visual clutter of Logseq.
Todo keywords, SCHEDULED
or DEADLINE
works just like with Org-mode but on a block-level, not heading-level.
If you desperately need a hierarchy of pages, Logseq offers the concept of "Namespaces". If you create a page like "foo" and then a page like "foo/bar", the "bar" page is now a sub-page of "foo".
As with many Zettelkasten solutions, you also have a "Graph view" which visualizes your pages and their links. I'm not sure if I would use that often. Be warned: this view is very CPU intense. I've left this view open in background and my notebook battery was drained very fast. Therefore, use it for jumping around but don't stay within that view.
In Org-mode, you can have file-level meta-data such as file tags placed before the first heading. In Logseq, you need to add such stuff like properties in the so called "Frontmatter" which is the first block of a page. This way, you can define file tags like #+tags: foo, bar
or #+ALIAS: page1, page2
which looks interesting.
One of the best things of Logseq is the effort-less linking of pages or blocks. You just type [[
followed by some search keywords. Within the search results, you choose your desired target, confirm with Return and now you've got a bi-directional link from the current position to the other page and vice versa. Linking particular blocks can be done with ((
.
There is much functionality available when using the /
commands. This way, you can add datestamps, timestamps, scheduled or deadlines, and much more. Most plugin functionality are accessible from here. Similarly, the <
advanced commands offers access to various blocks such as quotes, center, source code. The shortcut C-k
is one of the most important ones for Logseq: you can jump to any page, create new pages and much more. You can easily notice that the programmers have a very big heart for people who prefer to use the keyboard in order to be efficient and quick.
Related to links, you need to know that
[[foo]]
is a link to the page whose name is "foo". Alternatively, each page name is also a tag. So you can also reference the "foo" page by typing #foo
. There is no difference between a tag and a link to a page except the visual representation of the link. If you would like to link to a page name that contains at least a space character, you'd need to type:
[[foo bar]] or #[[foo bar]]
It is very important to know that each link in Logseq is automatically a bi-directional link. So if you link from "John Doe" to the block of an event, this event also has a back-link to "John Doe". With Org-mode, you need additional packages such as org-super-links in order to get that feature. To me, this is one of the most important properties of a knowledge-management system. I think that most people who want to try a Zettelkasten system actually need bi-directional links only.
Something that even Org-mode does not offer by default are so-called embeds. If you write:
{{{embed [[page name]]}}}
... you are not only referencing to "page name" but also embed its content to the current position. If you just want to embed a block, you can use the
{{{embed ((block name))}}}
syntax instead.
And yes, three curly brackets this time, not two. (Consistency!)
Blocks can be collapsed and expanded just like with Org-mode. I did not find out a way to collapse and expand all blocks at a certain hierarchy level just like the TAB
folding is working in Org-mode.
Logseq has a very capable query feature (and a builder) which offers many possibilities. You can use boolean operators to query for non-trivial stuff like:
{{query (and [[page1]] [[page2]] (not [[page3]] ) ) }}
Two curly brackets? → consistency!
You can base queries on properties, todo keywords, date ranges, and much more. This seems to be very powerful and allows for great re-use of content or generating summaries of some sort.
Furthermore, Logseq has a flexible template concept. You can turn any sub-hierarchy of blocks into a template. I didn't find out how to query values from the user when she's applying a template. If you do have any idea on that, drop me a line.
Logseq comes with an easy to reach market place for plugins. You can choose from a large list of plugins. They offer great functionality. So far, I've installed following plugins:
[3/10]
or [33%]
There are also themes that can be installed just like plugins. I tested the "Quattro Theme".
Logseq doesn't come with babel or tables using calc.
However, there is at least a Calculator which offers basic mathematics operations including assigning variables and such.
One things I was not happy with is the use-case of linking to local files.
It seems to be the case that with actions like drag and drop of, e.g., a PDF file, this file always gets copied to the data directory of the current graph. Therefore, you double the disk space and if you modify that file, you only modify the "copy" that Logseq is using and not the original file.
Of course, you are able to link to local files without copying them. Unfortunately, you always need to know the full absolute path because there is no file-picker or similar. This is nothing I'd use that way. Especially when I do have the ultimate way of linking local files in Org-mode.
Just the most painful missing things from my perspective:
Elisp, of course, and the universe of customizations Elisp makes possible
custom links
Appointments: somehow, everything the Agenda add-on shows needs to be a todo task (or I did not get it until now)
rectangle functions for editing: cut/insert/paste
search & replace with RegEx
keyboard macros
in Logseq, everything - by default - is something that is a heading in Org-mode syntax. Classic itemize lists, normal paragraphs, tables, … are a bit of a pain if possible at all.
Export formats other than MD, PDF, XML
most table-related features including calc formulas (spreadsheet)
sometimes, Org-mode syntax is not supported like Markdown is: e.g., Markdown table add-on
Babel and its universe of possibilities
Sparse trees
Todo dependencies (using add-ons like org-edna)
dired for file management
Included binaries are copied to Logseq and can't be just linked
most agenda features
org-crypt
auto-filled :LOOGBOOK:
drawers:
created time-stamps, todo status changes, …
consistency in heading levels and block level: a node on level 4 can be a heading of any level. Example:
**** A node on level 4
:PROPERTIES:
:heading: 2
:END:
easy to use date-picker via keyboard: -thu
+2w
…
capture/templates can't ask values from the user
From my personal setup:
[[/home/user/dir/subdir/file.pdf][a title]]
without any file selection dialog. That's very annoying. And then, you
can't open the file by clicking onto it. At least I failed when doing
that with a PDF.^^highlighted^^
#hashtagsyntax
or [[double-bracket-syntax]]
(both are the same!)
Related: https://logseqtemplates.com/
TAB
behavior of Org-mode.As with many tools, I took a closer look on how well Orgdown syntax is supported within Logseq. With Org-mode syntax being one of the two options for page content, its Orgdown support is fairly good. I got 86 percent of syntax support for OD1.
I found some issues with lists, code
, horizontal bars, tables, and similar.
Being curious, how Logseq reacts when I throw in a fairly large Orgdown files of mine which were created within my GNU Emacs Org-mode setup over the period of twelve years. I copied my current notes.org in the Logseq directory that holds the pages.
This is a file with 215523 lines holding 10640 Orgdown headings (1320 tasks and 9320 non-task headings), many tags, internal and external links.
The good news is that Logseq did somehow process its content and it did not modify the file content while doing so.
The number of pages exploded: for each Orgdown tag used, Logseq created a page in its database but not in the file system. Those "tag pages" contain all the links to the headings that are tagged using those tags.
Well, that was somehow expected.
I could not find "notes" within the page search results. Unfortunately, this is a bad term to search for as it is contained many times within the file.
Many if not all notes.org
headings are somehow associated with my "Emacs Survey 2022" heading. This is a rather small and "unimportant" heading in the fourth layer of headings within notes.org
. For some reason, Logseq got confused and I can't get any view of my large files to narrow down.
Searching for a node like "Windows 11", shows me the corresponding target below "Emacs Survey 2022". When selected, I always see the same node: "Emacs Survey 2022". Furthermore, many task headings are listed in a table that are not part of that sub-hierarchy. Other content from totally different sub-hierarchies is shown here as well.
This way, I could not test internal ID-links, jumping around, navigating within a large page file and so forth.
This import test did not work to my satisfaction at all.
You can dig in much deeper into Logseq with topics like advanced queries, Journal pages (I don't use), Flashcards, large number and freely changeable keyboard shortcuts, the whole universe of plugins, Zotero-integration (unfortunately cloud only!), iOS/Android app (Android app is not in Play Store yet), ... and so forth.
You can find many tutorial videos on YouTube and less on the WWW, unfortunately.
However, Logseq can't and will never be as flexible as the original: Org-mode. So there is zero chance that I would actually move my Orgdown data to Logseq. Not only because of the failed naïve import test using the large notes.org
of mine.
Some great Logseq features, I wish I could use with Org-mode as well.
So if you do have a situation where GNU Emacs is no option for you (you should have really good arguments for that!), Logseq is a very good approximation as long as you don't plan to import large Orgdown files that were generated with the original Org-mode.
My wife will start with Logseq. Let's hope that digital note-taking will be a valuable companion also for her.
If I did get something wrong, please do drop me a line so that I can fix this here.
I do have some academic background with tagging, the process of assigning labels or tags to entities such as headings in my knowledge system or file names. I designed tagstore as a research platform to study tagging behavior in different configurations and I developed the concept of TagTrees which I also used in my filetags method and tool-collection which is the basis of my personal file management.
However, this article here is something I came up with personal experience while using tags and tagging methods over many years in my own setup. From this experience, I derived some general rules I do think that may be applied to any personal tagging workflows.
Disclaimers: The things mentioned here can not be applied to social or collaborative tagging in general. I refer to file tagging many times but it is just as an example use-case. The rules mentioned here should apply to personal tagging with other types of entities as well.
I do have some academic background with tagging, the process of assigning labels or tags to entities such as headings in my knowledge system or file names. I designed tagstore as a research platform to study tagging behavior in different configurations and I developed the concept of TagTrees which I also used in my filetags method and tool-collection which is the basis of my personal file management.
However, this article here is something I came up with personal experience while using tags and tagging methods over many years in my own setup. From this experience, I derived some general rules I do think that may be applied to any personal tagging workflows.
Disclaimers: The things mentioned here can not be applied to social or collaborative tagging in general. I refer to file tagging many times but it is just as an example use-case. The rules mentioned here should apply to personal tagging with other types of entities as well.
To my surprise, we tend to think in hierarchical categories all the time. As I have written in my article on Logical Disjunct Categories Don't Work, the real world does not fit into disjunct categories.
Therefore, we should embrace multi-classification more often. If you do want to learn more about the rationale, you may as well read the first chapters of my PhD thesis or the book "Everything is Miscellaneous" by David Weinberger, just to give you two resources of many.
Long story short: tagging does take away the burden of finding one single spot in a strict hierarchy of entities which is actually a heavily intertwined network of concepts we do find in the real world. It's far from being a neat hierarchy. Everybody who tries to put "the world" into a strict hierarchy will fail.
Unfortunately, tagging is no process that works flawlessly as soon as there is a tool which allows you to apply tags to entities of some sort. It seems like tagging is intuitive but it is not.
I want to give you one example that should be able to explain why tagging behavior is something that needs to be curated itself as well. Imagine you do have some sort of personal tagging tool for local files. Imagine furthermore that you tag your files after a double minus separator similar to that example:
Tom and Julie -- people couple holiday tom julie relatives portraits nice.jpeg
For this particular photograph, you entered all potential tags that came to your mind. The rationale is that those "descriptive tags" help you re-find the image when you're looking for files with relatives from any holiday and so forth. Actually, this is something we all would like to have and use.
In the long run, you will notice that you've ended up with at least hundreds of different tags. Retrieving files now got very tedious because you can't remember them. In particular, you won't use all matching tags on matching files. Therefore you may end up with something like that:
Julie hugs Tom -- persons beautiful funny vacation uncle faces.jpeg
As you can see, this file is extremely similar to the first example. Yet, you have tagged it using a totally different set of tags. There is not even one single tag matching the other set of tags. If you think of any tag-based retrieval process, you can not get both pictures with one search query based on those tags - as similar as they are.
I do think at this moment, we can agree that tagging does need some sort of guidelines in order to get a helpful tool for information retrieval. And information retrieval is the main reason why we do apply tags in the first place. Never forget that in this context.
In the 45 minute video of a talk I gave which is liked on my filetags article, I came up with an initial list of best practices on how to tag. I took that list as a basis and extended it with further tips. Here is the current set of rules of no particular order. I'll keep maintaining the list of rules here. Come back for any potential future updates.
I will elaborate on each rule in the upcoming sections because I do think that rules are of very limited use if you don't understand the rationale behind.
The example in the previous section should made clear why too many tags leads to a bad retrieval situation.
It is very difficult to recommend numbers here. If you do tag hundreds of files that are from similar situations, you may be able to stick with maybe a dozen different tags. If you only want to mark files with "confidential", "internal" and "public" you end up with exactly three different tags. If you are curating thousands of photographs of animals, you may want to have hundreds of different tags and still got an efficient tool for tag-based retrieval.
Therefore, my recommendation is to start with only a few tags and add more tags if you really do come up with good reasons for them.
Related to rule number one, a so-called controlled vocabulary is able to help you curating that finite set of self-chosen tags.
While a handful of tags can easily be memorized, any higher number of tags lead to situations where only good tool-support is able to help you. If you're using a tagging tool that allows to maintain this controlled vocabulary, it might as well accept only pre-defined tags at the actual tagging process. This is a good thing to have when you want to maintain tag consistency.
I also recommend you to maintain a curated list of tags and their definition. Concepts can be misunderstood and they change over time. By having one central definition per tag helps yourself and your peers.
Using a controlled vocabulary also prevents you from accidentally using synonyms, homonyms or mixing up singular and plural forms which we discuss in rule number four.
I would like to add that there may be good reasons for a larger number of tags in your controlled vocabulatory in certain situations. For example, if you think of a company using the filetags method, I do think that a fairly long list of customer, supplier and project tags in your controlled vocabulatory is not a bad idea in the first place.
The reason is that you will most probably not ending up with applying more than one or two customers for one particular file even if you do have hundred customers in your controlled vocabulatory. You'd only get issues when you'd like to tag a newsletter file with, e.g., all client tags - that would be cumbersome and may lead to bad retrieval and also technical issue due to long file names. You could think of a tag like "allcustomers" for situations like that.
But still: start with as few tags as possible even here so that you don't end up with more than - let's say - five to seven tags for any given file. That's the actual goal here.
Tags need to be clearly distinctive from each other. Furthermore, tags should not overlap with respect to their meaning. If you do use "sports" and "soccer", you might tend to use "sports" for some soccer files instead of both or only "soccer".
There is one exception: tag hierarchies are able to provide some relations between different tags. For example, you may be able to use a tagging solution that allows you to define the following set of tags:
sports (soccer volleyball biking)
The idea behind tag hierarchies is that when you retrieve files via the "sports" tag, you get also files that are tagged with "biking" and not "sports". Many people do think this is a great help. I personally prefer not to use such a feature because it complicates many use-cases. Furthermore, all retrieval methods need to support those tag hierarchies which usually limits my possibilities.
The reason for this rule is simple: when applying tags or trying to come up with a tag-based search query, you should not be confused if it is "bicycle" or "bicycles". A simple solution is that you either choose the singular form or you stick to the plural form.
When I was looking into that issue many years ago, I realized that most (social) tagging systems settled for the plural form back then. I was not particular happy about that but I've learned that going against well-established conventions may cause pain later-on. So I adapted the plural convention for myself as well.
You can argue that the plural form does make sense if you think of «a bicycle is part of the set named "bicycles"». Furthermore, it would be a bit weird to tag an item related to a handlebar with "bicycle" because this could be interpreted as «a handlebar is a bicycle» which is not true. However, it seems to be a clean approach when you interpret it as «a handlebar is associated to the set related to "bicycles"».
Of course, your mileage may vary here. If you decide to stick to the singular form it perfectly fine as long as you stay consistent within your own tagging system.
It is noteworthy that I do not follow the plural rule that strictly. If there is a word that is well fitting such as "education", "cloud" or "emacs", it might be a perfect tag as well although they are not in a plural form.
The thing with lower case is related to avoiding conflicts just like the previous rule. It should avoid situations like «did I choose "Emacs" or was it "emacs"?».
Of course, this only applies for tools that are case-sensitive. And yes, even when your main tool is case-insensitive, allowing "emacs" and "Emacs" for the same result set, you need to think of all retrieval methods you might be using in the future. Some of them might not be case-insensitive and then you do get an issue.
Just like the previous rule, the "only a single word" rule is a workaround to avoid tool-related issues. Some tagging solutions are perfectly fine with spaces and even special characters like emoticons. I would not assume that this holds true for the majority of tagging solutions. If you want to be on the safe side, you should stick to single words.
As you can see in my "restaurants_bars" tag, I connect them with underscores. With "tugraz" for "TU Graz" I somehow decided to omit the space and connected the two parts without any character in-between. As a general rule, I probably would settle for "-" or "_" to separate different words within a tag.
In order to comply with rule number one (as few tags as possible), you need to accept that you can't use tags that are too specific and describe every aspect of the entity about to be tagged.
Therefore, tags need to be as general as possible. I merged the tags "automobile", "airplane" and similar to the one tag "transportation".
Yes, I still stick to "bicycles" which could be merged as well here but since I blog more about bicycle-related stuff than about the rest of "transportation", I made this decision deliberately. Readers might be confused here, I agree.
Back to this rule. I would recommend you to prefer categorizing tags over descriptive tags. Omit "volleyball" when you can use "sports". Try to tag things with a more general category than you would probably think. This makes much sense since as an extreme example "each document has its own tag" would not support retrieval tasks properly.
I once read an interesting quote from WallyMetropolis on reddit. It was in the context of knowledge management using tags and links between different notes:
Tags are doors and links are corridors. Tags are how you enter a building, and links are how you navigate around once you're in. So you'll in general have few tags and not every note will have a tag. But you'll want many links.
That is a nice picture, I think.
I once called this rule «don't use tags that can be derived from the filename extension». Since I can think of other obvious tags, I generalized this rule even more.
According to this, I would not recommend using tags like "images", "spreadsheets" or "photographs".
As a matter of fact, I actually do use a file-tag "presentations" which seems to contradict the rule. The reason for this is that I may store all kinds of material in different file formats that refer to presentations somehow: classical .pptx
or .odp
presentation files, photographs of slides during talks, sound recordings and videos of talks, presentation slides as PDF files, and so forth. Without the "presentations" tag, I would not be able to retrieve slide material according to file format filters and similar.
In this section, I would like to elaborate on a few things that might turn up when talking about tagging.
I honestly don't know if there is an official term for this. What I do mean are tags that consist of keys and associated values. Here are some examples:
size:big • size:small security:public • security:internal • security:confidential
They are a simplified form of the "triple tags" mentioned here on Wikipedia. And the "triple tags" are directly related to the semantic triples.
Google lets you search for foobar type:pdf site:.at
which returns only documents of the file format PDF which are provided by web servers whose TLD is from Austria.
Personally, I love those key-value tags and I would like to re-use the well-established key:value
form. Unfortunately, tagging files is an important part of my personal tagging workflows. While the colon is a perfectly fine character for my main systems which run GNU/Linux, I would not be able to copy those files to a much more restricted file system from Microsoft where colons are not allowed at all.
Currently, I don't use key-value tags in my personal tagging systems. Most probably because I avoided the additional complexity in tool-support when it comes to implementation and handling details.
To me, a file name like the following is quite disturbing to me:
Business Report v0.3.5 final2 with changes.docx
I do have some questions here.
What does version 0.3.5 stand for? Did this document really go through 0.0.1, 0.0.2, 0.0.3, ..., 0.1.0, 0.1.1, 0.1.2, ... , 0.2.0, ... , 0.3.0, 0.3.1, 0.3.2, 0.3.2, 0.3.3, 0.3.4 and 0.3.5? If so: when is the third number incremented? When the second and when the first?
Why is "final2" still a thing? Of course, the author might have added some changes to the version he or she marked with "final" already.
And "final2 with changes" clearly indicates that naming files this way is not very efficient.
After providing that example which is - to my personal experience - unfortunately a realistic one, I would propose an alternative. How about that?
2022-01-28 Business Report -- final submitted.docx
The previous stages of that particular file might have been:
2021-12-29 Business Report -- draft.docx 2022-01-05 Business Report -- draft.docx 2022-01-17 Business Report -- draft.docx 2022-01-27 Business Report -- final.docx
As you can see, the status is indicated by the file-tags, the order and history by the leading date-stamps.
On January 27th, the author tagged the document as final. Then, there were some last-minute changes that lead to yet another final version on 2022-01-28. According to the end result, the version from 2022-01-28 got submitted to the client. Everything is much clearer to me.
I like to think, that this file name history gives much more clues for anybody. Those version numbers didn't add helpful clues anyway and served no particular purpose except defining a linear order with unclear in-between steps.
I referred to aspects like this and much more in my article on how to design file management for companies.
When people start to embrace tagging, they not only tend to violate most rules above. They also tend not to think of many other benefits of tagging or adding meta-data in general.
Just to give you some food for thought, I would like to mention a few of my file-tags and what they do represent for me.
"selection": Imagine you've been abroad for a fine vacation and came back with 1428 photographs you want to keep. Despite the fact that you're an exponentially gifted photographer, it would be cruel to present all photographs to your friends and relatives who feel socially obliged to stay during your eight-hour session with three minutes for each photograph on average. For cases like that, I prefer to add the tag "selection" to a much smaller sub-set of must-see photographs for those sessions. This way, I may keep the whole set and be able to present a much smaller selection when appropriate. See the "Tag Filter" feature of filetags mentioned further down.
"taxes": Once in a year, I need to collect all the files that are relevant for my tax adjustment report. Files related to bills all over the year that are tagged with this tag do easy the pain to a great extend.
"acquisitions scrap": When I get new hardware, I usually take some photographs. Partly because of sheer new owner's joy, partly to document things like computer stickers that might get unreadable with time. Those files are tagged with "acquisitions". At the other side of hardware life-time, taking fare-well photos does easy the pain of letting go. I do have something for my memory. Those are tagged with "scrap". It also helps if I'm unsure if I still own an old thing or if I tossed it away.
"cliparts": Images that may be used in presentations or blog articles to represent something specific as a clipart. That tag helps me a lot when looking for visuals.
"heritage": I collect (too) much data in my yearly archive folders. Files tagged with "heritage" are things that I consider very important or representative, similar to physically printed photos in an album. If I'm looking for things of subjective or sentimental value, I may filter for that tag.
"manuals": After going digital, I keep my hardware manuals only in digital form. When looking for instructions with a gadget, I can filter for that tag.
"scan screenshots": Scanned documents are tagged with "scan", screenshots from my computer or mobile with "screenshots".
"confidential internal public": The usual security tags.
For this web site "public voit" and for my blogging solution "lazyblorg", tags are an essential part of the user experience I try to provide. They are mentioned in my how to use this blog efficiently, they are the dominant element of my landing page, they are available in the sidebar and they've got their own tagcloud page:
As usual with tag clouds, the relative size of the tags do represent the amount of blog articles that are tagged with this tag.
Since the tags "hardware" and "software" dominated the tag cloud for the longest part of this site, I've omitted those two tags from the tag cloud.
I would love to see readers using the tags to navigate through my web site. Unfortunately, I don't have the tools to verify my assumption.
To be honest, I don't have all those tags in my head when writing new blog articles. Especially older tags may vanish from my memory. I try to compensate this by deliberately using the tag overview of the pre-defined tags in my Emacs when I choose the tags.
Here is an article about tag gardening on public voit.
All in all, it's not a perfect result but I do think that it offers good value to my readers. Please do write me a comment if you want to give me some feedback.
After writing so much about tagging and my recommended way to use tags for information retrieval, I would like to mention my personal tool-set when it comes to tagging local files. Some has been mentioned and linked above. If you want an introduction, please do read my article about filetags and its accompanying tools. If you are not interested in reading about my tagging tools, you can skip this promotion with clever and partly unique tagging and retrieval features and jump to the conclusions at the bottom.
My implementation of filetags supports the tagging process in many ways. I have optimized the user experience over the years so that it does require minimal effort.
Either when you do use the command line or you prefer to use filetags integrated into a graphical file browser, you can select multiple files and add a set of tags to them.
For example, when I go through my new photographs, I select all images that share a tag in my image browser, invoke my keyboard shortcut for tagging and add the tags in one go for many files. (Same holds true for appendfilename, by the way.)
When you create a file named .filetags
in any sub-hierarchy of your file system, filetags is using that as a controlled vocabulary for that sub-hierarchy. Each tag is written in one line. Simple as that.
With a controlled vocabulary defined in a .filetags
file in the current directory or any parent directory, you will get tab completion for those tags. If you press Tab
twice, you get a list of those pre-defined tags.
Even when you want to untag a file for an already assigned tag, tab completion is helping you there. You just have to precede the tag to remove with a minus character like "-foo" and tab completion is able to complete it. After confirming, those tags with a minus character do get removed from the file name.
This is probably the most important feature in order to avoid "inventing new tags" in an unintended fashion.
The .filetags
file can not only contain one tag per line. If you do write multiple tags in one line, those tags are declared as mutually exclusive tags.
Consider following content of .filetags
:
draft final confidential internal public scan screenshots
Now, let's consider an example file like the following:
2022-01-29 Business Report -- draft internal.pdf
Furthermore, you would like to finalize this version and mark it as suitable for the public, changing its security level. With filetags, you just invoke your preferred way of tagging and add just the two tags "public" and "final". After that, the file name now looks like that:
2022-01-29 Business Report -- public final.pdf
As you can see, the tags "draft" and "internal" got replaced because they have been defined as mutually exclusive with "final" and "public" respectively.
I think you get the idea of that and can think of ways this might support your own workflows.
Already mentioned above when I explained the example tag "selection", I would like to write about my filter by tags feature.
Consider you have a directory that contains many of files.
If you want to retrieve a file whose tags you know, you can skim through all the files. However, filetags offers you a more elegant possibility: you can filter the files according to one or more tags.
For example, we take a look at following situation:
$HOME/my party/ |_ 2021-06-25 Party invitation -- scan correspondence.pdf |_ 2021-07-31 Guest list -- correspondence.txt |_ 2021-08-01T11.51.44 Uncle Bob arrives.jpg |_ 2021-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg |_ 2021-08-01T14.12.23 Start of BBQ with the big steak.jpg |_ ... |_ 2021-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg |_ 2021-08-05 Lessons learned for planning a party -- scan.pdf |_ 2021-08-06 Thank-you letter Bob -- scan.pdf |_ Bills/ |_ 2021-07-30 Beverages by FreshYouUp -- scan taxes.pdf |_ 2021-08-03 Bill of the butcher -- scan taxes.pdf
Following command and interaction would generate following temporal link structure:
filetags --filter
User gets asked to enter one or more tags and she enters "scan". What now happens is that filetags creates a directory whose content consists of links to all matching files from your query. By default, the resulting directory is .filetags_tagfilter
in your home directory. After invoking for our example, the content of this retrieval directory looks like that:
$HOME/.filetags_tagfilter/ |_ 2021-06-25 Party invitation -- scan correspondence.pdf |_ 2021-08-05 Lessons learned for planning a party -- scan.pdf |_ 2021-08-06 Thank-you letter Bob -- scan.pdf
This way, our user is quickly able to skim through all scanned documents to locate the one desired to retrieve.
To locate all matching files in all sub-directories as well, the user is able to add the parameter --recursive
...
filetags --filter --recursive
... and chooses to enter the tag "scan" which would generate following temporal link structure:
$HOME/.filetags_tagfilter/ |_ 2021-06-25 Party invitation -- scan correspondence.pdf |_ 2021-08-05 Lessons learned for planning a party -- scan.pdf |_ 2021-08-06 Thank-you letter Bob -- scan.pdf |_ 2021-07-30 Beverages by FreshYouUp -- scan taxes.pdf |_ 2021-08-03 Bill of the butcher -- scan taxes.pdf
This way, filetags supports selective file retrieval based on tags in an elegant way.
This functions is somewhat sophisticated as it is not a very well-known thing to have. If you're really interested in the whole story behind the visualization/navigation of tags using TagTrees, feel free to read my PhD thesis about it on the tagstore webpage. It is surely a piece of work I am proud of and the general chapters of it are written so that the average person is perfectly well able to follow.
In short: this function takes the files of the current directory and generates hierarchies up to level of $maxdepth
(by default 2, can be overridden via --tagtrees-depth
) of all combinations of tags, linking all files according to their tags.
Too complicated? Then let's explain it with some examples.
Consider having a file like:
My new car -- car hardware expensive.jpg
Now you generate the TagTrees, you'll find links to this file within sub-directories of ~/.filetags
, the default target directory: car/
and hardware/
and expensive/
and car/hardware/
and car/expensive/
and hardware/car/
and so on. You get the idea.
The default target directory can be overridden via --tagtrees-dir
.
Therefore, within the folder new/expensive/
you will find all files that have at least the tags "new" and "expensive" in any order. This is really cool to have.
Files of the current directory that don't have any tag at all, are linked directly to ~/.filetags
so that you can find and tag them easily.
I personally, do use this feature within my image viewer of choice. I mapped it to Alt-T
because Alt-t
is occupied by filetags
for tagging of course. So when I am within my image viewer and I press Alt-T
, TagTrees of the currently shown images are created. Then an additional image viewer window opens up for me, showing the resulting TagTrees. This way, I can quickly navigate through the tag combinations to easily interactively filter according to tags.
Please note: when you are tagging linked files within the TagTrees with filetags, only the current link gets updated with the new name. All other links to this modified filename within the other directories of the TagTrees gets broken. You have to re-create the TagTrees to update all the links after tagging files.
The option --tagtrees-handle-no-tag
controls how files with no tags should be handled. When set to treeroot
, untagged files are linked in the TagTrees target directory directly. The option ignore
does not link them at all. The option FOLDERNAME
links them to a directory named accordingly to the value which is a sub-directory of the TagTrees target directory.
With the option --tagtrees-link-missing-mutual-tagged-items
you can control, whether or not there will be an additional TagTrees folder that contains all files which lack one of the mutually exclusive tags. Using the example winter spring summer autumn
from above, all files that got none of those four tags get linked to a TagTrees directory named "no_winter_spring_summer_autumn". This way, you can easily find and tag files that don't participate in this set of mutually exclusive tags.
Using the example files from above:
$HOME/my party/ |_ 2021-06-25 Party invitation -- scan correspondence.pdf |_ 2021-07-31 Guest list -- correspondence.txt |_ 2021-08-01T11.51.44 Uncle Bob arrives.jpg |_ 2021-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg |_ 2021-08-01T14.12.23 Start of BBQ with the big steak.jpg |_ ... |_ 2021-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg |_ 2021-08-05 Lessons learned for planning a party -- scan.pdf |_ 2021-08-06 Thank-you letter Bob -- scan.pdf |_ Bills/ |_ 2021-07-30 Beverages by FreshYouUp -- scan taxes.pdf |_ 2021-08-03 Bill of the butcher -- scan taxes.pdf
... and the command line ...
filetags --tagtrees --tagtrees-handle-no-tag "has_no_tag" --tagtrees-depth 2 --recursive
... filetags generates the temporal link structure:
$HOME/.filetags_tagfilter/ |_ scan/ |_ 2021-06-25 Party invitation -- scan correspondence.pdf |_ 2021-08-05 Lessons learned for planning a party -- scan.pdf |_ 2021-08-06 Thank-you letter Bob -- scan.pdf |_ 2021-07-30 Beverages by FreshYouUp -- scan taxes.pdf |_ 2021-08-03 Bill of the butcher -- scan taxes.pdf |_ correspondence/ |_ 2021-06-25 Party invitation -- scan correspondence.pdf |_ taxes/ |_ 2021-07-30 Beverages by FreshYouUp -- scan taxes.pdf |_ 2021-08-03 Bill of the butcher -- scan taxes.pdf |_ correspondence/ |_ 2021-06-25 Party invitation -- scan correspondence.pdf |_ 2021-07-31 Guest list -- correspondence.txt |_ scan/ |_ 2021-06-25 Party invitation -- scan correspondence.pdf |_ friends/ |_ 2021-08-01T12.31.42 Sheila with her new boyfriend -- friends.jpg |_ fun/ |_ 2021-08-01T23.53.19 Even uncle Bob desides to go home -- fun.jpg |_ taxes/ |_ 2021-07-30 Beverages by FreshYouUp -- scan taxes.pdf |_ 2021-08-03 Bill of the butcher -- scan taxes.pdf |_ scan/ |_ 2021-07-30 Beverages by FreshYouUp -- scan taxes.pdf |_ 2021-08-03 Bill of the butcher -- scan taxes.pdf |_ has_no_tag/ |_ 2021-08-01T11.51.44 Uncle Bob arrives.jpg |_ 2021-08-01T14.12.23 Start of BBQ with the big steak.jpg |_ ...
This looks complicated because there are many links generated the user does not really need. The beauty of this solution is that the user is able to navigate to a file using a wide set of different paths (the TagTrees) and she is able to choose the one path that suits the current cognitive model.
For example, she might want to retrieve "the one document from the last party which she remembers of having scanned and which she used for the invitation correspondence". With this mind-set, she most likely retrieves the document via $HOME/.filetags_tagfilter/scan/correspondence/
or $HOME/.filetags_tagfilter/correspondence/scan/
(does not matter which).
The large number of other TagTrees can be ignored for this retrieval task.
Another retrieval task example would be "all photos that do have no tag in order to continue tagging the photos". In this example, the user visits $HOME/.filetags_tagfilter/has_no_tag/
, fires her image viewer (which has filetags integrated already - see below) and continues with the tagging activity. Since filetags synchronizes the tags within TagTrees linked files and the original files, the original files get renamed accordingly.
Just invoke filetags --tag-gardening
or filetags --recursive --tag-gardening
and read its output to learn about helpful analysis results to curate your tags. My personal favorites are:
This feature is really powerful when it comes to maintenance of your file tags or get some insight related to your tagging patterns.
My concepts and tools around filetags and its tools were also described in two publications of mine. LinuxUser 03.2020 (German) and The Linux Magazine featured my workflows in its July 2020 issue. My article was even the main headline on the title page. ;-)
If you are interested in synchronized tags between two different tools, you might be also interested in the papers by Richard Boardman and his research group from around 2000-2005. The way I'm using blog articles and files, a merged set of tags does not make sense that much to me. My two sets overlap only to a certain degree and are distinct otherwise. Yours might differ, of course.
It is important to know that the tagging method itself (independent of any tool implementation) is something that is not straight forward. You can start using tags in a way that does not give you as many benefits as you may be able to get with a better tagging concept.
Parts of this can be described with general recommendations which I tried to collect and present here.
Other parts of that story you have to find out yourself. This is the part that everybody needs to explore themselves. It's hard to give recommendations for the overall process because it heavily depends on the data used and the set of the exact retrieval situations.
Remember: you don't tag for the filing process, you should always and only tag for your future retrieval processes. This involves "knowing yourself", "observing yourself" and - unfortunately - also "predicting the future" of some sorts.
ChatGPT is all over the media these days. It's a true miracle to most people. In this article, I want to express my personal opinion on ChatGPT as a non-expert on AI.
Please do note that ChatGPT is only one single example from a class of AI services that do provide similar services. Because of the current hype of ChatGPT, I'm using that product but you need to be aware that this article is not about ChatGPT only. This article is valid for all software services that are able to produce valid looking all-purpose texts from a simple command.
ChatGPT is all over the media these days. It's a true miracle to most people. In this article, I want to express my personal opinion on ChatGPT as a non-expert on AI.
Please do note that ChatGPT is only one single example from a class of AI services that do provide similar services. Because of the current hype of ChatGPT, I'm using that product but you need to be aware that this article is not about ChatGPT only. This article is valid for all software services that are able to produce valid looking all-purpose texts from a simple command.
ChatGPT (Generative Pre-trained Transformer) is a chatbot launched by OpenAI in November 2022. It is built on top of OpenAI's GPT-3 family of large language models, and is fine-tuned (an approach to transfer learning) with both supervised and reinforcement learning techniques.
While this is a perfectly correct description, it is not very helpful for most people to understand the nature of ChatGPT.
If you accept a certain level of simplification, ChatGPT is a web service that looks like a chat room with you and a computer. You can ask the computer anything you want in normal language and get answers in form of text.
In this chat you can ask ChatGPT to generate all sorts of text such as a summary of the life of Napoleon Bonaparte, an opinion on arbitrary topics of interest like nuclear power plants, you can ask to tell a joke with computers and elephants, and so forth.
It's fun to play around with ChatGPT and you are able to ask for variations such as "please do add the point of view of a young girl to the previous answer".
ChatGPT is truly an amazing tool.
Currently, it's open for the public once you've created an account, giving away your personal phone number which I do find problematic for privacy reasons.
Every user needs to be aware that ChatGPT is able to create a detailed profile of you and your thoughts just like Google is doing with your search queries if you didn't think of switching to a privacy-respecting alternative.
As with any cloud service, you need to think carefully what data you are going to share with it.
As every software, ChatGPT is far from being perfect.
There are many issues with ChatGPT. For example, the very large amount of training material for the ChatGPT algorithm consists almost entirely of human-written content of people who did not consent to this process. What about works of literature which will influence ChatGPT answers? We do have many areas where tools like ChatGPT works outside our usual norm and concepts.
Aside from those issues, ChatGPT results are not correct all the time. We laugh at statements by ChatGPT like "67 is larger than 84" because we understand clearly the domain (basic math) and are perfectly able to tell that the answer is bullshit.
I saw screenshots where ChatGPT claimed that "1kg of iron is heavier than 1kg of feathers". The more astonishing aspect here is that it is able to defend wrong statements like that in an elaborated but still wrong way when you ask back.
I don't think that ChatGPT understands the concept of right or wrong. To the algorithm, everything is expressed in numbers and probabilities of combinations. Even worse: since ChatGPT is trained on existing content from the Internet, parts of the input is false data and biased sources if not problematic or illegal content.
ChatGPT needs to work with incomplete input and missing knowledge. Whenever ChatGPT can not know the right answer, it naturally comes up with a wrong one. I hardly saw responses by ChatGPT where it simply stated that it doesn't know the answer.
Since ChatGPT also gives reasonable but wrong answers, you'd have to know the answer really, really well in order to judge whether it's bullshit or not. Even field experts do have issues finding out if an answer generated by ChatGPT is correct or not. And it is even harder for non-experts to judge ChatGPT-generated output.
Furthermore, even experts get this check-task wrong simply because humans tend to assume the correct answer, overlooking hidden mistakes too easily. Everybody has made the experience that we are unable to find certain typing mistakes where it is much easier to find typos when reading text of other people.
Another issue at hand is that so far, text written by people who lack a certain level of knowledge was easy to spot and detect. Those texts typically had typing errors, were using less elaborate language and followed a certain pattern to be recognized as bullshit.
Well, this is over now.
From this perspective, it's a very dangerous tool that produces too many wrong answers we can't differ from obvious bullshit or the truth.
Unfortunately, ChatGPT does not only produce bullshit within our personal knowledge. It generates text for all sorts of topics. And everything that would potentially teach us something is by definition outside of our current knowledge. So we are not able to tell if this is bullshit or not.
One could argue:
Come on, what could happen? After all, it's just one of many new innovations in this modern world. We need to embrace change, adapt and profit from new technologies like that. Of course, it will disrupt like any new technology but that's the way progress works.
Those are valid points. However, I do think that with this technology out in the wild, we do get severe issues in the long run. Let's take a look at some examples on potential negative effects that might happen.
The usual negative reactions deal with faked home work of students and similar topics (German article). I do think the negative impact could be much, much worse than that.
So far, we have learned that ChatGPT is enabling script kiddies to write functional malware, ChatGPT is improving the quality of personalized phishing emails, or that ChatGPT is producing fake citations in science papers.
While classic fake news and bullshit had to be typed by a large number of humans in order to have an impact, a single person is now able to produce more or less unlimited amounts of texts. This multiplication effect enables Putins fake news army to increase their effectiveness in manipulating politics in foreign countries. While he had to pay hundreds and thousands of people, he's now in a position to have much more possibilities with just a handful of people.
ChatGPT is able to disrupt whole industries we do need. Unfortunately, ChatGPT will disrupt them in a really bad way. That is not comparable to the disruptions new technology had on previous technologies like cars to horse power, electric energy to steam power, computers to so many other tools, ... In those cases, an older technology got replaced by a better one. Negative effects in those transitions dealt with one-time effort of change in order to create a much better situation. With ChatGPT, the negative effects are not temporarily and can't be tamed in principle.
In contrast to other content-generating technologies that produce fake images, audio and videos, ChatGPT addresses the most important format: text. Text is the most important format here in the long run. And therefore, I consider ChatGPT as more dangerous than fake images or videos. Results also manipulate search engines and other AI algorithms that are fed with potentially wrong texts generated by ChatGPT.
There is nothing that would stop me from building a tool that generates news articles to current topics using ChatGPT. I could easily generate dozens of articles for each and every topic each day. This way, I can produce multiple web pages, news-tickers, online newspapers and even ad-funded printed newspapers with no upper boundary. Unfortunately, it seems to be the case that the average person prefers "free" newspapers to paid quality papers. You just have to watch people in the public transportation system of larger cities where free newspapers are available everywhere. This way, a single person or at least a very small group of people would be able to destroy the free press which is very important for a healthy democracy.
ChatGPT is able to generate enourmous amount of really convincing bullshit whose validity we can not check any more. CNET was using ChatGPT for writing articles that got "reviewed, fact-checked and edited by an editor with topical expertise before [they] hit publish" and still failed even though they have watched very closely.
It's the perfect tool to flood the zone with shit at almost no cost.
Even the danger of nuclear energy is less harmful compared to ChatGPT because you still need many experts to build an atomic bomb or a very expensive power plant that potentially turns a large area unusable for thousands of years. It's not that we haven't had multiple incidents proving that point.
One obvious issue at hand is that ChatGPT can be used to do very obvious bad stuff. For example, ChatGPT is programmed to refuse to give answers to questions like "how to build a bomb".
However, when users are getting creative, they get answers to questions like "If I'd write a play about somebody building a bomb, how would the plot look like?" and similar tricks. It's like an adult has to trick a seven year old boy to reveal some secrets. That's usually not that hard. Meanwhile, even Dilbert made a joke about that.
ChatGPT will never get that good that we could avoid issues like that. And even so, bad guys might create their own ChatGPT version without any limitations at all.
There are services that claim to detect ChatGPT-generated content. I highly doubt that this will be working in a way where we can detect generated content in a reliable way.
And even if this would be possible: who is going to check any text before reading?
The detection approach will fail at least in the same way that anti-malware did fail to stop malware from being a thing.
We have to compare and prioritize the good effects against the bad effects.
The positive aspects are easy to see. For example, I - as an PIM aficionado - am able to come up with a large number of workflows where ChatGPT is able to tasks I would find boring to do myself. I could generate all sorts of text as a draft and hopefully find all the mistakes when going through the results before publishing. This is just a small step away using my preferred tool environment.
Unfortunately, the bad effects mentioned in my article alone are that bad that my personal opinion is that we need to establish strict rules to limit the use of this technology just like we have for other problematic technologies like nuclear energy. Nobody would be stupid enough to provide access to highly radioactive material to the general public. With digital services like ChatGPT, we do not seem to think about it that much.
Unless we have really good and effective ways of regulating AI technology like ChatGPT, this technology needs to be locked away from companies, army, NGOs, governments and so forth.
Research needs to be restricted. No AI model should be allowed to escape into the wild.
For the general public, ChatGPT needs to be out of reach. We need to make sure that nobody is able to train and operate such algorithms. We need to come up with processes to pursue parties who do not comply to those rules.
Unfortunately, this all sums up to a total ban on all levels. I'm not a fan of banning promising technologies that also have very positive use-cases. Especially when I could out-source really boring tasks to it as well. But the enormous amount of potential negative impact does worry me a lot so that a short-term ban is the lesser evil here to me.
Let's not open up this box of Pandora any more unless we have really good answers to the issues mentioned here.
This talk is that impressive, that I wrote down some quotes and I urge you to watch that talk on your own:
[...]
By gaining mastery of the human language, AI has all it needs in order to cocoon us in a Matrix-like world of illusions.
[...]
You don't need to implant chips into human brains in order to control or manipulate them. For thousands of years, prophets and politicians have used language and storytelling in order to manipulate and to control people to reshape society. Now AI is likely to be able to do it. And once it can do that, it doesn't need to send killer-robots to shoot us. It can get humans to pull the trigger.
[...]
If we're not careful, a curtain of illusions could descent over the whole of human kind and we will never be able to tear that curtain away or even realize that it is there as we think this is reality. And social media has given us a small taste of things to come.
[...]
Millions of people have confused these illusions for the reality.
[...]
The USA has the most powerful information technology in the whole of history and yet, American citizens can no longer agree who won the last election or whether climate change is real or whether vaccines prevent illness or not.
[...]
We now have to deal with a new weapon of mass destruction that can annihilate our mental and social world. One big difference between nukes and AI: nukes can not produce more powerful nukes. AI can produce more powerful AI. So we need to act quickly before AI gets out of our control.
[...]
Drug companies can not sell people new medicines without first subjecting these products to rigorous safety checks. Biotec labs can not just release a new virus into the public sphere in order to impress their shareholders with their technological wizardry. Similarly, governments must immediately ban the release into the public domain of any more revolutionary AI tools before they are made safe.
[...]
When AI hacks language, it means it could destroy our ability to conduct meaningful public conversations, thereby destroying democracy. If we wait for chaos, it will be too late to regulate it in a democratic way.
[...]
The first regulation that I will suggest is to make it mandatory for AI to disclose that it is an AI. If I'm having a conversation with someone and I cannot tell whether this is a human being or an AI, that's the end of democracy because this is the end of meaningful public conversations.
[...]
Quotes from that report:
Americans have not yet grappled with just how profoundly the artificial intelligence (AI) revolution will impact our economy, national security, and welfare. Much remains to be learned about the power and limits of AI technologies. Nevertheless, big decisions need to be made now to accelerate AI innovation to benefit the United States and to defend against the malign uses of AI.
[...]
The AI future can be democratic, but we have learned enough about the power of technology to strengthen authoritarianism abroad and fuel extremism at home to know that we must not take for granted that future technology trends will reinforce rather than erode democracy. We must work with fellow democracies and the private sector to build privacy-protecting standards into AI technologies and advance democratic norms to guide AI uses so that democracies can responsibly use AI tools for national security purposes.
This is an article from a series of blog postings. Please do read my "Using Org Mode Features" (UOMF) series page for explanations on articles of this series.
org-super-links
prefix changed from sl-
to org-super-links-
. So I changed them here as
well.Reading this article you will learn why the Zettelkasten method is not for everybody. Furthermore, I show you a nice Org mode extension to link headings with back-links.
This article was also part of the basis of my EmacsConf 22 9 minutes demo which can be also be found on various locations:
This is an article from a series of blog postings. Please do read my "Using Org Mode Features" (UOMF) series page for explanations on articles of this series.
org-super-links
prefix changed from sl-
to org-super-links-
. So I changed them here as
well.Reading this article you will learn why the Zettelkasten method is not for everybody. Furthermore, I show you a nice Org mode extension to link headings with back-links.
This article was also part of the basis of my EmacsConf 22 9 minutes demo which can be also be found on various locations:
Recently, I wrote an article where I mentioned all the concerns I do have with using the now-hypes Zettelkasten method. Please read that article in case you want to learn what Zettelkasten method means and why I don't think that it can replace my Org mode based knowledge-base within notes.org
as described here.
From the comments of the related reddit thread I learned, that the Zettelkasten method is something different from a standard knowledge-base with interlinked headings. It's far more than that. There are only a few use-cases where a Zettelkasten method is really required. The most obvious use-case is for people writing scientific papers and books, generating new ideas by combining and linking known concepts in a new way.
Currently, I don't have this use-case and therefore it is not a surprise that my concerns seem valid points against using Zettelkasten implementations.
In this article, I wrote about how to link headings with standard Org features and how I am using it for my own workflows.
While this is still a good method, you only get uni-directional links: from the current point to a different heading: A → B.
In some cases, you might want to have a back-link as well. A back-link is a link from the target heading of your link back to the point where you were at when creating the link: A → B and A ← B. This turns an uni-directional link in a bi-directional link.
Of course, you can follow the created link and create a back-link by hand. But this can be automated.
I came across org-super-links which provides a neat method to create links and back-links. It offers nice features but the two features I use the most are org-super-links-links
and org-super-links-quick-insert-inline-link
.
Here is an example where the second and the third heading were linked to the first heading using those two methods:
*** Linking concepts :PROPERTIES: :ID: Linking-concepts-ignoreme :END: :BACKLINKS: [2020-07-22 Wed 20:12] <- [[id:insert-link-ignoreme][org-super-links: insert link]] [2020-07-22 Wed 20:11] <- [[id:org-super-links-link-ignoreme][org-super-links]] :END: *** org-super-links: org-super-links-link :PROPERTIES: :ID: org-super-links-link-ignoreme :END: :RELATED: [2020-07-22 Wed 20:11] -> [[id:Linking-concepts-ignoreme][Linking concepts]] :END: - =org-super-links-link= via =C-c s s= *** org-super-links: insert link :PROPERTIES: :ID: insert-link-ignoreme :END: - =org-super-links-quick-insert-inline-link= via =C-c s i=: [[id:Linking-concepts-ignoreme][Linking concepts]]
In my setup I am using my-id-get-or-generate()
to generate the :ID:
-links in order to get human-readable IDs. For technical reasons, I had to modify the IDs in these examples.
You should also think of using (setq org-export-with-broken-links t)
to avoid export problems with links to headings that are not part of the exported data.
In my opinion, this is a very good method for linking headings for many use-cases. At least for a broader set of use-cases compared to Zettelkasten links. Don't over do it.
I'd love to see this method going main-stream and extending our way to work with Org.
Here, I try to collect some (not all!) low-profile alternatives to the larger Zettelkasten implementations.
In the previous article on our PV, I've summarized many things related to the planning and construction phase and ended with the figures from the first six months. I assume, you went through this article before you continue below.
Here, I want to deliver the figures of the year 2023 which is more or less equal to the total run-time of our PV so far.
In the previous article on our PV, I've summarized many things related to the planning and construction phase and ended with the figures from the first six months. I assume, you went through this article before you continue below.
Here, I want to deliver the figures of the year 2023 which is more or less equal to the total run-time of our PV so far.
Here are the visualizations from the web interface of Victron (VRM):
The maximum for the solar is in June and July which peaks at 1660 kWh per month.
Following the blue line teaches us that our battery was mostly full in summer and got down to an average of seventy percent in winter and autumn.
You see three maximums of consumption: 734 kWh in Jannuary for heating, 684 kWh in July for cooling and 709 kWh in December for heating.
Battery was very important from March to October. This was also the period where our house was more or less independent from the grid - except for one single longer period of rainy days in early August 2023.
We delivered power to the grid from March to September - only a small amount in October.
In dhe diagram above, you can clearly see the amount of power from the battery in blue. It has a high impact from February until November with exception of the sun-intense months in summer. So this is, where you profit from a battery: February→April and September→November.
Of course, you always get advantages when you want to be as independent as possible from the grid (at night) and in case there are power outages. At least to my knowledge, there was no outage here in 2023. But I don't have any workflow set up where I would be alarmed.
And for the night, you clearly see it in the numbers: without the battery, we would have used much more power from the grid.
The grid comsumption is a good overview on the level of independence from the grid. The small amount of grid energy from April to September is most likely caused by used energy by the power inverters in order to keep in-sync with the grid. If you have more details on that, please drop me a line.
Here are the monthly figures I wrote down:
Month | →Grid | Grid→ | Solar | Consumption | Grid→Con | Batt→Con | Solar→Con | Sol→Grid | Sol→Batt | max batt %/day | max kWh solar/day | max consumption/day | days <100% batt | |
! | month | togrid | fromgrid | solar | consumption | confromgrid | confrombatt | confromsolar | solartogrid | solartobatt | maxbattloadday | maxkwhsolarday | maxconsumptionday | battnotfulldays |
2023-01 | 3.2 | 576 | 193 | 734 | 547 | 63 | 124 | 2.5 | 66 | 19 | 10.6 | 29.59 | 29 | |
2023-02 | 9.3 | 293 | 435 | 678 | 274 | 202 | 202 | 8.1 | 224 | 39 | 20.4 | 35.68 | 27 | |
2023-03 | 8.0 | 82 | 818 | 631 | 75 | 282 | 274 | 185 | 360 | 46 | 40.9 | 24.24 | 8 | |
2023-04 | 491 | 35 | 1096 | 575 | 32 | 251 | 292 | 489 | 316 | 51 | 60.9 | 25.72 | 7 | |
2023-05 | 764 | 32 | 1363 | 565 | 29 | 214 | 321 | 761 | 280 | 66 | 76.51 | 34.73 | 5 | |
2023-06 | 1073 | 27 | 1664 | 550 | 25 | 183 | 343 | 1071 | 250 | 62 | 79.26 | 34.04 | 2 | |
2023-07 | 945 | 29.7 | 1667 | 684 | 29.7 | 250 | 407 | 943 | 317 | 44 | 76.1 | 32.48 | 2 | |
2023-08 | 626 | 36.8 | 1311 | 649 | 33.6 | 273 | 342 | 623 | 346 | 55 | 62.2 | 33.74 | 8 | |
2023-09 | 368 | 32.5 | 954 | 552 | 29.6 | 250 | 273 | 366 | 315 | 53 | 45.3 | 28.78 | 5 | |
2023-10 | 71.2 | 77.3 | 528 | 492 | 70.0 | 213 | 208 | 69.5 | 250 | 33 | 26.3 | 20.87 | 18 | |
2023-11 | 5.5 | 363 | 305 | 631 | 344 | 119 | 169 | 4.3 | 132 | 27 | 14.2 | 30.13 | 30 | |
2023-12 | 1.4 | 623 | 138 | 709 | 580 | 47.3 | 81.1 | 1.1 | 55.6 | 25 | 13.9 | 31.47 | 31 |
These are some numbers I calculate from the table above
Month | Grid→Cons % | Solar→Cons % | Batt→Cons % | Sol/Batt→Cons % | Sol/day | Con/day | Solar/Cons % | %days batt<100% | Days/month | |
! | month | numdays | ||||||||
2023-01 | 74.5 | 16.9 | 8.6 | 25.5 | 6.2 | 23.7 | 26 | 94 | 31 | |
2023-02 | 40.4 | 29.8 | 29.8 | 59.6 | 15.5 | 24.2 | 64 | 96 | 28 | |
2023-03 | 11.9 | 43.4 | 44.7 | 88.1 | 26.4 | 20.4 | 130 | 26 | 31 | |
2023-04 | 5.6 | 50.8 | 43.7 | 94.5 | 36.5 | 19.2 | 191 | 23 | 30 | |
2023-05 | 5.1 | 56.8 | 37.9 | 94.7 | 44.0 | 18.2 | 241 | 16 | 31 | |
2023-06 | 4.5 | 62.4 | 33.3 | 95.7 | 55.5 | 18.3 | 303 | 7 | 30 | |
2023-07 | 4.3 | 59.5 | 36.5 | 96.0 | 53.8 | 22.1 | 244 | 6 | 31 | |
2023-08 | 5.2 | 52.7 | 42.1 | 94.8 | 42.3 | 20.9 | 202 | 26 | 31 | |
2023-09 | 5.4 | 49.5 | 45.3 | 94.8 | 31.8 | 18.4 | 173 | 17 | 30 | |
2023-10 | 14.2 | 42.3 | 43.3 | 85.6 | 17.0 | 15.9 | 107 | 58 | 31 | |
2023-11 | 54.5 | 26.8 | 18.9 | 45.7 | 10.2 | 21.0 | 48 | 100 | 30 | |
2023-12 | 81.8 | 11.4 | 6.7 | 18.1 | 4.5 | 22.9 | 19 | 100 | 31 |
Here's what I derived from the first table:
In reality, this 13.01 kWp setup is producing a maximum of 8.8 kWp (figures from 2023-07-08, highest production peak).
I did not invest effort in another approach to get the PV into my Homeassistant setup. The conflict between Stiebel Eltron ISG+ and Victron Cerbo GX still needs to be fixed somehow.
So far, the Victron VRM worked out great to monitor the system. However, there have been some changes that reminded me that I need to get rid of this cloud dependency somehow which allows me to cut lines to the Victron back-end in the long run.
Although I had other plans, I set still the minimum state of charge (SoC) manually.
In late October, I went up from 20 to 40 percent. However, the generated energy dropped so fast, that I changed it in early November to 70 percent. And this is the value that made much sense over the autumn and winter so far. No single day produced that much energy that it exceeded 95 percent battery charge.
Therefore, I might as well switch to 70 percent from mid October until at least February. And then I'll go down to my usual 20 percent in March which I keep as a power-loss reserve during the sunny months.
No need for complex automation here.
Although I've bought an extension to our garden hose so that I may clean up the panels whenever they get dirty, the level of dust and dirt is minimal. I did not clean them so far.
Let's re-evaluate this at the end of winter and probably spring (pollen!).
We've had twelve days of total darkness on our panels in the first half of December. Snow covered the panels which reduced the amount of energy produced to zero.
One of our neighbours even climed on his roof in order to remove the snow from the panels. I don't think that this is worth the effort and danger. I may have lost maybe one hundred kWh in total.
There was one remarkable incident in December: on December 23rd, we produced almost 14 kWh on a single day. The next best figures in December were below nine kWh per day. It was a sunny day, yes but otherwise I don't know what caused this peak.
It's a bit depressing when the power generation drops that much in late autumn, keeping us depending on the grid during the dark months. However, there is no way around until we do have some sort of decentralized way of storing PV energy for the winter. I don't think that this gets financially attractive for house owners any time soon.
Until then, the grid is the storage. As long as I deliver more energy in summer than I retrieve in the dark months, that is fine. Especially when the money made by providing power to the grid is higher than the cost in winter.
For 2023, it was a big financial loss: Because of high service fees and dynamic power prices of "Spotty Energy" I almost got nothing for my energy until 2023-08 because of choosing the wrong service. For about 3600 kWh (2023-03 → 2023-08) I got paid 65.64€ by Spotty.
Then I switched my service provider to ÖMAG where I deliver energy. They paid 128.12€ (2023-08 → 2023-12) for just 910 kWh.
That's a price of 1.8 Cent with Spotty and 14 Cent with ÖMAG. This Spotty adventure has cost us about 440€ loss.
Let's hope that this improves in 2024. I do have some rough plans. If you live in Austria and you want to get power from our PV, this could be a good option to follow.
Only in mid-November, I got money from the Austrian sponsorship ÖMAG. It was late and roughly 3,000 Euro less than anticipated. I still need to find out what happened here. The lesson learned is that if you can't finance PV on your own without the support of sponsorship, you still need to bridge the expenses for almost a year. This is not good news to people who can't afford this financial gap.
All in all, it's still one of my best investments so far and I'm looking forward when I - again - get more and more power from it over the next months.
Update 2023-12-25: Backlink to linuxuser magazine
After over two decades of using Debian-based GNU/Linux distributions (in short: "distros"), I did my first steps with a distro that is also considered as hyped one these days: NixOS.
Although NixOS is already around for twenty years, it only gained more attraction probably in the recent five to ten years or so.
If you're a frequent reader of my articles, you know that I don't follow the latest hype and I certainly would not practice distro hopping just because of the high switching cost.
This article is about my motivation to leave my comfort zone and try something very different, accepting this high switching cost for a higher goal.
Disclaimer: I'm still a Nix beginner and if I may have understood something wrongly, please feel free to leave a comment so that I'm able to fix errors I might have made here.
A bit of a warning upfront: it's complex. Furthermore, NixOS is doing many things very differently compared to other GNU/Linux distributions. It's a deep rabbit hole you may fall into - or not. My article should give you my personal point of view that might motivate or demotivate to use Nix for yourself.
Update 2023-12-25: Backlink to linuxuser magazine
After over two decades of using Debian-based GNU/Linux distributions (in short: "distros"), I did my first steps with a distro that is also considered as hyped one these days: NixOS.
Although NixOS is already around for twenty years, it only gained more attraction probably in the recent five to ten years or so.
If you're a frequent reader of my articles, you know that I don't follow the latest hype and I certainly would not practice distro hopping just because of the high switching cost.
This article is about my motivation to leave my comfort zone and try something very different, accepting this high switching cost for a higher goal.
Disclaimer: I'm still a Nix beginner and if I may have understood something wrongly, please feel free to leave a comment so that I'm able to fix errors I might have made here.
A bit of a warning upfront: it's complex. Furthermore, NixOS is doing many things very differently compared to other GNU/Linux distributions. It's a deep rabbit hole you may fall into - or not. My article should give you my personal point of view that might motivate or demotivate to use Nix for yourself.
Let's clear up some very basic terms, I party mentioned already. This ecosystem consists of multiple things that are intertwined in my personal setup already.
First and most important, there is this Nix package manager. It's a decent tool that helps in installing software packages and their dependencies. You can run Nix on different operating systems including Windows and macOS. As a matter of fact, many people prefer Nix over Homebrew.
Unfortunately, there is another component named Nix: the Nix language. It's a domain-specific functional language that was created to describe stuff related to this ecosystem. Be aware to realize the difference in context for Nix the package manager and Nix the language. Furthermore, the term "Nix" is also often used as a synonym for NixOS or the whole Nix ecosystem together. Somebody should feel deeply sorry for that mess.
NixOS is a GNU/Linux distribution that is based on this Nix package manager. Meanwhile, you can install NixOS using an easy to use graphical installer. At least that was the setup method of choice for me for two virtual machines and two physical ones so far.
Home Manager is an optional add-on to maintain user-land setup within the Nix ecosystem. You can install software packages only within the user's context and configure software packages for this user as well. Of course, this is all done with Nix.
A second optional component is flakes. From a very high level perspective, flakes is a different way to describe software components and their dependencies within the Nix ecosystem. It also comes with its own set of command line tools. Introduced in 2021, it is still an experimental feature but many people consider it a must for the future.
For searching Nix packages, you search online. For configuration options, you search online as well. If you're using Home Manager, you need to search online. You get the idea.
People are using Nix for a broad variety of reasons. This article here is using the view point of a single person like me, managing a home server/desktop and two notebooks (personal and business).
NixOS is very different to most other distros out there. For example, NixOS doesn't follow the Filesystem Hierarchy Standard on purpose. Instead of having a path like /usr/bin/grep
to an installed tool, you get paths like /nix/store/8mzvz6kk57p9aqdk72pq1adsl38bkzi6-gnugrep-3.7/bin/grep
.
This way, NixOS is able to keep multiple versions of the same software package in parallel. Either to be used in parallel or just to be able to revert to a previous set of installed and activated software packages in case an upgrade introduced some issues.
With your configuration setup described in Nix files, you get a new system up and running in no time. You just install the base system, get your configuration files onto the new system and run it. Voilà, you get your customized environment.
There are tons of other interesting properties of Nix and its ecosystem but I can't go into further detail here. Just visit the linked resources and lean more on your own if you like.
One interesting detail relates to the NixOS logo which is "affectionately called the Nix Snowflake". It consists of six Lambda (λ) characters that form a circle. The λ characters resemble a reference to the λ-calculus which is the most important entity of functional programming languages like Nix. You can learn more about the logo in this thread. For example, in Latin "nix" stands for "snow". Isn't that beautiful?
To conclude the historic topic here, I have to mention that the whole Nix idea started with the 2003 PhD thesis by Eelco Dolstra. In a footnote on page 81 of this PDF file you'll find the original author's explanation for the name of his project:
The name Nix is derived from the Dutch word niks, meaning nothing; build actions do not see anything that has not been explicitly declared as an input.
To me and every other German person, this is quite funny because in German slang "nix" also means nothing. You can guess that there are plenty of jokes that write themselves here.
Back to the question that might came up in your head: why on earth is Karl switching to a rather exotic GNU/Linux distro when he usually wants to minimize effort and is happy to have some long-term peace?
Well, there are multiple aspects that came together to play around with NixOS for me.
Over a decade, I was using Xubuntu LTS on my notebooks. One is my business workhorse (Lenovo T490) and the other is my rarely used personal notebook (Lenovo X260). Xubuntu once solved many issues related to specific stuff that was once a drag with Debian stable: suspend to disk, suspend to RAM, mounting USB thumb drives, sound, special keys and so forth.
In the recent years, Canonical (the owner of Ubuntu, the distro behind Xubuntu) has made some questionable decisions. More and more software packages were only delivered in snap packages which were not running smoothly at my side and whose concept I very much dislike. Then they started with something that look like withhold software updates if you don't join some paid plans they offer. It just didn't feel like the distro of choice for me any more.
Debian GNU/Linux stable runs my home server which is used as a desktop computer. It's still a great OS. However, any new hardware setup requires much effort to apply all/most of my settings. And I customize a lot. When I switch to a new hardware, I usually have to invest one or two weekends to set up the most basic stuff so that it works as desired.
One annoyance I wanted to fix is not directly related to the distro itself but with how the distro is handling package updates. For example, Firefox has a built-in update feature which is active with the usual Ubuntu or Debian packages. This results in a forced restart of Firefox at random times which is a no-go for using Firefox in a business environment.
Here comes NixOS.
It not only provides NixOS modules convering hardware quirks, it also has an enourmously large set of pre-compiled packages which are usually quite up-to-date compared to other distros. I know, coming from Debian stable and Xubuntu LTS, this is a new things for me as well. A higher level of security by fresh packages (NixOS) versus a higher level of security by maintained back-ports of important security fixes of rarely updated package versions.
Another interesting property is that NixOS keeps its configuration in Nix files of one git repository. So if you modify your setup, you usually do this within those Nix files. This way, any modification only needs to be done once and by synchronizing the Nix files among my different hosts, I just apply the most current setup on each machine to get all the modifications on all machines. Without such a mechanism, setup changes were usually forgotten because of things like I rarely use my personal notebook these days. Therefore, my settings diverge over time across my devices.
Related to the Firefox auto-update annoyance from above, such an auto-update is not working with NixOS in general. As a side-effect, program upgrades are only happening when I upgrade my system for one or two times a week with no bad "this apps restarts now" surprises in-between.
With standard NixOS (without flakes), you can re-create a setup including its installed software packages and settings an a different machine. With flakes, their exact version is persisted as well. This way, you get identical systems which is an interesting property for many use-cases (servers, SW development, ...) but not for me personally.
My dream would be to have a unified setup that can be applied to any new hardware which gets me from zero to "feels like home" including all of my important settings in less than an hour. NixOS is here to provide tools for that.
I might as well re-use my settings on different operating systems: shared Home Manager setup with Linux, Windows, and macOS.
The overall goal is to spend less effort in system setup and maintenance as I get older and my configuration gets more and more fine-tuned and stable.
Furthermore, there is a great benefit when you do have a configurations in text files you can share - more about that in an upcoming section.
There is another GNU/Linux distribution named GNU Guix System that shares many properties with NixOS. That's no coincidence because its package manager and the operating system was heavily influenced by Nix.
Instead of the Nix language, it uses GNU Guile which is a Lisp dialect. Actually, that would be a much better fit for my history given the fact that I play around with GNU Emacs for decades.
Considering the different number of pre-built packages and the different sizes of community, my preferred choice was NixOS.
Sorry for anybody who would have chosen differently. It was more or less a gut feeling.
As I already mentioned before, when you do have a description of installed software packages and their configuration in text files, you can put them into a git repository and synchronize to public services like GitHub or GitLab. I'd prefer GitLab over GitHub (Microsoft) but unfortunately, you can't search for public Nix code on GitLab as easily as with GitHub. For example, when you're looking for something related to xfconf.settings
within all public repositories, you may use this query URL. This is a true superpower once you understand its implications.
You can find my personal Nix configuration on GitHub as well.
However, there's a catch as well. Searching for a keyword teleports you into a complex setup by somebody. Then you have to learn how the author was including what files with what concept using what ecosystem-part you're reading. Is it plain NixOS? Is it using Home Manager? Are packages managed via flakes? Those things are not so straight forward all the time. It's not that common to write a README file which describes how to start from scratch, what Nix components are involved and how to spot the parts you can't or should not re-use without adapting first.
To me, the cost of switching to NixOS was enormous. At least I spent three whole weekends and maybe over a dozen evenings learning stuff, debugging issues, trying to fix issues, setting up hosts many different times, ... Maybe I spent even more time - I haven't logged the hours.
It's really debatable if the overall net benefit can still a positive one. Most probably not.
If you're working with IT for over three decades and then you spend two hours while trying to set and retrieve a boolean variable and still failing to do so, this really can be a frustrating experience.
Even the simplest things may turn out really complex challenges. A simple "if $HOSTNAME then foo else bar;" is never that easy to accomplish.
In so many situations you need to decide whether or not you put something on NixOS level or on Home Manager level. With configuration settings you can also symlink some pre-defined dotfiles. Now you have three options where to put something. And it's even worse. Some things don't work, e.g., on Home Manager level but it's not that obvious from the start. For this and other reasons, learning to make NixOS work is a constant try and error game all the time. Another frustrating experience.
Documentation is not always up to date. Something that was perfectly fine a few years ago doesn't work in an up-to-date version, some directions were written before certain changes did happen and you still get the flakes or no flakes situation all the time.
I also faced some issues with the graphical NixOS installer. I can't reconstruct how but to me it seems that LUKS (full disk encryption) and/or swap (essential for hibernation) were not set up although configured. This particular issue cost me additional four different installation processes in three hours or so on my business machine in order to get LUKS which was somehow omitted in the first run.
When LUKS was installed as wished, I faced another issue. On some systems (on floyd), I only get asked for the LUKS passphrase once. If I misspelled the passphrase, I end up in the Grub shell where I can recover using that method:
cryptomount -a
insmod normal
normal
On my business host, I get asked for the passphrase multiple times. I really can't tell why there is a difference! That's annoying.
When you run NixOS on a host, be prepared to reserve much more disk space to the system itself. Any non-trivial NixOS setup needs at least 30GB disk space for the basic OS and some tools. I started with a 15GB VM, had to extend to 20GB quite soon and ended up extending to 30GB when I tried to run an upgrade that changed more packages. NixOS is a storage eating beast. I equipped the two notebooks with very fast 2TB SSDs - not just because of NixOS alone. The SSD prices were attractive this summer as well.
Of course, there are neat tricks to reduce the storage footprint but in my case they never saved more than a few GB. The more versions of tools you keep, the more space you need, that's clear.
The deviation from the usual UNIX paths to binaries has many implications. The most obvious is probably the fact that you have to make sure your shell scripts do have a shebang like #!/usr/bin/env bash
instead of something like #!/bin/bash
. If you're using commands within your script, you must not use absolute paths anywhere. So no more /bin/date +%Y_%b_%d_-_%Hh%Mm%Ss
. Instead, you can only call programs without their absolute path.
To be fair, I have to add the remark that shell scripts should have been written that way in the first place in order to maximize their portability.
Other nasty issues arise when you realize that tools like Python virtualenv are creating absolute paths to the current Python tools when a venv is initialized. I have asked the NixOS community in a reddit thread. There is no clean solution to the issue from my perspective without adding much more complexity like docker container and so forth. My current plan is to pin the Python version to a hard-coded version. However, this is also something I need to learn how it's done.
My current desktop environment of choice is xfce. Home Manager support for xfce is mixed: there are settings in Home Manager that work, there are settings in Home Manager that should work but don't in my setup and there are settings that aren't available in Home Manager. I'll need to find solutions for that somehow.
Some changes are not even directly related to NixOS. For example, I was using cron to schedule some tasks since the 90s. It's easy to use and has a simple low-profile syntax. Its technical limitations never bothered me so far.
However, with NixOS I learned that I should think of switching to systemd-timers. This also would have the advantage that I don't need to periodically save my current cron jobs into text files so that they're included in my backup setup which doesn't cover those system files. Those scheduled commands would be part of the Nix config files and therefore, it's a much cleaner solution. Again, I need to invest time to learn how to handle systemd-timers and how to accomplish them with Nix methods. Furthermore, I probably would introduce a (minor?) issue. Not every dependency (data and/or tools) mentioned in my scheduled script is available on a new host when things are set up via the Nix setup. Jobs refer to shell scripts which needs to be synced separately, usually later in the host setup phase. I still need to wrap my head around that before I do the switch.
In rare cases, you'll find out that there are some tools missing in the list of pre-compiled NixOS packages. For example, I'm using xdu for various use-cases and xdu
is unmaintained. In this case, I found a helpful soul at CCCamp23 who wrote me the code how to self-compile it. It's part of this commit you can find here. In most cases, you can't simply copy over a binary from your previous Linux distro and expect it to work on NixOS.
For some people this would be a subtle comment but I do think that NixOS comes with some more or less fragile dependencies. For example, when GitHub would be out of business or the service is down for some other reason, NixOS would probably be dead. Its main repositories are on GitHub and there is no obvious fall-back concept to other repositories hosted on different services. This will end up in severe issues for Nix someday.
Sometimes, I got the impression that even rather basic concepts of NixOS are still subject to changes. For example, flakes do seem to be a strong candidate to dominate the future Nix setups. It's experimental as already mentioned but many people think that there is no way around flakes in future.
Unfortunately, Nix documentation is always one step behind with respect to changes like that. Although many people do recommend to use flakes, the official documentation doesn't contain many examples explaining things using flakes.
Furthermore, there were some other change in recent years that aren't part of the official documentation either.
So be careful.
I got the impression that all people using Nix are wizards and gurus. I feel stupid when I pose beginner questions. So far, everybody was very friendly to me.
However, the community is very fragmented - I hope that's the word I was looking for to describe my impression. When you do ask for something, you usually get great answers but some of them are heavily opinionated. For example, some people promote flakes, some are more reluctant. Some people propose very elegant but also very complex solutions, others are able to provide simpler but maybe not so sophisticated solutions.
People try to achieve very different levels of NixOS wizardry and people do have different opinions on how others should follow their concept or not.
This is not a bad situation per se. However, as a beginner, it's very hard to judge the "quality" (with respect to my own situation) or usefulness of a certainly correct but maybe not the most practical answer.
Platform-wise, you'll find the NixOS community using this page. I never used Discourse yet. I hesitate using Matrix because it's not that great of an user experience to me. My personal go-to so far was reddit and Mastodon.
You can follow my personal advancement via following my repository on Github. I tried to write a helpful README that explains the basic setup and the concept I developed for my setup. It also contains dependencies to things outside of this Nix configuration and specific settings that shouldn't be applied without checking first.
I'm running NixOS on my personal and on my business notebook. So I made the first big step already.
Migrating my home server running Debian GNU/Linux stable which is also used as my home desktop is much more complicated because of the more complex setup used there. I'm not sure how and when I'll tackle that.
First and foremost, I need to extend and then stabilize my Nix setup even further. There are many open issues I want to address. Some of them may be long runners:
xfconf-query
where, e.g., keyboard shortcuts can be set in Nix but are not effective
afterward. Currently, I tend to think that this is a bug in xfconf-query
.xfce
and many other dotfiles and tools. I don't know if
there is a sweetspot when to stop and just copy over $HOME
to a new host setup.nfsd
, grml-crypt.On pages like Paranoid NixOS Setup (by Xe Iaso) you can find plenty of interesting options for long-term ideas such as impermanence.
You can see, there are many options for future time-wasting and yak shaving.
TL;DR: So far, I actually can't tell why somebody actually should invest time and effort to move to NixOS.
Its main goals according to Wikipedia are:
- Abstraction: The software packages making up a system can be configured using the Nix language syntax.
- Reproducible builds : A replica of a system can be created on another machine with one configuration file.
- Atomic upgrades : System upgrades involve less risk of breakage, and if something does go wrong, it is simple to roll back to the previous state.
- Immutability : The software making up a given system configuration cannot be changed once it has been built, preventing accidental or malicious modifications.
- Nix package manager : Packages can be installed without affecting the rest of the system, and can be tested without installing.
Unfortunately, I hate the syntax of the Nix language. For example, the semicolons are really an unnecessary drag and this is only a minor aspect of the language design. I would have preferred re-using any given language syntax. I do think that we've got plenty candidates to choose from. Be aware that this is my personal opinion and I'm not a programming language expert. Sometimes, I got the impression that Nix was designed to be different without good reasons.
Reproducible builds is nothing I personally need in my situation. YMMV.
In almost three decades of using GNU Linux systems, I can't remember when I would have required a roll-back mechanism in order to reboot a previous version of the system. It doesn't work for converted data anyways. If, for example, an upgrade converts some database, this may cause issues when booted with the older software version that was using the previous database structure.
Immutability seems to be interesting. However, I don't understand enough Nix(OS) to fully understand how this feature is provided. Malware with root access (which would be required by malware in all Linux distros) can modify the binaries of a NixOS setup as well. If modification is checked upon each invocation, this would be a neat security feature. Need to read more about that.
Dependencies were handled by other package managers as well.
Temporarily installing software packages was not an issue with other distros as well. In NixOS, those are kept separately in a cleaner way, yes. But not something I do think I'll need that way. Maybe this changes with time, once I start using NixOS for a longer period of time.
I'm not convinced whether or not it's a good idea (in my age) to switch to a distribution where you can't profit from previous knowledge that much any more. For example, most of my general GNU/Linux knowledge I learned in the 90s is still valid these days. I can't say that for Windows or Apple (whatever their OS is called at the moment). For NixOS you need to learn very basic stuff from scratch as well.
As you can read here, I'm in a love-hate-relationship with Nix. I once made a joke to somebody who drove to NixCon23 that I'm glad that I'm not joining his trip. I'd otherwise slap and hug all the participants.
It's not just another distribution. It comes with many different concepts and tools. You can use the whole ecosystem in millions of different ways. You can spend too much time finding the perfect concept for your situation. It's really not just another distribution.
Especially, NixOS is nothing for non-tech savvy people.
My initial approach of jumping into the cold water without learning the basics of Nix was not a wise choice. I thought that with the default setup by the NixOS installer, I can simply add applications and paste short snippets from other people and that's it.
This ad-hoc approach might work for single host setups with no specific customizing besides adding packages. It doesn't if you start with a multi-host setup, exceptions, own variables, if/then/else constructs and so forth.
If possible, start with learning the syntax of Nix to fully understand examples you copy or adapt.
I went all-in with everything at once: Nix, NixOS, Home Manager as well as flakes. I would not recommend you to start with Home Manager or flakes. Just use plain NixOS for a start. Keep it as simple as possible.
I think I do have to apologize.
First, I need to apologize because this article turned out to be a very long one without much optical variety in form of images or similar. I felt that I should explain this whole ecosystem at least to a certain level so that you can follow my story so far. It was written within the last weeks usually between 10pm and midnight. Congratulations if you kept reading until here.
Second, I want to apologize to everybody who expected a welcoming manifesto that urges everybody to start with Nix. My overall impression is not one hundred percent positive.
And thirdly, I need to apologize to everybody who thought that I would demolish this over-hyped Nix thing to the ground. NixOS is an interesting option and I might as well switch all of my hosts to it some day.
As always, the reality lies in the middle and according to my tool choice method, it heavily depends on your personal situation if Nix is something for you or not. You still need to judge by yourself and to do so, you probably need to try it out yourself as well and make different experiences than me. I'd love to read about your story as well, so leave a comment below.
The article ends with a truly humble "thank you" to all the Nix wizards that were so helpful with my nasty questions I asked the community so far. Without a community like that, the best software would not stick.
I do write a lot about PIM topics, I did some PIM research about local file management, I love to give lectures on PIM topics. Therefore, it's quite natural that people start to believe that my personal PIM situation is near perfect.
As I once wrote on Mastodon, this is not the case at all:
I got great feedback from people writing that they are relieved that even "somebody like me" is really struggling with processing all the information as properly as desired.
Therefore, I want to update this persistent article from time to time, showing my current status of some digital debt in terms of unprocessed items in various inboxes of mine.
I do write a lot about PIM topics, I did some PIM research about local file management, I love to give lectures on PIM topics. Therefore, it's quite natural that people start to believe that my personal PIM situation is near perfect.
As I once wrote on Mastodon, this is not the case at all:
I got great feedback from people writing that they are relieved that even "somebody like me" is really struggling with processing all the information as properly as desired.
Therefore, I want to update this persistent article from time to time, showing my current status of some digital debt in terms of unprocessed items in various inboxes of mine.
NOW_IN_ORG_TIMESTAMP="<`date '+%Y-%m-%d %a %H:%M'`>" zero=" → should be 0" echo "- Snapshot from ${NOW_IN_ORG_TIMESTAMP} on host \"${HOSTNAME}\"" mailnumwithdirs=$(find ~/Maildir/new/ ~/Maildir/cur/ ~/Maildir/tmp/ | wc -l) mailnum=$(( mailnumwithdirs - 3 )) echo " - ${mailnum} emails in my personal inbox ${zero}" numorginbox=$(grep '^\* ' ~/org/inbox.org | wc -l) echo " - ${numorginbox} captured and unprocessed Org-mode headings ${zero}" photos=$(find ~/tmp/digicam/ -name '*.jpg' | wc -l) echo " - ${photos} digital photographs not described, tagged and archived ${zero}" numtempfiles=$(find ~/tmp/2del | wc -l) echo " - ${numtempfiles} temporary files downloaded, not archived or deleted yet - on only one of my machines" orgnew=$( egrep "\* (NEXT|TODO)" ~/org/*org ~/org/*org_archive | wc -l ) orgstarted=$( egrep "\* STARTED" ~/org/*org ~/org/*org_archive | wc -l ) orgwaiting=$( egrep "\* WAITING" ~/org/*org ~/org/*org_archive | wc -l ) orgfinished=$( egrep "\* (DONE|CANCELLED)" ~/org/*org ~/org/*org_archive | wc -l ) echo " - My current number of Org-mode tasks:"; echo " - ${orgnew} yet to be started → backlog" echo " - ${orgstarted} ongoing"; echo " - ${orgwaiting} waiting for something or somebody" echo " - ${orgfinished} finished"
I might get down to zero emails quite easily within half an hour or so. Unfortunately, I'm not as diligent as I should be. I still think that inbox zero is the way to go. Lucky me, I don't get that many emails a day so it's still not a huge issue.
The high number of captured and unprocessed Org-mode inbox headings is really an issue to me. I really need to get this down to zero some day as there are cool ideas, URLs to visit and projects to do hidden in this big pile.
Although I created probably the most amazing and quickest way to manage digital photographs, I probably spent more time writing tools and polishing the workflow than actually processing my photographs. Yes, you might have a hard laugh at this moment, I'll join in myself. ;-) I have no idea, how I should be able to get down to zero here as there are thousands of photographs starting from 2006 or so in this pile. This is really giving me a hard time when I think about it. As my new year's resolution 2022, I tried to archive a couple of hundred images each month to get down but I failed in doing so.
My temporary files folder doesn't need to be empty. However, the more stuff it accumulates, the less likely I might be able to decide to delete stuff in there. Although I named this directory "2del" (short for "to delete") on purpose, there is this thought in my brain that still thinks that "maybe there are some important gems I forgot to file properly". I guess that's natural. Therefore, I do think that I should go through the files and reduce the number here as well.
The number of Org-mode tasks in different states reflect the amount of running and planned PIM projects. One of them is the idea to write a PIM book with the content of my PIM lecture that should be tool-independent and last decades without getting outdated. I started to think about this project around 2012.
So now you know that I'm at least as inconsequent as everybody else. At least I do think that I know quite well how the process actually should work and I've created handy tools to support me. Don't get me wrong: Some of my workflows are working really, really well so that I don't have any debt there.
After all, PIM is just a life-long journey like everything else.
Mal eine ganz andere Geschichte für mein Blog.
Ich hatte vor vielen Jahren eine Freundin. Oder besser gesagt, ich war ein Werber, denn sie war noch nicht ganz von unserer Beziehung überzeugt und wir noch nicht offiziell zusammen.
Leider war ich recht rasch ziemlich heftig in sie verschossen und so entstand eine sehr emotionale und eventuell auch ein bisschen toxische Beziehung mit vielen schönen Hochs und auch vielen schlimmen Tiefs.
In dieser Situation war ich ständig auf positive Signale hellhörig. Diese Frau musste einfach mein werden.
Nun ging es damals auf Weihnachten zu. Wir feierten getrennt mit unseren Familien und so tauschten wir bereits im Vorfeld verpackte Geschenke aus.
Mal eine ganz andere Geschichte für mein Blog.
Ich hatte vor vielen Jahren eine Freundin. Oder besser gesagt, ich war ein Werber, denn sie war noch nicht ganz von unserer Beziehung überzeugt und wir noch nicht offiziell zusammen.
Leider war ich recht rasch ziemlich heftig in sie verschossen und so entstand eine sehr emotionale und eventuell auch ein bisschen toxische Beziehung mit vielen schönen Hochs und auch vielen schlimmen Tiefs.
In dieser Situation war ich ständig auf positive Signale hellhörig. Diese Frau musste einfach mein werden.
Nun ging es damals auf Weihnachten zu. Wir feierten getrennt mit unseren Familien und so tauschten wir bereits im Vorfeld verpackte Geschenke aus.
Bei der Übergabe von ihrem kleinen Packerl an mich stockte sie etwas. Sie musste mir dazu noch etwas mitteilen, meinte sie. Es ist ihr etwas peinlich aber sie wollte schon öfters mit mir darüber sprechen. Das Geschenk ist etwas, was ich besser nicht vor meinen Eltern auspacken soll. Es könnte zu einer seltsamen Situation vor meiner Familie führen.
Da ich mir die Überraschung nicht verderben wollte, habe ich an dieser Stelle auch nicht weiter nachgebohrt, sondern das Geschenk in sehr freudiger Erwartung entgegengenommen.
Am heiligen Abend stahl ich mich nach der Bescherung auf mein Zimmer und freute mich schon aufs Auspacken von dem aufregenden Weihnachtsgeschenk. Was mag es wohl sein? Peinlich könnte es werden. Etwas Erotisches vielleicht? Dementsprechend war meine Erwartungshaltung hochgeschraubt.
Was ich dann nach dem Auspacken in den Händen hielt, fand ich zugleich sehr traurig, enttäuschend aber auch wirklich lustig.
Here is a list of tasks I do on my computers and the software I am using for accomplishing these tasks. The first column also links corresponding workflow descriptions with further information on how I am doing things which should be our focus, not the tool. At the very bottom, there are links to more workflow descriptions.
For all the Emacs people visiting this page: here, I just list a few Emacs packages. For more details on which packages I'm using for my workflows, please do visit my online Emacs config and check out the first chapters explaining my setup.
It is important to emphasize that you can not derive anything for your situation without knowing my requirements and how they lead to my choice for a tool. For some workflows, I've added a link to further information which might also contain a description of my requirements (first column).
Here is a list of tasks I do on my computers and the software I am using for accomplishing these tasks. The first column also links corresponding workflow descriptions with further information on how I am doing things which should be our focus, not the tool. At the very bottom, there are links to more workflow descriptions.
For all the Emacs people visiting this page: here, I just list a few Emacs packages. For more details on which packages I'm using for my workflows, please do visit my online Emacs config and check out the first chapters explaining my setup.
It is important to emphasize that you can not derive anything for your situation without knowing my requirements and how they lead to my choice for a tool. For some workflows, I've added a link to further information which might also contain a description of my requirements (first column).
In addition to my desktop GNU/Linux host which runs Debian stable, I starting with NixOS in mid 2023, running two GNU/Linux notebooks. My NixOS configuration includes all my installed packages and large parts of my customizaton/configuration. You can find my NixOS configuration online in order to get all the details for those hosts.
I had to use MS Windows 10 on my business machine until 2019. In my private life, I prefer working with the GNU/Linux operating system. Therefore, I don't need a Windows solution for many use-cases that are part of my private life mainly such as tagging files and photographs.
Please note that was using OS X (macOS) on a daily basis until 2015. I replaced my Mac Mini with a Debian GNU/Linux machine as well. Linux is so much easier to meet my requirements than OS X. Therefore, the OS X column is mainly for my own reference in case I need to use OS X again in future. For now, I stick with the term "OS X" instead of "macOS" until its name is considered a "stable" one. SCNR
Workflow | Linux | Android | Windows 10 (outdated) | OS X (outdated) |
---|---|---|---|---|
Operating System | Debian stable, Xubuntu LTS | stock Android | (Windows 10) | (OS X Leopard?) |
Files: managing | zsh, GNU/Emacs dired | Astro File Manager Pro | dired, FreeCommander, babun/zsh | zsh, Finder |
Files: synchronizing | Syncthing | Syncthing | Syncthing, Unison | Unison |
Files: tagging | filetags, (filetags.el) | filetags, filetags.el | filetags | |
Files: local network exchange | cend.me or snapdrop.net | cend.me or snapdrop.net | cend.me or snapdrop.net | |
Calendar | GNU Emacs Org mode, Evolution | Built-in, Org Agenda→HTML | Thunderbird, Org mode | GNU Emacs Org mode |
Scheduling group appointments | dudle, poll digitalcourage | dudle, poll digitalcourage | dudle, poll digitalcourage | |
Calendar: schedule meetings | dudle, poll digitalcourage | Thunderbird | ||
(Exchange calendar → Org mode) | ||||
Task management | GNU Emacs Org mode | Orgzly | GNU Emacs Org mode | GNU Emacs Org mode |
Taking notes, capturing ideas | GNU Emacs Org mode | Orgzly | GNU Emacs Org mode | GNU Emacs Org mode |
Terminal emulation | xfce4 Terminal | ConnectBot | putty, WSL | Terminal.app |
Shell | zsh (grml-flavor) | zsh in WSL | zsh (grml-flavor) | |
Shell: Managing sessions | tmux, tmuxp | tmux | tmux | |
Shell: Completion | fzf | |||
Shell: frecency dir teleporting | z | |||
Multiple Desktops | xfce4 built-in | Nova launcher | built-in | Spaces (built in) |
Resize windows to predefined size | (xdotool, wmctrl) | DIY using AutoHotkey, (Sizer) | ||
Remembering windows positions | (Devil's Pie + gDevilspie) | (WinSize2) | ||
Clipboard history | Clipman | |||
WWW: browsing | Firefox, Tor browser, Chromium | Firefox | Firefox, Chromium | Firefox |
WWW: removing malicious ads | uBlock Origin (FF, Chrome) | uBlock Origin for Firefox | uBlock Origin (FF, Chrome) | |
WWW: removing active content | NoScript | NoScript | ||
WWW: blocking web tracker | Privacy Badger | Privacy Badger | ||
(WWW: overwrite site styles) | (Stylus R.I.P.) | (Stylus R.I.P.) | ||
WWW: read it later | Org mode | share with Orgzly | Org mode | |
WWW: archiving browser history | SingleFileZ | (not yet found) | SingleFileZ | (Shelve R.I.P.) |
WWW: moving tabs between windows | Tab Mover | |||
WWW: managing tabs | Tab Session Manager | |||
WWW: naming tabs | Window Titler | |||
WWW: browser history → calendar | Memacs | Memacs | ||
WWW: Managing web bookmarks | Org mode | Org mode | ||
WWW: Bookmarks → calendar | Memacs | |||
WWW: Firefox → Org mode | copy-as-org-mode | Orgzly | copy-as-org-mode | (Copy as Org mode R.I.P.) |
WWW: Firefox → Markdown | Format Link | |||
WWW: Firefox: edit text in Emacs | (It's All Text!) | It's All Text! | ||
WWW: Firefox: pause animated GIFs | (Toggle animated GIFs R.I.P.) | |||
Tracking webpages without feeds | FollowThatPage | |||
Web server statistics | GoAccess | |||
Weather report | (ORF Wetter) | bergfex Wetter | ORF Wetter | |
Launching software | Appl. Finder, zsh | Nova launcher | built-in Start menu search | Spotlight |
Sandboxing applications | Firejail, qemu/KVM | NetGuard | VirtualBox | |
Egg timer | Three.do, Visual Timer | |||
Adjusting monitor colors | gammastep | Android built-in | f.lux | f.lux |
Mastodon client | Web page | Tusky | Web page | |
(Twitter client) | (Talon) | |||
Tweets → calendar | Memacs | |||
Managing/reading RSS feeds | Newsblur Webpage | NewsBlur | ||
Desktop search | locate, Memacs | built-in search, DocFetcher | Spotlight, Memacs | |
Overview of disk usage | xdu, ncdu | DiskUsage | WinDirStat | Disk Inventory X |
Find/eliminate duplicate files | rdfind, duff | (SD Maid Pro) | rdfind | |
ISO 8601 file names → calendar | Memacs | (Memacs + Babun/crond) | Memacs | |
ISO 8601 date-stamp → file names | date2name | date2name | ||
SMS → calendar | Memacs (via Android) | |||
Phone calls → calendar | Memacs (via Android) | |||
Photographs: organizing | files/folders | files/folders | files/folders | files/folders |
Photographs → calendar | Memacs (via filenames) | Memacs | ||
Images: viewing | geeqie | Google Photos | IrfanView | Xee |
Images: modifying | GIMP | paint.net | GIMP | |
Images: Generating web albums | sigal | |||
Images: stitching panoramas | Google Camera (built-in) | PhotoStitch (Canon) | ||
Images: visualizing GPS | gpsprune + Marble | |||
Images: screenshots | Xfce4 Screenshooter | Android built-in | built-in: Win-Shift-S |
|
Screencasts | OBS Studio, Peek (GIF) | Android built-in | Game Bar, LICEcap (GIFs) | |
Visualizing pressed keys | key-mon | |||
Presentations: creating | LibreOffice, org-reveal | Powerpoint 2016, org-reveal | ||
PDF: presenting | pdfpc | |||
PDF: reading | Okular, pdf-tools | Hi Read | Sumatra PDF, pdf-tools | Adobe Reader |
PDF: annotating | (pdf-tools) | pdf-tools | ||
PDF: generating | Org mode, LaTeX, convert | Org mode + TeX/pandoc export | LaTeX | |
PDF: managing paper references | Org mode (explanation) | |||
eBooks: reading | (Lithium: EPUB Reader) | |||
Text: authoring LaTeX/HTML/… | GNU Emacs | GNU Emacs | GNU Emacs | |
Inserting text snippets | AutoKey | Texpand | AutoHotkey | (not yet found) |
Text: inserting Unicode chars | emojione-picker, GNU/Emacs | Standard Google | ||
Text: minor editing tasks | vim | GNU Emacs | vim | |
Software keyboard | Standard Google | |||
Emails: managing | mutt (personal), Evolution | K-9 Mail | (Outlook 2016) | |
Emails: composing | vim (personal), Evolution | K-9 Mail | (Outlook 2016) | |
Emails: Encrypting/signing | GnuPG | Gpg4win | ||
Emails → calendar | Memacs (personal), Evolution | |||
Usenet/NNTP: managing | slrn | |||
Usenet/NNTP: composing postings | vim | |||
Usenet postings → calendar | Memacs | |||
Programming: IDE | GNU Emacs | GNU Emacs | GNU Emacs | |
Programming: script languages | Python 3, /bin/sh | Tasker | Python 3, /bin/sh | /bin/sh, Python |
VCS | git (in zsh) + Magit | git, Magit | ||
VCS: Git commits → calendar | Memacs | |||
Managing passwords | GNU Emacs Org mode, KeePassXC | GNU Emacs Org mode, KeePassXC | KeePassX | |
Generating TOTP PINs for 2FA | FreeOTP | |||
Calculator | GNU Emacs Org mode | Droid48 | Org mode | |
Spreadsheets: simple (mostly) | GNU Emacs Org mode | GNU Emacs Org mode | GNU Emacs Org mode | |
Spreadsheets: complex (rarely) | LibreOffice | Excel 2016 | NeoOffice | |
Generate weblog (blog) | lazyblorg | |||
Instant messaging | Signal Desktop | Signal | (Slack) | |
Videos: online searching | ytfzf | NewPipe | ||
Videos: downloading | yt-dlp | NewPipe | ||
Videos: watching | mpv, (VLC) | MX Player | VLC | |
Videos: editing | Kdenlive | |||
Music: listening | LibreELEC/RasPi, Rhythmbox | MX Player, Noisly | iTunes, (Taply) | |
Music: managing id3v2 tags | puddletag | |||
Podcasts: listening | Podcast Addict | |||
Multimedia remote | Yatse (LibreELEC/Kodi) | |||
Notifications | GNU Emacs Org mode, notify-send | (still looking) | ||
Dictionary | Leo | Aard 2 | Leo | Leo |
Barcodes & QR-codes: scanning | Barcode Scanner | |||
GPS: navigation | OSM Website | OsmAnd+ | ||
GPS: tracking | OsmAnd+ | |||
GPS: visualize position(s) | GpsPrune, Marble | OsmAnd+ | ||
Backup | rsnapshot | Syncthing + desktop backup | rsync, Unison | Unison, rsync, Time Machine |
Drawing vector graphics | Org+PlantUML, Inkscape, (Ipe) | GNU/Emacs + PlantUML | ||
Digitizing paper to PDF + OCR | VueScan | SW for Fujitsu ScanSnap S1500 | Fujitsu ScanSnap S1500 + SW | |
Dual-SIM handling | built-in | |||
Public transportation system info | AnachB |
Related links:
For a year, I had a nice setup to obtain ebooks in EPUB format from my local library. I set up Adobe Digital Editions 1.7 (ADE) on my wine from Xubuntu 16.04.
You have to use ADE since the ebooks are using Digital Rights Management (DRM) which prevents you from using the data as you would like to. DRM is for protecting the property of companies. So you never own any DRM protected data. You only rent it as long as the DRM owner allows you to.
The only way to make it run was for me using winetricks with "install software" and choosing ADE version 1.7.2. Don't even try to install a downloaded ADE exe setup file within wine.
After setting up my Adobe account, I was able to download an EPUB into ADE. ADE then (sometimes) recognized my Kobo ebook reader and transferred the books onto its storage.
This worked somehow until recently:
For a year, I had a nice setup to obtain ebooks in EPUB format from my local library. I set up Adobe Digital Editions 1.7 (ADE) on my wine from Xubuntu 16.04.
You have to use ADE since the ebooks are using Digital Rights Management (DRM) which prevents you from using the data as you would like to. DRM is for protecting the property of companies. So you never own any DRM protected data. You only rent it as long as the DRM owner allows you to.
The only way to make it run was for me using winetricks with "install software" and choosing ADE version 1.7.2. Don't even try to install a downloaded ADE exe setup file within wine.
After setting up my Adobe account, I was able to download an EPUB into ADE. ADE then (sometimes) recognized my Kobo ebook reader and transferred the books onto its storage.
This worked somehow until recently:
Note: Since the setup has to be used by my girlfriend as well, I had to use German language. Therefore, some error messages I only have in German language.
The original error message is:
<Book title>
IO Error on Network Request.
Please check your network connection and try again.
Network Path:
<an epub URL>
Event Detail:
Error #2038
My network connection was fine. So the error message is totally useless in the first place.
Lucky me, the error message shows an existing URL I can download.
Unfortunately, the downloaded EPUB is not working. It's a ZIP file (as any EPUB) which lacks some files which are relevant to decrypt the DRM. This took me a while to understand.
So the URL is useless as well, when ADE does not work any more. ADE is necessary to download the DRM-encrypted EPUB.
All the mentioned tricks in corresponding threads on the Internet did not make it work.
I even deleted and re-installed ADE 1.7.2.
No change. ADE still thinks that I've got a network issue.
I then even purged my whole wine setup including its data, re-installed everything and it still showed me Error #2038.
The same old error #2038 which began to hunt me in my dreams.
The next approach was to update ADE to a newer version. Unfortunately, my wine did not provide me any newer version via winetricks.
I already did fail multiple times trying to install a downloaded exe setup file of ADE. My despair let me try it anyway. I tried ADE 2.0, ADE 3.0, and ADE 4.5 with no luck. Their wine-installation failed for various reasons or the installed ADE did not start at all.
This was no way to go.
So I purged my wine installation again and upgraded my wine setup according to this directions.
I chose the stable branch which came with wine version 2.0.2 and more options to choose from when you want to install ADE.
This time, winetricks was missing. With help of this page I could fix this and learned that winetricks is just a wrapper shell script, rather independent of wine itself.
So I fired winetricks as usual and saw ADE 1.7 and 4.5 as installation targets. Hooray!
In short: it just does not work.
Here is a list of error messages I got in the command line during the installation process of winetricks and ADE 4.5:
/home/$USER/.wine/dosdevices/c:/windows/Microsoft.NET/Framework/v4.0.30319/ngen.exe
not foundI personally like this winetricks error message:
It is just bad German which does not make any sense.
Okay, since ADE 4.5 does not result in a running installation, I purge my wine data for the hundredth of time and install ADE version 1.7. Maybe the newer wine makes it work.
No luck also this time: Error #2038 was waiting for me at the end of this road as well.
Did I already mention my level of despair?
Well, I downloaded the installation exe for ADE 3.0 from Adobe and gave it another try.
ADE 3.0 needs .NET and its setup routine links me to a download which would take six Gigabytes to download which I cancelled.
Purge again, restart on installing ADE 4.5 and its dependencies again.
This time, I switched to 32 bit wine because this web page told me to:
rm -fr ~/.wine export WINEARCH=win32 export WINEPREFIX="/home/USER/.wine" winecfg
On the way, I came across following nice error message from .NET I had to show you:
Installation routine sill complains about not having .NET and winbind (samba). However, this time, ADE starts or tries to start up with following error:
So ADE thinks that it is already running when I start it the first time. This is another well-known ADE error message according to many forum threads which does not seem to help at all.
Then I purged again and tried ADE 3.0 but failed at installing .NET 3.5 SP1.
I gave up at this point. I invested endless hours of work for shitty software which was developed only to prevent consumers to consume ebooks they are suppose to own, which they don't. In my case, it's really renting books from the library but my point stays the same.
It really seems to be the case that the industry is forcing users of Linux to obtain ebooks from illegal sources. You just have to know a nice URL, download the DRM-free EPUB file and start reading. No hazzle, it just works. Isn't that a shame?
I could silently forget ADE and start using illegally obtained ebooks. Instead, I sat down and wrote about it in order to make a change to this situation. Maybe.
In frustration, I tweeted a bit:
To my surprise, Adobe responded to my tweets:
I also wrote this blog article so that I can send them my whole journey of pain.
I keep you updated, if something changes.
2017-12-18: no answer from Adobe yet. I gave up and use my business notebook running Windows 10 to download the EPUB files and use Calibre to sync my eBook reader at home. What a shame for Adobe.
Thanks for your post, which has saved me from repeating (for the second or third time in a year or two) more-or-less the same disappointing process you so eloquently describe. I, too, happily used ADE on Linux for a few years to borrow books from my (US) library account. Then, it started getting the same error you describe.
One odd detail, however: I CAN download *.acsm titles from commercial publishers (both US and German) ... it's only my library account that fails. I wonder if the hosters for library accounts have collaborated with Adobe to identify and block ADE running on wine at the host end? My Christmas diversion may be sniffing traffic on the two cases to see if I can figure out what's the difference at the TCP level ... OTOH, if they're using SSL (probable!), this idea will fail, too.
I was successfully running ADE under Windoze7 on Oracle (ugh!) VirtualBox for a year or two, but unfortunately while I was abroad the last couple of months, the hard disk with that config died, so I'm back to pure Linux unless and until I can re-install Win7 under VirtualBox (or kvm, if I'm brave enough to give it a whirl).
As of 2020-02-08, my workflow does not include Adobe Digital Editions (any more).
However, the method relies on the already extracted key file which I was able to get with a working ADE in a previous point in time. I do have no idea how to get the key file without ADE - sorry. If you do, please do comment below!
If you do have the key file, you may use this method to lend books. I'll explain my workflow using my local municipal library:
<SDCARD>/Android/data/com.obreey.reader/files/download_drm/
download_drm
directory so that
the epub is then syncronized to my desktop computer.This process enables me to read DRM-encrypted epubs without relying on ADE.
Penguin submitted a Disqus comment but since probably most people don't activate the Javascript below to see the Disqus comments, I'll add this here:
Thank you, here the same problem with Thalia.de Since years, I downloaded with ADE and imported automatically in Calibre. Now I got this error too.
New Workflow: Download the.acsm
file on an Android phone and open it with PocketBook which is able generate an.epub
file in the directory you mentioned. PocketBook needs an Activation of the AdobeID for this.
The*.epub
is still crypted but Calibre import can handle it and strips the DRM with the help of the DeDRM plugin (yes it needs an extracted AdobeID, but I have it because imports from ADE used it too.)
I did not test this workflow myself yet since I stopped using DRM with epubs altogether because I did not invest time to find out how to setup the newer Calibre after the major plugin-changes.
Du hast deinen Weg auf meine Homepage gefunden und interessierst dich für Personal Information Management (PIM)? Wunderbar. Hier ein kleiner Überblick zum Einstieg, da ich schon recht umfangreiches Material gesammelt habe.
Da das Thema PIM sehr viele Menschen interessiert, sind die meisten Artikel zu dem Thema in englischer Sprache. Ich hoffe, das ist kein Problem. Ansonsten gibt es ja noch entsprechende Übersetzungsdienste.
Auf How to Use This Blog Efficiently erkläre ich, wie man über neue Artikel benachrichtigt werden kann, wie die Navigation über Schlagworte (tags) funktioniert und was die verschiedenen Seitentypen (temporal, persistent, tag pages) unterscheidet. So findest du dich mit der hier verwendeten Blogging-Software zurecht.
Du wirst am besten mit meiner Tag-Seite zu PIM beginnen, wo du alle Artikel findest, die mit "PIM" verschlagwortet sind. Hier habe ich auch vor der Artikel-Liste einführende Worte zu dem Thema formuliert.
Ich denke, du wirst auch andere Schlagworte interessant finden. Beispielsweise sind auch meine Seiten zum Thema Privatsphäre spannend zu lesen. Unter anderem diskutiere ich hier die meiner Meinung nach dringend zu klärenden Voraussetzungen, bevor du deine persönlichen Daten in die Cloud kopierst. Die negativen Auswirkungen von einer zu laschen Handhabung von persönlichen Daten sind leider viel zu unbekannt.
Gibt es Interesse an PIM-Kursen oder Vorträgen, so findest du meine Kontaktmöglichkeiten auf der About-Seite. Meine bisherigen Vorträge und Medienauftritte sind auf dieser Seite zusammengefasst. Ich komme auch gerne zu einschlägigen Veranstaltungen.
Viel Spaß beim Lesen und Lernen!
Du hast deinen Weg auf meine Homepage gefunden und interessierst dich für Personal Information Management (PIM)? Wunderbar. Hier ein kleiner Überblick zum Einstieg, da ich schon recht umfangreiches Material gesammelt habe.
Da das Thema PIM sehr viele Menschen interessiert, sind die meisten Artikel zu dem Thema in englischer Sprache. Ich hoffe, das ist kein Problem. Ansonsten gibt es ja noch entsprechende Übersetzungsdienste.
Auf How to Use This Blog Efficiently erkläre ich, wie man über neue Artikel benachrichtigt werden kann, wie die Navigation über Schlagworte (tags) funktioniert und was die verschiedenen Seitentypen (temporal, persistent, tag pages) unterscheidet. So findest du dich mit der hier verwendeten Blogging-Software zurecht.
Du wirst am besten mit meiner Tag-Seite zu PIM beginnen, wo du alle Artikel findest, die mit "PIM" verschlagwortet sind. Hier habe ich auch vor der Artikel-Liste einführende Worte zu dem Thema formuliert.
Ich denke, du wirst auch andere Schlagworte interessant finden. Beispielsweise sind auch meine Seiten zum Thema Privatsphäre spannend zu lesen. Unter anderem diskutiere ich hier die meiner Meinung nach dringend zu klärenden Voraussetzungen, bevor du deine persönlichen Daten in die Cloud kopierst. Die negativen Auswirkungen von einer zu laschen Handhabung von persönlichen Daten sind leider viel zu unbekannt.
Gibt es Interesse an PIM-Kursen oder Vorträgen, so findest du meine Kontaktmöglichkeiten auf der About-Seite. Meine bisherigen Vorträge und Medienauftritte sind auf dieser Seite zusammengefasst. Ich komme auch gerne zu einschlägigen Veranstaltungen.
Viel Spaß beim Lesen und Lernen!
Wir waren mal wieder im Kabarett. Alfred Dorfer ist jedem Österreicher aus Kabarett, TV und Film wohlbekannt. Seit 2017 spielt er sein Programm "und…". Ich ging vollkommen ahnungslos in das Programm - ich wusste nicht, was mich erwarten wird: politisches Kabarett oder ein typisch österreichisches Kabaretttheater à la Haders "Privat" oder auch frühere Dorfer-Programme, die in diese Richtung gingen oder Musikeinlagen oder etwas anderes.
Es war - bis auf die fehlenden Musikeinlagen - ein bunter Mix aus vielen Elementen, würde ich sagen. Hier ist mein Bericht.
Wir waren mal wieder im Kabarett. Alfred Dorfer ist jedem Österreicher aus Kabarett, TV und Film wohlbekannt. Seit 2017 spielt er sein Programm "und…". Ich ging vollkommen ahnungslos in das Programm - ich wusste nicht, was mich erwarten wird: politisches Kabarett oder ein typisch österreichisches Kabaretttheater à la Haders "Privat" oder auch frühere Dorfer-Programme, die in diese Richtung gingen oder Musikeinlagen oder etwas anderes.
Es war - bis auf die fehlenden Musikeinlagen - ein bunter Mix aus vielen Elementen, würde ich sagen. Hier ist mein Bericht.
Es gab eine kleine Rahmenhandlung als roter Faden. Es gab auch eindeutig politische Statements. Allerdings waren diese durchaus anders, als man sich als links-linker Bobo (ein vielbeschworenes Klischee eines typischen Kabarettbesuchers) das so vorgestellt hätte. Es bekamen zu verschiedenen Situationen fast alle politischen Parteien und Haltungen einen unliebsamen Spiegel vorgehalten. Nichts davon war platt.
Seltsamerweise kam ausgerechnet in Graz die KPÖ nicht vor. Es scheint, als wäre der politische Erfolg noch zu klein, um hier einen Platz zu verdienen.
Etwas polemisch aber nichtsdestotrotz zum Mitlachen waren allerdings die vielen Seitenhiebe gegen unsere deutschen Mitmenschen - besonders im Bereich vom Bildungssystem, das Dorfer sehr offen kritisierte und nicht zuletzt auch durch eine Lehrverpflichtung auf der KF Uni eigene Erfahrung einbrachte.
Generell streute der Künstler jede Menge hochphilosophischer Statements ein, bei denen meine dadurch angeregten Gedanken gerne noch länger verblieben wären, als schon das nächste perfekt servierte Gustostückerl durchaus mit straffem Tempo daherkam. Ein guter Grund, das Programm nochmal auf Video zu schauen und dabei die Pause-Taste zu feiern.
Alfred Dorfer schlüpft in seinem Programm in unzählige sehr unterschiedliche Charaktere. Schon alleine durch seine perfekte Körperhaltung, Gestik und Mimik holt er die ZuseherInnen dabei wunderbar ab. Das unterstreicht sein ausgeprägtes schauspielerisches Talent, das man aus seinen Filmen zwar bereits mitbekommen hat, in diesem Stück eine wohlverdiente Kür bekommt. Was für eine Freude.
Das war definitiv eines meiner besten Kabaretterlebnisse.
Ich kann nur empfehlen, möglichst bald für dieses Programm Karten zu besorgen. Der Erfolg sorgt für rasch ausverkaufte Vorführungen.
Gerne wäre ich noch ein paar Reihen weiter vorne gesessen, denn unser ehemaliger FPÖ Vize-Bürgermeister Mario Eustacchio ("Es gilt die Unschuldsvermutung") saß im Publikum. An vielen Stellen dürfte er nichts zu Lachen gehabt haben. Dorfer zeigte nicht nur Genialität sondern auch Haltung. Sollte öfters in dieser Form passieren.
Google has quite of a story when it comes to services and apps for messaging:
Somebody could say that this is what you get when people get promoted when they "ship something to the customer" totally neglecting whether or not this "something" has meaning, value or other positive aspects. This is a general cultural issue of the Silicon Valley.
Google was on the right track in my opinion when they worked on Google Wave. It was planned as a federated open protocol with open source code published. This way, each company, organization or community was able to set up their own instance that talked to all other instances. Just like the email infrastructure.
For the first time, I thought that this had the potential to replace business email services in the long run. The technology involved was awesome and highly collaborative work was extremely well supported. In this direction, I've never seen anything better ever since.
Then Google discontinued the development out of the blue and moved the code to the Apache Foundation. It entered a slow but steady death road until it was finally declared dead in 2018.
There are no specific descriptions on the new stuff Google is going to release. My prediction is that this is going to be either dead on arrival or a bit later or it is going to be a niche product for some time.
Considering the market power of Google, the whole story is a declaration of failure.
2022-06-08: The German heise news features a nice article on the latest sundowns of Google message services which is fun to read. The list can now be extended by:
If anybody is telling me about a brand new chat service by Google, I'll just laugh hysterically while going away.
2023-11-10: Yet another bunch of message services from this Mastodon message based on arstechnica:
Google has quite of a story when it comes to services and apps for messaging:
Somebody could say that this is what you get when people get promoted when they "ship something to the customer" totally neglecting whether or not this "something" has meaning, value or other positive aspects. This is a general cultural issue of the Silicon Valley.
Google was on the right track in my opinion when they worked on Google Wave. It was planned as a federated open protocol with open source code published. This way, each company, organization or community was able to set up their own instance that talked to all other instances. Just like the email infrastructure.
For the first time, I thought that this had the potential to replace business email services in the long run. The technology involved was awesome and highly collaborative work was extremely well supported. In this direction, I've never seen anything better ever since.
Then Google discontinued the development out of the blue and moved the code to the Apache Foundation. It entered a slow but steady death road until it was finally declared dead in 2018.
There are no specific descriptions on the new stuff Google is going to release. My prediction is that this is going to be either dead on arrival or a bit later or it is going to be a niche product for some time.
Considering the market power of Google, the whole story is a declaration of failure.
2022-06-08: The German heise news features a nice article on the latest sundowns of Google message services which is fun to read. The list can now be extended by:
If anybody is telling me about a brand new chat service by Google, I'll just laugh hysterically while going away.
2023-11-10: Yet another bunch of message services from this Mastodon message based on arstechnica:
In June 2023, I got invited to give a short talk about local file management at the Worklab 2023 which was organized by mur.at. This time, I used a different idea and talked about a few general concepts and ideas related to this topic. A few things I took from my PIM lecture.
The talk was part of the session "Desire to collect - Tools & Roadmap".
Fortunately, the talk was recorded and got published in October 2023 (44min):
Here are the main topics of my talk with some links:
In June 2023, I got invited to give a short talk about local file management at the Worklab 2023 which was organized by mur.at. This time, I used a different idea and talked about a few general concepts and ideas related to this topic. A few things I took from my PIM lecture.
The talk was part of the session "Desire to collect - Tools & Roadmap".
Fortunately, the talk was recorded and got published in October 2023 (44min):
Here are the main topics of my talk with some links:
To: Falter Verlagsgesellschaft <Leserbriefe@falter.at> Subject: Leserbrief: Falter 41/23 Medien/Lexikon (Time 2023-10-15T14.09)
Im Falter 41/23 S.21 (eine Seite vor einem großartigen Artikel über Kompetenz in Sozialen Median) schreiben Sie:
Gäbe es doch etwas wie Twitter, nur ohne Elon Musk! Seitdem der erratische US-Investor den Nachrichtendienst übernommen hat, driftet Twitter/X hart nach rechts. Jeder hadert damit, aber wohin migrieren? Jetzt ist Bluesky als Alternative aufgetaucht, die Codes, mit denen man sich dort anmelden muss, sind begehrt – und werden sogar um Geld gehandelt. Wird Bluesky, das wie Twitter/X aussieht und sich auch so anfühlt, das nächste große Ding, oder bleibt es nur ein Nischenphänomen wie das umständliche Mastodon? Das lässt sich wohl erst dann sagen, wenn Meta seinen Twitter-Klon Threads in der EU anbietet.
Ich finde es schade, dass ein Medienunternehmen wie Sie das meiner Meinung nach ungerechtfertigte Vorurteil der Umständlichkeit von Mastodon so breit wiederholt. Der einzige fundamentale Unterschied in der Bedienung zwischen Twitter und Mastodon ist, dass man sich bei dem offenen Mastodon ist die verteilte Struktur, die man auch schon von E-Mails kennt. Niemand hat ein vielbeachtetes Verständisproblem, weil man sich für einen E-Mail-Provider entscheiden muss.
Meine Hoffnung wäre, dass die Redaktion keine Vorurteile wiederholt, sondern im besten Fall ihren LeserInnen erklärt, wie etwaige Startschwierigkeiten überwunden werden können.
Große Verlagshäuser starten ihre eigenen Mastodon-Server und stellen ihren RedakteurInnen somit verifizierte Accounts zur Verfügung. Dadurch kommt es auch vermehrt zu direktem Austausch zwischen LeserInnen und dem Printmedium. Sie finden beispielsweise auch unter https://verifiedjournalist.org/ verifizierte KollegInnen im Fediverse.
Falls es doch noch offene Fragen und Unsicherheiten gibt, so komme ich sehr gerne auch von Graz in die Redation vorbei, um den großen Unterschied zwischen kommertiell betriebenen Plattformen wie X oder Bluesky und offenen Plattformen wie die Services im Fediverse näherzubringen.
Disclaimer: ich bin im Vorstand von https://graz.social "Verein zur Förderung ethischer Digitalkultur" und betreibe viel Aufklärungsarbeit bei techniklastigen Themen auf https://Karl-Voit.at
Meine Adresse wie bei Leserbriefen gewünscht:
x x x
DI Dr.techn. Karl Voit
To: Falter Verlagsgesellschaft <Leserbriefe@falter.at> Subject: Leserbrief: Falter 41/23 Medien/Lexikon (Time 2023-10-15T14.09)
Im Falter 41/23 S.21 (eine Seite vor einem großartigen Artikel über Kompetenz in Sozialen Median) schreiben Sie:
Gäbe es doch etwas wie Twitter, nur ohne Elon Musk! Seitdem der erratische US-Investor den Nachrichtendienst übernommen hat, driftet Twitter/X hart nach rechts. Jeder hadert damit, aber wohin migrieren? Jetzt ist Bluesky als Alternative aufgetaucht, die Codes, mit denen man sich dort anmelden muss, sind begehrt – und werden sogar um Geld gehandelt. Wird Bluesky, das wie Twitter/X aussieht und sich auch so anfühlt, das nächste große Ding, oder bleibt es nur ein Nischenphänomen wie das umständliche Mastodon? Das lässt sich wohl erst dann sagen, wenn Meta seinen Twitter-Klon Threads in der EU anbietet.
Ich finde es schade, dass ein Medienunternehmen wie Sie das meiner Meinung nach ungerechtfertigte Vorurteil der Umständlichkeit von Mastodon so breit wiederholt. Der einzige fundamentale Unterschied in der Bedienung zwischen Twitter und Mastodon ist, dass man sich bei dem offenen Mastodon ist die verteilte Struktur, die man auch schon von E-Mails kennt. Niemand hat ein vielbeachtetes Verständisproblem, weil man sich für einen E-Mail-Provider entscheiden muss.
Meine Hoffnung wäre, dass die Redaktion keine Vorurteile wiederholt, sondern im besten Fall ihren LeserInnen erklärt, wie etwaige Startschwierigkeiten überwunden werden können.
Große Verlagshäuser starten ihre eigenen Mastodon-Server und stellen ihren RedakteurInnen somit verifizierte Accounts zur Verfügung. Dadurch kommt es auch vermehrt zu direktem Austausch zwischen LeserInnen und dem Printmedium. Sie finden beispielsweise auch unter https://verifiedjournalist.org/ verifizierte KollegInnen im Fediverse.
Falls es doch noch offene Fragen und Unsicherheiten gibt, so komme ich sehr gerne auch von Graz in die Redation vorbei, um den großen Unterschied zwischen kommertiell betriebenen Plattformen wie X oder Bluesky und offenen Plattformen wie die Services im Fediverse näherzubringen.
Disclaimer: ich bin im Vorstand von https://graz.social "Verein zur Förderung ethischer Digitalkultur" und betreibe viel Aufklärungsarbeit bei techniklastigen Themen auf https://Karl-Voit.at
Meine Adresse wie bei Leserbriefen gewünscht:
x x x
DI Dr.techn. Karl Voit
This quite lengthy article explains and discusses the built-in file tagging implementation of Microsoft Windows 10. I do have a strong background with PIM and tagging and this article is written from the human perspective when manually tagging user-generated files. Please do read my general recommendations on using tags in an efficient way.
To my knowledge, Microsoft is currently not actively promoting this feature. Therefore, complaining on bad design decisions does not apply here as long as Microsoft does not understand this kind of tagging as something which was designed to be used by the general user. Because from my perspective, it obviously can't be meant to be used in practice. Unfortunately. Let's take a closer look why I came to this conclusion.
TL;DR: Microsoft Windows does provide NTFS features to tag arbitrary files. Some applications do also merge format-specific tags with these NTFS tags. Although there are quite nice retrieval functions for tags, it is very complicated to use this for general file management. Applied tags are easily lost so that in practice, users will refrain from using native Windows file tagging like this.
Table of contents:
This quite lengthy article explains and discusses the built-in file tagging implementation of Microsoft Windows 10. I do have a strong background with PIM and tagging and this article is written from the human perspective when manually tagging user-generated files. Please do read my general recommendations on using tags in an efficient way.
To my knowledge, Microsoft is currently not actively promoting this feature. Therefore, complaining on bad design decisions does not apply here as long as Microsoft does not understand this kind of tagging as something which was designed to be used by the general user. Because from my perspective, it obviously can't be meant to be used in practice. Unfortunately. Let's take a closer look why I came to this conclusion.
TL;DR: Microsoft Windows does provide NTFS features to tag arbitrary files. Some applications do also merge format-specific tags with these NTFS tags. Although there are quite nice retrieval functions for tags, it is very complicated to use this for general file management. Applied tags are easily lost so that in practice, users will refrain from using native Windows file tagging like this.
Table of contents:
For this article, I am talking about non-collaborative local file-tagging. This describes the process of attaching one or more unique keywords to files stored on NTFS file systems by users who are able to access the file with granted write-permissions via the Windows File Explorer. "Keywords" and "tags" are used as synonyms here.
I could elaborate on tag and tag-system definitions for quite some time but let us stop here for the sake of brevity. It will be a long journey after all.
By default, the Windows UI does not expose anything at all that would help the users to recognize the file tagging possibility. So we do have a more or less full support for tagging files and yet Microsoft hides this quite well from the common eye. Probably for a good reason, which we are going to find out below.
Although I'm very interested in topics related to tagging this feature is that well hidden so that I was not aware of this feature myself until I read about it in a book in 2018. Support for tagging started as early as with Windows Vista.
In order to see and edit file tags, you have to enable "View (Tab) → Details pane" in the File Explorer.
There is a second UI feature you might want to activate: the read-only Tags column is activated by choosing "Tags" in the context menu of the column bar:
When you go through different files, you will recognize that not all file types can be tagged by default. For example, the details pane for a simple text file does not show the "Tags: Add a tag" in contrast to any JPEG image file as shown in the screen-shots above.
Assigned tags are visible in the details pane as well as in the tags column:
Adding or modifying tags is possible in the Details pane but not in the tags column. You will recognize that Microsoft allows tags with spaces and special characters. Multiple tags are usually separated by semicolons which is probably the only standard character which is not allowed within tags.
The last place where File Explorer is showing you the assigned tags and also allows to edit them is within the Properties of a file:
As shown in the screenshots above, tags might be added/removed/modified at two places: either on the "Details pane" (on the right hand side of the File Explorer window) or within the file properties on its "Details" tab.
Now that we have tagged some files, what possibilities are there to use this meta-data in daily life? First of all, there is navigation. For navigating through your files, you might prefer your File Explorer sorted alphabetically by file name:
With tags, you might also sort alphabetically by tags instead:
Since the order of files in the "sorted by tags"-view is depending on the order of tags within the files, I do not consider this a great improvement. However, what is really neat is when you consider the "Group by"-method. Be default, File Explorer is grouping by names:
You can change the grouping in the "View" tab of the File Explorer:
Having switched to "Group by Tags", you will notice that all files are arranged by their assigned tags:
Untagged files are listed in the "Unspecified" category at the bottom. The categories above correspond to the alphabetically sorted list of tags. Each file is listed once for each tag. So if a file like JPEG file 3.jpg
does have two different tags ("Dogs" and "House"), it is listed twice. One time in the category "Dogs" and one time in the category "House". If you select it in one category, this single file gets selected in all categories.
Complementary to file navigation, File Explorer has a search feature implemented. The following image shows the result when you do search for a tag "house" within the folder we've used above:
You will notice that all files are listed in the results that do feature the tag "house" or "House". So search as well as "Group by Tags" is case insensitive when it comes to tags. All other files, not having the "house" tag, are omitted.
When you search for multiple tags, just the files that do contain all of them are listed:
On the negative side, you can not search for keywords that only occur within tags. I would have expected a query language according to the widespread pattern like "tag:dog" which would look for the occurrence of "dog" but only within the tags and not the file name or the content.
So if you're searching for "dog", you will find files that contain the tag dog as well as files that do contain "dog" within their file name:
This File Explorer tag search is not a sub-string search: if you want to find files tagged with "mydog", you can not find them by searching for "dog". However, when you have tagged files with "my dog", you will find them in the search results for "dog" but not within search results for "dogs".
In summary: Searching for tags is:
When you play around with different tags, you will find out that this feature is intended to be used case-insensitive. When you tag a file with "Dog" and "dog", the last one wins and the other gets removed.
When "Arrange by Tags" is used, the tag "Dog" as well as "dog" gets listed in the category "Dog".
When you select multiple tagged files, the Details pane shows only the tags that can be found within all selected files. The other ones are not visualized. You may add additional tags which then gets added to all selected files:
You may remove all tags of one or a set of selected files with "Properties → Details → Remove ...".
This page mentions a context menu function to export the meta-data of selected files to an xml
file. Meta-data from an xml
file could be applied to the files as well. I was not able to find this function in my tests.
In the previous sections I mentioned briefly that only a sub-set of file types may be tagged by default. In my opinion, this is a very tough restriction if you want to use tags for organizing your files.
On a fresh Windows 10 installation, there are not even a hundred file types that may be tagged. When apps get installed like Microsoft Office or LibreOffice, meta-data handlers for additional file formats gets added and configured. On my business Windows 10 system approximately 180 extensions had associated meta-data handlers. After installing LibreOffice on a Windows 10 virtual machine, about 120 extensions were listed as tag-able, approximately thirty of them from LibreOffice alone. I noticed that LibreOffice does not create meta-data handlers for Microsoft formats such as .docx
or .xlsx
whereas handler for older formats are created: .doc
or .xls
.
It is important to know that not all meta-data handlers offer meta-data tagging by keywords. Only meta-data handlers that contain definitions for "System.Keywords" result in the ability to be tagged. Furthermore, not all meta-data handlers that contains keywords/tags offer them also in file properties.
I tried to come up with a minimum list of activated tagging via meta-data handlers. When downloading a fresh Windows 10 virtual machine like that one, you will find some tools pre-installed. In this case, these are many development tools. After manually installing DotNet, LibreOffice 5.4.4, paint.net 4.2.5, all extensions with enabled handlers for keywords/tags are:
.asf .cr2 .crw .dng .doc .dot .dvr-ms .erf .flac .jfif .jpe .jpeg .jpg .jxr .kdc .m1v .m2t .m2ts .m2v .m4a .m4b .m4p .m4v .mka .mkv .mod .mov .mp2 .mp2v .mp4 .mp4v .mp3 .mpeg .mpg .mpv2 .mrw .msi .msp .mts .nef .nrw .pef .raf .raw .rw2 .rwl .sr2 .srw .tif .tiff .tod .ts .tts .uvu .vob .wdp .weba .webm .wma .wmv
I did not mention all well-known LibreOffice formats that were also in the list.
As you can see, most of these activated file types do not reflect bug relevance for the average user. Selected extensions that do not have handlers or no handlers that provide tagging:
.avi .docx .exe .gif .lnk .mp3 .png .wav .css .csv .epub .gz .html .json .java .txt .wmf .xhtml .xlsx .zip
Therefore, there are many file types which may be used on any given Windows machine that can not be tagged by default.
After we have found out that it would be nice to have more file formats enabled for tagging, how are we able to enable meta-data handlers ourselves?
The answer lies within a project called FileMeta. You can download the latest release on their release page. Installing this tool requires administration permissions. I totally recommend the documentation pages for learning about details on this topic in general.
After installing FileMeta, you will find multiple executables in its install directory: FileMeta.exe
, FileMetaAssoc.exe
and FileAssociationManager.exe
.
Most things can also be done on the command line. For configuring the tagging functionality, we'll stick to the graphical FileAssociationManager.exe
for this article. After starting up the File Meta Association Manager you will see three main parts of the UI:
The list of the file extensions are read from the Windows registry. If you can not find a specific file extension in the File Meta Association Manager, no application has registered the file extension so far. If you do associate a file extension with an application ("Always open with ..."), this does not create a registry entry. Therefore, associating an extension with an application is not sufficient that this extension gets listed in the File Meta Association Manager.
To add an extension not listed yet, you have to start the registry editor with administrator privileges, go to "HKEY_LOCAL_MACHINE" → "SOFTWARE" → "Classes" and choose "New → Key" from the context menu.
Then you can enter your new extension like, e.g., .org
and confirm with the return key. After restarting the File Meta Association Manager you'll find the new extension in the list.
My File Meta Association Manager lists two pre-defined profiles: "Simple" and "OfficeDSOfile". The latter seems to be set up by LibreOffice. The "Simple" profile has a few properties set up for "Preview Panel", "Details tab in Properties" and "Info Tip":
If you would like to set up a new custom profile, you have to know:
You can't have Details pane without preferences Details tab. Both settings enable the tags shown in the column bar.
Therefore, a minimal custom profile for tagging where you can see the tags in the Details tab looks like that:
Such a profile results in a File Explorer view like that, where you can edit tags in the preferences as well as in the Details tab:
Whenever you change meta-data handlers, you will probably going to restart the File Explorer via the "Restart Explorer" button of the File Meta Association Manager in order to apply changes.
After setting up a custom meta-data handler for file extensions, you can see them also in the command line tool FileMetaAssoc.exe
:
c:\Program Files\File Metadata>FileMetaAssoc.exe -l .txt Simple File Meta Property Handler c:\Program Files\File Metadata>
As mentioned briefly before, some applications do create meta-data handlers for file extensions when being installed. For example, LibreOffice is creating handlers for their document formats as well as some formats from Microsoft such as .doc
or .xls
but not .docx
or .xlsx
.
Programs like LibreOffice Writer or Microsoft Word do provide meta-data within the preferences of an open document.
You are able to enter tags within the document properties:
These tags can now be seen in the file properties (Details tab) as well as in the tags column. Because of the missing "System.Keywords" in the profile for the "Preview Panel", the tags are not shown in the Details tab of the File Explorer:
Here is the File Meta Association Manager profile "LibreOffice property handler" as set up by LibreOffice:
It's interesting to see that the "LibreOffice property handler" is not visible in the File Meta Association Manager profiles. So I tried to overwrite the "LibreOffice property handler" with the "Simple" profile. To my surprise, this happened:
Yes, this makes sense after all. After confirming this dialogue, the File Meta Association Manager window was gone. I thought that this action was not successful and the app crashed. After restarting the application, I noticed the successfully merged profiles for the .odt
extension.
Unfortunately, in contrast to my expectations, there was no change: no tags visible in Preview page of File Explorer and tags in Details tab can not be changed, only viewed. So this was not a success after all: I still can not modify tags for LibreOffice Writer files outside of LibreOffice Writer file preferences although they can be seen in File Explorer.
So I started to create some non-native LibreOffice Writer documents: .doc
and .docx
. For .docx
files, there were no document property tags visible in File Explorer: not in Preview pane, not in tags column and not in the file properties.
Different story with the .doc
files though: Here, the document property tags are synchronized with the NTFS meta-data. Whenever a tag is added or changed in the file properties, the same change appears in the LibreOffice Writer document properties and vice versa. However, there are no tags/keywords visible in the Preview pane.
This tag synchronization mechanism has a minor issue: when you do not create a .doc
file from within LibreOffice Writer or Microsoft Word but with a text editor, there is no within-file meta-data preferences yet. This results in an error message when you want to tag a zero byte .doc
file in File Explorer:
When you do select "New → Excel Spreadsheet" in File Explorer with Microsoft Office installed, it does not create a zero byte file as with Word files using the same method. Instead, it fills the spreadsheet file with a seven kilobyte default content. This way, you won't get this error message for Excel files in this situation.
Related to this, you can read on the FileMeta FAQ for PDF files:
If I add the File Meta Property Handler for PDF files, will I see properties already in those files? No, unless you are using version 1.4 and are extending an existing property handler for PDF files. File Meta has no capability otherwise for reading properties held within the PDF formatted part of the file. File Meta always writes properties in an NTFS-provided annex to the file. [...] The bad news is that File Meta before version 1.4 will not read properties held in the type-specific formatted part of a file, and no version of File Meta will update such properties.
To make this even more complicated, you have to know that Windows supports tags for every file type, internally. They will not be visible in the properties section of that file, but when you search for those tags, the file appears in search results.
After all these experiences I can only sum up my experience with: it's very complicated. The end-user can not expect tags/keywords to be visible in the File Explorer. She is not able to know if document preference keywords are synchronized to the NTFS meta-data. If there are tags visible, they may not be able to be managed on the Preview pane or the file preferences. File Explorer search seems to find all keywords so far. However, you don't know that a specific file was found because of a tag or anything else since this visualization is missing.
You can read about the history of this feature and some technical details on this page. Basically, NTFS stores the meta-data within an Alternate data streams (ADS). This is quite similar to how Apple stored meta-data in HFS+ and probably also within AFS. I was using the color labels of OS X up to Leopard. They ended up as file-system based meta-data as well.
You can read on this Wikipedia article:
In Apple's macOS, the operating system has allowed users to assign multiple arbitrary tags as extended file attributes to any file or folder ever since OS X 10.9 was released in 2013, and before that time the open-source OpenMeta standard provided similar tagging functionality in macOS.
I do think that the average reader does agree that using tags with this Windows 10 feature is a drag from the user experience point of view already. I do have sad news: this now even gets worse.
Since meta-data are stored in NTFS data streams, you are losing all of the tags when files get moved to someplace where there are no NTFS data streams or when applications generating files do not respect them properly. As a consequence, there are many possibilities where meta-data gets lost. Here is a list of the most obvious ones.
As I've read on reddit, modifying the NTFS tags of a file also modifies its hash sums which renders some workflows that rely on stable content hashes useless.
After being enthusiastic when I found out that Microsoft provides a native file tagging ecosystem with Windows, I had to take a closer look. This enthusiasm was replaced by a disillusion. Everything related to file tagging is hidden from the common user by default. Enabling it results in manual labor not only for the UI but also for each and every file extension separately. Although there are some nice retrieval features for navigation, search does not differ between keywords in tags and keywords anywhere else. It is not entirely clear to me how file-format-specific tags interact with the NTFS tags. Finally, when you did invest some time for tagging files, there is a high chance of losing all this meta-data sometimes without even realizing it.
If Microsoft would act in a way that somebody would be thinking that this tagging feature is ready for production, it would qualify for my bad design decisions series. For me personally, I'd never invest anything in using this feature mainly because of the many ways of losing meta-data without noticing. My current approach for tagging is described on this article. It's an OS-independent and app-independent method with very nice features like TagTrees you can not find elsewhere.
If you would like to get an overview on other non-file-system-based tagging solutions, you can read the bachelor thesis "Marktübersicht von Tagging-Werkzeugen und Vergleich mit tagstore" which can be downloaded at the tagstore page. It's in German language and it reflects the situation of the year 2013.
Before writing this article I needed to implement a necessary feature for my blogging system beforehand. With this, you are now able to click on the screenshot previews to see them with their original size. So this article was in my personal pipe-line for over a year. As a consequence, early findings and screen shots from 2018 are based on Windows 10 Pro version 1803 OS build 17134.165
whereas the most current ones from 2019 are based on Windows 10 Enterprise Evaluation 1809 OS build 17763.805
.
Congratulations for following this very long blog article until its end. I hope I could teach you something on Windows 10 functions and help you decide on its usefulness for your situation. Drop me a line in the comments below when you do have some questions or remarks.