KDE's Semantic Desktop

The search for a meaningful representation of data

There was a recent discussion on the Sugar mailing list that whether files need to die. It included a reference to the following article: http://radar.oreilly.com/2011/07/why-files-need-to-die.html. The computers need to understand the semantics of human language. There needs to be a new view of the data so that you may search for it ways more meaningful to you.

The first time I had tried desktop search was with Google desktop on a windows machine. The machine did not even have enough disk space for the indexing and I dropped the experiment. In any case it would not have been very useful as I was normally using Linux. There was a lot of excitement about beagle in Gnome desktops. Again on the resource constrained systems, it wasn't worth the overheads and I would disable it.

By contrast, my desktops were now running indexing servers and I rarely became aware of them.

I normally use KDE desktop these days but do use Gnome as well. I was intrigued by the processes like tracker-store, virtuoso-t. Most likely, I became aware of them just after an upgrade as one or the other would consume a lot of cpu time. Both were related to desktop search. Tracker is an external dependency of Gnome desktop. Virtuoso organises the metadata needed by KDE's Nepomuk.

Both Gnome and KDE enable indexing of the documents. Hence, the software was creating an index for my desktop without a significant overhead. But which applications were using this data?

In this article, you will get an idea of the current state of the ambitious Nepomuk project. KDE's semantic desktop is currently evolving. Some parts are now useful. However, the utility will be far greater as more applications get integrated with Nepomuk.

KDE Nepomuk

You can control the desktop search option in the system settings. The primary options are whether to enable 'Nepomuk Semantic Desktop' and 'Strigi Desktop File Indexer'. Additional options allow you to control the memory used and the directories you may wish to index.

Dolphin, the default file manager for KDE, is integrated with the Nepomuk project. It allows you to add metadata to the files. You do it by creating tags and associating them with files. You can also add comments to files.

How do you then use them?

Using the tags is easy. Dolphin will give you the option to filter the results based on tags. This is in addition to filtering them by type of document and date of document.

It also has the option of filtering the files on the basis of a rating. Ratings are commonly used in the music playing applications like Amarok. So, if you have given a rating in Amarok, will it be available in Dolphin? Sadly, not so far. Amarok is not integrated with Nepomuk yet and uses its own database. Hopefully, the integration will happen soon!

Dolphin's find menu has options to search for file names and content. The latter relies upon the strigi indexing. Any comments you may have added to a file will also be searched if you use the content option.

Krunner, the tool for searching and launching applications is integrated with Nepomuk. You can enable Nepomuk Desktop Search plugin in Krunner. It will then find not just applications but also files which may be relevant.

As you would have noticed that Nepomuk is integrated with Strigi for analysis and indexing of files but the process name is virtuoso_t. Virtuoso is the backend storage server used by Nepomuk for the RDF( http://en.wikipedia.org/wiki/Resource_Description_Framework ) data. It is now the default for Nepomuk.

Nepomuk uses Strigi libraries. So, do not install the standalone Strigi package, which has its own indexing daemon and query tool but does not integrate with the Nepomuk environment.

KDE Akonadi

Akonadi server has a long history. It stores the data needed for personal information management(PIM), e.g. email addresses, chat contacts, email attachments, email contents, etc. PIM applications use Akonadi for handling personal information. Akonadi server stores/caches, the data in a mysql database and also passes it to Nepomuk for analysis, indexing and storage of RDF data.

Integration of KDE's PIM applications with the semantic desktop environment is in progress. At the time of writing, e.g. kmail2 was included with Arch Linux. Fedora 15 is staying with kdepim 4.4, waiting to make sure that the upgrade is stable and smooth. Both are distributions use kde 4.6.

It was very easy to create a kmail2 account using gmail's IMAP settings. The IMAP data is cached by akonadi. Installing the akonadi-googledata package made it possible to easily add the address book of google account by just giving the email address and the password.

Unlike earlier versions, kmail2 relies on akonadi for storing local messages as well. The database is maintained in $HOME/.local/share/akonadi.

Searching with Krunner will show email messages and contacts as well as files.

At present, there does not seem to be an application for making complex queries using SPARQL easily. You can build and install nepomukshell. The possibilities of what can be done are illustrated in http://trueg.wordpress.com/2011/04/05/nice-things-to-do-with-nepomuk-%E2%80%93-part-two/ .

Will Nepomuk succeed?

When it does, you may be able to find a document by recalling that it was sent as an attachment by your boss around the same time as you were planning a mountain trek.

What if cloud based computing succeeds. The role of the desktop reduces. But even in that case, you may still use Nepomuk-like functionality but provided by the cloud service.