“The way you use data is the way you store it”
In case we need to store large volumes of data, we are accustomed to using a relational database. We rarely look for alternates unless we run into a bottleneck. Even then, we are likely to spend a lot of effort to optimise the database rather than step outside the relational model. Non-relational databases have been around for many years. When object oriented programming became popular, a number of object databases were created but none captured any substantial mind share. Object relational mapping software like Hibernate for Java, SQLAlchemy for Python, ActiveRecord for Ruby, fulfilled the need of using relational databases within the object oriented programming paradigm.
SQL is a wonderful tool for arbitrary queries on a relational database. However, its need may be overestimated by us. For example, when dealing with a content management system, we are more likely to need a keyword retrieval option rather than a flexible SQL query. We use keyword search with Gmail and I have rarely felt the need to narrow the search to, say, the subject only. Even if I search the subject line, I still need a keyword search. I can't recall needing to search where the use of an index on the subject would have been beneficial, e.g. matching a prefix. Hence, a keyword search tool like Apache Lucene (http://lucene.apache.org/) along with any database, whether relational or not, can be a superb solution.
In the last few years, the need for web-scale databases has increased the interest in 'nosql' databases, a misleading term which is now often interpreted as 'not only sql' (http://nosql-database.org/). One category of such databases are the object database management systems (ODBMS) and among them is a native object database for Python - ZODB (http://www.zodb.org/). Object databases provide ACID support. Object databases reduce the friction of having to transform objects into relational table rows and vice versa; thus, improving the efficiency of accessing and manipulating objects. There is no need to map all our information needs into a well defined schema, which can be very difficult at times. Imagine a shopping engine. Each category or even a product group may need attributes which are a unique combination to the product. Do we create a superset of all attributes or do we create a keyword, value pair? Or should we just dump them in a string description and interpret the string at runtime?
ZODB is like a dictionary. It stores data in a key value pair, where the value is a pickled(serialised) object. An object could be a container, which is like a dictionary for storing a very large number of elements.
Let us look at a simple example which would be perfectly suitable for a relational database and see how it may be implemented in ZODB. We have a set of albums and a set of tracks. We may wish to access the tracks and from there, if need be, access the album of which it is available. Or we may access an album and then access the tracks which make up the album. In the relational model, we will need a table each for albums and tracks. We will need a foreign key from a track to an album. And we will need an additional table to maintain the relationship between the album and tracks. Suppose we realise that a track can be on multiple albums, we will need to create one more table for that relationship instead of using a foreign key.
Now, let us see how we do the same ZODB. The initial step is to create/open the database, open a connection and access its root. Let's write this basic code in app_db.py as we will need to use it in each script.
Let us write a script, create_containers.py, to create btree containers for albums and tracks.
The next step is to define the models we need. Let's write them in app_models.py. Each track can belong on multiple albums and each album contains multiple tracks. The only noticeable line is the assignment of _p_changed variable to 1 to tell ZODB that a mutable structure like a list or a dictionary has changed.
Let us create a simple script, store_data.py, to add some tracks and an album.
Finally, we print the data to see how to access the data in zodb. We iterate over each album and each track and print the values of the object. Details flag is used to prevent an indefinite recursive loop.
Working with ZODB is almost as easy as dealing with dictionaries. One can use the Python method isinstanceof to determine the type of an object with which we are dealing and write very versatile and flexible code. ZODB has been around for over a decade and has been used in various production environments though Zope community does not seem to have been successful in marketing it to developers for use outside the Zope (or Plone) environments.
Exploring Software >