3 : Preserving and Searching Personal Information

Data has a way of growing. As it gets larger, it is harder to extract information from it. I recall an advice early in my career that the best way to prevent giving information is to give reports with lots of data!

We have our list of items which we have loaned. It keeps getting longer. We need to manage it better. The issues are how do we store it and how do we retrieve what we need. Python's dictionaries are great. We can access what we need using a key and the values can be quite complex. Since we have learnt dictionaries to some extent, it would be nice to build on top of what we have already learnt.

Simple Databases

Python supports several simple databases, e.g. Berkeley DB or gdbm, which are available on Unix systems. Python allows them to be used just like a dictionary. E.g.

>>> import anydbm as dbm

>>> newdb = dbm.open('mydb.db','n') # specify the file name and that it is a new database

>>> newdb

{}

>>> newdb['a'] = 'apple'

>>> newdb['b'] = 'boy'

>>> newdb['c'] = 'cat'

>>> newdb.close()

>>> olddb = dbm.open('mydb.db')

>>> olddb['b']

'boy'

As we mentioned, Python supports a number of dbm's. The module 'anydbm' selects the most appropriate one available. We have imported 'anydbm' with an alias 'dbm'. This is, of course, very useful for packages with long names.

The usage is exactly like a dictionary. The disadvantage is that these dbm's require that both the key and the value must be simple strings. We want to store objects. Pickle module actually converts an object into a string. Python offers a package 'shelve', which extends the dbm's and converts the value objects to strings using pickle for storing in the dbm. The key must still be a string. We can live with that constraint.

Version 2 of the Loans Module

We will need to keep track of our friends and our items which have been borrowed. We write the following code in 'loans_v2.py'. The code is modelled on the one in previous article. We will, however, create more classes so that the code is easier to understand. Let us add a class for a friend:

class friend:

def __init__(self, name, phone=None, email=None):

self.name = name

self.phone = phone

self.email = email

Now let us add a class for the item with the methods for loaning and returning an item:

class item:

def __init__(self, item_name, borrower = None):

self.item_name = item_name

self.borrower = borrower

def loan_item(self, borrower):

self.borrower = borrower

def return_item(self):

self.borrower = None

Finally, let us add a loans class which makes use of friends and items and manages them using simple databases:

import shelve

class loans:

def __init__(self, friends_db, items_db):

self.friends = shelve.open(friends_db)

self.items = shelve.open(items_db)

def close(self):

self.friends.close()

self.items.close()

def add_friend(self, short_name, name, phone=None, email=None):

self.friends[short_name] = friend(name, phone, email)

def add_item(self, item_code, item_name):

self.items[item_code] = item(item_name)

def loan_item(self, item_code, short_name):

it = self.items[item_code]

it.loan_item(short_name)

self.items[item_code] = it

return 'Loaned to ' + self.friends[short_name].name

def return_item(self, item_code):

it = self.items[item_code]

it.return_item()

self.items[item_code] = it

We can test our module as follows:

>>> from loans_v2 import *

>>> my_loans = loans('friends.db', 'items.db')

>>> my_loans.add_friend('ms', 'Meena Shanbhag')

>>> my_loans.add_item('2001', '2001, A Space Odyssey')

>>> my_loans.loan_item('2001','ms')

'Loaned to Meena Shanbhag'

>>> my_loans.close()

We recall that the '__init__' method is called whenever an object of a class is created. We pass the names of the two dbm's we need while creating a loans object. We can add as many friends and items as we like in any order and loan the items or return them.

We must not forget to save the changes by calling the close method on the my_loans object just before we finish. So, other than creating/opening the dbm's and closing them, there is nothing special in this code from what we have done earlier.

We reorganised (refactored) the code, making 'friend' and 'item' into separate classes to make it easier to understand. We needed to add a 'short_name' for a friend and an 'item_code' to use as keys for our dbm's. This is actually more convenient as we do not have to type the long names. Another small improvement is that we retain an item and keep track whether it is borrowed or not.

Let us verify that the data we have entered is still present.

Not surprisingly, the 'keys' method on a dictionary or a dbm returns a list of the keys and it includes both the original and the additional values. If we store a new name with the same short name, the old name will disappear. We could improve the code to prevent that. Let us modify the add_friend method in loans class:

def add_friend(self, short_name, name, phone=None, email=None):

if short_name in self.friends:

return 'Sorry. The short name exists'

self.friends[short_name] = friend(name, phone, email)

return short_name + ' added'

The expression 'short_name in self.friends' returns true if short_name is a key in the dictionary/dbm. The one line method is replaced by 4 lines. Quality of a program depends on how well it handles exceptions and it does take a lot of effort. We can verify that the code works as intended. We will return to exceptions in a later article.

Customising the Description of an Object

When we loan an item, we want to know more information about the borrower. There is a very useful method '__repr__' in Python classes. The best way to know what it does is to see it in action. Let us add __repr__ method to the friend and item classes:

class friend:

def __repr__(self):

return self.name + ', Email ' + str(self.email) + ', Phone ' + str(self.phone)

class item:

def __repr__(self):

return self.item_name

Now let us modify, the 'loan_item' method in loans class:

def loan_item(self, item_code, short_name):

it = self.items[item_code]

it.loan_item(short_name)

self.items[item_code] = it

return str(it) + ' loaned to ' + str(self.friends[short_name])

We have changed only the return line. str(it) will return the expression defined in __repr__ method for item class. Similarly, str(self.friends[short_name]) will return the expression defined in __repr__ method of friend class. (You may wish to compare it to the info_friend method in last month's article.) So, if we now loan office to mms, we will get:

>>>print the_loans.loan_item('office','mms')

The Office Space loaned to Man Mohan Singh, Email None, Phone None

>>>the_loans.close()

We have used the print statement because if we were writing the code in a file and wanted to display the return message, it would be required. The call to close before we exit is essential to ensure that the new loan is saved.

Selecting only the Desired Data

I want to know the items I have which have 'space' in the title and whether they have been loaned. So, we define a method get_borrowers in the loans class as follows:

def get_borrowers(self, keyword):

borrowers = {}

for item_code in self.items:

an_item = self.items[item_code]

if an_item.borrower and keyword.lower() in an_item.item_name.lower():

borrowers[an_item.item_name] =str(self.friends[an_item.borrower])

return borrowers

This method expects a parameter, keyword. The method finds all the items for which a borrower exists and the keyword occurs in the item name. We convert both the keyword and the name to lower case for a case-insensitive comparison. We return a dictionary of item name and borrower information pairs. Let us test this method:

>>> from loans_v2 import *

>>> the_loans = loans('friends.db', 'items.db')

>>> the_loans.get_borrowers('space')

{'The Office Space': 'Man Mohan Singh, Email None, Phone None', '2001, A Space Odyssey': 'Meena Shanbhag, Email None, Phone None'}

Python strings have a wealth of operations – try help('str'). The major power of computers comes not from computation but from their ability to manipulate strings and makes them useful in fields as diverse as linguistics, genetics and search engines.

Summary

Using a simple database in Python is no harder than using a dictionary. We refactored the earlier version to make better use of the simple databases available.

We explored some capabilities of strings for selecting information. If we need to search for more complex patterns, we can use 're' module which supports regular expressions. The best place to start if you want to know more about regular expressions is, not surprisingly, Wikipedia.

If a user cannot intuitively deal with our application, it is destined to remain unused. A convenient user interface is graphical. So, in the next instalment, we will explore adding a GUI to the work done so far.


<Prev>  <Next>


Comments