5 : Integrating with OpenOffice

The spreadsheet is the bread and butter of financial wizards. If you could just get the data into a spreadsheet, you could be empowered to explore the data in powerful ways.

Another angle is that if you can move your data to a spreadsheet, you do not have to write a gui. It is already written for you. Why reinvent the wheel.

If only it weren't so hard to get started! The problem is that OpenOffice is very powerful and all its features are exposed to the developer. May be with OpenOffice 3, extending OpenOffice for simple, common tasks will become easier. For now, we will take the getting started code on faith.

Getting Started

The first step is that OpenOffice must be started with the option to listen to requests from other programs. So, on the command line,

$ ooffice -invisible '-accept=socket,host=localhost,port=2002;urp;'

Now, we want to open a spreadsheet. We will need to connect to the open office we have started and send a request to open a spreadsheet. The “invisible” option in starting OpenOffice and the “Hidden” option below ensure that the work is done in the background. (Neither is essential.) Insert the following code in a file “open_office.py”:

import uno

from com.sun.star.beans import PropertyValue

def oo_calc(uri='private:factory/scalc'): # if no URI is passed, open a blank calc document

localContext = uno.getComponentContext()

resolver = localContext.ServiceManager.createInstanceWithContext(

'com.sun.star.bridge.UnoUrlResolver', localContext)

ctx = resolver.resolve('uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext')

p=PropertyValue()

p.Name = 'Hidden'

p.Value = True

properties=(p,)

desktop = ctx.ServiceManager.createInstance('com.sun.star.frame.Desktop')

return desktop.loadComponentFromURL(uri, '_blank', 0, properties)

Since everything is invisible, how do you tell that you have succeeded? Well, change the 'p.Value' to 'False' while you are testing. Now, start the python interpreter and run the following commands.

>>> from open_office import *

>>> calc = oo_calc()

If the OpenOffice program is running and all is well, you will have a new calc document.

Working with the Spreadsheet

A new calc document will consist of 3 sheets by default. You will load the data of friends in the first sheet and the loaned items data in the second sheet. Before doing that, add the functions to select a sheet. So, add the following code in 'open_office.py'.

def get_sheet(index, workbook):

""" Return the desired sheet. If the index is wrong, return the 0th sheet"""

sheets = workbook.getSheets()

if index >= sheets.getCount():

print "Warning: Index out of Range. Returning 1st Sheet"

index = 0

return sheets.getByIndex(index)

You can now test this code. Start the OpenOffice with options as above. Now, start the python interpreter and give the commands:

>>>from open_office import *

>>>calc = oo_calc()

>>>sheet = get_sheet(0, calc)

>>>sheet.setName('Friends')

>>>dir(sheet)

The sheet would now have a new name. The 'dir' command will give you information about all the attributes and methods available for the object, 'sheet'. As we had mentioned above, there is a lot of power available to the programmer.

You are interested in changing the values of the cells in the sheet. The method 'getCellByPosition' is promising. Google helps in finding the details of the parameters if you search for 'pyuno getCellByPosition'.

>>>cell = sheet.getCellByPosition(1,2)

>>>dir(cell)

Now, you can see the various options available with the cell object. The methods 'setString' and 'setValue' look promising.

>>>cell.setString('The Price')

>>>cell2 = sheet.getCellByPosition(1, 3)

>>>cell2.setValue(99.95)

You can explore and discover that the names are meaningful - one method for storing text and the other for numeric values. Since Python uses dynamic typing, you can now add a convenience function in 'open_office.py' to store data in a cell depending on the type of data.

def set_cell(sheet, row, col, data):

xCell = sheet.getCellByPosition(col, row)

if type(data) in (type(str()), type(unicode())) : # String/Unicode type?

xCell.setString(data)

else: # assume a numeric value

try:

xCell.setValue(data)

except: # ignore cells with invalid data

print 'Invalid data ', data

Currently, you can assume that the data is either numeric or string. OpenOffice works with unicode strings. So, you will need to handle unicode strings as well. You will also need to handle exceptions because some fields may have no values. You can test your code as before:

>>>from open_office import *

>>>calc = oo_calc()

>>>sheet = get_sheet(0,calc)

>>>sheet.setName('Active')

>>>set_cell(sheet, 0,0, 'This is it')

>>>set_cell(sheet, 1,1, "99.99")

>>>set_cell(sheet, 1,2, 99.99)

You will want to save the work. So, you need to add a function to save the file and will want to allow overwriting an existing file.

def save_document(calc,filename,overwrite):

p = PropertyValue()

p.Name = 'Overwrite'

p.Value = overwrite

properties = (p,)

calc.storeAsURL('file://' + filename, properties)

You can test the method loading data into the spreadsheet as above, saving the file in your home folder and closing (disposing off) the document.

>>>save_document(calc, '~/workbook.ods', True)

>>>calc.dispose()

If you repeat this test, the file will be replaced.

Database to the Spreadsheet

You should reuse the 'loans_v2.py' created earlier. The minimal code needed for this exercise is given below:

import shelve

class friend:

def __init__(self, name, phone=None, email=None):

self.name = name

self.phone = phone

self.email = email

class item:

def __init__(self, item_name, borrower = None):

self.item_name = item_name

self.borrower = borrower

class loans:

def __init__(self, friends_db, items_db):

self.friends = shelve.open(friends_db)

self.items = shelve.open(items_db)

def close(self):

self.friends.close()

self.items.close()

You will make use the two files and write your application code in 'db2oo.py'. The 'friends.db' and 'items.db' should contain the data as discussed in an earlier article. If they do not exist, your spreadsheet will contain only column headings and you can add the data. You will need to create the loans object and the calc document:

from open_office import *

from loans_v2 import *

my_loans = loans('friends.db','items.db')

calc = oo_calc()

You can add a utility function to insert column values into a row:

def add_row(sheet,row, row_data):

col = 0

for data in row_data:

set_cell(sheet, row, col, data)

col += 1

Now, you can insert the data from the friends database into the first sheet:

sheet = get_sheet(0, calc)

sheet.setName('Friends')

add_row(sheet, 0, ['Key', 'Name', 'Phone', 'Email']) # column headings

row = 1

for key in my_loans.friends:

f = my_loans.friends[key]

add_row(sheet, row, [key, f.name, f.phone, f.email]) # add the key and attributes

row += 1

It would be nice to have the column widths of the proper size. So, let us add a utility function, which takes the sheet and the number of columns at parameters, and call it:

def optimise_column_widths(sheet,n):

columns = sheet.getColumns()

for col in range(n):

column = columns.getByIndex(col)

column.setPropertyValue('OptimalWidth',True)

optimise_column_widths(sheet, 4)

Now, we repeat similar code for the items loaned.

sheet = get_sheet(1,calc)

sheet.setName('Items')

add_row(sheet, 0, ('Code', 'Item Name', 'Borrower'))

row = 1

for key in my_loans.items:

it = my_loans.items[key]

add_row(sheet, row, [key, it.item_name, it.borrower])

row += 1

optimise_column_widths(sheet, 3)


Finally, we save the file and close the databases.

save_document(calc, '~/workbook.ods', True)

calc.dispose()

my_loans.close()

Now, you can modify and change the data in the spreadsheet by using OpenOffice and when satisfied, reverse the process.

From Spreadsheet to the Databases

You can start with making changes to the 'open_office.py' file. You will need a function to get the data from a cell.

def get_cell(sheet, row, col):

EMPTY = uno.Enum("com.sun.star.table.CellContentType", "EMPTY")

TEXT = uno.Enum("com.sun.star.table.CellContentType", "TEXT")

FORMULA = uno.Enum("com.sun.star.table.CellContentType", "FORMULA")

VALUE = uno.Enum("com.sun.star.table.CellContentType", "VALUE")

xcell = sheet.getCellByPosition(col,row)

dataType = xcell.getType()

if dataType == EMPTY:

return None

elif dataType == VALUE:

return xcell.getValue()

else:

return xcell.getString()

The type property of a cell tells you about the contents of a cell and then you can extract the data using the appropriate method. Your function will not handle the formulas properly. Interpretation of numeric data is controlled by formatting information. E.g. is the content of a cell a date? You need not handle such complexity at present.

Now, you can write the following code in 'oo2db.py':

from open_office import *

from loans_v2 import *

my_loans = loans('friends.db','items.db')

calc = oo_calc(uri='file://~/workbook.ods')

sheet = get_sheet(0,calc)

row = 1

while True:

key = get_cell(sheet,row, 0)

if key == None:

break

my_loans.friends[str(key)] = friend(get_cell(sheet,row,1), get_cell(sheet,row, 2), get_cell(sheet, row, 3))

row += 1

sheet = get_sheet(1,calc)

row = 1

while True:

key = get_cell(sheet,row, 0)

if key == None:

break

my_loans.items[str(key)] = item(get_cell(sheet,row,1), get_cell(sheet,row, 2))

row += 1

calc.dispose()

my_loans.close()

This code will read rows on a sheet from an existing document until it finds a an empty key column. The key has to be a string so the unicode value returned from the spreadsheet is forced to a string.

You can verify that there is a bug in this code. A row from a database cannot be deleted. If we change the key, an additional row is inserted in the database. If you delete a row from the spreadsheet, the original row still remains in the database. The program should also ensure that the data on the second sheet is consistent with the data on the first sheet before you store it in the database. It doesn't do that at present. You can practice making the program better.

You can know how to do more complex programming with OpenOffice by using the api documentation (http://api.openoffice.org/) along with the PyUNO bridge documentation (http://udk.openoffice.org/python/python-bridge.html). However, you may find it easier to get a working solution by using Google search with keywords 'pyuno' and a suitable keyword from the result of 'dir' on a pyUno object as mentioned earlier.

Finally, let's face it. This solution is clumsy. You do not expect a user to start the oofice in a shell in network mode. Then run the db2oo.py application. After that, modify the spreadsheet document just created. And finally, move the data back into the database using oo2db.py application.

What you need is an application which will manage the flow of the various applications. Python is a great glue language. So, next time, you will learn how to use Python to communicate with the OS and create an integrated work flow.



<Prev>  <Next>

Comments