Optimizing Database Transaction Tests

June 19, 2008

Using the profiler, Jeff and I discovered, as expected, that the unit tests were taking as long as they were because of the setup. The test fixtures involved setting up the in-memory database and populating it with some values. The mechanics of this involved a lot more than one would expect. A mock environment is created, a test database is created, system calls are invoked (which, as we know, take very long), and much else is done.

If the database and environment were setup only once before running all tests, then that would provide a significant speedup.

So we know that nose has hooks like setup_module, setup_class, etc that get run at the respective levels. So the class setup function gets run first and then takes a seat and all the tests run. So we can setup the database and mock environment in that function.

But what about classes that don’t just query the database, but create records and delete records as well? Well my solution to this was to create a savepoint using Alchemy before each test then do a rollback after th test.

A First Solution
import unittest
from elixir import objectstore, session

class MyTestCase(unittest.TestCase):
    @classmethod
    def setup_class(self):
       <setup>

    def setUp(self):
       session.begin()
    def tearDown(self):
       objectstore.clear()
       session.rollback()

    <Test Cases>

But there was a problem with this. If somewhere in the middle of a test I expected an exception to be raised, the exception would cause the session to come out of the sub-transaction I had set up and the rest of the test function would then be running outside the sub-transaction, thus messing up the database for other tests. You get an error from Alchemy that tells you to close the session.

A Second Solution
import unittest
from elixir import objectstore, session

class MyTestCase(unittest.TestCase):
    @classmethod
    def setup_class(self):
       <setup>

    def setUp(self):
       session.begin()
    def tearDown(self):
       objectstore.clear()
       session.rollback()

    def test_x(self):
       self.assertRaises(<Exception>, <callable>, <args>, ...)
       session.close()

    <Other Test Cases>

I knew this was bad but I loved the speedup of the tests so I tried to convince Jeff to allow this. But deep down I knew that it’s horrible to have the programmer worry about the fixture within the tests case function.

So this was my solution: I’ll create an intermediate class between unittest.TestCase and the test class. It’ll override the assertRaises function and close the session if the expected exception is raised.

A Working Solution
import unittest
from elixir import objectstore, session

class TestCase(unittest.TestCase):

    def setup_func(self):
       pass
    def teardown_func(self):
       pass

    def setUp(self):
       session.begin()
       self.setup_func()

    def tearDown(self):
       objectstore.clear()
       session.rollback()
       self.teardown_func()

    def failUnlessRaises(self, excClass, callableObj, *args, **kwargs):
       try:
          callableObj(*args, **kwargs)
       except excClass:
          session.close()
          return
       else:
          if hasattr(excClass, '__name__')
             excName = excClass.__name__
          else:
             excName = str(excClass)
          raise self.failureException, "%s not raised." % excName

    assertRaises = failUnlessRaises

------

from drproject.test import TestCase

class MyTestCase(TestCase):
    @classmethod
    def setup_class(self):
       <setup fixture>

    def setup_func(self):
      
<setup fixture>

    <Test Cases>

But Jeff made a good point in that we want to be able to use the existing API, not have our own functions called setup_func or whatever. Instead of going through the details of how I arrived at this solution, here is the solution itself

The Final Solution
import unittest
from elixir import objectstore, session

class TestCase(unittest.TestCase):

    def __init__(self, methodName='runTest'):
       unittest.TestCase.__init__(self, methodName)
      
       old_setup = self.setUp
       def our_setup():
          session.begin
          old_setup()
       self.setUp = our_setup

       old_teardown = self.tearDown
       def our_teardown():
          objectstore.clear()
          session.rollback()
          old_teardown()
       self.tearDown = old_teardown

    def failUnlessRaises(self, excClass, callableObj, *args, **kwargs):
       try:
          callableObj(*args, **kwargs)
       except excClass:
          session.close()
          return
       else:
          if hasattr(excClass, '__name__')
             excName = excClass.__name__
          else:
             excName = str(excClass)
          raise self.failureException, "%s not raised." % excName

    assertRaises = failUnlessRaises

------

from drproject.test import TestCase

class MyTestCase(TestCase):
    @classmethod
    def setup_class(self):
       <setup fixture>

    def setUp(self):
      
<setup fixture>

    <Test Cases>

Based on the current number of tests, their resource usage, and this computer, the speedup was 5-fold. They’re lightning fast now!

Advertisements

NEWT Update

June 18, 2008

The Database Schema
Here is (probably) the final version of the ticketing system’s new database schema, baring any difficulties in developing the UI.

Files in the New System
The following is a cheap diagram giving a general idea of which files are associated with which functionality on a DrP portal.

The red files are core files, the files under template/ directories have been excluded because it’s all too obvious what they do, and the other files have poor quality images beside them describing which screens they’re associated with. Of course, the admin system has functionality for a lot more than just tickets, so those images have been excluded.

Progress and Next Steps
I have finished:

  1. initial design and functionality decisions
  2. database schema design
  3. back end implementation and core functions
  4. top layer functions in drproject/ticket
  5. testing of core model and top layer functions
  6. fixing whatever newt broke in drproject; this includes email notifications, users’ per-project statistics, and some admin system functions
  7. using profiler to optimize code and tests
  8. using coverage analysis to ensure tests are thorough (this is a gray area but I consider this done)
  9. ensuring code style is compliant with PEP 8

Now I have to:

  1. include functions in the admin system to create ticket fields, edit, delete, etc
  2. test those functions via unittest
  3. move on to build/modify the interface

I anticipate 1 and 2 will collectively take about a week, and 3 will (and might as well) take the entire rest of my time here.

Supplemental

June 16, 2008

Previewing Ticket ChangesSo I solved the problem of name collisions easily by assigning the fields to a dictionary.

web_ui.py is actually coming along now. The problem with assigning objects, as I mentioned before has also been solved, albeit in a roundabout way. The problem, more specifically, is what happens when a user decides to preview a new ticket or changes to an existing one. In the previous version, the ticket was created/updated but no changes were flushed to the database. If it was a preview, the ticket (and its associated milestone) are expunged so the database sees no change while the request object sees this newly updated ticket. Everyone’s happy. But in the new version, creating a ticket is very involved; you have to update it with all the existing fields, many of which may be foreign keys. So you can’t just expunge a ticket. So here’s my roundabout solution (better ones are welcome with open arms):

When a user opts to create a new ticket, the ticket is created in the database. Necessary information is then passed back to the request object, and the ticket is deleted if it was a preview.

When a user opts to update a ticket, the changes are applied in the database and the necessary data is passed to the request object. If it was a preview, we do a rollback and assign the old values of the ticket fields.

I have to test what happens when the user clicks preview multiple times one after the other.

June 16

June 16, 2008

Working from home
It’s about 12:30 right now and I’m working from home because I’m sick. I started work at about 10:00 this morning. Although I’m feeling well enough to think about DrP problems, I don’t think I would’ve been able to give up the comfort of my home and actually travel to work.

A Problem Solved
I don’t work or think about work over the weekend, so the ticketing system was on hold with a certain problem. The HTML sends a request object to Python. The object has many fields including two dictionaries, req.data and req.args. The name of fields are in req.args and their values are stored in different ways (e.g. strings, integers). When I was extracting the values of the fields, sometimes I was getting strings which I could parse (by sending them through the parse functions for the data types), and sometimes they were coming out as objects like Enums (which I couldn’t really send to any parse function).

This caused a problem since the TIcket.setvalue function takes strings. I could try converting the objects I was getting to their string representation, but that turned out to be pretty much impossible (and besides, I would be changing objects to strings back to objects).

So what I did was circumvent the setvalue() function altogether. The setvalue() function changes the value of a ticket field by assigning an object to it, then it creates a TicketChange instance to log the change. So I created a function that assigns the fields from the request object to a ticket all at once and returns a dict of changes. I then used that dict to log the changes.

Name Collisions
I still need to find a place in the req object to put the values of ticket fields. In the current situation, users can create fields called “action” for example. This will collide with the existing req.data[“action”] which is not a ticket field. I don’t think the problem is that serious right now, but it is something I have to look into.

Adopting a DrP Component
Since I adopted the DrP ticketing system, I’ve had the overwhelming feeling that Ticket is now mine! I.e. even though it’s open source stuff, ticket is now in my branch, it’s mine, I will do with it as I please, and no one else can touch it until it’s merged into trunk. I am above the law and free to ignore style conventions etc. When it’s merged into trunk, people may change my style to comply with conventions and I can’t stop them, but until then it’s my world and I’m boss.

That’s bad, right?

10098

June 13, 2008

It seems the following (Python) files contain work that needs to be done once I finish newt:

  • drproject/admin/api.py
  • drproject/userpages/tests/test_api.py
  • drproject/userpages/api.py

That’s a lot fewer than I thought, but I may be incorrect and in for a surprise; you never know.

I’m about half a week ahead of schedule.

My Challenges and Their Resolutions

June 13, 2008

Setting up DrProject on Windows
Problem: I need to use a screen magnifier which has drivers only for Windows.

Solution: So in the first week of my job I spent time modifying the DrProject install and documenting instructions on installing DrProject on Windows. Here’s a more permanent link.

Tools for Developing DrProject
Problem: DrProject is larger than most software code bases I’ve worked with and some big-boy tools were in order.

Solution: Here are the tools I use:

Shortcuts
Problem: Even with the given tools, you need ways to do things quickly.

Solution: I’m working on the ticketing system for DrProject so I’ve added the following line to my ~/.bashrc in Cygwin so I’m where I want to be as soon as I open Cygwin:

cd /DrProject/drproject/ticket/tests

I’ve set up alias in ~/.bashrc as follows:

alias tst='nosetests'
    (-v for verbose output which is often useful)
    (follow tst with the file name to run a particular test file only)
    (follow the filename with :<ClassName>.<test_function> to run a particular test)
alias profile='tst --with-profile --profile-stats-file=<output filename>'
    (here’s how to analyze the data in the output file)
alias setup='/setup.sh'
alias init='/init.sh'
alias go='/go.sh'
    (these aliases point to shell scripts which record the CWD so I can setup DrProject from any directory)
    (here is an example of the contents of one of these scripts; this could be done in other ways)
    #!/bin/sh
  DIR=`pwd`
  cd /DrProject/
  python setup.py develop
  cd $DIR
alias flake='c:/cygwin/Pyflakes/bin/pyflakes'

I find it very useful to fold and unfold functions and classes in Eclipse, which can be done using CTRL+9 and CTRL+0

I also have a macro for IPython to set up a fixture so I can immediately start using the DrProject code there. Using TAB for completion is very useful and can shed light on available functions. “?” following a a class name or function name will give the docstring and “??” will give the code if available. Here’s my macro as an exmaple:

from datetime import *
from drproject import *
from drproject.api import *
from drproject.db.util import flush
from drproject.scripting import *
from drproject.ticket.model import *
from drproject.ticket.api import *
from drproject.ticket.types import *
from drproject.ticket.roadmap import *
from elixir import *
from sqlalchemy import *

env = Environment()
env.configure(‘/cygwin/DrProject/hacking/drp_root/’)
setup_all()

p = Project.query.first()

t1 = TicketSystem().add_ticket(p, ‘me1’)
t2 = TicketSystem().add_ticket(p, ‘me2’)
t3 = TicketSystem().add_ticket(p, ‘me3’)
t4 = TicketSystem().add_ticket(p, ‘me4’)
t5 = TicketSystem().add_ticket(p, ‘me5’)
t6 = TicketSystem().add_ticket(p, ‘me6’)
t7 = TicketSystem().add_ticket(p, ‘me7’)
t8 = TicketSystem().add_ticket(p, ‘me8’)
t9 = TicketSystem().add_ticket(p, ‘me9’)

It can be created using %macro <macro name> <IPython line numbers with the instructions to include; can be given sequentially with spaces between them or as a range sing -; e.g. 1 2 3 4-10>

Then you can save the macro using store <macro name>, and edit it using notepad by ed <macro name>; don’t forget to store it again when you change it.

Managing Imports
Problem: Imports can sometimes cause problems

Solution: Try to avoid from X import * because you want to be explicit in what imports you’re using, and importing everything may lead to cyclic imports and other problems. Use PyFlakes to help with determining which imports are being used, which are unused, etc.

If you get an error like “cannot import X”, you may have a cyclic import. If this cannot be avoided, you can move the import from one of the files down from the top of the file into the function that’s using it.

When using Elixir, you can create ManyToOne, OneToMany, etc relationships. When you do this, you need to pass in the name of the class which represents the table being referenced. E.g. f = ManyToOne('Y', primary_key=True). Even though “Y” is a string here, that class still needs to be imported.

Merging Trunk into your Branch
Problem: SVN’s built-in merging is retarded.

Solution: Use the svnmerge script.

Libraries that DrProject Uses
Problem: DrProject uses a lot of libraries and third-party stuff; learning to use all this is very necessary since these tools make life a lot easier.

Solution: DrProject uses at least the following:

  • the Python tutorial
  • SQLAlchemy for interacting with the database in an object oriented way
  • Elixir for creating the database schema using Python classes
  • Dojo as a layer on top of JavaScript; here are some demos
  • Kid to send Python objects back and forth between Python and HTML files and resolve Python code in HTML

Making Tests Run Faster
Problem: Some tests were just too slow.

Solution: In query.py, all I needed to do was create the database once since I wasn’t doing any updates (just queries). But the setUp() function gets run for each test in unittest.TestCase. You canreplace the setUp() function signature with this:

    @classmethod
    def setup_class(self)
       ...

Eclipse will no longer be able to run the tests correctly, but you can use nose to run them and it’ll run the setup only once per class (not per test method). There is a similar function for module level setup.

Defining Default behaviour in Elixir
Problem: Elixir allows many options (including table names) using the using_options and using_table_options functions. E.g.

class MilestoneChange(Entity):
    milestone = ManyToOne(‘Milestone’, primary_key=True)
    time = elixir.Field(DateTime, default=datetime.now, primary_key=True)
    …

    using_options(tablename=’milestone_change’, order_by=[‘time’])

But some options persist for each class and you don’t want to declare them each time.

Solution: You can import and use (at the top of the file outside each class) the function options_defaults. E.g.

    options_defaults['shortnames']=True
    options_defaults['inheritance']='multi'

Database Polymorphism
Problem: Sometimes you want classes which extend Elixir's Entity class (and hence are part of the database schema) to use polymorphism, inheritance, all that good stuff.

Solution: Simply extend the parent class and remember to set the 'multi' option default. The parent class will automatically be given a a row called 'row_type' which it will use to differentiate between between its descendants.

You can have many layers of inheritance. E.g.

class Field(Entity):
    details = ManyToOne('FieldDescription')
    ticket = ManyToOne('Ticket')

    def parse(self, val):
       ...

    def html(self):
       ...

    def __unicode__(self):
       ...

class PrimitiveField(Field):
    pass

class ComplexField(Field):
    def get_filter_col(self):
       ...

class WikiField(PrimitiveField):
    value = elixir.Field(UnicodeText)

    def parse(self, val):
        ...

    def html(self):
       ...

    def __unicode__(self):
       ...

class TicketField(ComplexField):
    value = ManyToOne('Ticket')
    filter_col = 'id'

    def parse(self, val):
        ...

    def __unicode__(self):
       ...

    def get_filter_col(self):
       ...

Querying Polymorphic Types
Problem: If I had a variable f of type Field and I knew that it was more specifically of type WikiField, I couldn't just assume that; the queries wouldn't work.

Solution: When dealing with polymorphism, use Alchemy's of_type(), with_parent(), and any() functions. Alchemy will do a join with the appropriate tables. E.g.

Assume the Ticket class has a row called "fields" of type Field and I want THE WikiField called X with value Y:
        Ticket.fields.of_type(WikiField).any(and_(WikiField.name==X, value==Y))

If I want ANY field of type WikiField with value Y:
        Ticket.fields.of_type(WikiField).any(value==Y)

Getting Specific Module Members
Problem: I wanted all classes that extended Field in a dictionary that maps the name of the class to the class itself; e.g. 'WikiField' => WikiField.

Solution:
from inspect import isclass
def get_field_classes(cls, members=locals()):
    return dict((c.__name__, c) for c in filter(isclass, members.values())
                if issubclass(c, cls) and c is not cls)

Notice the named parameter "members" assigned locals(). This is done because locals() within the function is different from locals() outside of it.

Merging Dictionaries
Problem: Sometimes I have two Python dictionaries which I want to combine into one big one.

Solution: I use the following idiom to merge dict a and b

c = dict(a.items() + b.items())

Shadowing
Problem: I wanted to call a class Field but Elixir already uses that name.

Solution: So I called my class Field and imported elixir. I then used elixir.Field for elixir's version.

Accessing Table Properties
Problem: I want to access information about a database table.

Solution:

  • list of column names: <Table>.c.keys()
  • a particular column object: <Table>.c.<column name>
  • the value of a column for a particular row: <table_instance>.<column name>
  • the table name: <Table>.table.name
  • the table properties: <Table>.table
  • pretty list of columns: <Table>._descriptor._columns
  • primary keys: <Table>._descriptor.primary_keys
  • natural ordering of the table: <Table>._descriptor.order_by
  • foreign keys: <Table>._descriptor.relationships
  • foreign key table object: <Table>._descriptor.relationships[<#X>].target

Tables with Non-Primitive Values (Foreign Keys)
Problem: If I wanted to dynamically get values from tables, it was fine so long as the `value` field I was looking for was a primitive data type. But if it was a foreign key, I didn’t know which of the columns of the foreign table to return (or do whatever with).

Solution: I added a variable (table column) to the class that held the reference. It was a string which represented the column of the foreign table to use. Its parent class had a function which resolved this string and rturned the appropriate foreign column object.

 I also added a __unicode__ function to the classes which returned the desired column from the foreign table. So unicode(X) where X is of type MilestoneField will return the name of the milestone which X is referencing.

June 12

June 12, 2008

In yesterday’s all-hands meeting, Greg mentioned that documentation is hard. NEWT is quite complex (esp. its queries) and there needs to be some documentation. I’ve been thinking about this and I’ve decided that documentation is easy, but finding the right level of abstraction to document at is an art. And I think that’s what Greg meant.

June 11

June 12, 2008

The Problem
I was working on web_ui.py in which there was a function that takes values from an HTTP Request object and creates a new ticket accordingly. If the value received is for a field of a primitive type, then the task is simple. But if the value is for a complex field (i.e. a foreign key reference), then which column in the foreign table is the value for? If a user selects a particular milestone, is she supplying the name, the due date, the id…?

Jeff taught me a little bit about how the requests actually work; I can ask for type of data, use it to generate other data, and return the generated result. So I can ask for a milestone name and use that to figure out the id and return the id in the HTTP request object.

As I started to do this, I realized that the newt infrastructure was not really designed to implement some of the details behind this. So at the end of the day and after I went home, I started to refactor the data type classes a little bit to make the code cleaner and allow seamless queries.

The Solution

  • At the time, field was an “abstract” class (if I can call it that in Python) and the data types inherited directly from it. But I needed to distinguish between data type classes that held primitive values and those that held f-keys. So I created another level of indirection: Field has descendants PrimitiveFIeld and ComplexField, which in turn have descendatns the data type classes (such as IntegerField for the former, and MilestoneField for the latter).
  • Each data type class has a classmethod function used to parse the default value from Unicode to its data type. Since the function was “static”, it required the project name (and sometimes some other metadata) for data types with f-keys. I changed the function to non-classmethod so that an object could be passed in and the function could derive the project name from the instance.
    • I had to find all the places where this parse function was being used statically. It was being used in two situations: 1) creating a new field and assigning the default value to it, and 2) querying a field. For the first case, I created the field without a default value, I used the newly created field object to parse its own default value, then I assigned that value to the field. For the second case, I used the fetch method to retrieve an object then queried rows that matched that object.

Another Issue with Enums
Admins will be able to create Enums, so enums should be specific to a particular project. So I added a project field to Enums and made it a primary key. I also updated a lot of code to match this new schema.

Those Darn Enums
When I finally finished all this refactoring, I noticed that in query.py the function that decides which input type a field should have in the HTML (e.g. text, select, radio), was failing because of Enums. All data types require only the project name plus some other piece of information to identify them; enums, however, required a project name, a whatfor clause, and a name clause. So things were inconsistent.

To solve this, I enforced the constraint that an enum’s whatfor field must be the same as the name of the field for which it is being used. This means that an enum can be used for only one particular ticket field. E.g. if you want a (high, medium, low) enum, you can create it then use it for only one ticket field, no other field can share this enum. In practice, this’ll mean creating enums when you create the field they’re for. Jeff approves.

Profilers
Mid-yesterday, Jeff was trying to make tests runs faster in DrP. The tests for query.py were very slugish. I had created a class to extend unittest.TestCase and I had created a function to override setUp(). This would get run before each test. But these tests really only required the database to be set up once and then all they were doing were queries; nothing was being changed.

So I added a function called setup_class() which nose runs on the class level, not on the function level. There’s also one that can be run at the module level. But I didn’t get rid of setUp(). I was under the impression that nose would use only the setup_class() function and Eclipse would use only the setUp() function. I had it something like this:

class QueryTestCase(unittest.TestCase):
    setUp(self):
       self.setup_class()

    @classmethod
    setup_class(self):
       <DB setup>…

I was working for weeks actually under the assumption that Eclipse would set up the DB for every test, but nose would do it only once. Jeff showed me how to use the profiler in nose and we discovered, as Jeff had thought, that setUp() was being run on each tests even by nose. It makes sense of course; it overrides the function in TestCase (or wherever) and so it’ll get run on each test. So I got rid of setUp() and kept the other one.

Now, however, I can run the tests using only nose, not Eclipse. Oh well.

Adendum: The 71 tests used to run in about 400s, now they run in less than 7s.

Aliases I Use
Here are some of the contents of my bashrc file:

alias ls=’ls –color’
alias tst=’nosetests’
alias profile=’tst –with-profile –profile-stats-file=foo’
alias go=’/go.sh’
alias setup=’/setup.sh’
alias init=’/init.sh’
alias flake=’python c:/cygwin/Pyflakes/bin/pyflakes’
cd /DrProject/drproject/ticket/tests

where go.sh, setup.sh and init.sh are shell scripts that 1) record the CWD, 2) change into the appropriate dirs, 3) run their respective functions, and 4) change back to the saved CWD.

So I can setup DrP by using `setup; init` and run a local portal using `go`. I can use a the profiler with `profile <filename>` and run unit tests using `tst <-option> <filename>`.

How Querying Works in NEWT

June 10, 2008

June 10

June 10, 2008

I spent Friday, parts of the weekend, Monday, and much of today working on query.py. I was just hammering at it and I didn’t intend to stop until I was done… and now I’m done. This is the class that deals with viewing the tickets and querying them.

Before I started, query.py looked very well written, and parts it actually were, but the more I dirtied my hands in the code, the more I realized that it mitigates extension and inhibits the ability to change.

I changed a function called get_columns which returns the names of the columns that will be shown when viewing all the tickets. There were many hard coded values in there since the programmers who designed it were assuming that tickets would only have a static set of fields. In the TicketQuery constructor, I looked up all the fields associated with a ticket in the given project and used that list instead of hard coding values. I did, however, need to assume that some fields (such as reporter, date created, etc) existed, so I wrote an assertion to check that in the constructor.

Further on, there were a set of functions designed to join the necessary tables for getting information about associated milestones, projects, etc. There was a huge issue here; in the old system, the only foreign references a ticket had were to projects, milestones, and users. Each of these tables has a `name` field so the code explicitly referenced that foreign column by name (if it didn’t find it, an exception was raised). But I have references to attributes of the ticket table, fields with primitive data types (such as integers, dates, etc), and fields with references to other tables (such as milestones, users, tickets). Notice that some of these referenced tables (e.g. Ticket) don’t have a `name` attribute.

But each field data type does have a class associated with it, so my solution was to create a function that each data type class would implement that would return the column object of its referenced table for filtering. So, for example, if a ticket has a field of type milestone and a user wants to query by that field, the data type class will return the milestone’s `name` attribute as a column object to tell me that this is what the user wants to filter/query by. So querying for something that’s in the Ticket table is easy. Querying for a field that holds a primitive type is also easy, but querying for a field which as an f-key was hard and this was my solution to that problem.

In the future, developers of DrP may choose to allow querying on these foreign tables by more than one column (e.g. list all tickets whose X field (which is of type milestone, say) has name A and is due between dates B and C.) This will certainly not be the case for version 1.0, however.

I’ve learned that determining when you should stop adding features to a version is hard.

So that was the issue with filtering. There was a similar, smaller issue with ordering; which column of these foreign tables do you order by when the user asks to order by a field? So say a user asks to display all tickets and order them by some field called X which is of type milestone. Do you order by milestone name, due date, …? I choose to order by the first element in the natural ordering of that foreign table. This is a very sensible solution, it’s very easy, and it’s how it used to be done in the old system (even the old system suffered from this dilema).

Further on, I also added the ability to query using integer operations (e.g. less than, greater than, etc). That was much easier than I thought it would be.

The querying class is very complicated in terms of what the user can do (queries are like that by nature). So the tests (most of which already existed and some of which I added) number around 60-70. They are sol involved, in fact, that I’ve separated them into two files. I spent most of the time during my work on query.py fixing these tests, getting the code to run on them, etc. It was a huge headache but it’s over now. Don’t get me wrong, though; the tests were very well written in terms of abstraction. In terms of concise tests, however, they were quite dense and verbose.

During my changes and debugging sessions in query.py, I looked at some of the SQL generated by Alchemy during the joins that I had implemented. I noticed that there were duplicate joins… if a user were to AND or OR two fields of the same data type, the tables would be joined each time. To correct this issue – the consequence of which may have been extreme sluggishness in production – I constructed a set of unique tables in the execute function and joined them all together before even doing any filtering or ordering-by. This did eliminate duplicate joins, but Jeff didn’t like this since it spread the code that deals with these joins and lookups into two places (once in the execute function and once in the filtering functions).

Normally I yield to Jeff’s suggestion, but this time I didn’t. Granted the code was a little more difficult to follow, but it would save potentially seconds of time over the course of a few queries. Not only that, bu I didn’t agree that the code dealing with joins and lookups was now spread into two places; the execute function joined the tables and the filtering functions merely looked up column objects… I didn’t see the big deal. I understand that readability is probably the number one priority, but this didn’t hamper readability by a lot and its benefits were emence in my mind.

It turned out that the reason the tests weren’t passing at this point was that I was missing a clause in the Alchemy expression for joining tables. I was doing a join (on WikiField for example), retrieving a column (WikiField.text for example), then filtering that column. I didn’t bother to check whether that WikiField was the wiki field associated with the field I was searching for or just some other wiki field I didn’t even care about. To give a more concrete exmaple, I had two FloatField fields in a test project. When I did a greater than, less than or whatever query, I was getting incorrect results. Alchemy was matching ANY of the two FloatFields, not the particular one I told it to.

So in order to correct this, I had to add an extra filter clause to the Alchemy expression. This required the name of the field I was searching under, which entailed reverting the joins back to their original location, i.e. the way Jeff wanted it !

Anyways, so the last thing I did was to consider the “Show full description under each result” checkbox. Currently, clicking on that results in the ticket description showing up under the ticket summary when viewing. But there is no guarantee that there will be a field called “description” in the new system. And what if users want to see some of their fields after having clicked this? SO I added an attribute to the FieldDescription database table called “is_verbose”. If this is True, it means the field will show up when a user clicks that checkbox. I also added the functionality for this, which was trivial.

Now I’m going to move on to the final Python file, which is web_ui.py. I didn’t spend all the time I had budgeted for query.py, so I can use some of that here becasue it looks like this is going to be a pain. Then I can move on the fix DrP (which shouldn’t take very long at all), and then go ahead with the interface stuff.