I've developed a very simple webapp.RequestHandler subclass that does a couple of useful things that I need in my GAE app. They seem like pretty basic things that are worth sharing – though anyone could have done it themselves in five minutes.

class BBRequestHandler(webapp.RequestHandler):
  # Render a template with standard data mixed in.
  def render(self, template_name, template_data = {}):
    template_data.update(self.sundries())
    self.response.out.write(template.render('view/' + template_name, template_data))
  
  # Forward the request to a different handler. Assumes GET.
  def forward(self, handler_class):
    handler = handler_class()
    handler.initialize(self.request, self.response)
    handler.get()
    
  # Standard template data required for most pages.
  def sundries(self):
    return {'logout':users.create_logout_url('/')}

The intention is that sundries() should return whatever standard data your pages require that's not specific to individual pages. In my case that's data that's required for the header, footer and sidebar – like the logout URL.

You may notice that I'm not using the Python docstring format. I just can't quite bring myself to do that just yet, though I have got used to the indentation-is-block thing.
The theory is good with the new buttonless MacBook trackpads, but in practice I think it’s flawed. The whole very large surface clicks down when pushed, though it’s hinged at the far side, so it works best at the side closest to the user. So you ought to be able to use your thumb to click exactly where the button used to be, and be no worse off than before. Problem is, the button was raised up a bit above the palmrest, whereas the tracking surface is a shade below it. So now your thumb has to reach down and over the lip to click, which is a bit uncomfortable. It’s a subtle difference but I think an important one.

I’ve only had a play in the shop, so maybe I’d get used to it and maybe even change my clicking habits entirely, but it put me right off.

I've noticed that a lot of the standard Python methods on lists and dictionaries don't return anything at all – they simply modify self and return nothing.

>>> l = [1,2,3]
>>> r = l.reverse()
>>> print l
[3, 2, 1]
>>> print r
None

This is becoming extremely frustrating as it prevents chaining of method calls, which would be possible if only self was returned. This keeps catching me out if I attempt to return some_list.reverse() or dict = {'foo':bar, 'x':y}.update(some_other_dict) as I end up with None when I expected Something.

It's the little details that make a language a joy or a chore.

I ran into an interesting performance issue with my Google App Engine application, which I'm currently developing feverishly. It seems that if you use indexed array style access to an unfetched Query object, performance is atrocious. I'm fairly sure that it's doing a fetch for each access. It will also hit the 'DB' if you call count() on the Query object. Far superior is to explicitly fetch() from the Query to get a list, then use that. It's all fairly obvious really, but it took me a while to realise what was going on, perhaps because you can innocuously use the Query object as an iterator without this problem, which lulls you into a false sense of security.

So to give a contrived (and untested) example, this I would expect to be terrible performance wise:

special_days = SpecialDay.all().filter('date >=', start_date).order('date')
day = date.today()
end_date = day + timedelta(21)
sd_index = 0
while day <= end_date:
  # This count() call and the indexed access to special_days each cause a DB hit!
  if special_days.count() > sd_index and day == special_days[sd_index]:
    print 'special day'
    special_days_index += 1
  else:
    print 'ordinary day'
  day += timedelta(1)

It's easily fixed by simply adding fetch(1000) on the end of the first line, and using len(special_days) instead of special_days.count(). Also, having moved to working with a list, you can remove sd_index entirely and pop() items off the front of special_days until it's empty instead. It's a pain that there isn't a less skanky way to fetch 'all of them' without using that nasty 1000. I suppose at the very least I should create a MAX_FETCH_LIMIT constant for it in my own code, so I can centrally modify it when Google modify their max limit.

Why can't I just use an iterator anyway you may ask? Because my loop, which is similar in structure to the example above, is iterating over days between two dates and each time round the loop picking the item off the front of my list of data entities (which were fetched in date order) if it matches the day. It's all working very nicely now thanks, and I've added caching too, to save hitting the DB at all in many cases.

I've stumbled across a real irritation with the way the Google App Engine data model works. One among many frankly, but I'll restrict myself to just this one for now.

The issue stems from the fact that db.run_in_transaction(func) insists that func is a function with no side effects, since it may be run repeatedly in an attempt to get the transaction to go through (if optimistic locking fails). Fair enough, but that means it has to freshly fetch any model objects that it wants to modify, otherwise it would have side effects to objects outside its scope. But consider this situation, in which we have an increment() function on our model object, that must use a transaction because it also modifies other related objects at the same time and require atomic behaviour:

class Person(db.Model):
  count = db.IntegerProperty(default=0, required=True)

  def increment(self):
    def tx():
      # Mess with some other related objects in data store.
      <omitted for brevity>
      # Must fetch a separate copy of self to avoid side effects.
      person = db.get(self.key())
      person.count += 1
      person.put()
    db.run_in_transaction(tx)

The problem here is that self hasn't actually been modified at all and is now out of date with respect to the data store (where the count is one bigger, assuming the transaction succeeded). This is a pain for the caller who had a Person object and called increment() on it and naturally expects their object's count to be one higher. But their object hasn't been modified at all – though the data store has, via the freshly fetched person. In case it's not obvious, we can't simply change the code above to use self instead of getting the new person object, since db.run_in_transaction(tx) may run our tx() function multiple times until it completes without an optimistic locking failure. If it did have to run multiple times, self's count would increment by one for each failed attempt, so the final successful attempt could end up with more than one added to the count. Or if the transaction eventually failed outright, self's count would still have been modified even though the data store had not been touched.

So the only solutions I can see are:
  • Put code after the run_in_transaction() call, that synchronises self with the data store. There isn't a sync() or refresh() method on Model objects, so you have to do this painstakingly by getting another fresh person with db.get(self.key()) and then copying across just the fields you know might have changed.
  • Insist that the caller is aware that certain methods on the model objects won't modify the object itself so they need to get a fresh one. This completely wrecks the idea of an object model and encapsulation though. You'd might as well just have a purely functional interface to the data store.
It all seems like madness to me, that defeats the point of trying to have a neat, simple data storage object model. As usual, I can only hope that I've missed some crucial point and that in fact the problem is easily and elegantly solved. I shall look out for that solution, unless some kind reader can enlighten me!

I do quite a lot of futzing around developing web apps, and as a result I'm often needing to choose colours to use in them, and I want those colours in HTML hex form – e.g. #ff0000 (bright red). The standard Mac OS X colour picker is a pretty reasonable colour picker, with all the features I want: colour wheel, RGB sliders, HSB sliders etc. but it lacks the most crucial feature – easy output of HTML hex codes! This is clearly a criminal omission, and it meant I'd frequently have to convert from RGB 0-255 numbers down to hex, which is tedious in the extreme.

Hence I was muchly pleased when I found Hex Colour Picker, which adds exactly this missing feature to the standard Mac colour picker panel. Perfect. Now I just open TextEdit, hit cmd-shift-c to bring up the colour panel and I have everything I need for web colour picking.

I've been banging away at my Google App Engine application. It'll be a little while yet, but in the meantime, a particular observation about Python. I keep getting tripped up when I refer to a member function without the parenthesis. For instance:

>>> list = [1,2,3,4,5]
>>> print list.count
<built-in method count of list object at 0x238fa8>

This doesn't print '5' because list.count is a reference to the count method itself, which (like everything else in Python) is an object, so gets printed. You need to use list.count() with those important parentheses in order to actually get the answer 5 that you wanted. I keep getting tripped up by this and taking a while to debug the problem every time. Some things like __class__ don't need parentheses, which I think is part of the confusion. Furthermore, when you use dir() to look up the public features of a class, it doesn't do anything to show you which ones are functions:

>>> dir([])
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__str__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

I rather suspect I've missed a subtle but important bit of Python understanding, having only scratched the surface so far. If someone can point that out to me I'd be very grateful. 

My MacBook laptop has had to go in for repair as a sliver is splintering off the plastic palm-rest where the lid touches it at the edge. There are also myriad hairline cracks appearing in the bottom shell around the front and back. The repair guy I took it to assured me this wasn't my fault and that it would be covered under warranty, which is good since I've got less than a month left on the year long warranty. Phew!

It seems this has become a common problem and that Apple is sorting it out under warranty so I'd advise anyone else with similar issues to get it sorted.

It does mean I'm without laptop for much of this week though, which is a pain.

Update: It was indeed fixed under warranty but it took a bit of wrangling with Apple on the phone as they weren't keen. It all hinged on whether I had been treating it well or not.

Update 2 (April 2009): It looks like Apple is capitulating and more generally recognising and fixing the issue, even outside of warranty, if reports are to be believed – e.g. AppleInsider's report.

I've spent a bit more time learning Python, because I've been dabbling with Google App Engine (which I'm going off rapidly) but mainly because it's an interesting language. What's particularly intriguing to me is its similarity to Ruby, and hence where it differs in syntax or approach makes for a notable point of comparison. Of course it's probably more correct to say that Ruby is similar to Python than the other way around. From my still very small exposure to Python…

Nice things about Python:
  • Less confusion with the way classes work. Or maybe I just haven't stumbled into Python's equivalent of meta-classes and class vs instance variables.
  • @classmethod is a much neater way of stating what's a class method rather than an instance method, compared to all that self gubbins, or the dreaded <<.
  • Optional named arguments for functions, allowing more flexibility for optional arguments and greater clarity when calling. [As an aside I like Objective-C's way of building argument labels into the method signature, but not it's square brackety syntax: [obj message:foo]. I'd much rather do obj.message(foo) and in fact with properties in Objective-C 2.0 we see more of this style.]
  • I think I probably prefer explicit return statements rather than the Ruby way of returning the last evaluated thing in any expression.
  • List comprehensions. To start with it just seems like a syntactic difference – Python: [x*2 for x in my_list]  Ruby: my_list.map {|x| x*2}. But Python's party trick is excluding elements as it goes: [x*2 for x in my_list if x != 3].

Nasty things about Python:

  • Seriously – indentation to demarcate blocks? Apparently I'll get used to it.
  • Double underscores. __init__ is a complete pain to type. Why not just a single underscore at the start or something?
  • Fiddly module system. Why the need for an __init__.py file in a directory, just to make it a module? Why the need to explicitly define an __all__ method just to be able to import everything in a module? I want to be able to create a "model/" directory, put all the .py files for my DB classes in there, then import the whole lot from my other classes with ease. When I add a new model class, I shouldn't have to go and modify the __init__.py file. It seems to be a real pain to split code up into multiple files sensibly in Python. If I've missed a trick here – please somebody show me the light!
  • The string interpolation is OK I suppose, giving the full power of C style formatting, but most of the time you're just doing simple interpolation and Ruby's syntax is far more pleasant and readable. Python: "Hello %s, from %s." % (person, greeter)  Ruby: "Hello {person}, from {greeter}."  Ruby also has a full on formatting system for the few times when you need it.

  

Having ploughed through much of the documentation, done the tutorial and started writing my own little web app, I have some half-formed thoughts about Google App Engine to throw out to the world.

  • As far as I can tell, any sort of data aggregation functionality (counting, averaging etc.) just won't be possible as the Datastore APIs don't allow for it. I've tried to think of ways to fake it but even my most elaborate machinations come up against the buffers. The only way to manage it at all is to do counting and averaging piecemeal, manually keeping the aggregate values you need up to date with each individual entity modification. Unfortunately, that means that you can't introduce new functionality requiring new aggregate values after you've already got a million users, because you've missed the chance to record those aggregates along the way.
  • Python's OK, but I don't like using indentation as the sole way to define blocks. But I'm sure I can get used to that.
  • I really don't like having to put an empty __init__.py file in any subdirectories of my python code. If I don't do that it seems I can't import foo.bar.Thingy. Breaking up code into multiple files in a sensible directory structure is surely a fairly common thing to do, so I'm amazed that Python makes it strangely difficult. I hope I've simply missed something and it's actually easier than that.
  • In fact all those double underscores look horrid and are a pain to type. Surely a single underscore would have been quite adequate?
  • The overall experience for learning GAE is very sorted. Smooth and well integrated – all you need to supply is your own decent text editor. I'm trying out TextMate, the darling of Mac OS X code editors, but I'm worried to see that it doesn't seem to have been updated for over a year.