EuroOSCON 2006
I think I’ve had enough time to digest last week’s EuroOSCON (the O’Reilly European Open Source Convention) in my mind. I’m not going to attempt to write a coherent story: instead, I’m just going to put down some thoughts from my notes and as they occur to me.
One theme of the convention was that, paradoxically, Open Source doesn’t really matter any more. One speaker, I forget who, asserted that Open Source has already succeeded. It’s here to stay. There was a lot of discussion about access to data. To my mind, this is far more important: open source is useful, but it’s the data and the formats that really matter, and that provide a real barrier to entry or migration. Just think of the ubiquity of the proprietary Word document format.
Tor Nørretranders gave an excellent talk on the motivations that drive us as human beings, and the role of the hormone oxytocin in our behavioural choices. Some interesting points: humans are not rational when dealing with other humans—but they are rational when dealing with computers. (For example, humans won’t accept less than twenty percent in the Ultimatum Game when playing against other humans, but will accept any offer from a computer.) He asserted that being wasteful proves that you have resources to spare. As he reduced it, altruism helps you get laid!
On a more technical note, Peter Zaitsev explained
full-text search possibilities for MySQL (Slides).
The inbuilt full-text search is limited and slow (especially if
there are additional conditions on the search that trigger a linear
scan). It’s useful if the data set is small, you have plenty of
hardware, and the queries are simple. He gave some hints on ways to
improve performance: use a good list of stop words relevant to the
application; keep the index unfragmented by optimising regularly;
have enough memory to hold the working set; encode properties into
the indexed text instead of using more complex queries. Finally,
paging is broken (using LIMIT
slows performance
dramatically). Of the alternatives he discussed, Sphinx and Lucene seemed the most
promising.
Roger Magoulas of O’Reilly talked about data warehousing under
the title of ‘Big Data and the Open Warehouse’. He had some
interesting points on managing large amounts of data, like using
temporary tables to get around the fact that DELETE
and INSERT
are faster than UPDATE
. He
also mentioned the Star
Schema, a denormalised database structure in which I saw some
immediate potential applications.
I found out one of the reasons why technical books are so fat: the binding has to stand out when fighting with other books on the shelf.
I attended talks about a couple of new (-ish) web frameworks each of which had some interesting, distinctive features of its own. I’m planning to look at integrating some of the useful features into Rails. RIFE (Java) and Jifty (Perl) are worth a look for inspiration if you are interested in that kind of thing.
Meanwhile, Rasmus Lerdorf (of PHP fame—or infamy) gave some insights into how PHP is used at Yahoo!: it seems that they aren’t using frameworks in most cases. The performance numbers that he quoted were very impressive, but Yahoo! is perhaps a pathological case: I’m not sure that their approach would be a good idea for other businesses. I also got the impression that they use PHP as a very thin layer.
Dennis Linnell’s Do-It-Yourself Performance Benchmarks on Rails had some good ideas on tools and methodologies for doing just that.
Rob Savoye gave a short-notice talk on Gnash, the free Flash replacement. After an inopportune update failure on his Ubuntu laptop left it unusable, he did his presentation without any visual aids. In spite, or, rather, because of that, it was excellent. Given the shocking incompetence shown by Adobe in their attempts to port Flash to Linux, it was good to see that the open source equivalent is in good hands.
That’s far from everything, but it’s enough for now: I’m going to bed.