Previously: Part I, Part II

I’ve been a bit quiet on the rerouting project lately. I just haven’t managed to find the time to sit down and work on it lately.

Tonight, I got back down to it.

One of the problems that has been exercising me is how to cope with varying path component separators in Rails. Up until now, I’ve just being breaking the path into components on each slash (or, rather, solidus, but that sounds even more pretentious than I usually like to be), but it’s not a complete solution. A few examples might help to explain the problem.

:foo/:bar.jpg

The first path component, foo, should be terminated by a slash. The second, bar should be terminated by a period. But what happens if foo contains a period, or bar a slash?

/blah.blah/blah.jpg

Instinctively, I’m inclined to say that the above line should match the route, giving

{:foo => 'blah.blah', :bar => 'blah'}

but that raises a few problems. If I always consider a period to be a path separator, foo is broken. On the other hand, if I consider it never to be a separator, foo works but bar doesn’t. What a dilemma!

The solution, of course, is that a period should sometimes be a path separator. One way to determine this is to look at the following character in the route: if it’s a slash, then everything that’s not a slash should match. (If the component ends the string, the same is probably true.)

What about this, though?

/blah/blah/blah.jpg

It could be argued that this should evaluate to

{:foo => 'blah', :bar => 'blah/blah'}

but my instinct says not. So, in fact, I’m formulating the following rules:

  • A slash always forms a path segment.
  • If a route segment is followed by one of ; . , ?, that byte will also form a path segment.

If you look at the graph I posted in an earlier episode, you’ll see a number of nodes marked ‘Any’. These are the only nodes affected by the separator issue; solving it is therefore a matter of creating specialised ‘Any’ nodes which, instead of excluding only slashes, also exclude another possible separator byte.

I think that that, once implemented, should finish up the route recognition side of the code for now. I’m eager to get it done, at least to a reasonably stable design, so that I can move onto what is likely to be the more interesting and challenging half of the puzzle: route generation.

By the way, I’d love to be able to expose my SVN repository for anonymous checkouts if there’s an easy way to do that. Got any ideas? Let me know!