ChaSen, Ruby, Ubuntu Linux
ChaSen is a morphological analyser for Japanese. For me, it’s particularly useful in the context of full text search. Japanese doesn’t use spaces, so it’s very hard for a computer to work out where to break up the sentence in order to index the components. ChaSen handles this beautifully, delivering a full analysis of the sentence, showing each component’s pronunciation, basic form, and part of speech. It’s an example of standing on the shoulders of giants thanks to open source software: with such powerful tools available for free, it’s possible to achieve things that would otherwise be impossible.
I was trying to build the Ruby/ChaSen library on Ubuntu Linux. After a little trouble, I discovered that it was necessary to specify the library location:
> ruby extconf.rb -L/usr/lib > make > sudo make install
But every time I tried to
require it in Ruby, I was
[...]/chasen.so: undefined symbol: _Znwj
Apparently, the problem lies in
mkmf, which is
choosing the wrong linker.
The solution? After using
extconf.rb to create the
Makefile before starting
In the line:
LDSHARED = $(CC) -shared
g++ and then
make as before. Don’t forget to
clean first if there are already files lying around from
previous failed attempts.
I hope that this brief explanation helps anyone else suffering from the same problem.