I have seen a few people now ask about using MySQL's FULLTEXT
indexing with asian languages such as Chinese, Japanese and
Korean (herein referred to as CJK.), however, there doesn't seem
to be a good centralised article that covers it.
The information is out there, I just don't think it has been well
presented yet.
As I have recently done a bunch of research on this topic for a
customer, I figured it might be a good opportunity to make my
debut in the MySQL blogosphere.
So here we go...
I'll open by saying that attempting to use FULLTEXT with CJK text
in MySQL 5.0 will be unsuccessful.
From the CJK FAQ in the MySQL manual:
"For FULLTEXT searches, we need to know where words begin and
end. With Western languages, this is rarely a problem because
most (if not all) of these use an easy-to-identify word boundary
— the space character. However, this is not …
Showing entries 1 to 1
Dec
16
2008
Showing entries 1 to 1