Showing entries 21 to 30 of 35
« 10 Newer Entries | 5 Older Entries »
Displaying posts with tag: search (reset)
Tuning Search In Drupal 5

In previous search benchmarks, I utilized random content generated with Drupal's devel module. In these latest benchmarks, I used an actual sanitized copy of the Drupal.org community website database, with email addresses and passwords removed. The first tests were intended to confirm that Xapian continues to perform well with large amounts of actual data. Additional tests were performed to measure the effect of various MySQL tunings and configurations. The following data was derived from several hundred benchmarks run on an Amazon AWS instance over the past week using the SearchBench module.

These tests confirm that Xapian continues to offer better search performance than Drupal's core search module. Contrary to popular belief, the data also shows that using the InnoDB storage engine for search tables significantly outperforms using the MyISAM storage engine for search tables, especially when your database server has sufficient RAM. The …

[Read more]
Spinn3r Hiring Senior Systems Administrator


Spinn3r is hiring for an experienced Senior Systems Administrator with solid Linux and MySQL skills and a passion for building scalable and high performance infrastructure.

About Spinn3r:

Spinn3r is a licensed weblog crawler used by search engines, weblog analytic companies, and generally anyone who needs access to high quality weblog data.

We crawl the entire blogosphere in realtime, remove spam, rank, and classifying blogs, and provide this information to our customers.

Spinn3r is rare in the startup world in that we’re actually profitable. We’ve proven our business model which gives us a significant advantage in future product design and expanding our current customer base and feature set.

We’ve also been smart and haven’t raised a dime of external VC funding which gives us a lot …

[Read more]
MySQL Full Text Search by Alex Rubin

Download the PDF: http://www.mysqlfulltextsearch.com/full_text.pdf

Default search by relevance, default sort is by relevance

Boolean search is also popular. cats AND dogs. No default sorting, so you need to order the results yourself

Phrase search

MySQL Full Text Index, only available with MyISAM, and it supports natural language and Boolean search. ft_min_word_len - 4 characters per word by default is indexed. Frequency based ranking, doesn’t count distance between words

SELECT * FROM articles WHERE MATCH (title,body) AGAINST (’database’ IN NATURAL LANGUAGE MODE);

For Boolean, you use AGAINST (’cat AND dog’ IN BOOLEAN MODE).

n-gram fulltext plugin for CJK languages are available as plugins

DRBD and MySQL FullText search? DRBD requires InnoDB, when there is a failover, DRBD needs to …

[Read more]
MySQL: ?SOUNDS LIKE? vs. Full-Text search

A friend of mine asked me: I’m hoping you can help me out with something — I’m trying to optimize a search feature. Since it uses a MySQL database, the search already uses the LIKE statement to get matches for a search query, we might be needing something more flexible. I found mention on MySQL’s website [...]

O?Reilly Open Source Conference Day Two

A

O?Reilly Open Source Conference Day One

A

Five months with MySQL Cluster

So, the whole world changed at dealnews when Yahoo! linked us. We realized that our current infrastructure was not scaling very well. We had to make a change.

The Problem

Even though we were using all sorts of cool techniques, the server architecture was really still just a bunch of web servers all serving the same content. In addition to that, our existing systems as the time used a pull method. When a request came in, memcache was checked, if the data was not there, it was fetched from our main MySQL server. So, when there is no data in the cache or when it expires, this was very bad. Like when Yahoo! hit us. Some cache item would expire and 60,000 users would hit a page and each page would try and create the cache item.

The Solution

I …

[Read more]
Xapian Search Backend Revisited

I wrote previously about looking for a more powerful search solution, and I mentioned that Xapian wasn’t quite so convenient in indexing my data. I then chose to experiment with sphinx a little more, and proceeded to create a number of search engines and indexed a number of data sources in order to decide which direction to go. Unfortunately, while sphinx was convenient and still provides an excellent backend for basic search indexes, I’m revisiting Xapian once again based on it’s more-than-anticipated flexibility. I was brief in my explanation of Xapian however, and didn’t mention some of the more important and powerful aspects of it.

Xapian provides an API

Xapian is primarily an API for search indexing/data retrieval. They do provide a handy utility called Omega (available here) for indexing static pages and a plethora of other mime-types. However, I’m in …

[Read more]
Sphinx Fulltext Search Engine Part III (continued)

I’m finally taking the time to continue this series =)

Lyrics Grep

For testing purposes, I went ahead and scraped about 60,000 song lyrics off of a number of sites and developed a simple search engine for them. The script that did the scraping is pretty nasty (to handle equally nasty HTML that I had to parse through), so I’m going to refrain from posting that script and save myself some embarrassment. Make a pot of coffee, sit down, and write one yourself (or something else that’s similar enough).

Database Schema

I created a new database called lyricsgrep with the following simple …

[Read more]
Sphinx Fulltext Search Engine Part II (continued)

Note: Part I is located at the page that describes part one

Disclaimer
Just a minor clarification to anyone that was confused: I am currently experiencing Sphinx for the first time. Everything I’m writing about is new to me as well, for the most part. So far, I’m drooling over some of it’s capabilities; I may come back in a month and rip it a new ass hole.

Back to Configuration… (not really, this is the bitching section)

In preparation for my previous post about Sphinx, I had originally played with a number of configuration options, and even encountered a couple of issues that caused my confused butt to have to debug a number of things, and even recompile --with-debug and gdb the thing …

[Read more]
Showing entries 21 to 30 of 35
« 10 Newer Entries | 5 Older Entries »