Plugins - Searchable
Add to favoritesSearchable
ActiveRecord Integration for Rails
Overview
Searchable is a Rails plugin that uses the Ferret toolkit (a Lucene derivative) to provide full-text search integration with ActiveRecord.
Compatibility
Searchable has been tested with Rails 1.0+.
Installation
script/plugin install -x http://svn.mojodna.net/repository/acts_as_searchable/trunk
Configuration
environment.rb:
# Globally override the default index path ("#{RAILS_ROOT}/db/ferret_index")
MojoDNA::Searchable::Indexer::default_index_path "#{RAILS_ROOT}/db/my_index"
# Globally override the default analyzer (Ferret::Analysis::StandardAnalyzer)
MojoDNA::Searchable::Indexer::default_analyzer Ferret::Analysis::StopAnalyzer
# Force Searchable to use the DRb backend (default: local)
MojoDNA::Searchable::remote
Indexing Models
The are 3 ways to specify indexing strategies for models.
Using Defaults
The first way is to simply include Searchable which triggers Searchable's default behavior, which is to index all attributes returned by Model.content_columns using each attributes name as its field name within the index.
For example:
class Person < ActiveRecord::Base
include Searchable
# attrs: first_name, last_name
end
This model can then be searched:
Person.search("seth")
This will attempt to match "seth" on all fields available in the index (first_name and last_name). If you wish to target a specific field:
Person.search("first_name: seth")
Indexing DSL
Searchable includes a DSL which allows one to exercise fine-grained control over indexing strategies.
An overly complex example:
class Person < ActiveRecord::Base
include Searchable
# attrs: first_name, last_name, address (an Address object)
has_one :address
# (locally) override the index path
index_path "#{RAILS_ROOT}/db/person_index"
# (locally) override the default analyzer for this model
default_analyzer Ferret::Analysis::StopAnalyzer
# index the person's first name (specifying defaults as a hash)
index_attr :first_name, :boost => 1.0, :indexed => true, :tokenized => true, :stored => false, :indexed_name => "first_name", :sortable => false
# index the person's last name (overriding some defaults using the block format)
index_attr :last_name do |attr|
attr.indexed_name "last_name"
# alias "surname" and "sn" to this field so queries for "sn:fitzsimmons" will work
attr.aliases ["surname", "sn"]
attr.boost 2.0
attr.indexed true
attr.tokenized true
attr.stored false
# allow resultsets to be sortable by last name
attr.sortable true
end
# index (part of) the person's address, including attributes of the address association
index_attr :address do |attr|
attr.include :city, :boost => 0.75
attr.include :state do |state|
# alias state to province to queries for "address.province: MA" will work
state.aliases ["province"]
state.boost 0.75
end
attr.include :country, :boost => 0.75
end
end
Custom to_doc
If the DSL doesn't provide enough flexibility and you want to be even more specific about how attributes are indexed, you can provide your own to_doc method that will be used in conjunction with the DSL to provide additional fields to the index.
If you have field/value pairs where you want the field name used as a field name:
# custom to_doc to handle field/value pairings
def to_doc(doc)
fields.each do |field|
f.values.each do |value|
doc << Indexer::create_field("#{field.name}", value)
end
end
doc
end
Creating the Index
The instance method model.add_to_index is used to add an instance of a model to the appropriate index. Searchable will handle creation of the index files and will remove any existing documents corresponding to the instance being indexed. model.remove_from_index is used to explicitly remove an instance from the index. Both methods are registered as ActiveRecord callbacks, so if an object is created, updated, or deleted, it will be created, updated, or deleted within the index as well.
If you wish to index all instances of a given model, use the class method Model.index_all. This will load all instances of a model (in batches of 500) and add them to the index. By default, it creates a temporary index and copies it over the existing index when it's done being processed. This avoids any interruption of service if this is being done to rebuild an index on a live site. When running in local mode, avoid calling this from a process that shouldn't block (i.e. a FastCGI process).
Searching
By now, you've seen the basics of how to search for instances of models:
all_people_named_or_living_in_kerry = Person.search("kerry")
all_people_named_seth = Person.search("first_name: seth")
You can pass the search method either a String or a Ferret Query that you've constructed yourself. If you pass in a String, you can use the Ferret Query Language, which is very similar to Lucene's query syntax.
By default, all results returned by search will be fully-loaded models (loaded using the appropriate find method). If you wish to have just the ids returned, pass :load => false as part of the options hash.
Person.search("seth", :load => false)
You can take advantage of attributes that have been marked as sortable (reversible):
Person.search("seth", :sort_by => :last_name)
Person.search("seth", :sort_by => :last_name, :reverse => true )
Specify the default fields to be searched:
Person.search("seth", :default_fields => ["first_name, last_name"])
Support pagination by using :offset and :limit:
Person.search("seth", :limit => 10, :offset => 10)
Remote Mode (DRb)
Searchable includes a DRb backend that allows AR models to communicate with a single "search server". This is a much better option for clustered environments where index consistency is important and there are enough reads and writes that index file locking issues come into play.
To enable remote mode, add this to your environment.rb:
# Force Searchable to use the DRb backend (default: local)
MojoDNA::Searchable::remote
And copy the appropriate scripts to your script directory:
for f in searchable indexer; do
cp vendor/plugins/acts_as_searchable/script/$f script
chmod 755 script/$f
done
You can then use Searchable in exactly the same manner. The only difference is that your index(es) will be stored on the server running the DRb service.
To start the DRb service:
script/searchable &
script/indexer &
Additional backends
Alternative backends can be created by following the example of LocalSearchable and RemoteSearchable. You'll also need to modify Searchable to include the appropriate backend under appropriate conditions.
The Future
Support for searching across multiple models has not yet been added. Nor has support for searching across multiple indexes. Dates should be handled for more efficient querying.
http://mojodna.net/searchable/ruby/
http://svn.mojodna.net/repository/acts_as_searchable/trunk
Rails' (MIT)
Searching and Queries
