public
Description: A document vector search with flexible matrix transforms. Currently supports Latent semantic analysis and Term frequency - inverse document frequency
Home | Edit | New

Home

Rsemantic

A document vector search with flexible matrix transforms for Ruby. Currently supports:

  • Latent semantic analysis
  • Term frequency – inverse document frequency

Usage


documents = ["The cat in the hat disabled", 
             "A cat is a fine pet ponies.", 
             "Do and cats make good pets.",
             "I haven't got a hat."]

#Log to stdout how the matrix gets built and transformed
search = Semantic::Search.new(documents, :verbose => true)

#We can pass different transforms to be performed. 
#Currently only :LSA and :TFIDF. 
#The order of transforms reflects the order they will be performed on the matrix
search = Semantic::Search.new(documents, :transforms => [:LSA])

#Defaults to performing :TFIDF and then :LSA
search = Semantic::Search.new(documents)

#Find documents that are related to documents[0] with a ranking for how related they are.
puts search.related(0)

#Search documents for the word cat. 
#Returns a ranking for how relevant the matches where for each document.
puts search.search(["cat"])

Rake Examples

There are some pre-built examples that can be run through rake. They all operate in verbose mode so you can see whats going on.


rake example:lsa

rake example:vector_space
Last edited by josephwilk, Sat Sep 27 06:53:09 -0700 2008
Home | Edit | New
Versions: