• Home
  • About
    • Glen Cooney photo

      Glen Cooney

      Welcome to my official portfolio site

    • Learn More
    • Email
    • LinkedIn
    • Github
    • StackOverflow
  • Posts
    • All Posts
    • All Tags
  • Projects

DSet (v2)

19 Jul 2016

Reading time ~6 minutes

Github

The Ultimate Search Tool

DSet is a search tool for the board game Dominion. It allows players to filter sets of cards via chainable, partial queries to allow users to generate a set of cards that fit all their supplied criteria.

Originally conceived as a randomizer to speed up setup of a game session, my recent efforts have been focused on refining the core search functionality. The following are the changes I have made since my last post on this project.

Tools

Rails, Rspec, jQuery, AJAX, Arel, Bootstrap, Atom, Ubuntu

Overview

DSet is designed to use a minimalist, single-input field to accept user queries. It combines the functionality of a fuzzy search, chainable search, multi-category search, and live search into a single system.

Fuzzy Search

Each query is pattern matched against strings in eligible fields.

Example: “Vlg” matches with “Village”

Chainable search

Each space-separated query is treated as a chained filter, with each additional subquery further refining the result set.

Example: “v 3” will return a match of “Village” (“v” found in the name, with “3” as its cost).

Multi-Category Search

When results are presented to the user, they are grouped by the “categories” (ie database fields) on which they were matched. In other words, for any given query, it is compared against any possible matches across all indexable record fields.

Example: A query of “v 3” would return an array of results containing hashes like this:

["Name", "Terminality", "Cost"]=>[["Village", {:card=>#<Card id: 5, name: "Village", image_url: "http://wiki.dominions
trategy.com/images/thumb/5/5a...", cost: 3, created_at: "2018-03-23 02:08:14", updated_at: "2018-03-23 02:08:14">, :columns=>["Name", "Terminality", "Cost"], :terms=>["Village", "Splitter / Village", "3"]}]]

Each result is assigned an array as a key matching the columns on which it was matched (for multi-category display), followed by values for the card record, matched columns array, and term array (ie if the “column” is the key, the “term” is it’s value).

Process

Continuing from my previous work, I made some updates to lift the limitations of my previous solution while cleaning up the code to make it more readable.

Database Update

In earlier versions, DSet had a single “Card” model that contained all data corresponding to a particular card. In my newer version, I split this data into a “Descriptor” model, which not only helped with lookup but allowed for metadata about each descriptor to be included within each descriptor instance. As a result, the seed.rb file was refactored to create Card instances alongside the appropriate Descriptors like so:

card = Card.create!(name: "Cellar",
            image_url: base_url + "1/1c/Cellar.jpg/200px-Cellar.jpg",
            cost: 2)

assign_descriptors(%w(Base Action Sifter Non-Terminal), card)

...
def assign_descriptors(name, card)
  ...
  case name.downcase
  ...
  when 'Sifter'.downcase
        Descriptor.create!({
            name: 'Sifter',
            category: 'Archetype',
            description: 'Enables players to cycle through junk cards in their deck faster.',
            card: card
        })

Actualizing Full Search System Functionality

By far the most time-consuming aspect of this project has been tweaking the card search algorithm to behave as I intended. After many different refactor attempts, I eventually landed on the following code for the top-level method call:

def self.search(queries_string, slot)
    return [] if queries_string.blank?

    query_array = queries_string.to_s.split
    formatted_queries = format_multi_char_queries(query_array)
    matches = get_matches(formatted_queries)
    matches = matches.select{|_,v| v[:terms].length >= query_array.length}

    matches.group_by { |_, v| v[:columns] }
  end

Starting with the search function itself, I broke down each piece into separate functions for readability. In brief, the search method:

  1. Splits the user’s query into subqueries delineated by a space.
  2. Formats each query by inserting the wildcard operator between each character in each subquery string.
  3. The formatted query array is passed to get_matches to perform the search against each subquery.
  4. Matches that do not match against ALL supplied queries are discarded.
  5. Matches are grouped by columns to format output to be displayed to the user.

Getting #4 took the most time to get right. A frequent issue with my refactors was that my search function would return too few or too many results. In the latter case, it would return results that matched against one supplied query, but not against the rest (when it should only have results against all chained queries).

The Search System Itself

Given the complexity of the code, I will discuss it one piece at a time.

private_class_method def self.get_matches(subqueries)
    match_data = {}

    subqueries.each do |query|
      result = get_card_subset(query, match_data)

      match_data = result.nil? ? match_data : result
    end

    match_data
  end

At this level, each subquery is evaluated sequentially and returns a result hash containing matching cards. With each iteration, match_data is passed back into get_card_subset to limit subsequent searches to only search within the previous result set.

For the next part, I will use in-line annotation for readability.

private_class_method def self.get_card_subset(query, match_data)
  new_match_data = {}

  # If match data from a previous iteration is not available, test against all cards in the database
  if(match_data.empty?)
    card_set = Card.all
  else
    card_set = Card.where(name: match_data.keys)
  end

  # If the supplied subquery is numeric, test it against the "Cost" column
  if numeric?(query)
    matched_cards = card_set.where(cost: query)

    # For each match found, concatenate the results into the existing result hash (new_match_data). Explained further below.
    matched_cards.each do |card|
      merge_match_data(match_data, new_match_data, card, 'Cost', query)
    end
  else
      # If the supplied subquery is a string, test against the "name" column
      # of the Card model via fuzzy matching
      matched_cards = card_set.where('name ~* :pat', pat: query).distinct

      # Concatenate and format results into existing results
      matched_cards.each do |card|
        merge_match_data(match_data, new_match_data, card, 'Name', card[:name])
      end

      # This line allows for both Card name matches and Descriptor matches
      # to be returned within the same result hash
      match_data = (new_match_data.any? ? new_match_data : match_data)

      # Retrieves all Descriptor instances associated with a card within the current card_set (ie not filtered out by earlier subqueries)
      descriptor_set = Descriptor.where(Descriptor.arel_table[:card_id].in card_set.pluck(:id) ).distinct

      # Fuzzy match the subquery against the names from the filtered array of
      # Descriptor instances
      descriptor_matches = descriptor_set.where('name ~* :pat', pat: query).distinct

      # Concatenate and format results into existing results
      descriptor_matches.each do |kw|
          merge_match_data(match_data, new_match_data, kw.card, kw.category, kw.name)
      end
  end

  return new_match_data
end

The next two functions are a little less heavy in terms of complexity.

private_class_method def self.merge_match_data(match_data, new_match_data, card, column, term)
    new_match_data[card.name] = new_match_data[card.name].nil? ? new_match_data[card.name] : {}
    new_match_data[card.name][:card] = card
    new_match_data[card.name][:columns] = merge_result_hash(match_data, card.name, :columns, column)
    new_match_data[card.name][:terms] = merge_result_hash(match_data, card.name, :terms, term)
  end

This method creates result hashes, or concatenates new data into existing result hash elements. It generates a hash containing the card (as an ActiveRecord object), columns (ie what descriptor of the model was matched, such as name, cost, etc), and terms (the actual word matched against the supplied subquery).

private_class_method def self.merge_result_hash(hsh, key, sub_key, new_elem)
    result = {}

    if hsh[key].nil?
      result = [new_elem]
    else
      if hsh[key][sub_key].nil?
        result = [new_elem]
      else
        result = (hsh[key][sub_key].include?(new_elem) ?
          hsh[key][sub_key] : hsh[key][sub_key].push(new_elem))
      end
    end

    result
  end

Finally, merge_result_hash concatenates new term or category matches into the result hash, omitting duplicates. The key refers to the name of the card stored as a key in the result hash, and the subkey refers either to the column or term key of the nested result hash element.

Example: Given this result hash:

# Before
result_hsh = {
  'Village':{
    columns: ['Name'],
    terms: ['Village']
  }
}

...

# Methods invoked
merge_result_hash(result_hsh, "Village", :columns, "Cost")
merge_result_hash(result_hsh, "Village", :terms, 3)

...

# After
result_hsh = {
  'Village':{
    columns: ['Name', 'Cost'],
    terms: ['Village', 3]
  }
}

Next Steps

I am currently toying with porting this search system for use as a product lookup system for a local hardware store. My goal is to convert this over into a reusable gem to be shared with the open source community.



BlocCapstoneBackend Like Tweet +1