Fast, lightweight full-text search engine for objects in WPF and Silverlight controls

Here’s a neat filtering technique which will give you the ability to quickly and easily index your C# objects and give your users a single search box to search across entire records in any itemscontrol, eg. an autocompletebox or a datagrid.

AutoCompleteBox:

image

DataGrid

image

Features:

  • These screenshots should give you the idea, it indexes most basic types, eg. dates (including month names), decimals (including comma formatting) and would be fairly easy to handle other search use cases.
  • It’s super-easy to implement, you just need to mark up your model (or view-model) with [Index] attributes – no other modifications necessary.
  • It’s fast, using linq expression lamdas to generate compiled property-accessors for the indexed attributes.

I’ll attach a demo solution in WPF, but here’s the key bits of code.

The fast linq property accessor. In my solution, this is used by another method which uses normal reflection to find the properties marked with [Index] attributes, then these compiled lamdas are cached so this is only done once.

/// <summary> /// Get fast compiled property accessor using a compiled linq expression lambda /// </summary> private static Func<T, object> GetFastPropertyAccessor<T>(string property) { ParameterExpression t = Expression.Parameter(typeof (T), "t"); MemberExpression prop = Expression.Property(t, property); return Expression.Lambda<Func<T, object>>( Expression.Convert(prop, typeof (object)), t).Compile(); }

The search index method is pretty basic stuff but effective. Essentially, all we do is break the object down into distinct search tokens by combining all indexed fields in to one string, then split the result into a string array of tokens. We’ll do the same with the search text and match them up. In WPF we can AsParallel() it to make this very fast, but this works fine for approx 30k records on my current project.

public string[] IndexUsingFields(IEnumerable<object> fields) { fields = fields.Where(a => a != null).ToArray(); var decimals = fields.OfType<decimal>(); var ints = fields.OfType<int>(); var dateTimes = fields.OfType<DateTime>(); var exclude = decimals.OfType<object>().Concat(dateTimes); string indexString = string.Concat( fields.Except(exclude).StringJoin(" "), " ", dateTimes.Select(d => d.ToString("d M MMMM yy yyyy hh:mm:ss")).StringJoin(" "), " ", decimals.Select(d => d.ToString("#.########")).StringJoin(" "), " ", decimals.Where(d => d > 1000).Select(d => d.ToString("N8")).StringJoin(" "), " ", ints.Where(d => d > 1000).Select(d => d.ToString("N")).StringJoin(" ")) .ToLower(); return this._searchSplitRegex.Split(indexString) .Select(s => this._removeLeadingZerosRegex.Replace(s, string.Empty)) .Where(s => !string.IsNullOrEmpty(s)).Distinct().ToArray(); }

And then the search itself, again just a simple split to get the search tokens, then matching it up with the index using some basic linq:

private string[] GetSearchTokens(string searchText) { return _searchTextTokensRegex.Split(searchText); } public bool Matches<T>(T subject, string searchText) { string[] searchTokens = this.GetSearchTokensCached(searchText); return searchTokens.All( t => _searchIndexDict2[typeof(T)][subject].Any(s => s.Contains(t))); }

This is all a bit simplified and edited for the blog so I encourage anyone to check out the solution to see how it fits together. It could do with some (weak) event handlers to update the indexes on property changes, I’ve got some ideas on how to make it faster for over say 100k records (currently works well for up to around 30k) by getting rid of the duplication in the search index, and it’d be neat if we could get some form of highlighting textbox into the mix. Overall though, it’s a very simple, quick effective way to impress with a single-box style filter that users are getting very used to in the web/search engine world.

image
FullTextSearchForSilverlightAndWpf.zip

September 22 2010
blog comments powered by Disqus