Fast, lightweight full-text search engine for objects in WPF and Silverlight controls

Here’s a neat filtering technique which will give you the ability to quickly and easily index your C# objects and give your users a single search box to search across entire records in any itemscontrol, eg. an autocompletebox or a datagrid.

AutoCompleteBox:

image

DataGrid

image

Features:

  • These screenshots should give you the idea, it indexes most basic types, eg. dates (including month names), decimals (including comma formatting) and would be fairly easy to handle other search use cases.
  • It’s super-easy to implement, you just need to mark up your model (or view-model) with [Index] attributes – no other modifications necessary.
  • It’s fast, using linq expression lamdas to generate compiled property-accessors for the indexed attributes.

I’ll attach a demo solution in WPF, but here’s the key bits of code.

The fast linq property accessor. In my solution, this is used by another method which uses normal reflection to find the properties marked with [Index] attributes, then these compiled lamdas are cached so this is only done once.

/// <summary> /// Get fast compiled property accessor using a compiled linq expression lambda /// </summary> private static Func<T, object> GetFastPropertyAccessor<T>(string property) { ParameterExpression t = Expression.Parameter(typeof (T), "t"); MemberExpression prop = Expression.Property(t, property); return Expression.Lambda<Func<T, object>>( Expression.Convert(prop, typeof (object)), t).Compile(); }

The search index method is pretty basic stuff but effective. Essentially, all we do is break the object down into distinct search tokens by combining all indexed fields in to one string, then split the result into a string array of tokens. We’ll do the same with the search text and match them up. In WPF we can AsParallel() it to make this very fast, but this works fine for approx 30k records on my current project.

public string[] IndexUsingFields(IEnumerable<object> fields) { fields = fields.Where(a => a != null).ToArray(); var decimals = fields.OfType<decimal>(); var ints = fields.OfType<int>(); var dateTimes = fields.OfType<DateTime>(); var exclude = decimals.OfType<object>().Concat(dateTimes); string indexString = string.Concat( fields.Except(exclude).StringJoin(" "), " ", dateTimes.Select(d => d.ToString("d M MMMM yy yyyy hh:mm:ss")).StringJoin(" "), " ", decimals.Select(d => d.ToString("#.########")).StringJoin(" "), " ", decimals.Where(d => d > 1000).Select(d => d.ToString("N8")).StringJoin(" "), " ", ints.Where(d => d > 1000).Select(d => d.ToString("N")).StringJoin(" ")) .ToLower(); return this._searchSplitRegex.Split(indexString) .Select(s => this._removeLeadingZerosRegex.Replace(s, string.Empty)) .Where(s => !string.IsNullOrEmpty(s)).Distinct().ToArray(); }

And then the search itself, again just a simple split to get the search tokens, then matching it up with the index using some basic linq:

private string[] GetSearchTokens(string searchText) { return _searchTextTokensRegex.Split(searchText); } public bool Matches<T>(T subject, string searchText) { string[] searchTokens = this.GetSearchTokensCached(searchText); return searchTokens.All( t => _searchIndexDict2[typeof(T)][subject].Any(s => s.Contains(t))); }

This is all a bit simplified and edited for the blog so I encourage anyone to check out the solution to see how it fits together. It could do with some (weak) event handlers to update the indexes on property changes, I’ve got some ideas on how to make it faster for over say 100k records (currently works well for up to around 30k) by getting rid of the duplication in the search index, and it’d be neat if we could get some form of highlighting textbox into the mix. Overall though, it’s a very simple, quick effective way to impress with a single-box style filter that users are getting very used to in the web/search engine world.

image
FullTextSearchForSilverlightAndWpf.zip

September 22 2010

WPF/Silverlight MVVM Design Data - create once and share with Unit Tests

Getting data appearing in design view in Blend is key to being able to manipulate the your layout and styles without needing to run up your app (to quote Marcin, ‘nudge, nudge, nudge instead of nudge-F5-waaaait, nudge-F5-waaaaait’)

As soon as a colleague pointed out to me VS2010 now supports Blend’s d:DataContext I brought forward getting things ‘working in in design view’, as I realised here was a way to quickly and easily wire up a test rig of data in C# that I could share between my unit tests, a mock session of my WPF/Silverlight app and design view in both Blend and VS2010 (rather than xaml test data for UI, C# or for tests/runtime test UI). Down to it …

Working backwards, here’s my end result (Bob and Alice are a List<Person> bound to a ListView). Doesn’t look like much, but it works with all xaml controls including DataGrids/TreeViews etc, so you should quickly start to see the power in simplicity.

image

Key XAML:

<UserControl ... xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:md="clr-namespace:MvvmDemo" mc:Ignorable="d"
        d:DesignHeight="600" d:DesignWidth="800"
        d:DataContext="{x:Static md:DesignData.PersonListViewModel}"
        x:Name="view" DataContext="{Binding ElementName=view, Path=ViewModel}">
... <TextBox Text="{Binding Filter.PersonName}"
... <DatePicker SelectedDate="{Binding Filter.BirthDate}"
... <ListBox ItemsSource="{Binding FilteredPeople}">

By the way, this is very cool: the VS2010 Properties window now has navigation of my ViewModel object hierarchy, as long as I’m bound to my real DataContext in xaml. Still prefer the speed of resharper smart auto-completion though (ctrl+alt+space).

image

Anyway, back to the xaml … note the d:DataContext is bound to a static DesignData class & property. Here you could either inject a fake/mock repository and have the ViewModel do its thing, or stop it from loading and populate it in the DesignData class. Both are good techniques in unit testing and design data, and can be used interchangeably. Here I demo both (the viewmodel loads existing people from the repository, hence in design the fake, and I populate the form filter fields).

public static class DesignData
{
    static DesignData()
    {
        Container.Register<IPersonRepository, FakePersonRepository>();
    }

    public static PersonListViewModel PersonListViewModel
    {
        get
        {
            return new PersonListViewModel
                       {
                           Filter = new PersonFilter
                                        {
                                            PersonName = "Ca",
                                            BirthDate = DateTime.Now.AddYears(-20)
                                        }
                       };
        }
    }
}

I inject & reuse the FakeRepository in my TestUI exe (a separate exe so as not to pollute my release code with accidental fake data). 

And just for good measure, here’s a kind of contrived test – the fake data doesn’t really effect the test but you get the idea: write once, use everywhere - same as the rest of your code.

[TestMethod]
public void search_on_name_should_filter()
{
   // ARRANGE 
   _sut = new PersonListViewModel();
   var people = new FakePersonRepository().GetPeople();
   _sut.People.AddRange(people);
   people = new List<Person> { new Person {Name = "Carol"},
                        new Person {Name = "Carlos"},
                        new Person {Name = "Charlie"} };
   _sut.People.AddRange(people);
    string search = "Ca";
    _sut.PersonName = search;

   // ACT 
   _sut.Search();

   // ASSERT
   Assert.IsFalse(_sut.FilteredPeople.Any(p => !p.Name.StartsWith(search)));
   Assert.IsTrue(_sut.FilteredPeople.Any(p => p.Name.StartsWith(search)));
}

Full code here:

image

Technorati Tags: ,,
April 1 2010

Static reflection INotifyPropertyChanged

Every MVVM framework contains a ViewModelBase with an implementation of INotifyPropertyChanged. Mine's no different, but I wanted to have the flexibility to control when propertychanges are fired, and whether to do other stuff/fire other propertychanges at the same time, all strongly typed (for resharper renaming, navigation etc). Static reflection came into play, and I came up with a solution that gave me the ability to do this:

if ( SetProperty( ref this._company, value, () => this.Company ) )
{
	// update something else, fire other property changes etc ... 
	this.FirePropertyChanged( () => this.Salary );
}

To achieve this I used static reflection as thus

public bool SetProperty<T>( ref T field, T value, Expression<Func<T>> property )
{
	if ( ( Equals( field, default( T ) ) && Equals( value, default( T ) ) )
		  || ( !Equals( field, default( T ) ) && field.Equals( value ) ) )
	{
		return false;
	}
	field = value;
	if ( property != null )
	{
		this.FirePropertyChanged( property );
	}
	return true;
}

public void FirePropertyChanged<T>( Expression<Func<T>> property )
{
	if ( this.PropertyChanged != null )
	{
		this.PropertyChanged( this, new PropertyChangedEventArgs( property.GetPropertyName() ) );
	}
}

public static class StaticReflectionExtensions
{
	public static string GetPropertyName<T>( this Expression<Func<T>> property )
	{
		return ( ( (MemberExpression)property.Body ).Member ).Name;
	}
}

Upgrade to Linq AsParallel

I was amazed & delighted a couple of weeks ago when resharper 5 suddenly decided a huge piece of code I was refactoring could be converted in to linq – my esteemed ex-colleague John Rayner has already blogged about this so no need to repeat it (check out the amazing chunk of code it reduced). I showed it to someone at work though and he asked the question ‘why?’, as to him the original was more readable. I tend to agree with John that linq (methods chaining) is more readable now, and as everyone groks it I’m sure you will too. However I have come up with another neat reason ‘why’ – you can parallelize your (thread-safe) .Net 4 code for free.

Run this code:

int c = 0; 
foreach (var i in Enumerable.Range(1, 1000000))
{
    if (IsPrime(i))
    {
        c++;
    }
}
// ... 
static bool IsPrime(int val)
{
    foreach (var i in Enumerable.Range(2, val))
    {
        if (val % i == 0)
        {
            return false;
        }
    }
    return true;
}

Watch it churn a single CPU (to be fair this didn't dent my CPU too much, but it did take over 2 minutes)

image

upgrade to linq (alt-enter on the foreach)

int c = Enumerable.Range(1, 1000000)
                .Count(IsPrime)

stick in a strategic AsParallel, and watch it use up all of your CPUs nicely.

int c = Enumerable.Range(1, 1000000)
    .AsParallel()
    .Count(IsPrime);

image

The latter took ~30s.

Credit here to this Christer for the IsPrime AsParallel example, I really did just paste 2 blogs together here but I hope someone finds it useful ;)

Older Posts