Faceted Search - ΩJr. Programming Algorithms

This information lives on a webpage hosted at the following web address: 'https://www.omegajunior.net/code/programming-algorithms/'.

2 Apr. 2012.

Faceted search offers a user categories of filter tags (facets) for search results, but only those that give any results, with a tally that updates dynamically with any filter change.

This article describes 1 method to achieve this functionality, programmed in javascript, as portrayed in a proof of concept available right here on the OmegaJunior.Net.

A word from a sponsor:


We created this proof to find out whether it was possible at all using client-side scripting, and how many results a browser can filter before showing signs of delays. As it turns out, we can easily store and filter 1000 search results before running into delays by slower browsers (like Microsoft Internet Explorer 8 and older).


Visual

Here's what it can look like. The language in the picture is partially set in Dutch.

You see a search box, followed by a bread-crumb-like path of active filters, followed by the facets (with some filters activated), followed by paging buttons, followed by a couple of search results. A faceted search should not offer any filter that yields 0 results.

(Side note: In our proof of concept for dynamic result filtering, the text input serves no real purpose.)

Here's one real application: our Knowledge Base.


Think before you build

A data architect should define by which categories the search results should be allowed to get filtered. Think blog and hash tags and other context- and meaning-providing meta tags. These categories of tags shall be referred to as facets.

For instance, in a facet "location" one might find country, state, and city names. And in a facet "education" one might find names of studies and curricula, and possibly education levels.

Usually these tags are manageable through some kind of thesaurus interface, and editors and visitors are allowed to tag their articles. Some systems like to limit the tags their editors can apply. Others, like Twitter, leave the tagging completely to the whim of the author.

A system that normalises its facets into a separate data structure, like a thesaurus system, probably is most suited for building faceted search.


Interaction

Each clickable tag is followed by the exact amount of results that can be found through clicking it. After clicking a tag, the search results get filtered, all remaining tag result counts are updated, tags with a count of 0 results get hidden, and the filtered facet now shows a previously unseen method to undo that particular filter.

Our proof of concept shows it is possible to let the setting and resetting of the filters happen in real time, with search results loaded into the browser's memory. Those results only get updated from the server, when the user changes the search criteria.

The form sends out a request for search results to the web server. Our proof of concept employs zjrJS.Doc.lS(), passing the user's form input as query parameters. The server creates a Javascript-formatted result set, and returns it to the browser.

(Side note: though our proof of concept does retrieve a results file, it doesn't actually retrieve its results from the server. Instead, the script inside the results file generates results on the fly, attributing meta tags randomly. In a real situation, the server's DBMS should do the heavy lifting for generating the search results.)


Data Model

The data structure of the search results returned by the server is formatted as an array of Javascript Object Literals, which looks like this:
[
({
'title':'1st Search Result',
'desc':'Of all search results this is 1.',
'url':'somewhere.html',
'added':'20100903',
'experience':'3 years',
'education':'academic',
'location':'ams',
'language':'dutch'
})
,
({
'title':'Trivial Result Title',
'desc':'Descriptions should be read from a data source',
'url':'somewhere-else.html',
'added':'20100903',
'experience':'2 years',
'education':'academic',
'location':'ber',
'language':'german'
})
]


As you can see, a search result consists of a title, a description, an URL, and one (denormalised) tag per facet. (More tags per facet is possible of course, and we can think of situations where such would be needed. Adding support for this we deem trivial.)

The facets themselves are grouped and tallied by the server. These too are formatted as Javascript Object Literals and appended to the response the server sends back to the browser. Here's what one of our facets looks like:

Facets.Experience = [
{name: '0 years', tally: 101},
{name: '1 year', tally: 111},
{name: '3 years', tally: 112},
{name: '5 years', tally: 105},
{name: '6+ years', tally: 102}
];


This data is sent in a results script, entirely set in javascript. It should also contain a call-back to a pre-defined function in the search engine script. Thus, once the whole resuls script loads into the browser, it kicks the search engine into gear, which will parse and show the results and their faceted filters.


Parsing and Showing Received Results

Parsing the results in our case means moving the results into their own memory variable, as a filtered set (even though no filter has been applied yet). This means we're doubling the used memory, which is necessary. What we also need to do is calculate the amount of results, result pages, page size, and the current page number.

function parse () { "use strict"
var z,i,r,parsed,f;
z=Engine;
i=z.resultInfo;
r=i.results;
parsed=[];
for(var x in r){
//custom parsing goes here
if(r.hasOwnProperty(x)){
parsed[parsed.length]=new Object(r[x]);
}
}
f=({
tally: new Number(i.tally),
page: new Number(i.page),
pageSize: new Number(i.pageSize),
results: parsed
});
z.Filter.resultInfo=f;
z=i=r=parsed=f=null;
}



Showing the Facets

As the user clicks tags, the filter engine reparses the already filtered result set, reducing the amount of results (and as a bonus, memory), while at the same time calculating the amount of filtered results, pages, page size, and page number.

For that to happen, we must first show the facets and make their tags clickable:
function showFacet (x) {
var d,z,o,e,k,f,i,l,t,C,r;
d=document;
z=Engine;
C=z.Config;
e=d.getElementById( C.filter_container_id );

if(z){if(z.Facets[x]){f=z.Facets[x]}}
if(f){

//make a block to hold the facet
k=d.createElement('div');
k.className=C.filter_category_cn;

//add the facet's caption
k.appendChild( d.createElement('h3') ).appendChild( d.createTextNode( Locale.translate( 'facet_' + x ) ) );
o=k.appendChild( d.createElement('ul') );

//add a button to reset the facet filter
if( z.Filter.Choice[x] ){
l=o.appendChild( d.createElement('li') );
l.className=C.filter_reset_cn;
l.title=Locale.translate('filter_reset_title');
l.appendChild( d.createTextNode( Locale.translate( 'filter_reset_label' ) ) );
zjrJS.Doc.aE(l, 'click', new Function( 'Engine.Filter.expand("' + x + '")' ) );
}

//show this facet's tags and tallies
for( i=0; i<f.length; i++ ){
if( f[ i ].tally>0 ){ //avoid showing empty filters
l=o.appendChild( d.createElement('li') );
l.appendChild( d.createTextNode( f[ i ].name + ' (' + f[ i ].tally + ')' ) );//scrutinize
zjrJS.Doc.aE( l, 'click', new Function( 'Engine.Filter.limit("' + x + '","'+f[ i ].name+'")' ) );
}
}

e.appendChild( k );
}
d=z=o=e=k=f=i=l=t=C=r=null;
}


(Side note: To cross browser differences, we employ zjrJS.Doc.aE() to attach event listeners. Use whatever cross-browser library you prefer.)

Then from here, we need functions to limit the results (applying a filter) and, inversely, expand them (removing a filter). We chose to name those functions "narrow" and "expand".


Expanding the Result Set

Expanding in our case, or removing a filter, means reparsing the original set of search results into a fresh, unfiltered set, and then reapplying all previously chosen limitations. Why? Because the sheer amount of combinations possible with 5 facets makes remembering the chain of choices more resource-intensive than starting over.

function expand (facet) { "use strict";
document.body.className='busy';
if(facet){
var z,e,m,c,f;
z=Engine;
e=z.Filter.Choice;
m=Config.facets;
f=z.Filter;

f.parse();//start fresh

if(e[facet]){
delete Engine.Filter.Choice[facet];

//reapply previous filters
for(c=m.length; c>=0; c-=1){
if(e[m[c]]){
f.narrow(m[c], e[m[c]], true);
}
}

f.retally();
f.showAll();
}
}
z=e=m=c=f=null;
}/code]



Narrowing the Result Set

Narrowing (or applying a filter) entails running through the existing set of filtered results, and removing any result whose facets do not contain the chosen tag.

function narrow (facet, name, deferShowing) {
"use strict";
var f,i,l,t,e,r,s,z,copy;
z=Engine;
f=z.Facets;
i=z.Filter.resultInfo;
copy=[];
t=i.tally;
e=f[facet];
r=i.results;

if(e){if(e.tally>0){
z.Filter.Choice[facet]=name;
copy=[];
for(s=0;s<t;s+=1){
if(r[s]){
if(!r[s][facet]){}
else if(r[s][facet]!=name){}
else{copy[copy.length]=r[s]}
}
}
if(copy.length>0){
i.results=copy;
i.tally=Number(copy.length);
i.page=Number(1);
}
i=copy=null;
if(!deferShowing){
z.Filter.retally();
z.Filter.showAll();
}
}}
f=i=l=t=e=r=z=copy=null;
}



Conclusion

There you have it: dynamic filtering of faceted search results, where we offer the visitor only those meta tags that yield any results, no matter the combination of filters, and we tell them how many results to expect.

Let us know if you have any remarks or question!

Need problem solving?

Talk to me. Let's meet for coffee or over lunch. Mail me at “code at omegajunior dot net”.