Event related data – the buzz word at ECIR 2013

One of the major trends at the 35th annual European Conference on Information Retrieval was event related data. The conference took place between the 24th and 27th of March this year in a snowy Moscow, Russia. It attracted around 300 participants from all over the globe, 3 of them findwizards. While ECIR 2013 provided talks on a large variety of topics from across the field, event related data was definitely a buzz word.

The keynote speaker opening the second day of conference was Rutgers University assistant professor and Mahaya inc. CTO Mor Naaman. In his talk, Mr Naaman let the following image explain why Mahaya inc. are in business.

 rome-then-and-now

The past two papal elections.

The image above clearly shows that the way people act at events has changed considerably in the past few years, nowadays everyone is a reporter and their stories can be found on social media. Using platforms such as Twitter, Facebook and YouTube as data sources Naaman’s company creates products which not only extracts, but also synchronizes event coverage. One interesting feature in their latest product is the synchronization of video clips, making it possible for a user to easily switch view when watching video footage of for example a concert.  An arguably even stronger feature of this use of social media is the fact that news and event footage can reach the world even if no press is present at the scene. Slides from this inspiring talk can be found here.

Another presentation the same day displayed promising results in the task of automatic event detection. Using machine learning algorithms a team of researchers from Hanover, Germany have designed a system for detecting and summarizing entity related events from Wikipedia edit history data. Basically the idea is that when a Wikipedia article is edited by a large amount of users in a short period of time that can mark an important event considering the subject of the article. More information about this research can be found here.

The last day of the conference opened with a presentation from Jimmy Lin of Twitter. His talk centered on the importance of fast real-time indexing in social media platform architecture. One of the strengths of Twitter is presenting the users with information about events as they happen. As an example of this he used the event of an earthquake hitting eastern USA in 2011. Tweets from locations closer to the epicenter of the earthquake reached Twitter users in New York City before the actual quake did. I have to admit “Twitter, faster than earthquakes” is a pretty good slogan.

So whether it’s using social media data to let people (re)visit events, automatic event detection in open source dictionaries, making sure your indexing is fast enough to let your users cover events as they happen or something else, event based data seems to be one of the driving forces in the field of IR at the moment.

Query Rules in SharePoint 2013

Leaving both the SharePoint Conference in Las Vegas and the recent European SharePoint Conference in Copenhagen behind, Findwise continues sharing impressions about the new search in SharePoint 2013! We have previously given an overview of what is new in search in SharePoint 2013 and discussed Microsoft’s focus areas for the release. In this post, we focus more on the ranking of the search results using the query rules.

Understanding user intent in search is one of the key developments in the new release. The screenshots below, showing out-of-the-box functionality on some sample content, exemplify how the search engine adapts to the user query. Keywords such as ‘deck’, ‘expert’, or ‘video’ can express the user’s needs and expectations for different search results and information types, and what the search engine does in this case is promoting those results that have a higher probability to be relevant to the user’s search.

Query rules

Source: Microsoft

 

The adaptability of the search results can seem remarkable, as we see in these examples, aiming to provide more relevant search results through a better understanding of the user intent. Actually, this is powered by a new feature in SharePoint 2013 called query rules. Even more interesting maybe is that you can define your own custom query rules matching your specific needs without writing any code!

The simplest query rule would be to promote a specific result for a given search query. For example, you can promote a product’s instruction manual when the users search for that product name. Previously, in SharePoint 2010, you were able to define such promoted results (or “best bets”) using the Search Keywords. The query rules in SharePoint 2013 extend this functionality, providing an easy way to create powerful search experiences that adapt to user intent and business needs.

When defining a query rule, there are two main things to consider: conditions and corresponding actions. The conditions specify when the rule will be applied and the actions specify what to do when the rule is matched. There are six different condition types and three action types that can be defined.

For example, a query condition can be that a query keyword matches a specified phrase or a term from a dictionary (such as ‘picture’, ‘download’ or a product name from the term store), or that the query is more popular for a certain result type (such as images when for example searching for ‘cameras’), or that it matches a given regular expression (useful for matching phone numbers for example). The correlated actions can consist of promoting individual results on top of the ranked search results (promoting for example the image library), promoting a group of search results (such as image results, or search results federated from a web search engine), or changing the ranking of the search results by modifying the query (by changing the sorting of results or filtering on a content type). Another thing to consider is where you define the rule. Query rules can be created at Search Service Application, Site Collection, or Site level. The rules are inherited by default but you can remove, add, configure and change the order of query rules at each level. Fortunately, it also allows you to test a query and see which rules will fire.

There is one more thing though that you need to take into account: some features of query rules are limited in some of the licensing plans. Some plans only allow you to add the promoted results, and the more advanced actions on query rules are disabled. Check TechNet for guidelines on managing query rules and a list of features available across different licensing plans.

With the query rules, you have the freedom and power to change the search experience and adapt it to your needs. Defining the right keywords to be matched on the user queries and mapping the conditions with the relevant actions is easy but the process must undoubtedly be well managed. The management of the query rules should definitely be part of your SharePoint 2013 search governance strategy.

Let’s have a chat about how you can create great search experiences that match your specific users and business needs!

Presentation: Enterprise Search – Simple, Complex and Powerful

Every second, more and more information is created and stored in various applications. corporate websites, intranets, SharePoint sites, document management systems, social platforms and many more – inside the firewall the growth of information is similar to that of the internet. However, even though major players on the web have shown that navigation can’t compete with search, the Enterprise Search and Findability Report shows that most organisations have only a small or even a non-existing budget for search.

Web Search and Enterprise Search

Web search engines like Google has made search look easy. For enterprise search, some vendors give promises of a magic box. Buy a search engine, plug it in and wait for the magic to happen! Imagine the disappointment when both search results and performance are poor and users can’t find what they are looking for…

When you start planning your enterprise search project you soon realize the complexity and challenge – how do you meet the expectations created by Google?

The Presentation

This presentation was originally presented at the joint NSW KM Forum and IIM September event in Sydney, Australia by Mattias Brunnert. It contains topics as:

  • Why search is important and how to measure success
  • Why Enterprise Search and Information Management should be friends
  • How to kick off your search program

The Enterprise Search and Findability Report 2012 is ready

No strategy, no budget, no resources. This is the common scenario for enterprise search and findability in many organisations today. Still Enterprise Search is considered a critical success factor in 75% of organisations that responded to the global survey that ran from March to May this year.

The Enterprise Search and Findability Report 2012 is now ready for download.

The Enterprise Search and Findability report 2012 shows that 60% of the respondents expressed that it is very/moderately hard to find the right information. Only 11% stated that it is fairly easy to search for information and as few as 3% consider it very easy to find the desirable information. This shows that there still is a large untapped potential for any organisation to get great value from investing in enterprise search. For a relatively small investment, preferably in personnel it is possible to make search a lot better. The survey also reveals that  organisations who are very satisfied with their search, have a (larger) budget, more resources and systematically work with analysing search.

What is your primary goal for utilising search technology in your organisation?Figure. What is your primary goal for utilising search technology in your organisation?

The primary goal for using search is to accelerate retrieval of known information sources, 91%, and to improve the re-use of content (information/knowledge), 72%. This indicates that often search within organisations is used as a discovery tool for what already is known. If looking over the next three years, as many as 77% think that the amount of information in the organisation will increase. This means that every year it will be even more important be able to find the right information and that means Enterprise search is still very much needed, as stated in the following great presentations (on video):  Why Business Success Depends on Enterprise Search (by Martin White of Intranet Focus) and The Enterprise Search Market – What should be on your radar? (by Alan Pelz-Sharpe of 451 Research)

Download the full report.

A look at European Conference on Information Retrieval (ECIR) 2012

European Conference on Information Retrieval

The 34th European Conference on Information Retrieval was held  1-5 April 2011, in the lovely but crowded city of Barcelona, Spain. The core conference attracted over 100 attendees, with a total of 35 accepted full papers, 28 posters, and 7 demos being presented. As opposed to the previous year, which had 2 parallel sessions, this year’s conference included a single running session. The accepted papers covered a diverse range of topics, and were divided into query representation, blog and online-community search, semi-structured retrieval, applications, evaluation, retrieval models, classification, categorisation and clustering, image and video retrieval, and systems efficiency.

The best paper award went to Guido Zuccon, Leif Azzopardi, Dell Zhang and Jun Wang for their work entitled “Top-k Retrieval using Facility Location Analysis” and presented by Leif Azzopardi during the retrieval models session. The authors propose using facility location analysis taken from the discipline of operations research to address the top-k retrieval problem of finding “the optimal set of k documents from a number of relevant documents given the user’s query”.

Meanwhile, “Predicting IMDB Movie Ratings using Social Media” by Andrei Oghina, Mathias Breuss, Manos Tsagkias and Maarten de Rijke won the best poster award. With a different goal from the best paper, the authors of the poster experiment with a prediction model for rating movies using a set of qualitative and quantitative features extracted from the stream of two social media channels, YouTube and Twitter. Their findings show that the highest predictive performance is obtained by combining features from both channels, and propose as future work to include other social media channels.

Workshop Days

The conference was preceded by a full day of workshops and tutorials running in parallel. I attended two workshops: Information Retrieval Over Query Sessions (SIR) during the morning and Task-Based and Aggregated Search (TBAS) in the afternoon. The second workshop ended with an interactive discussion. A third, full-day workshop was Searching 4 Fun!.

Industry Day

The last day was the Industry Day. Only 2 papers here, plus 5 oral contributions, and around 50 attendees. A strong focus of the talks given at the industry day was on opinion-mining: four of the six participating companies/institutions presented work on sentiment analysis and opinion mining from social media streams. Jussi Karlgren, from Gavagai, argued that sentiment analysis from social media can be used by companies for example in finding reviews or comments made about their product or service, analyse their market position, and predict price movements. Rianne Kaptein, from Oxyme, backed this up by adding that businesses are interested by what the consumers say about their brand, products or campaigns on social media streams. Furthermore, Hugo Zaragoza from Websays identified two basic needs inside a company: a need for help in reading so that someone can act, and a need for help in explaining so that it can convince. Very interesting topic indeed, and research in this direction will advance as companies become more aware of the business gains from opinion mining of social media.

Overall, ECIR 2012 was a very inspiring conference. It also seemed a very friendly conference, offering many opportunities to network with the fellow attendees. Despite that, several participants said that the number of attendees at this year’s conference has decreased in comparison with previous years. The workshops and the core conference gave me the impression that it has a strong focus on young researchers, as many of the accepted contributions had a student as a first author and presenter at the conference. The fact that there was only one session running at a time was a good decision in my opinion, as the attendees were not forced to miss presentations. Nevertheless, the workshops and tutorials were running in parallel, and although the proceedings of the workshops will be made freely available, I still feel that I missed something that day. The industry day was very exciting, offering the opportunity to share ideas between academia and industry. However, there were not so many presentations, and the topics were not as diverse. I propose that next year Findwise will be among the speakers at the Industry track!

ECIR 2013 will be held in Moscow, Russia, between 24-28 March. See you there!

Mobile clients and Enterprise Search – What are the Implications?

As we all know the smartphone user base is growing explosively. According to www.statcounter.com, internet access from handheld mobile devices has doubled yearly since 2009 adding up to 8,5 % of all page views globally in January 2012. And mobile users want to be able to do all the same things that they are able to do on their PC. And that includes access to the company’s Enterprise Search solution!

The benefits of the sales force being able to search for vital customer information before a meeting or for field service personnel being able to find documentation quickly are quite obvious. So how can an organization tweak its search solution in order to provide convenient access for the mobile users? And above all, what will it cost?

Well, to answer the last question first: much less than you think. Providing for the mobile user is mainly about creating a new front end/UI. The main bulk of your search solution remains the same; indexing, metadata structure and content publishing, for instance, remain essentially unaffected.

But you do need to provide a quite different UI in order for the user interaction to work smoothly considering the specific characteristics of the mobile client primarily when it comes to screen size/resolution and text input. But the smartphone also has a lot of features that the PC lacks – it is always available and it knows exactly where you are, it always has a camera, microphone, speaker, possibly a magnetometer and accelerometer and of course a touchscreen with motions like pinching and swiping etc. And many of these features can be quite useful as the following examples prove:

Illustration 1. Google Mobile Voice Search on the iPhone. Courtesy of UX Matters, www.uxmatters.com

  • Google Mobile App for iPhone: in this app, the iPhone senses when the phone is lifted towards the ear and hence knows when to listen for a search command. Since the phone also knows where the user is, a search for “restaurant” automatically generates hits with restaurants in your vicinity.
  • Scanning a Barcode or QR-code: scanning a Barcode or QR-code with your phone is another way of entering a search string. An example could be a product in a store where the customer could open a price-search-engine and scan the QR-code of the product and see where the best price is.

As you can see, there are plenty of opportunities for those who want to be creative. But for the most part, the I/O will still be done via the screen. At UX Matters there is a great article by Greg Nudelman describing the considerations when implementing search for mobile clients and suggestions for various design patterns that can be efficient (see http://www.uxmatters.com/mt/archives/2010/04/design-patterns-for-mobile-faceted-search-part-i.php). I have included a brief summary below together with illustrations courtesy of UX Matters. But first, some general considerations for mobile clients:

  • Use Javascript code to detect what type of device is accessing your search solution and if it is a mobile client you display the mobile interface.
  • Native App or Mobile Web App: Creating a Mobile Web App is easier and cheaper than creating a native App – for one thing you don’t have to create multiple versions for different OS’s (although you still need to test your solution with different browsers/resolutions). Performance wise there isn’t a big difference between Native Apps and Web Apps and mobile browsers are increasingly gaining access to most of the phones hardware as well.
  • Authentication: SSO for mobile web applications works the same as for desktop browsers.  There are also new solutions currently being launched enabling usage of the company’s existing Active Directory infrastructure. One example is Centrify Directcontrol for Mobile enabling a centralized administration within Active Directory of all device security settings, profiles, certificates and restrictions.
  • Use HTML5 instead of FLASH: iPhones don’t support FLASH but HTML5 is a very capable alternative
  • Testing: How the design looks for different resolutions can be tested through various emulators but it is always recommendable to test on a limited set of real smartphones as well.
  • Access needs to be quick and simple: user interaction is more cumbersome on a phone than on a PC. Normally try to avoid solutions that require more than 3 input actions.
  • Menu navigation: links on the right side are normally used to drill down in the menu hierarchy and left up/towards the home screen
  • Gestures: is a very powerful toolbox that can be used in many different ways to create an efficient UI. For example, use “pinch to show more” if you want to expand the summary information of a specific item in the search hit list or “swipe” to expose the metadata (or whatever action you want to assign to that gesture).
  • Be creative: the mobile client is inherently different from a PC, limited in some ways but more powerful in others. So if you just try to adopt design solutions from the PC and fit them into a mobile UI you are missing out on a lot of powerful design solutions that only make sense on a mobile client and you are definitely not giving the users the best possible search experience. Also, since mobile design is still evolving you don’t need to be limited by conventions and expectations as much as on the PC side – make the most of this freedom to be creative!
  • W3C mobile: for more information about mobile web development, see http://www.w3.org/Mobile/ which also includes a validating scheme to assess the readiness of content for the mobile web

Design patterns for mobile UI (with courtesy of Greg Nudelman/UX Matters)

Mobile faceting can be tricky but by using design patterns like “4 Corners”, “Modal Overlays”, “Watermarks” and “Teaser Design” the UI can become both intuitive and easy to learn as well as providing reasonably powerful functionality. As mentioned, these techniques are summaries from an article written by Greg Nudelman for UX Matters. If you are eager to learn more, feel free to check out Greg’s website and his upcoming workshops focused on mobile design http://www.designcaffeine.com/category/workshops/

4 Corners: instead of stealing scarce real estate by adding faceting options directly on the screen together with the search result, semitransparent buttons are available in each corner enabling the user to bring up a faceting menu by tapping in a corner (see illustration 2).

Modal Overlays: the modal overlay is displayed on top of the original page. The modal overlay works well together with the 4 corners design – tapping a corner opens up the overlay containing faceting functions like filtering and sorting (see illustration 2).

Illustration 2. Four Corners and Modal Overlay patterns. Courtesy of UX Matters, www.uxmatters.com

Watermarks: a great technique for guiding users and showing the possibility of using new functions. The watermarks, possibly animated, show a symbol for the available action, for instance arrows indicating that a swiping gesture could be used (see illustration 3).

Full-Page Refinement Options Pattern: gives the user plenty of refinement options to choose from (see illustration 3).

Illustration 3. Two variations of the Watermark pattern and a Refinement Options pattern. Courtesy of UX Matters, www.uxmatters.com

Teaser Design: show part of the next available content so that the user is aware that there is more content available (see illustration 4).

Illustration 4. Teaser design pattern facilitates the discovery of faceted search filters. Courtesy of UX Matters, www.uxmatters.com

Persistent Status Bar: always maintain a persistent status bar containing the search string together with applied filters in the search result page. This helps the user maintain orientation. Note that all of the illustrations above have a persistent status bar.

Conclusion

Although Best Practices for mobile UI design are still evolving, plenty of progress has already been made and there are several solutions and design patterns to choose from depending on the specific circumstances at hand. So an implementation project need not be rocket science, as long as you learn the right tricks…

Bringing enterprise information to the field, readily available in a mobile handset or tablet, will mobilize your employees. The UI requires rethinking as we have seen. And security needs to be addressed properly to avoid having sensitive data compromised. But other than that, you are ready to go!

Google Search Appliance (GSA) 6.12 released

Google has released yet another version of the Google Search Appliance (GSA). It is good to see that Google stay active when it comes to improving their enterprise search product! Below is a list of the new features:

Dynamic navigation for secure search

The facet feature, new since 6.8, is still being improved. When filters are created, it is now possible to take in account that they only include secure documents, which the user is authorized to see.

Nested metadata queries

In previous Search Appliance releases there were restrictions for nesting meta tags in search queries. In this release many of those restrictions are lifted.

LDAP authentication with Universal Login

You can configure a Universal Login credential group for LDAP authentication.

Index removal and backoff intervals

When the Search Appliance encounters a temporary error while trying to fetch a document during crawl, it retains the document in the crawl queue and index. It schedules a series of retries after certain time intervals, known as “backoff” intervals. This before removing the URL from the index.

An example when this is useful is when using the processing pipeline that we have implemented for the GSA. GSA uses an external component to index the content, if that component goes down, the GSA will receive a “404 – page does not exist” when trying to crawl and this may cause mass removal from the index. With this functionality turned on, that can be avoided.

Specify URLs to crawl immediately in feeds

Release 6.12 provides the ability to specify URLs to crawl immediately in a feed by using the crawl-immediately attribute. This is a nice feature in order to prioritise what needs to get indexed quickly.

X-robots-tag support

The Appliance now supports the ability to exclude non-html documents by using the x-robots-tag. This feature opens the possibility to exclude non-html documents by using the x-robots-tag.

Google Search Appliance documentation page