Update on Findability Day 2013

Findability Day 2013 is just a few weeks away and the agenda is now finalized. We will have great keynote speakers and cases for inspiration and of course the approximately 200 attendees will create a valuable networking opportunity.

The event takes place in Stockholm on May 30th and as organizers of the event we are proud to present the following list of speakers and topics:

Martin White – The future of search

Daniel Bergqvist, Google – The Star Trek Computer

Bjørn Olstad, Microsoft – Unveil the hidden values in your organization

Ravi Mynampaty, Harvard Business School – Developing a Search & Findability Practice for the Enterprise 

Kristian Norling, Findwise – The 2013 Findability Survey

Sebastian Forseland, Husqvarna – Master Data management + Enterprise Search = User Satisfaction

Christian Finstad, Meltwater – Big data for online insight

Martin Öhléen, SKF – Search as a driver in Mobile applications

Jonas Berg, Svensk Byggtjänst – The next generation business search engine

Johan Johansson, Municipality of Norrköping – Governance and the role of search in user satisfaction

Niclas Lillman, Scania – Search as a service

DJ Skillman, Senior Director Technical Services, Splunk – Big data

Check out the agenda here for more details and for registration. There are just a few seats left so make sure to register today!

It promises to be a great event and a day full of inspiration, knowledge sharing and networking opportunities to help develop your business, personal skills and professional network.

Hope to see you there!

Event related data – the buzz word at ECIR 2013

One of the major trends at the 35th annual European Conference on Information Retrieval was event related data. The conference took place between the 24th and 27th of March this year in a snowy Moscow, Russia. It attracted around 300 participants from all over the globe, 3 of them findwizards. While ECIR 2013 provided talks on a large variety of topics from across the field, event related data was definitely a buzz word.

The keynote speaker opening the second day of conference was Rutgers University assistant professor and Mahaya inc. CTO Mor Naaman. In his talk, Mr Naaman let the following image explain why Mahaya inc. are in business.

 rome-then-and-now

The past two papal elections.

The image above clearly shows that the way people act at events has changed considerably in the past few years, nowadays everyone is a reporter and their stories can be found on social media. Using platforms such as Twitter, Facebook and YouTube as data sources Naaman’s company creates products which not only extracts, but also synchronizes event coverage. One interesting feature in their latest product is the synchronization of video clips, making it possible for a user to easily switch view when watching video footage of for example a concert.  An arguably even stronger feature of this use of social media is the fact that news and event footage can reach the world even if no press is present at the scene. Slides from this inspiring talk can be found here.

Another presentation the same day displayed promising results in the task of automatic event detection. Using machine learning algorithms a team of researchers from Hanover, Germany have designed a system for detecting and summarizing entity related events from Wikipedia edit history data. Basically the idea is that when a Wikipedia article is edited by a large amount of users in a short period of time that can mark an important event considering the subject of the article. More information about this research can be found here.

The last day of the conference opened with a presentation from Jimmy Lin of Twitter. His talk centered on the importance of fast real-time indexing in social media platform architecture. One of the strengths of Twitter is presenting the users with information about events as they happen. As an example of this he used the event of an earthquake hitting eastern USA in 2011. Tweets from locations closer to the epicenter of the earthquake reached Twitter users in New York City before the actual quake did. I have to admit “Twitter, faster than earthquakes” is a pretty good slogan.

So whether it’s using social media data to let people (re)visit events, automatic event detection in open source dictionaries, making sure your indexing is fast enough to let your users cover events as they happen or something else, event based data seems to be one of the driving forces in the field of IR at the moment.

Big Data is a Big Challenge

Big Data is also a Big Challenge for a number of companies that would like to be ahead of the competition. I think Findwise can help a lot with both technical expertise in text analytics and search technology but also with how to put Big Data to use in a business.

During the last days of February I had the pleasure to attend IDG Big Data conference in Warsaw, Poland. It brought plenty of people from both vendors and industry that shared interesting insights on the topic. In general, big vendors that try to be associated with Big Data dominated the conference. IBM, SAS, SAP, Teradata has provided massive marketing information on software products and capabilities around Big Data. Interestingly every single presentation had its own definition on what Big Data is. This is probably caused by the fact that everybody tries to find the best definitions for fitting own products into it.

From my perspective it was very nice to hear that everyone agrees text analytics and search components are of big importance in any Big Data solution. In multiple applications analysis (both predictive and deductive) and for mass social media one must use advanced linguistic techniques for retrieving and structuring the data streams. This sounded especially strong in IBM and SAS presentations.

A couple of companies revealed what they have already achieved in so called Big Data. Orange and T-Mobile presented their approach of extending traditional business intelligence to harness Big Data. They want to go beyond standard data collected in transaction databases and open up for all the information they have from calls (picked and non-answered), SMS, data transmission logs, etc. Telecom companies consider this kind of information to be a good source for data about their clients.

But the most interesting sessions were held by companies that openly shared their experience about evolution of their Big Data solutions based mainly on open source software. In this way Adam Kawa from Spotify showed how they based their platform on Hadoop cluster starting from a single server to a few hundreds nowadays. To me that seems like a good way to grow and adapt easily to changing business needs and altering external conditions.

Nasza Klasa – a Polish Facebook competitor had a very good presentation on several dimensions connected to challenges in Big Data solutions that might be used for summarisation of this post:

  1. Lack of legal regulations – Currently there are no clear regulations on how the data might be used and how to make money out of it. It is especially important for social portals where all our personal information might be used for different kinds of analysis and sold in aggregated or non-aggregated form. But the laws might be changed soon, thus changing the business too.
  2. Big Data is a bit like research – it is hard to predict return on investment on Big Data as it is a novelty but also a very powerful tool. For many who are looking into this the challenge is internal, to convince executives to invest in something that is still rather vague.
  3. Lack of data scientists – even if there are tools for operating on Big Data, there is a huge lack of skilled people – Big Data operators. These are not IT people nor developers but rather open-minded people with a good mathematical background able to understand and find patterns in a constantly growing stream of various structured and unstructured information.

As I stated at the beginning of this post, Big Data is also a Big Challenge for a number of companies that would like to be ahead of the competition. I truly believe we at Findwise can help a lot within this area, we have both the technical expertise and experience on how to put Big Data to use in a business.

Welcome to Findability Day 2013

Don’t miss the opportunity to visit the biggest search event in Northern Europe focusing entirely on how to find and display corporate information.

Last year we took the first steps towards creating a new industry event for everyone interested in search and findability. This year we are taking it to the next level!

The agenda is work in progress but we can promise a day full of inspiration, knowledge sharing and networking opportunities to help develop your business, personal skills and professional network.

The event takes place in Stockholm on May 30th. For more details check out the event here.

Hope to see you there!

Graph Search from Down Under

We’ve already written about the new concept called Graph Search, which is being popularized by Facebook. Wouldn’t it be cool if we applied this to the enterprise as well, as I wrote in an earlier blog post on Enterprise Graph Search? That’s what Australian startup company Lumanetix thinks, when they created the SPAR-K graph search engine for the enterprise.

Applied graph search

As seen in the screenshots of the product, the product do queries against relational databases with linked data objects such as Movies linked to People in Casts, or Managers of Departments in an organization. One difference to Facebook graph search is the more Google-like query syntax which is keyword-based where Facebook uses natural language processing to describe specific queries.Graph search applied to the enterprise

It’s exciting to see that the market is picking up speed with new innovations in the enterprise search field, as Lumanetix SPAR-K is an example of.

 

/Christian Ubbesen

Speaking about Search as a Service @ PROMISE Technology Transfer day, want to meet up?

Tomorrow morning I leave Gothenburg to attend the PROMISE Technology Transfer day @ CeBIT 2013 in Hanover, Germany.

The event is a workshop introducing its participants to methodologies for the systematic evaluation and monitoring of search engines, and for discussing future trends and requirements for the next generation of information access systems. In other words, it is right up our alley at Findwise.

As Director of Research at Findwise I will speak about Search as a Service. If you are at the event or just nearby I would be happy to meet up and have a chat.  I will be around from Tuesday March 5 until Thursday March 7. Feel free to email me, henrik.strindberg@findwise.com or give me a call at +46709443905.

Hope to see you there!

Query Rules in SharePoint 2013

Leaving both the SharePoint Conference in Las Vegas and the recent European SharePoint Conference in Copenhagen behind, Findwise continues sharing impressions about the new search in SharePoint 2013! We have previously given an overview of what is new in search in SharePoint 2013 and discussed Microsoft’s focus areas for the release. In this post, we focus more on the ranking of the search results using the query rules.

Understanding user intent in search is one of the key developments in the new release. The screenshots below, showing out-of-the-box functionality on some sample content, exemplify how the search engine adapts to the user query. Keywords such as ‘deck’, ‘expert’, or ‘video’ can express the user’s needs and expectations for different search results and information types, and what the search engine does in this case is promoting those results that have a higher probability to be relevant to the user’s search.

Query rules

Source: Microsoft

 

The adaptability of the search results can seem remarkable, as we see in these examples, aiming to provide more relevant search results through a better understanding of the user intent. Actually, this is powered by a new feature in SharePoint 2013 called query rules. Even more interesting maybe is that you can define your own custom query rules matching your specific needs without writing any code!

The simplest query rule would be to promote a specific result for a given search query. For example, you can promote a product’s instruction manual when the users search for that product name. Previously, in SharePoint 2010, you were able to define such promoted results (or “best bets”) using the Search Keywords. The query rules in SharePoint 2013 extend this functionality, providing an easy way to create powerful search experiences that adapt to user intent and business needs.

When defining a query rule, there are two main things to consider: conditions and corresponding actions. The conditions specify when the rule will be applied and the actions specify what to do when the rule is matched. There are six different condition types and three action types that can be defined.

For example, a query condition can be that a query keyword matches a specified phrase or a term from a dictionary (such as ‘picture’, ‘download’ or a product name from the term store), or that the query is more popular for a certain result type (such as images when for example searching for ‘cameras’), or that it matches a given regular expression (useful for matching phone numbers for example). The correlated actions can consist of promoting individual results on top of the ranked search results (promoting for example the image library), promoting a group of search results (such as image results, or search results federated from a web search engine), or changing the ranking of the search results by modifying the query (by changing the sorting of results or filtering on a content type). Another thing to consider is where you define the rule. Query rules can be created at Search Service Application, Site Collection, or Site level. The rules are inherited by default but you can remove, add, configure and change the order of query rules at each level. Fortunately, it also allows you to test a query and see which rules will fire.

There is one more thing though that you need to take into account: some features of query rules are limited in some of the licensing plans. Some plans only allow you to add the promoted results, and the more advanced actions on query rules are disabled. Check TechNet for guidelines on managing query rules and a list of features available across different licensing plans.

With the query rules, you have the freedom and power to change the search experience and adapt it to your needs. Defining the right keywords to be matched on the user queries and mapping the conditions with the relevant actions is easy but the process must undoubtedly be well managed. The management of the query rules should definitely be part of your SharePoint 2013 search governance strategy.

Let’s have a chat about how you can create great search experiences that match your specific users and business needs!

Mobility 2013 top trend among Sweden’s CIOs

Each year CIO Sweden conducts a trend survey among Sweden’s CIOs. They also host an annual event where they discuss the results and the CIOs from some of Sweden’s largest companies talk about their vision. On February 6 I attended this year’s CIO Trends event at Münchenbryggeriet in Stockholm.

The main conclusion from this year’s survey is that compared to last year not that many things have changed. However, one interesting change this year was that last year’s most important trend, cloud and cloud solutions, had been kicked down by Mobility. Mobility as in easiness to move around not only in the office but also in large scales around the world. Information should always be on your fingertip no matter the device or connection. The Cloud is still a hot topic and focus on that is still high among companies. I guess henceforth we will see more of a combination of the two where you use cloud to create more mobility.

Fun fact of the day from the CIOs of Sweden: The most common CIO in Sweden is Male (84%), around 45-49 years old (33%) and don’t like shopping (2%).

//Ludvig Aldrin still Sweden’s youngest CIO (CIO’s under 30,  1%)

Enterprise Graph Search

Facebook will soon launch their new Graph Search to the general public, and it has received a lot of interest lately.

With graph search, the users will be able to query the social graph that millions of people have constructed over the years when friending each other and putting in more and more personal information about themselves and their friends in the vast Facebook database. It will be possible to query for friends of friends who have similar interests as you, and invite them to a party, or to query for companies where people with similar beliefs as you work, and so on and so forth. The information that is already available, will all the sudden become much more accessible through the power of graph search.

How can we bring this to an enterprise search environment? Well, there are lots of graphs in the enterprise as well to query, both social and other types. For example, how about being able to query for people that have been members of a project in the last three years that involved putting a new product successfully to the market. This would be an interesting list of people to know about, if you’re a marketing director that want to assemble a team in the company, to create a new product and make sure it succeeds in the market.

If we dissect graph search, we will find three important concepts:

  1. The information we want to query against don’t only need to be indexed into one central search engine, but also the relations and attributes of all information objects need to be normalized to create the relational graph and have standard attributes to query against. We could use the Open Graph Protocol as the foundation.
  2. We need a parser that take human language and converts it to a formal query language that a search engine understands. We might want to query in different human languages as well.
  3. The presentation of results should be adapted to the kind of information sought for. In Facebook’s example, if you query for people you will get a list of people with their pictures and some relevant personal information in the result list, and if you query for pictures you will get a collage of pictures (similar to the Google image search).

So the recipe to success is to give the information management part of the project a big focus, making sure to create a unified information model of the content to be indexed. Then create a query parser for natural language based on actual user behavior, and the same user studies would also give us information on how to visualize the different result set types.

I believe we will see more of these kind of solutions in the coming years in the enterprise search market, and look forward exploring the possibilities together with our clients.

Predictive Analytics World 2012

At the end of November 2012 top predictive analytics experts, practitioners, authors and business thought leaders met in London at Predictive Analytics World conference. Cameral nature of the conference combined with great variety of experiences brought by over 60 attendees and speakers made a unique opportunity to dive into the topic from Findwise perspective.

Dive into Big Data

In the Opening Keynote, presented by Program Chairman PhD Geert Verstraeten, we could hear about ways to increase the impact of Predictive Analytics. Unsurprisingly a lot of fuzz is about embracing Big Data.  As analysts have more and more data to process, their need for new tools is obvious. But business will cherish Big Data platforms only if it sees value behind it. Thus in my opinion before everything else that has impact on successful Big Data Analytics we should consider improving business-oriented communication. Even the most valuable data has no value if you can’t convince decision makers that it’s worth digging it.

But beeing able to clearly present benefits is not everything. Analysts must strive to create specific indicators and variables that are empirically measurable. Choose the right battles. As Gregory Piatetsky (data mining and predictive analytics expert) said: more data beats better algorithms, but better questions beat more data.

Finally, aim for impact. If you have a call center and want to persuade customers not to resign from your services, then it’s not wise just to call everyone. But it might also not be wise to call everyone you predict to have high risk of leaving. Even if as a result you loose less clients, there might be a large group of customers that will leave only because of the call. Such customers may also be predicted. And as you split high risk of leaving clients into “persuadable” ones and “touchy” ones, you are able to fully leverage your analytics potencial.

Find it exciting

Greatest thing about Predictive Analytics World 2012 was how diverse the presentations were. Many successful business cases from a large variety of domains and a lot of inspiring speeches makes it hard not to get at least a bit excited about Predictive Analytics.

From banking and financial scenarios, through sport training and performance prediction in rugby team (if you like at least one of: baseball, Predictive Analytics or Brad Pitt, I recommend you watch Moneyball movie). Not to mention Case Study about reducing youth unemployment in England. But there are two particular presentations I would like to say a word about.

First of them was a Case Study on Predicting Investor Behavior in First Social Media Sentiment-Based Hedge Fund presented by Alexander Farfuła – Chief Data Scientist at MarketPsy Capital LLC. I find it very interesting because it shows how powerful Big Data can be. By using massive amount of social media data (e.g. Twitter), they managed to predict a lot of global market behavior in certain industries. That is the essence of Big Data – harness large amount of small information chunks that are useless alone, to get useful Big Picture.

Second one was presented by Martine George – Head of Marketing Analytics & Research at BNP Paribas Fortis in Belgium. She had a really great presentation about developing and growing teams of predictive analysts. As the topic is brisk at Findwise and probably in every company interested in analytics and Big Data, I was pleased to learn so much and talk about it later on in person.

Big (Data) Picture

Day after the conference John Elder from Elder Research led an excellent workshop. What was really nice is that we’ve concentrated on the concepts not the equations. It was like a semester in one day – a big picture that can be digested into technical knowledge over time. But most valuable general conclusion was twofold:

  • Leverage – an incremental improvement will matter! When your turnover can be counted in millions of dollars even half percent of saving mean large additional revenue.
  • Low hanging fruit – there is lot to gain what nobody else has tried yet. That includes reaching for new kinds of data (text data, social media data) and daring to make use of it in a new, cool way with tools that weren’t there couple of years ago.

Plateau of Productivity

As a conclusion, I would say that Predictive Analytics has become a mature, one of the most useful disciplines on the market. As in the famous Gartner Hype, Predictive Analytics reached has reached the Plateau of Productivity. Though often ungrateful, requiring lots of resources, money and time, it can offer your company a successful future.